CodexBloom - Programming Q&A Platform

scenarios when using tf.data.Dataset with custom data generator in TensorFlow 2.12

👀 Views: 213 💬 Answers: 1 📅 Created: 2025-06-17
tensorflow tf.data data-generator TensorFlow-2.12 Python

Does anyone know how to I'm trying to use a custom data generator with `tf.data.Dataset` in TensorFlow 2.12, but I'm running into an scenario where the dataset keeps throwing an `InvalidArgumentError: Fetching from iterator failed` when I attempt to iterate through it... I've defined my generator class like this: ```python import tensorflow as tf import numpy as np class MyDataGenerator: def __init__(self, data, labels): self.data = data self.labels = labels self.index = 0 def __iter__(self): return self def __next__(self): if self.index < len(self.data): batch_data = self.data[self.index] batch_labels = self.labels[self.index] self.index += 1 return batch_data, batch_labels else: raise StopIteration data = np.random.rand(100, 32) labels = np.random.randint(0, 2, (100, 1)) generator = MyDataGenerator(data, labels) dataset = tf.data.Dataset.from_generator( lambda: generator, output_signature=(tf.TensorSpec(shape=(32,), dtype=tf.float32), tf.TensorSpec(shape=(1,), dtype=tf.int32)) ) ``` When I try to consume the dataset with: ```python for batch_data, batch_labels in dataset: print(batch_data.shape, batch_labels.shape) ``` I get the following behavior: ``` InvalidArgumentError: Fetching from iterator failed: DataGenerator has not been initialized ``` I've checked that my generator class is correctly implemented, and I can manually iterate through it without any issues outside of TensorFlow. I also ensured that the output signatures match the shapes of the data I'm generating. The behavior continues even when I modify the generator to use `tf.data.Dataset.from_tensor_slices` instead. I’ve also tried placing the generator inside a `tf.function`, but it didn't help. What could I be missing here? The project is a application built with Python. Has anyone else encountered this?