CodexBloom - Programming Q&A Platform

scenarios when using tf.data.Dataset with multi-worker strategy in TensorFlow 2.11

👀 Views: 94 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-14
tensorflow tf.data distributed-training Python

I'm dealing with Does anyone know how to I need help solving I'm deploying to production and I'm dealing with I'm working with an scenario when trying to use `tf.data.Dataset` for loading data in a distributed training setup using `tf.distribute.MultiWorkerMirroredStrategy` in TensorFlow 2.11..... Specifically, when I attempt to map a function over the dataset, it throws the following behavior: ``` ValueError: Tensor conversion failed. ``` I've tried creating a dataset from a directory of images using `tf.keras.preprocessing.image_dataset_from_directory` and then applied a simple preprocessing function with `map`, but it seems to unexpected result when distributed across multiple workers. Here's the code snippet I'm using: ```python import tensorflow as tf import os # Setting up the strategy strategy = tf.distribute.MultiWorkerMirroredStrategy() with strategy.scope(): batch_size = 32 dataset = tf.keras.preprocessing.image_dataset_from_directory( 'path/to/data', batch_size=batch_size, image_size=(224, 224) ) def preprocess_image(image, label): image = tf.image.resize(image, (224, 224)) return image / 255.0, label dataset = dataset.map(preprocess_image) dataset = dataset.prefetch(tf.data.AUTOTUNE) ``` I've also tried to use `dataset.shard` to explicitly manage distribution, but that didn't seem to fix the question either. Additionally, I verified that all paths and data formats are correct. Can anyone help clarify why this is happening and how to properly set up a dataset for multi-worker training in this case? For context: I'm using Python on Windows 10. I'm on Debian using the latest version of Python. I appreciate any insights! The project is a CLI tool built with Python. Any ideas what could be causing this? This issue appeared after updating to Python LTS. Thanks, I really appreciate it! I'm working on a REST API that needs to handle this.