CodexBloom - Programming Q&A Platform

Azure Data Factory: Issues with Copy Activity Performance When Using Azure Blob Storage as Source and Sink

👀 Views: 47 💬 Answers: 1 📅 Created: 2025-06-11
azure-data-factory azure-blob-storage performance-optimization json

I'm sure I'm missing something obvious here, but I’m experiencing significant performance issues when using Azure Data Factory to copy data from an Azure Blob Storage source to another Blob Storage sink. I have a pipeline that utilizes a Copy Data activity, but the throughput seems to be way below what I expected. The source dataset is around 10GB in size, yet the copy process often takes several hours to complete, and I’m not sure why. I've tried adjusting the parallel copies setting in the activity to 10, but there’s no noticeable improvement. Here is the configuration I’m using for the Copy activity: ```json { "name": "CopyFromBlobToBlob", "type": "Copy", "inputs": [ { "referenceName": "SourceBlobDataset", "type": "DatasetReference" } ], "outputs": [ { "referenceName": "SinkBlobDataset", "type": "DatasetReference" } ], "typeProperties": { "source": { "type": "BlobSource", "recursive": true }, "sink": { "type": "BlobSink" }, "parallelCopies": 10 } } ``` Additionally, I’ve verified that both my source and sink Blob Storage accounts are in the same region to eliminate any network latency issues. I also checked the performance metrics for the storage accounts, and they don’t seem to show any throttling, but the copy operation is still taking far too long. I’m wondering if there are any best practices or specific configurations I might be overlooking here that could help improve the performance of this copy activity. Are there any settings I can tweak or alternative strategies I could employ to speed up this data transfer? Has anyone else encountered this?