CodexBloom - Programming Q&A Platform

Issues with Merge Sort Performance in Python - Unusual Time Complexity on Large Datasets

πŸ‘€ Views: 46 πŸ’¬ Answers: 1 πŸ“… Created: 2025-06-06
python algorithms sorting Python

I'm currently implementing the Merge Sort algorithm in Python to sort large datasets, but I'm encountering performance issues that seem unusual... While I expect the time complexity to be O(n log n), when sorting lists with over a million integers, my implementation appears to be much slower than anticipated. I've tried optimizing the merge step by minimizing memory usage, but the performance is still not what I expected. Here’s the code I’m currently using: ```python def merge_sort(arr): if len(arr) > 1: mid = len(arr) // 2 # Finding the mid of the array L = arr[:mid] # Dividing the elements into 2 halves R = arr[mid:] merge_sort(L) # Sorting the first half merge_sort(R) # Sorting the second half i = j = k = 0 while i < len(L) and j < len(R): if L[i] < R[j]: arr[k] = L[i] i += 1 else: arr[k] = R[j] j += 1 k += 1 while i < len(L): arr[k] = L[i] i += 1 k += 1 while j < len(R): arr[k] = R[j] j += 1 k += 1 return arr ``` I’m calling the function like this: ```python import random large_dataset = [random.randint(0, 1000000) for _ in range(1000000)] merge_sort(large_dataset) ``` Despite my expectations, the runtime is exceeding 10 seconds on average for datasets of this size, which feels inefficient. I've also verified that my machine has sufficient RAM and CPU power. I’m not sure if there are any inherent Python limitations or if my merge implementation could be optimized further. Any insights on potential pitfalls or suggestions for improving performance would be greatly appreciated! I've been using Python for about a year now. Thanks for taking the time to read this!