How to perform a weighted average calculation in a Pandas DataFrame with grouped data?

👀 Views: 43 💬 Answers: 1 📅 Created: 2025-08-30

I'm having trouble with I'm reviewing some code and I'm trying to calculate a weighted average from a Pandas DataFrame that contains sales data, but I'm running into issues with the grouping and aggregation. I have a DataFrame structured like this: ```python import pandas as pd data = { 'Product': ['A', 'A', 'B', 'B', 'C', 'C'], 'Sales': [200, 300, 400, 500, 600, 700], 'Weight': [1, 2, 1, 3, 2, 1] } df = pd.DataFrame(data) print(df) ``` This DataFrame contains sales for different products along with their corresponding weights. I want to compute the weighted average sales for each product. I thought I could group by 'Product' and then apply a custom aggregation function, but I’m not quite sure how to do this correctly. I attempted the following code: ```python def weighted_avg(group): return (group['Sales'] * group['Weight']).sum() / group['Weight'].sum() result = df.groupby('Product').apply(weighted_avg) print(result) ``` However, I'm getting the following output instead of the expected weighted averages: ``` Product A 250.0 B 466.666... C 600.0 dtype: float64 ``` The average for Product B seems incorrect; I expected it to reflect the weights properly across the sales values. I also noticed that for some products, the values are not aligning with my expectations. I've researched this and checked various documentation but still need to find a clear explanation or solution. How can I ensure that the weighted average is calculated accurately for each product in my DataFrame? Is there a better approach to achieve this, especially with larger datasets? Any help would be greatly appreciated! My development environment is CentOS. Am I approaching this the right way? I'm on Windows 10 using the latest version of Python. I'd really appreciate any guidance on this.