advanced patterns when using .shift() with groupby in pandas 1.4.2
I'm working on a project and hit a roadblock. I'm relatively new to this, so bear with me. I'm working on a project and hit a roadblock. I'm running into an scenario while trying to calculate a rolling average using the `.shift()` function after a `.groupby()` operation in pandas 1.4.2. My goal is to create a new column that contains the average of the previous values within each group. Hereโs the DataFrame Iโm working with: ```python import pandas as pd data = { 'group': ['A', 'A', 'A', 'B', 'B', 'B'], 'value': [10, 20, 30, 40, 50, 60] } df = pd.DataFrame(data) ``` When I attempt to calculate the rolling average like this: ```python # Attempting to calculate the rolling average df['rolling_avg'] = df.groupby('group')['value'].shift().rolling(window=2).mean() ``` I expect the `rolling_avg` for group 'A' to be [NaN, 10.0, 20.0] and for group 'B' to be [NaN, 40.0, 50.0]. However, the resulting `rolling_avg` column ends up being: ``` 0 NaN 1 NaN 2 20.0 3 NaN 4 NaN 5 50.0 Name: value, dtype: float64 ``` It seems like the rolling average is not being calculated correctly across the groups. I've tried different approaches, including using `apply` after the groupby, but they all yield similar results. Hereโs what I tried: ```python # Using apply - still not working as expected df['rolling_avg'] = df.groupby('group')['value']\ .apply(lambda x: x.shift().rolling(window=2).mean()) ``` The resulting column is still not providing the expected rolling averages and remains filled with NaNs for many entries. Iโm confused about why the `.shift()` function seems to be affecting the rolling calculation in this way. Can someone guide to understand what Iโm doing wrong or suggest a better approach to achieve the desired result? My development environment is Linux. The stack includes Python and several other technologies. Thanks in advance! This is happening in both development and production on CentOS. Thanks for taking the time to read this!