CodexBloom - Programming Q&A Platform

Pandas: implementing calculating rolling averages on a time series DataFrame with uneven time intervals

👀 Views: 410 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-11
pandas dataframe time-series Python

I'm a bit lost with I've been banging my head against this for hours..... I've searched everywhere and can't find a clear answer. I'm trying to calculate a rolling average on a time series DataFrame where the timestamps are not evenly spaced. I want to compute a 7-day rolling average of the 'value' column, but I'm running into issues because of the irregular time intervals. When I use the `rolling` method with a `window` parameter, I get unexpected results in cases where the timestamps are far apart. Here's the code snippet I'm using: ```python import pandas as pd dates = ['2023-01-01', '2023-01-02', '2023-01-10', '2023-01-15', '2023-01-20'] values = [10, 20, 30, 40, 50] df = pd.DataFrame({'date': pd.to_datetime(dates), 'value': values}) df.set_index('date', inplace=True) # Attempting to calculate the 7-day rolling average rolling_avg = df['value'].rolling(window='7D').mean() print(rolling_avg) ``` The output I get is: ``` date 2023-01-01 10.0 2023-01-02 15.0 2023-01-10 20.0 2023-01-15 30.0 2023-01-20 40.0 Name: value, dtype: float64 ``` This doesn't seem right to me. I'm expecting that the values for the dates that fall outside of the 7-day window would not contribute to the rolling average calculation, but it appears to be averaging values even when they are not within the same week. I've tried using `min_periods` parameter to see if it makes a difference, but I still see similar results. Is there a way to correctly compute a rolling average while accounting for the uneven distribution of timestamps in my DataFrame? Any insights or best practices would be greatly appreciated! I'd really appreciate any guidance on this. My development environment is Linux. I'm developing on Ubuntu 20.04 with Python. Thanks, I really appreciate it!