Pandas DataFrame Resampling with Timezones Results in Unexpected Data Loss
I've been researching this but I'm optimizing some code but I'm performance testing and I've encountered a strange issue with Could someone explain I'm working on a personal project and I'm trying to resample a Pandas DataFrame to a higher frequency, but I'm working with an scenario where some of my data seems to be missing after the operation... My DataFrame contains timestamps in UTC, and I'm resampling it to a 15-minute frequency while converting to another timezone (America/New_York). Here's a simplified version of my code: ```python import pandas as pd import pytz timestamps = pd.date_range('2023-10-01', periods=100, freq='H') data = {'value': range(100)} df = pd.DataFrame(data, index=timestamps) df.index = df.index.tz_localize('UTC') # Resample to 15-minute frequency and convert timezone resampled_df = df.resample('15T').mean().tz_convert('America/New_York') ``` After running this code, I noticed that the `resampled_df` has fewer rows than I anticipated. When I check the `resampled_df`, it seems that the rows corresponding to certain time intervals are completely missing. Additionally, I receive an informative warning: "A value is trying to be set on a copy of a slice from a DataFrame. This may result in unintended behavior." I initially thought the warning might be related to how I'm handling the DataFrame, but the missing rows concern me more. I've tried various methods to debug this—like checking for duplicate timestamps or verifying the timezones across the DataFrame—but nothing has worked. Could this behavior be a result of how Pandas manages timezone-aware timestamps during resampling? How can I ensure that no data is lost during this operation? Any insights or suggestions on best practices for resampling with timezone conversions would be greatly appreciated. I'd really appreciate any guidance on this. I'm working in a Windows 10 environment. Thanks, I really appreciate it! I'm open to any suggestions. The project is a application built with Python. For reference, this is a production web app. I'd be grateful for any help. This is my first time working with Python 3.9. For reference, this is a production CLI tool. Any ideas how to fix this?