CodexBloom - Programming Q&A Platform

Error Parsing Complex Nested CSV with Dynamic Headers Using Pandas in Python

👀 Views: 31 đŸ’Ŧ Answers: 1 📅 Created: 2025-07-02
pandas csv data-parsing Python

Does anyone know how to I'm deploying to production and I'm stuck trying to I'm currently working on a project where I need to parse a complex CSV file that contains dynamic headers and nested structures... The CSV format I'm dealing with looks something like this: ``` ID, Name, Attributes 1, Alice, age=30; city=New York 2, Bob, age=25; city=Los Angeles 3, Charlie, age=35; city=Boston ``` I want to transform this data into a structured format that separates the attributes into distinct columns. After reading the CSV using Pandas, I'm trying to split the 'Attributes' column into multiple columns based on the key-value pairs but am encountering unexpected behavior. Here's what I have so far: ```python import pandas as pd df = pd.read_csv('data.csv') # Attempting to split the Attributes column attributes = df['Attributes'].str.split('; ', expand=True) attributes.columns = ['Attribute1', 'Attribute2'] # Now I want to split each attribute into key and value for col in attributes.columns: key_value = attributes[col].str.split('=', expand=True) key_value.columns = [f'{col}_Key', f'{col}_Value'] df = pd.concat([df, key_value], axis=1) print(df) ``` When I run this code, I get the following warning: `SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame`. Additionally, the resulting DataFrame seems to have NaN values for some rows in the newly created columns, which is unexpected since all rows in the CSV should have corresponding attributes. I've tried using `df.copy()` before manipulating the DataFrame, hoping to resolve the warning, but it hasn't helped. I'm unsure how to effectively handle the dynamic nature of the attributes given that they can vary in count and structure. Does anyone have suggestions on how to properly parse this format without losing data or running into warnings? Any help would be greatly appreciated! Thanks for taking the time to read this! Any feedback is welcome! I've been using Python for about a year now. Could this be a known issue?