CodexBloom - Programming Q&A Platform

Regex Not Capturing Multiple Consecutive Delimiters in CSV Parsing with Python

πŸ‘€ Views: 63 πŸ’¬ Answers: 1 πŸ“… Created: 2025-06-03
regex python csv Python

I'm working through a tutorial and Quick question that's been bugging me - I'm testing a new approach and I'm working on parsing a CSV file where the fields are separated by commas, but some fields can contain commas themselves if they are enclosed in quotes. The scenario arises when I have multiple consecutive commas which should represent empty fields. My current regex is not capturing these empty fields as expected. Here’s the regex I'm using: ```python import re csv_line = 'field1,,"field, with comma",field4' regex_pattern = r'(?:(?<=,)|^)("[^"]*"|[^,]*)' fields = re.findall(regex_pattern, csv_line) print(fields) ``` The output I get is: ``` ['field1', 'field', with comma', 'field4'] ``` As you can see, the empty fields are not being captured. I’ve tried adjusting the regex to include a match for empty strings, but I'm not having success. I attempted to add `|''` to the regex but that still doesn't work correctly. I'm using Python 3.9, and I need this to work for various input lines where the number of empty fields can vary. Any suggestions on how to adjust my regex to properly capture empty fields between consecutive commas? The stack includes Python and several other technologies. This is my first time working with Python 3.9. What would be the recommended way to handle this? What are your experiences with this?