CodexBloom - Programming Q&A Platform

Regex scenarios to Extract CSV Fields with Unescaped Commas in Java - Need guide with Edge Cases

👀 Views: 9286 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-09
regex java csv Java

I recently switched to I'm trying to parse a CSV string in Java using regex, but I'm running into issues when the fields contain unescaped commas. For example, the string `"value1, with comma",value2,"value3, with another, comma"` should be parsed into three fields, but my current regex pattern does not handle this correctly. I initially tried using the following regex pattern: ```java String regex = "([\"']?[^,\"']*[\"']?|[\"'][^\"']*[\"'])"; ``` However, this fails because it doesn't account for commas within quoted fields. I want to make sure that commas inside quotes do not split the fields. After several attempts to modify the regex, I ended up with: ```java String regex = "(?:"([^"]*)"|([^,]*))"; ``` This pattern works for some cases, but still fails when there are nested commas or when a field starts or ends with whitespace. When I test it with the input string, I get the following output: ``` [value1, with comma, value2, value3, with another, comma] ``` Clearly, it's not capturing the quoted fields correctly. I'm using Java 11 and the `Pattern` class for regex processing. I need a solution that captures the fields accurately, including handling optional whitespace and ensuring that unescaped commas do not split fields. Any insights on how to improve my regex would be greatly appreciated! My development environment is macOS. Thanks in advance! This is happening in both development and production on Windows 10. I'd really appreciate any guidance on this.