CodexBloom - Programming Q&A Platform

Regex for Extracting Key-Value Pairs from Custom Configuration Strings - implementing Special Characters

πŸ‘€ Views: 91 πŸ’¬ Answers: 1 πŸ“… Created: 2025-08-27
regex python configuration Python

I'm performance testing and I'm working on a personal project and I've been struggling with this for a few days now and could really use some help. I'm trying to write a regex to extract key-value pairs from a custom configuration string in the format `key1=value1; key2=value2; key3=value3;`. The values can contain spaces and special characters like commas and periods, but the keys are always simple alphanumeric strings with underscores. For example, given the string `username=john_doe; age=30; email=john@example.com;` I want to extract `username`, `age`, and `email`. I've started with the regex pattern `([a-zA-Z0-9_]+)=([^;]+)` which seems to work for most of the cases. Here's my implementation in Python using the `re` library: ```python import re config_string = 'username=john_doe; age=30; email=john@example.com;' pattern = r'([a-zA-Z0-9_]+)=([^;]+)' matches = re.findall(pattern, config_string) print(matches) ``` This prints `[('username', 'john_doe'), ('age', '30'), ('email', 'john@example.com')]` as expected, but I'm running into issues when the values contain special characters. When I try a string like `path=C:\Program Files\MyApp; version=1.0.0;` the output is correct, but if I add a value with a semicolon like `description=This is a test; value=50%;`, the regex fails to capture everything correctly since it sees the `;` in `value=50%` as a delimiter for the next pair. I attempted to modify my pattern to allow for escaped semicolons or quotes around values, but so far I haven’t been able to find a working solution. Any suggestions on how to make my regex more robust to handle such cases? I’m using Python 3.9.7 and I’d like to maintain efficiency as the configuration strings can be quite large. Any advice would be greatly appreciated! This is part of a larger service I'm building. Has anyone else encountered this? Could this be a known issue?