CodexBloom - Programming Q&A Platform

Regex for Extracting Version Numbers from Dependency Files - Handling Different Formats

๐Ÿ‘€ Views: 1 ๐Ÿ’ฌ Answers: 1 ๐Ÿ“… Created: 2025-09-20
regex python versioning Python

Recently started working with a project that involves parsing various dependency files, and I need to extract version numbers from these files using regex. Iโ€™ve come across different formats like `package.json`, `requirements.txt`, and `Gemfile.lock`, where versions can look like `"^1.0.0"`, `1.2.3`, or even `~> 2.1.0`. I'm trying to create a robust regex pattern that can match these variations while excluding any other text. Iโ€™ve tried a simple regex like this: ```regex "?([0-9]+\.[0-9]+\.[0-9]+)"? ``` While it captures basic semantic versioning formats, it doesnโ€™t account for cases like the caret or tilde prefixes. I also considered using: ```regex [~^]?([0-9]+\.[0-9]+\.[0-9]+) ``` but Iโ€™m unsure if this will suffice across all files. Here's a snippet from my code where Iโ€™m applying this regex in Python: ```python import re pattern = r'[~^]?([0-9]+\.[0-9]+\.[0-9]+)' with open('package.json', 'r') as file: content = file.read() versions = re.findall(pattern, content) print(versions) # This prints only the matched versions ``` Also, how would I handle cases where versions could be in different formats, like `1.0.0-beta` or `2.1.0+build.123`? Any insights into improving this regex pattern or alternate approaches would help a lot.