CodexBloom - Programming Q&A Platform

Regex Not Capturing Version Numbers with Optional Pre-release Identifiers in Python

👀 Views: 92 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-12
regex python string-manipulation Python

I'm having a hard time understanding I'm sure I'm missing something obvious here, but Hey everyone, I'm running into an issue that's driving me crazy. I'm trying to extract version numbers from a string that may include optional pre-release identifiers. The format I'm aiming for is `MAJOR.MINOR.PATCH` with an optional pre-release suffix, like `1.0.0-alpha`, `2.1.3-beta.1`, or simply `3.4.5`. However, my current regex fails to match certain patterns, especially when the pre-release identifier includes a numeric sequence. Here's the regex I've been using: ```python import re pattern = r'\b(\d+)\.(\d+)\.(\d+)(?:-(\w+(?:\.\d+)?))?\b' text = "The current versions are 1.0.0-alpha, 2.1.3-beta.1, and 3.4.5." matches = re.findall(pattern, text) print(matches) ``` When I run this code, the output is: ``` [('1', '0', '0', 'alpha'), ('2', '1', '3', 'beta'), ('3', '4', '5', None)] ``` While the major version numbers are captured correctly, I noticed that when the pre-release identifier contains a numeric suffix (like `beta.1`), the entire identifier is split into two separate captures: `beta` and `1`, which is not what I intended. I attempted to modify the regex to accommodate this scenario by changing the capture group for the pre-release to include both letters and numbers, but I'm not sure if I've implemented it correctly. My updated regex looks like this: ```python pattern = r'\b(\d+)\.(\d+)\.(\d+)(?:-(\w+(?:\.\d+)?))?\b' ``` However, this still doesn't yield the correct result. I'm unsure how to effectively capture the entire pre-release string as one group. Can anyone suggest a regex pattern that would correctly capture version numbers including numeric pre-release identifiers in Python, while ensuring the entire suffix is grouped together? Thanks! This is part of a larger service I'm building. Any ideas what could be causing this? I'd really appreciate any guidance on this. I'm working with Python in a Docker container on Linux. I'd be grateful for any help.