Regex scenarios to Match Version Numbers in Python - implementing Leading Zeros and Dots
I'm working through a tutorial and I'm refactoring my project and I'm experimenting with I'm trying to parse version numbers from a string in my Python application, but my regex isn't capturing certain edge cases correctly. Specifically, I'm having trouble with leading zeros and multiple dot separators. The versioning format I want to capture can be something like `1.0.0`, `2.3.4`, or even `3.0.0-alpha`, but it seems to unexpected result when there are leading zeros like `01.02.03` or when the component after the last dot is not strictly numeric. Here's the regex I've been using: ```python import re version_regex = r'\b(\d+)(?:\.(\d+)(?:\.(\d+))?(?:-(\w+))?)?\b' test_strings = [ 'Version 1.0.0', 'Version 01.02.03', 'Release 2.3.4-alpha', 'Update 3.0.0' ] for s in test_strings: match = re.search(version_regex, s) print(f"Testing '{s}': {match.groups() if match else 'No match'}") ``` When I run this code, it correctly matches `1.0.0` and `2.3.4-alpha`, but it fails to capture `01.02.03`, treating it as an invalid version. The output shows `Testing 'Version 01.02.03': No match`, which is frustrating since I expect leading zeros to be valid in this context. I've tried modifying the regex to allow leading zeros, such as changing `\d+` to `\d*`, but that leads to unintended matches like `01.02.03` being treated as a single number. I want to ensure that leading zeros are acceptable but still maintain the distinction between major, minor, and patch versions. Can anyone suggest an alternative regex pattern that properly captures version numbers with leading zeros without incorrectly combining them? This is happening in both development and production on Ubuntu 22.04. Could someone point me to the right documentation? What am I doing wrong? Thanks for any help you can provide!