Regex optimization guide as Expected for Extracting Version Numbers from Strings in Python - Need guide with Edge Cases
I'm working on a personal project and Quick question that's been bugging me - I'm trying to extract version numbers from a list of strings using regex in Python. The version numbers I'm interested in are in the format `major.minor.patch`, such as `1.0.2`, `2.3.0`, or `1.10.5`. However, I'm working with issues because some strings contain additional text or formatting, which is causing my regex to unexpected result in certain edge cases. For instance, I have a string like this: `"Version 2.3.0 released on 2023-10-10"` and another one like `"Update: 1.10.5 (stable)"`. I want to ensure I capture only the version numbers, but my current regex pattern seems to be matching too much or not at all. Here's the regex I initially wrote: ```python import re text = [ 'Version 2.3.0 released on 2023-10-10', 'Update: 1.10.5 (stable)', 'Patch 3.0', '2.3.0-beta', '1.0.1-alpha' ] pattern = r'\b\d+\.\d+\.\d+\b' for line in text: match = re.search(pattern, line) if match: print(f'Found version: {match.group()}') else: print('No version found') ``` This code works for some of the strings, but I'm not capturing versions that have a suffix like `-beta` or `-alpha`. Additionally, I noticed that it doesn't handle cases with leading digits correctly, where a version number might look like `01.02.03`. I've tried tweaking the regex to account for optional leading zeros but I need to seem to get it right. I would appreciate any guidance on how to modify the regex pattern to handle these edge cases, ensuring I capture valid version numbers regardless of additional text or formats. The main issues I'm working with are: - Not capturing versions with suffixes (like `-beta` or `-alpha`) - Handling leading zeros correctly Thanks in advance for your help! This is part of a larger web app I'm building. Is there a better approach? My development environment is Ubuntu. What am I doing wrong? Any suggestions would be helpful.