CodexBloom - Programming Q&A Platform

Regex for Matching Version Numbers in Semantic Format - implementing Pre-release Identifiers

πŸ‘€ Views: 95 πŸ’¬ Answers: 1 πŸ“… Created: 2025-08-25
regex python semantic-versioning Python

I've tried everything I can think of but I'm trying to implement I've hit a wall trying to This might be a silly question, but I'm currently working on a project in Python where I need to validate semantic versioning strings, specifically the ones that include pre-release identifiers... The versions I want to match follow the format `MAJOR.MINOR.PATCH[-PRERELEASE]` where `PRERELEASE` can be alphanumeric and may contain dots, dashes, or underscores. While I have a working regex for the MAJOR.MINOR.PATCH part, I'm having trouble extending it to correctly capture the `PRERELEASE` segment. Here’s the regex pattern I’ve been using to match the basic versioning scheme: ```python import re version_pattern = r'^(\d+)\.(\d+)\.(\d+)$' ``` This regex correctly captures the major, minor, and patch numbers, but when I try to extend it to include the pre-release part like so: ```python version_pattern = r'^(\d+)\.(\d+)\.(\d+)([-\w.]+)?$' ``` I get unexpected matches for inputs like `1.0.0-alpha+001`, where the `+001` part should ideally be ignored in the version check. I want to ensure that a valid version with a pre-release identifier does not mistakenly consider the build metadata (after '+' symbol) as part of the matching. To troubleshoot, I tried breaking down the regex further, but now I’m confused about how to structure it without making it overly complex or allowing invalid formats. I also checked that I’m using Python 3.10 and regex version `re` which seems to be the default. Here are some examples of the versions I want to validate: - `1.0.0` - `2.1.3-alpha` - `3.2.1-beta.1` - `4.5.6-beta+exp.sha.5114f85` Does anyone have any suggestions for a regex that can accurately match valid semantic versioning formats including the optional pre-release identifiers while ignoring anything after the '+' sign? Any insights on how to improve my regex pattern would be greatly appreciated! For context: I'm using Python on Ubuntu. I'd really appreciate any guidance on this. I recently upgraded to Python 3.11. Any help would be greatly appreciated! This is part of a larger desktop app I'm building. How would you solve this?