CodexBloom - Programming Q&A Platform

Regex scenarios to Match Flexible URL Patterns in PHP - Missing Edge Cases

πŸ‘€ Views: 324 πŸ’¬ Answers: 1 πŸ“… Created: 2025-06-05
regex php url-validation PHP

I tried several approaches but none seem to work. This might be a silly question, but I'm trying to use regex in PHP to validate a set of URLs that can come in various formats, but I'm running into issues capturing certain edge cases. My current regex is supposed to match URLs that can either include or exclude the 'http://' or 'https://' prefix and can also have a 'www.' subdomain. Here’s the pattern I've been using: ```php $pattern = '/^(https?:\/\/)?(www\.)?([a-zA-Z0-9-]+\.[a-zA-Z]{2,})(\/[^\s]*)?$/'; $urls = [ 'http://example.com', 'https://www.example.com/path/to/page', 'www.example.com', 'example.com', 'ftp://example.com', // should not match 'example', // should not match ]; foreach ($urls as $url) { if (preg_match($pattern, $url)) { echo "$url is valid.\n"; } else { echo "$url is invalid.\n"; } } ``` However, I noticed that URLs like 'ftp://example.com' are being treated as valid when they should not be. Additionally, I want to ensure that the regex correctly handles URLs with trailing slashes, but I suspect my current implementation might be too lenient. The output I get is: ``` http://example.com is valid. https://www.example.com/path/to/page is valid. www.example.com is valid. example.com is valid. ftp://example.com is valid. example is valid. ``` I would appreciate any guidance on refining this regex pattern to ensure it strictly validates URLs as intended, specifically disallowing FTP links and invalid formats like 'example'. What adjustments should I make to ensure that only valid HTTP/S URLs are accepted? What am I doing wrong? I'm developing on macOS with Php.