CodexBloom - Programming Q&A Platform

Python Regex for Extracting IP Addresses from Log Files - implementing Mixed Formats

👀 Views: 11 đŸ’Ŧ Answers: 1 📅 Created: 2025-08-06
regex python ip-addresses Python

I'm updating my dependencies and I'm performance testing and Can someone help me understand I'm working through a tutorial and Quick question that's been bugging me - This might be a silly question, but I am trying to extract IP addresses from a server log file using Python's `re` module, but I keep running into issues when the IP addresses are in mixed formats... The logs contain both IPv4 and IPv6 addresses, as well as some malformed entries. I want to ensure that I only capture valid IP addresses and ignore any malformed ones. Here's what I have so far: ```python import re log_data = ''' 2023-10-01 12:00:00 192.168.1.1 - Request 2023-10-01 12:01:00 2001:0db8:85a3:0000:0000:8a2e:0370:7334 - Request 2023-10-01 12:02:00 256.256.256.256 - Invalid 2023-10-01 12:03:00 10.0.0.1 - Request 2023-10-01 12:04:00 - Malformed ''' # Regex pattern for matching IPv4 and IPv6 pattern = r'(?:(?:[0-9]{1,3}\.){3}[0-9]{1,3}|(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4})' matches = re.findall(pattern, log_data) print(matches) ``` When I run this code, `matches` returns: ``` ['192.168.1.1', '2001:0db8:85a3:0000:0000:8a2e:0370:7334', '256.256.256.256', '10.0.0.1'] ``` However, I am also getting the malformed IP `256.256.256.256` in the results, which should not be matched. I need a regex pattern that correctly captures only valid IPv4 and IPv6 addresses while ignoring malformed ones. I've tried modifying the regex to include boundaries and some additional constraints, but I keep working with either false positives or missing valid addresses. Any help on crafting a more precise regex pattern to solve this scenario would be greatly appreciated! Hoping someone can shed some light on this. This is part of a larger mobile app I'm building. Any pointers in the right direction? I'm developing on Ubuntu 20.04 with Python. I'm developing on Ubuntu 20.04 with Python. Any feedback is welcome! Thanks for any help you can provide! This is my first time working with Python 3.10. Is there a better approach?