CodexBloom - Programming Q&A Platform

Regex scenarios to Capture HTML Attributes with Optional Spaces in JavaScript - implementing Edge Cases

👀 Views: 46 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-03
regex javascript html JavaScript

I'm prototyping a solution and I'm working through a tutorial and Can someone help me understand I tried several approaches but none seem to work... I'm working with a regex in JavaScript to extract attributes from a string of HTML. The goal is to match attributes that may or may not have spaces around the equal sign. For example, I want to capture `data-id="123"`, `class='example'`, and `style ="color: red;"` from the input string. However, I'm struggling with cases where spaces are inconsistent, particularly when the attribute is formatted like `style ="color: red;"` or `disabled =""`. I've tried the following regex pattern: ```javascript const regex = /\b(?<attribute>\w+)\s*=\s*(?<value>"[^"]*"|'[^']*'|[^\s>]+)/g; ``` When I test it, I see that it captures most attributes correctly, but it fails for attributes without a value and sometimes miscaptures when the attributes have different spacing. For example, when I run: ```javascript const htmlString = '<div class="test" style="color: blue;" disabled >'; const matches = [...htmlString.matchAll(regex)]; console.log(matches); ``` I get the expected output for class and style, but `disabled` does not get captured at all, and I noticed it throws an behavior if there are attributes without values. Any suggestions on how to modify the regex to handle these edge cases? I'm using Node.js 14.17.0 and regex101 for testing my patterns and would like to ensure compatibility across different browsers as well. For context: I'm using Javascript on Ubuntu. Any help would be greatly appreciated! I'm working on a desktop app that needs to handle this. What's the best practice here? I'm coming from a different tech stack and learning Javascript. Cheers for any assistance!