CodexBloom - Programming Q&A Platform

Trouble with `stringr::str_extract` not matching regex as expected in R 4.3.1

👀 Views: 351 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-13
regex stringr data-wrangling R

I'm trying to figure out I'm a bit lost with Quick question that's been bugging me - I'm working with an scenario with the `stringr::str_extract` function where it fails to match strings as I anticipated. I'm using R version 4.3.1 and trying to extract email addresses from a character vector. My regex pattern seems correct, but I'm getting `NA` for most of the entries, even though I can see valid email formats in the data. Here's the code I'm using: ```r library(stringr) text_data <- c('Contact us at info@example.com', 'Support: support@domain.org', 'No email here', 'hello@world.net is my email') # Regex pattern for extracting emails email_pattern <- '[\w\.-]+@[\w\.-]+\.\w{2,4}' # Extracting emails extracted_emails <- str_extract(text_data, email_pattern) print(extracted_emails) ``` When I run this, the output shows `NA` for 'Contact us at info@example.com' and 'hello@world.net is my email', but for the second entry, it correctly identifies the email. I tried adjusting the regex to account for variations, but the question continues. Is there something I'm missing with how `str_extract` handles text or the regex pattern itself? I'm looking for any insight on why this is happening and how I can properly extract all email addresses from my text data. Is there a better approach? I'm working on a microservice that needs to handle this. Thanks in advance!