CodexBloom - Programming Q&A Platform

Unexpected results when using the purrr package for nested data manipulation in R

πŸ‘€ Views: 84 πŸ’¬ Answers: 1 πŸ“… Created: 2025-06-10
r dplyr purrr R

I'm confused about I'm testing a new approach and I'm prototyping a solution and I'm working with a nested data frame in R using the `purrr` package and I'm running into unexpected results when applying a function to each group..... I have a data frame `df` that contains student scores for multiple subjects, and I want to calculate the average score per subject for each student. Here's a sample of my data: ```r library(dplyr) library(purrr) df <- data.frame( student = c('Alice', 'Alice', 'Bob', 'Bob', 'Charlie', 'Charlie'), subject = c('Math', 'Science', 'Math', 'Science', 'Math', 'Science'), score = c(90, 85, 88, 92, 95, 80) ) %>% group_by(student) ``` I want to calculate the average score per subject for each student using `map` but I’m getting an unexpected output. Here's the function I've written: ```r average_scores <- df %>% nest(data = c(subject, score)) %>% mutate(avg = map(data, ~ mean(.x$score))) ``` When I run this code, I get the output but the average scores are not correct. Instead of getting individual averages for each subject, it appears to return the overall average across all subjects for each student. The final output looks something like this: ```r # A tibble: 3 x 3 student data avg <chr> <list> <dbl> 1 Alice <data.frame> 88.0 2 Bob <data.frame> 90.0 3 Charlie <data.frame> 87.5 ``` I expected the `avg` column to reflect the average score of each `subject` for each student. I’ve also tried using `summarise` instead, but that only gives me one average per student rather than breaking it down by subject. Here’s what I attempted with `summarise`: ```r result <- df %>% group_by(student, subject) %>% summarise(avg = mean(score)) ``` But this doesn't fit my need since it produces multiple rows for each subject. I need a single row per student with their average scores. Can anyone help me understand how to achieve this correctly with `purrr` or suggest an alternative approach that might work better? Any insight would be greatly appreciated! I've been using R for about a year now. Any pointers in the right direction? The project is a desktop app built with R. I'm using R LTS in this project. Any ideas how to fix this?