CodexBloom - Programming Q&A Platform

Unexpected results when using `tidyr::pivot_wider()` with a custom aggregation function in R

๐Ÿ‘€ Views: 15 ๐Ÿ’ฌ Answers: 1 ๐Ÿ“… Created: 2025-07-08
r tidyr data-wrangling R

I'm performance testing and I tried several approaches but none seem to work... I am working with an scenario with `tidyr::pivot_wider()` when trying to use it with a custom aggregation function. I have a data frame that contains sales data with multiple entries for the same product on different dates, and I want to summarize the sales by summing them up while also keeping track of the maximum sale for each product. However, when I use `pivot_wider()`, I get unexpected results where the aggregated values seem incorrect. Hereโ€™s a simplified version of my data frame: ```r library(dplyr) library(tidyr) df <- data.frame( product = c('A', 'A', 'B', 'B', 'A'), date = as.Date(c('2023-01-01', '2023-01-02', '2023-01-01', '2023-01-02', '2023-01-03')), sales = c(10, 20, 30, 15, 25) ) ``` I want to pivot this data to get the total sales and maximum sales for each product across the dates. I tried the following code: ```r result <- df %>% pivot_wider( names_from = date, values_from = sales, values_fn = list(sales = list(sum, max)) ) ``` However, this results in an behavior: `behavior: need to convert list to a single value`. I also experimented with using `dplyr::summarise()` before pivoting, like this: ```r summarized_df <- df %>% group_by(product) %>% summarise(total_sales = sum(sales), max_sales = max(sales)) result <- summarized_df %>% pivot_wider( names_from = product, values_from = c(total_sales, max_sales) ) ``` But this doesnโ€™t provide the pivoted structure based on dates and fails to keep the date context. Iโ€™m using `tidyr` version 1.1.4 and `dplyr` version 1.0.7. What am I doing wrong? How can I correctly aggregate the sales data while preserving the necessary structure using `pivot_wider()`? I'm working on a service that needs to handle this. What's the correct way to implement this? My development environment is CentOS. Is there a simpler solution I'm overlooking?