CodexBloom - Programming Q&A Platform

PostgreSQL: Trouble with CTE performance when joining large datasets using ROW_NUMBER()

๐Ÿ‘€ Views: 55 ๐Ÿ’ฌ Answers: 1 ๐Ÿ“… Created: 2025-06-12
postgresql performance sql cte SQL

I'm a bit lost with I'm experiencing important performance optimization when using a Common Table Expression (CTE) with `ROW_NUMBER()` to partition and filter a large dataset before joining it with another table. I have two tables: `sales` containing millions of records and `products` with a few thousand entries. The goal is to select the top 10 sales per product category based on the sales amount, and then join this filtered result with the `products` table to include product details. Hereโ€™s the SQL query Iโ€™m using: ```sql WITH RankedSales AS ( SELECT s.product_id, s.amount, ROW_NUMBER() OVER (PARTITION BY s.category_id ORDER BY s.amount DESC) AS rank FROM sales s ) SELECT p.product_name, rs.amount FROM RankedSales rs JOIN products p ON rs.product_id = p.id WHERE rs.rank <= 10; ``` While this query returns the expected results, it takes a very long time to execute, particularly when the `sales` table grows. Iโ€™ve tried adding indexes on `sales.category_id` and `sales.amount`, as well as on `products.id`, but the performance hasnโ€™t improved much. When I analyze the execution plan, it shows that the CTE is materialized and the filtering happens after the join, which seems inefficient. Iโ€™ve also attempted to rewrite the query without a CTE, using a subquery instead, but the performance remains largely the same. Is there a better way to optimize this query? Could using a different window function or indexing strategy help? Any insights would be appreciated! This issue appeared after updating to Sql stable.