CodexBloom - Programming Q&A Platform

PostgreSQL Performance guide with Complex Subquery in SELECT Statement

👀 Views: 3 💬 Answers: 1 📅 Created: 2025-06-03
postgresql performance sql optimization SQL

I've spent hours debugging this and I'm trying to configure I've been struggling with this for a few days now and could really use some help. I'm working on a project and hit a roadblock. After trying multiple solutions online, I still can't figure this out. I'm experiencing important performance optimization with a complex SQL query in PostgreSQL 13.3 that uses a subquery in the SELECT statement. The query looks like this: ```sql SELECT a.id, a.name, ( SELECT COUNT(*) FROM orders o WHERE o.user_id = a.id AND o.status = 'completed' ) AS completed_orders FROM users a WHERE a.active = true; ``` The query executes in over 10 seconds on a dataset of about 500,000 users, and it seems to lock the database during execution. I’ve tried creating an index on the `orders` table for `user_id` and `status`, but it didn’t help much. The database is running on an AWS RDS instance with db.t3.medium specifications. Additionally, I tried rewriting the query using a LEFT JOIN instead of the subquery, but the performance barely improved: ```sql SELECT a.id, a.name, COUNT(o.id) AS completed_orders FROM users a LEFT JOIN orders o ON o.user_id = a.id AND o.status = 'completed' WHERE a.active = true GROUP BY a.id, a.name; ``` I also analyzed the execution plan, and it shows that the subquery is causing a sequential scan on the `orders` table, which seems to be the bottleneck. The table has about 2 million rows. What can I do to optimize this query further? Are there specific indexing strategies or query rewriting techniques that could help enhance performance? I'm working on a API that needs to handle this. For context: I'm using Sql on macOS. What am I doing wrong? Any examples would be super helpful. I recently upgraded to Sql latest. What am I doing wrong?