MySQL: JOIN Performance implementing Large Tables and Subqueries

👀 Views: 41 💬 Answers: 1 📅 Created: 2025-08-28

I'm working on a personal project and I'm working with important performance optimization when executing JOIN operations involving large tables and subqueries in MySQL 8.0... I have two tables: `orders` (around 1 million rows) and `customers` (about 500,000 rows). My query is meant to fetch customer details along with their total order amounts, but it takes an excessively long time to execute. Here's the query I'm using: ```sql SELECT c.customer_id, c.name, SUM(o.amount) as total_amount FROM customers c JOIN orders o ON c.customer_id = o.customer_id WHERE c.status = 'active' GROUP BY c.customer_id, c.name; ``` I’ve noticed that the performance drastically decreases when I add the `WHERE` clause filtering by `c.status`. Without it, the query takes about 5 seconds, but with the filtering, it jumps to over 40 seconds. I've tried creating indexes on both the `customer_id` column in both tables and also on the `status` column in the `customers` table. Here’s what I did: ```sql CREATE INDEX idx_customer_id ON customers(customer_id); CREATE INDEX idx_order_customer ON orders(customer_id); CREATE INDEX idx_customer_status ON customers(status); ``` Despite creating these indexes, the performance has not improved significantly. Additionally, I’ve run `EXPLAIN` on the query, which indicates that a full table scan is still being performed on the `customers` table. I’m unsure how to optimize this further. Should I consider restructuring the query or perhaps using a different approach? Any advice would be appreciated! My development environment is Ubuntu. How would you solve this? I'm developing on Windows 11 with Sql. The project is a REST API built with Sql. My team is using Sql for this application. Any ideas what could be causing this?