MySQL 8.0 - How to optimize JOIN performance on large tables with different indexing strategies?
I'm getting frustrated with I'm maintaining legacy code that I'm working on a project and hit a roadblock..... Hey everyone, I'm running into an issue that's driving me crazy. I'm currently working on a MySQL 8.0 database where I need to join two large tables, `orders` and `customers`, to generate a report. The `orders` table has around 1 million rows, and the `customers` table has approximately 500,000 rows. I noticed that the query performance degrades significantly when I run the JOIN, taking over 30 seconds to complete. I have the following query: ```sql SELECT o.order_id, c.customer_name FROM orders o JOIN customers c ON o.customer_id = c.customer_id WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'; ``` I've tried adding indexes on both `customer_id` columns in `orders` and `customers`, as follows: ```sql CREATE INDEX idx_customer_id_orders ON orders(customer_id); CREATE INDEX idx_customer_id_customers ON customers(customer_id); ``` However, the performance hasn't improved significantly. I'm also seeing that the query plan indicates a full table scan for `orders`, even with the indexes in place. To troubleshoot, I ran `EXPLAIN` on the query and noticed that the `type` is showing as `ALL`, which suggests that it's not using the index. I've also considered partitioning the `orders` table based on `order_date`, but Iām unsure if that will bring the desired performance benefit. Is there a better indexing strategy I should consider, or should I explore other optimization techniques such as query restructuring or caching mechanisms? Any insights on best practices for handling such large JOINs would be greatly appreciated. This is part of a larger CLI tool I'm building. I'd really appreciate any guidance on this. The stack includes Sql and several other technologies. What am I doing wrong? My team is using Sql for this web app. This is my first time working with Sql 3.11.