GCP BigQuery query returning 'Exceeded rate limits' scenarios on large datasets despite optimized SQL
I'm getting frustrated with This might be a silly question, but I'm working with Google BigQuery to analyze a large dataset (over 1 billion rows) and keep working with the behavior `Exceeded rate limits` when executing my SQL queries, particularly with queries that involve JOIN operations across multiple tables. I've optimized my SQL as much as possible by using `WITH` clauses to structure subqueries, but it still fails with the rate limit behavior. Here's a simplified version of the query I'm using: ```sql WITH filtered_data AS ( SELECT * FROM `my_project.my_dataset.large_table` WHERE some_condition = 'value' ) SELECT fd.*, lt.other_column FROM filtered_data fd JOIN `my_project.my_dataset.another_table` lt ON fd.id = lt.id; ``` I’ve tried breaking the query into smaller parts to run them individually, but that feels inefficient. Additionally, I checked my project quota limits in the GCP console and I appear to be well below the allocated limits for both queries and resources. I also attempted to adjust the default settings for the query and enabled the `use_legacy_sql` as false, hoping it would yield better performance but to no avail. I'm currently using the `google-cloud-bigquery` library (version 2.26.0) in Python for making the calls. Is there a recommended approach for handling large queries in BigQuery that might help avoid these rate limits, or possibly some configuration settings I might have missed? Any insights on best practices would be greatly appreciated. I'm working on a application that needs to handle this. How would you solve this? I recently upgraded to Python stable. Could this be a known issue?