GCP BigQuery Partitioned Table Query Performance Issues with Python Client Library
I am encountering significant performance issues when querying a partitioned BigQuery table using the Python client library. My table is partitioned by ingestion time, and I have tried querying the last 30 days of data. Despite using partition filters, the query takes much longer than expected. Hereβs a snippet of my code where I set up the query: ```python from google.cloud import bigquery client = bigquery.Client() query = ''' SELECT * FROM `my_project.my_dataset.my_partitioned_table` WHERE _PARTITIONTIME BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND CURRENT_TIMESTAMP() ''' job = client.query(query) results = job.result() # Waits for job to complete. for row in results: print(row) ''' I have ensured that my dataset has appropriate partitioning and I am using the latest version of the client library (2.30.0). However, the query execution times are averaging around 30 seconds for just a few hundred rows. Additionally, I checked in the BigQuery Console and the execution plan indicates that it is scanning an unexpectedly large number of bytes, which seems unnecessary given the partition filter. I've tried optimizing the query by limiting the number of selected columns and explicitly specifying partitions in the WHERE clause, but the performance hasn't improved. Could there be any additional best practices I might be overlooking when querying partitioned tables in BigQuery? Also, is there a way to analyze why the query is scanning so much data? Any insights would be greatly appreciated!