CodexBloom - Programming Q&A Platform

GCP Dataflow job scenarios with 'InvalidArgument' when using BigQueryIO with table partitioning

๐Ÿ‘€ Views: 7607 ๐Ÿ’ฌ Answers: 1 ๐Ÿ“… Created: 2025-06-09
gcp dataflow bigquery java Java

I recently switched to I tried several approaches but none seem to work....... I'm working with an 'InvalidArgument' behavior when trying to run a Dataflow job that reads from a partitioned BigQuery table using the `BigQueryIO` API. The table is partitioned by ingestion time, and I'm trying to read the latest day's worth of data. Hereโ€™s the code snippet Iโ€™m using: ```java PCollection<TableRow> rows = p.apply("ReadFromBQ", BigQueryIO.readTableRows() .from("my_project:my_dataset.my_table" + "@{`_PARTITIONTIME` = TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)}`") .withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) .withSchema(schema)); ``` When I execute this job, I receive the following behavior message: ``` Invalid argument: Unable to find the specified partition. ``` I believe I have correctly specified the partition in the query, but it seems like Dataflow is not able to recognize it. Iโ€™ve also verified that the partition exists by querying directly using the BigQuery console: ```sql SELECT * FROM `my_project.my_dataset.my_table` WHERE _PARTITIONTIME = TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY); ``` This query returns results as expected. Iโ€™ve tried changing the method to `BigQueryIO.TypedRead.Method.QUERY` and specifying the SQL directly, but that didn't resolve the scenario either. Iโ€™m using Dataflow SDK version 2.30.0 and the job is running in a streaming mode. Any guidance on how to properly access the partitioned table or what might be misconfigured would be greatly appreciated. My development environment is Linux. I appreciate any insights! Could this be a known issue?