CodexBloom - Programming Q&A Platform

GCP Dataflow Pipeline scenarios with 'InvalidTableReference' scenarios When Writing to BigQuery

👀 Views: 0 đŸ’Ŧ Answers: 1 📅 Created: 2025-08-20
gcp dataflow bigquery apache-beam Python

I can't seem to get I'm working through a tutorial and I'm stuck trying to I'm having trouble with I'm working with an scenario with my Apache Beam Dataflow pipeline while trying to write data to BigQuery..... Despite having the correct configuration, I keep getting an 'InvalidTableReference' behavior. My BigQuery dataset and table are both correctly set up, and I confirmed that my service account has the necessary permissions. Here's the relevant code snippet for writing to BigQuery: ```python from apache_beam.io import WriteToBigQuery from apache_beam import Pipeline project_id = 'my-gcp-project' dataset_id = 'my_dataset' table_id = 'my_table' def run(): with Pipeline(options=PipelineOptions()) as p: (p | 'ReadFromSource' >> beam.io.ReadFromText('gs://my-bucket/source-data.txt') | 'TransformData' >> beam.Map(lambda x: {'field1': x.split(',')[0], 'field2': int(x.split(',')[1])}) | 'WriteToBigQuery' >> WriteToBigQuery( table='{}:{}.{})'.format(project_id, dataset_id, table_id), write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND )) if __name__ == '__main__': run() ``` The behavior message I'm receiving is: ``` ValueError: Invalid table reference 'my-gcp-project:my_dataset.my_table': Table 'my-gcp-project:my_dataset.my_table' not found. ``` I've double-checked the dataset and table names, and they exist in the BigQuery interface. I'm using the following versions of libraries: - Apache Beam: 2.34.0 - Google Cloud BigQuery client: 2.34.0 I also verified that the service account used by Dataflow has roles `BigQuery Data Editor` and `BigQuery User`. Is there something I might be missing regarding the table reference or permissions that could lead to this scenario? I'm working with Python in a Docker container on Windows 11. Thanks for taking the time to read this! This issue appeared after updating to Python 3.9. I'm working with Python in a Docker container on Debian. I'd really appreciate any guidance on this.