GCP Dataflow Job scenarios with 'how to create a job with status RUNNING' scenarios When Using Apache Beam 2.34.0
Does anyone know how to Does anyone know how to I've searched everywhere and can't find a clear answer..... I'm trying to execute a Dataflow job using Apache Beam 2.34.0, but I keep working with a baffling behavior: **'want to create a job with status RUNNING'**. My pipeline reads data from a Pub/Sub topic and writes it to BigQuery. I ensure that the Dataflow runner is set up correctly and that I have sufficient permissions. However, every time I trigger the job, it fails and rolls back. Hereβs a simplified version of my code: ```python import apache_beam as beam from apache_beam.options.pipeline_options import PipelineOptions options = PipelineOptions( project='my-gcp-project', runner='DataflowRunner', temp_location='gs://my-temp-bucket/temp', region='us-central1' ) with beam.Pipeline(options=options) as p: (p | 'Read from PubSub' >> beam.io.ReadFromPubSub(subscription='projects/my-gcp-project/subscriptions/my-subscription') | 'Transform Data' >> beam.Map(lambda x: x.upper()) | 'Write to BigQuery' >> beam.io.WriteToBigQuery( table='my_dataset.my_table', schema='AUTODETECT', write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND )) ``` I've tried changing the region and setting different temp locations, but the scenario continues. I also checked the Google Cloud Console, and there are no running jobs at the same time. I even attempted to run a simplified version of the pipeline that just writes a static value to BigQuery, but the same behavior occurs. Is there something I might be missing in my setup, or could this be a bug in the Beam library? Any insights would be greatly appreciated! Has anyone else encountered this? I recently upgraded to Python 3.9. I'd be grateful for any help. Any pointers in the right direction?