AWS Glue ETL Job scenarios with 'InvalidInput' scenarios After Adding New DataSource
I've encountered a strange issue with I'm working on a personal project and I'm working on an AWS Glue ETL job that processes data from an S3 bucket and writes it to another S3 destination. Everything was functioning well until I added a new data source to my job. Now I'm working with the following behavior in the Glue console: `InvalidInput: The specified input does not exist.` I've double-checked the data source paths and ensured that all permissions are correctly set. The IAM role associated with the Glue job has access to both the source and destination buckets. Here's a simplified version of my Glue job script: ```python import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Job args = getResolvedOptions(sys.argv, ['JOB_NAME']) sc = SparkContext() glueContext = GlueContext(sc) job = Job(glueContext) # Define the first data source source1 = glueContext.create_dynamic_frame.from_catalog( database="my_database", table_name="my_first_table" ) # Defining the new data source source2 = glueContext.create_dynamic_frame.from_catalog( database="my_database", table_name="my_new_table" ) # Processing logic combined = source1.union(source2) # Write output output = glueContext.write_dynamic_frame.from_options( frame=combined, connection_type="s3", connection_options={"path": "s3://my-bucket/output/"}, format="json" ) job.commit() ``` I've verified that `my_database` and `my_new_table` exist in the Glue Data Catalog. The ETL job works perfectly if I only use `my_first_table`. However, adding `my_new_table` consistently leads to this `InvalidInput` behavior. Has anyone experienced similar issues with Glue ETL jobs, especially after modifying the data sources? Any suggestions on how to debug this or additional logging that might help diagnose the question would be greatly appreciated. For context: I'm using Python on Ubuntu 20.04. Thanks for any help you can provide! This is happening in both development and production on Ubuntu 20.04. Thanks for taking the time to read this!