I want to truncate my Redshift table before loading CSV file into Redshift table.
Error: airflow.exceptions.AirflowException: Invalid arguments were passed to S3ToRedshiftOperator (task_id: dag_run_s3_to_redshift). Invalid arguments were: **kwargs: {'method': 'REPLACE'}
Below Code:
task_fail_s3_to_redshift = S3ToRedshiftOperator(
s3_bucket=S3_BUCKET,
s3_key="{{ti.xcom_pull(task_ids='export_db',key='FILE_PATH_1')}}",
schema="dw_stage",
table="task_fail",
copy_options=['csv',"IGNOREHEADER 1"],
redshift_conn_id='redshift',
method='REPLACE',
task_id='task_fail_s3_to_redshift',
)
start >> task_fail_s3_to_redshift >> end
CodePudding user response:
The method parameter was added in PR it's available in:
apache-airflow-providers-amazon >= 2.4.0
The error you are having means that you are using older version of amazon provider which is why it doesn't work for you.
Your options are:
1.Upgrade the provider
pip install apache-airflow-providers-amazon --upgrade
2.If upgrade is not an option then use the deprecated truncate_table parameter:
task_fail_s3_to_redshift = S3ToRedshiftOperator(
s3_bucket=S3_BUCKET,
s3_key="{{ti.xcom_pull(task_ids='export_db',key='FILE_PATH_1')}}",
schema="dw_stage",
table="task_fail",
copy_options=['csv',"IGNOREHEADER 1"],
redshift_conn_id='redshift',
truncate_table=True,
task_id='task_fail_s3_to_redshift',
)
Since you want truncate option - it will give you the same functionality.
