AWS Glue Job Fails with CSV data source does not support map data type error
Published in
3 min readMar 28, 2021
AWS Glue is a serverless ETL service to process large amount of datasets from various sources for analytics and data processing. Recently I came across “CSV data source does not support map data type” error for a newly created glue job. In a nutshell, the job was performing below steps:
- Read the data from S3 using create_dynamic_frame_from_options
- Perform some required transformations
- Write the transformed data to Amazon Redshift using write_dynamic_frame_from_jdbc_conf
And it was during this write step that the glue job was failing. Lets look into it in little more details -
datasource0 = glueContext.create_dynamic_frame_from_options( connection_type="s3",
connection_options = {
"paths": [S3_location]
},
format="parquet"
)
2. The schema for the data was as below:
datasource0.printSchema() root
|-- id: string
|-- version: int
|-- description: string
|-- type: string
|-- status: string
|-- rel_metadata: map
| |-- keyType: string
| |-- valueType: string
|-- mod_metadata: map
| |-- keyType: string
| |-- valueType: string
|-- event_type: string
|-- created_at: long
|-- last_updated: long
|…