Minh DOANAWS EMR Serverless with Spark: A Guide to writing to DynamoDB with Docker Testing…Hello, this is a part of my data journey as a junior Data Engineer, where I would like to share with you and future me how I solved…Nov 3, 2023Nov 3, 2023
Minh DOANCosine Similarity for large scale Movies Recommendations evaluationHello, this is a part of my data journey as a junior Data Engineer, where I would like to share with you and future me how I solved…Nov 2, 2023Nov 2, 2023
Minh DOANSet up a customised Airflow local server for triggering a PySpark job on EMR ServerlessHello ! In this article, I’ll guide you on using Airflow to schedule a PySpark job run in EMR Serverless. If you’re curious about…Nov 2, 2023Nov 2, 2023
Minh DOANSpark submit with PySpark and AWS EMR Serverless 6.9.0Hello, I am writing this because I know that there are many Data Engineers out there who are struggling to run a Pyspark script in a…Nov 2, 2023Nov 2, 2023