Upsert In Delta Lake : Part 4

Aravinth
1 min readJan 30, 2020

--

Welcome to fourth part of series on how to upsert/merge data from an Apache Spark DataFrame into a Delta table.

The previous posts can be found here:

Introduction to Delta Lake

Delta Time Travel for Data Lake

Partitioned Delta Lake

Photo by Ryoji Iwata on Unsplash

Merge in SQL

A relational database management system uses SQL MERGE statements to INSERT new records or UPDATE existing records depending on whether condition matches.

Upsert into a table using Merge:

You can upsert data from an Apache Spark DataFrame into a Delta table using the merge operation. This operation is similar to the SQL MERGE command but has additional support for deletes and extra conditions in updates, inserts, and deletes.

I have used sales.csv file for my demo.

Let’s get started

In Upcoming series , I will explain about streaming in Delta Lake.

Thanks for reading!!!!

See you soon :)

--

--