Pandas >> Data Combination(1): merge()
8 min readMay 3, 2022
In this tutorial, we will explain how to combinate/join multiple DataFrames using merge().
Table of Contents
- Introduction to merge()
- Data Preparation
- JOIN
- INNER JOIN
- LEFT JOIN
- RIGHT JOIN
- OUTER JOIN
- CROSS JOIN
- LEFT EXCLUSIVE/ANTI JOIN
- RIGHT EXCLUSIVE/ANTI JOIN
- FULL OUTER EXCLUSIVE/ANTI JOIN - JOIN by multiple columns
- Removing duplicate columns
- Specify suffixes for columns with the same name
- Conclusion
Introduction to merge()
Firstly, let’s see the definition of the merge() function.
pandas.merge: Merge DataFrame or named Series objects with a database-style join.
https://pandas.pydata.org/docs/reference/api/pandas.merge.html
pandas.merge(left, right, how=‘inner’, on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=(‘_x’, ‘_y’), copy=True, indicator=False, validate=None)
Let’s explain the main parameters.
- left: The DataFrame on the left.
- right: The DataFrame on the right.
- how: How to join the two DataFrames. There are…