Python Tip #6 — Pandas Merge

Asha Ponraj
Analytics Vidhya
Published in
2 min readOct 25, 2020

Pandas concat & append works like an array concatenation either column wise or row wise.

But Merge & Join works similar to Database table joins. Lets see Merge in this article and learn about Join in next post!

So, Merge can be used when we need to join two data frames having different columns and there is a key column in both the data frames.

For example, if we have two data frames, one with store details and another with product details of all the stores, we can merge the data frames using shop Id.

Signature:
pd.merge(left,
right,
how: str = ‘inner’,
on=None,
left_on=None,
right_on=None,
left_index: bool = False,
right_index: bool = False,
sort: bool = False,
suffixes=(‘_x’, ‘_y’),
copy: bool = True,
indicator: bool = False,
validate=None,
)

  • left & right params are data.
  • how -> mentions how the merge should happen. It takes values ‘left’, ‘right’, ‘inner’, ‘outer’. default value is ‘inner’.
  • ‘left’ -> Collects all data from left dataframe and common data of left and right.
  • ‘right’ -> Collects all data from right dataframe and common data of left and right.
  • ‘inner’ -> Collects only the common data from both the left and right dataframes. This more like an intersection.
  • ‘outer’ -> Collects all from left and right including common.
  • on -> Mentions which column should be taken as key column that exists in both the left and right dataframe.
import pandas as pd shop = { 'Shop_id' : ['SP01', 'SP02', 'SP03', 'SP04'], 'City' : ['Chennai', 'Madurai', 'Trichy', 'Coimbatore'], 'ZipCode' : [600001, 625001, 620001, 641001] } shop = pd.DataFrame(shop) shopproduct = { 'Shop_id' : ['SP01', 'SP02', 'SP02', 'SP03', 'SP03', 'SP03', 'SP05'], 'product_id' : ['p01', 'p02', 'p03', 'p01', 'p02', 'p03', 'p02'], 'price' : [220, 500, 145, 225, 510, 150, 505] } product = pd.DataFrame(product) product

‘left’ -> Collects all data from left dataframe and common data of left and right.

pd.merge(shop, product, how = 'left', on = 'Shop_id')

‘right’ -> Collects all data from right dataframe and common data of left and right.

pd.merge(shop, product, how = 'right', on = 'Shop_id')

‘inner’ -> Collects only the common data from both the left and right dataframes. This more like an intersection.

pd.merge(shop, product, how = 'inner', on = 'Shop_id')

‘outer’ -> Collects all from left and right including common.

pd.merge(shop, product, how = 'outer', on = 'Shop_id')

Excited to try this? Enjoy!

We will meet with a new tip in Python. Thank you! 👍

Like to support? Just click the heart icon ❤️.

Happy Programming!🎈.

Originally published at https://devskrol.com on October 25, 2020.

--

--