5 Pandas Fundamentals | Python
Pandas library build on top of Python language is, by no doubt, the most powerful tool currently for data manipulation and analysis. For this reason, getting a good understanding of this library is a step towards the right direction of being a better data scientist, statistician, analyst and more.
In this article we will cover 5 essentials in Pandas.
1. Merging / Joining
Joining of two or more DataFrames / tables in Pandas is done using merge
function.
Pandas.DataFrame.merge(Documentation):
DataFrame.merge
(left, right, how=inner
, on=None
, left_on=None
, right_on=None
, left_index=False
, right_index=False
, sort=False
, suffixes=_x
, _y
, copy=True
, indicator=False
, validate=None
) →DataFrame[source]
Lets go through some parameters:
- left and right — These are left and right tables you want to merge respectively.
- how — There are several options for this parameter: {
left
,right
,outer
,inner
}, default:inner
.
Synonymous to SQL JOIN operator, “how” parameter brings 4 types of joins:
— Inner : Returns records that have matching values in both left and right tables.