Basic Data Operations — Part 2
The most basic data operations which we use in day to day basis for data wrangling, in comparison with SQL query for the same operation. This article mainly focuses on condition based filtering in Python. For more you could refer python documentation. Enjoy reading!
Let us consider image given below as reference dataset. Referring this as dataset1
Filtering data based on single column condition
SQL Query:
Select * from dataset1 where Country IN ('Canada')
OR
Select * from dataset1 where Country = 'Canada'
In SQL, for exact value match use =
sign and IN
operator. For partial value match use like
function.
SQL Output:
Python Code [Approach -1]:
Approach 1 code will use isin
function to filter out the data. We have used two steps, firstly create boolean variable satisfying the filtering condition then use boolean variable to filter rows
For more information, refer this (Source)
country_list = ['Canada']
dataset1[dataset1['Country'].isin(country_list)]
Python Code [Approach -2]:
Approach 2 code will use loc
and isin
function to filter out the data.
For more information, refer this (Source)
country_list = ['Canada']
dataset1.loc[dataset1['Country'].isin(country_list)]
In Python, for exact value match use ==
sign and isin
function. For partial value match use functions like startswith
, endswith
and other funcitons.
Python Output:
Filtering data based on multiple column conditions
For multiple column based conditions, separate the conditions with AND/OR based on requirement in both SQL and Python. For instance, we need to find all rows in dataset1
that belongs to country Cananda
and product name ends with nana
SQL Query Example:
Select * from dataset1 where Country IN ('Canada') AND Product like '%nana'
Python Code Example using Approach 1:
country_list = ['Canada']
dataset1[dataset1['Country'].isin(country_list) and dataset1['Product'].str.endswith('nana')]
Python Code Example using Approach 2:
country_list = ['Canada']
dataset1.loc[dataset1['Country'].isin(country_list) and dataset1['Product'].str.endswith('nana')]
Output:
Thanks for reading!
Basic Data Operations — Part 3, Coming Soon.
Stay Connected 😀