The basic function of Python Pandas that will help you in data science

Photo by Xtina Yu on Unsplash

For this couple of weeks, I’ve been working on data cleaning. This is a really easy task if you only need to clean a few of the data. It becomes hideous if you do a thousand rows of data manually. That is where the great Pandas will rescue you from this trouble.

If you wonder how such a cute fluffy animal will help you with this problem. It is not a real panda, but a Python package called Python Pandas. You can read more about what is Python here. Python has a lot of packages, one of them is Python Pandas. This package focused on data analysis. It works like Excel but on Python.

Okay, let’s go to how to use Python Pandas. The most important thing you need to do is install Python on your computer. You can read how to install python here. You can check if your Python is installed correctly by typing this on your terminal:

python

Your terminal should show something like this:

The next step is to install the Pandas. You can do that by simply type on your terminal:

pip install pandas

Now you should be perfectly ready to play with Python Pandas. I will tell you how to use Python Pandas. You need to open your terminal and type:

python

Same as before, you should see something like this:

Then you need to import Pandas package by typing:

import pandas as pd

If you want to read the official documentation of Python Pandas, you can click this link. Here are some of the basic functions of it:

1. Read CSV

You can download the CSV file here. Then go to the CSV file with your terminal, and you can write this script after you run python and import pandas again.

dataframe = pd.read_csv('file_name.csv')

It means you put your CSV data to the dataframe variable. You can access your data with this command:

#if you want to show all rows
dataframe
#only show top 5 rows
dataframe.head()

You can select which row you want to see by using this command:

#show only the first row
dataframe.loc[1]
#show range of rows
data.loc[range(1,3)]

2. Write CSV

dataframe.to_csv('new_file_name.csv', index=None)

This will insert dataframe to new_file_name.csv.

3. Data Manipulation

This is the one that helps me do the data cleaning task. For this example, We are not using the CSV data. We are going to create the dataframe. To do that, we can type this command:

new_dataframe = pd.DataFrame({“integer_col”: [1,2,3,4,5], “string_col”: [‘hello’, ‘my’, ‘beautiful’, ‘world’, ‘!’], “float_col”: [0.1, 0.2, 3.3, 4.5, 52.2348], “boolean_col”: [True, False, True, True, False]})

After that, you should get this dataframe

If you want to get only the true boolean_col you can use this command:

new_dataframe.loc[(new_dataframe.boolean_col == True)]

And you will get this result

If you want to change the 4th-row boolean col value to True you can simply do this:

new_dataframe.loc[4, 'boolean_col'] = True

So, when you try to find the boolean_col with true value, you will get the 4th-row as well. You can change multiple rows if you want. Example:

new_dataframe.loc[(new_dataframe.boolean_col == True), 'string_col'] = "okay fine :("

By doing that your dataframe will become like this:

If you want to set a null value you need to import Numpy:

import numpy as np

After that you can set the null value by doing this:

new_dataframe.loc[4, 'boolean_col'] = np.nan

There are still many other interesting features, but it should be enough to clean a basic CSV file. Hope this tutorial will help you soon.

--

--

--

Stories about designing and building tech that matters to people.

Recommended from Medium

See the Top Rated Talks from ODSC APAC 2020 Here

Python codes for types of Classification Algorithms

Data Model for Ride/Cab Service Provider

How Data Science Can Be As Much an Art As It Is a Science

Call for ODSC Europe 2020 Virtual Conference Speakers and Instructors

How the weather affect the number of increment of COVID-19 patients

Starbucks Capstone Challenge — Predicting Offer Success

Matt Himes — Project Report

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Saddhana Arta Daniswara

Saddhana Arta Daniswara

More from Medium

Decision Trees in Purchasing with Python

Gender Divide in Salaries- Capstone Project For Zero To Pandas Course

Simple Linear Regression From Scratch Using Python

png

How to Deal with Null, N/A, or Empty Cells in Your Dataframe Using Python.