Learn Complete Pandas in a fun way!!! PART 1

Poornachandra Kashi
GeekyNerds
Published in
2 min readOct 8, 2020

Hey everyone, it’s been so long I wrote an article now. This would be the first article in the series of Data Science blogs.

So, Lets get started Now.

What is Pandas?
It is a python module used to work with tabular data’s.

In a technical way it is defined as a Software library which is used for data manipulations and analysis. It offers data structures and operations for manipulating numerical tables and time series.

Importing Pandas Module

To use pandas module in our project , first we need to import it .

import pandas
In most cases people import this way.
import pandas as pd

Let’s Create a Dataframe

What is a Dataframe?
Generally dataframe is a object which stores data in rows and column. We can create dataframe by extrating even from the files like Excel sheets and CSV (Comma seperated values) files. In the dataframe each column has it’s own name and the column can contain the values of any datatype like int,float,tuple etc. Column name will be a string and each row has a index which would be a integer.

The one constraints that the dataframe follows is that all the columns should be of same length.

Credits: Tutorialspoint

Code the Dataframe😊

Coding is the best part what we like in Computer science. So lets get our hands dirty with code now.

There is two way to create a dataframe.
To add elements the column

df1 = pd.DataFrame({

‘name’: [‘John Smith’, ‘Jane Doe’, ‘Joe Schmo’],

‘address’: [‘123 Main St.’, ‘456 Maple Ave.’, ‘789 Broadway’],

‘age’: [34, 28, 51]

})

So here it’s in the form of dictionary. Where each dictionary key values represents the name of the column. And the elements in the list particularly refers to the column of the key .

The Other way to create the dataframe is using nested loops.

Where each one represents the row of data.

df2 = pd.DataFrame([

[‘John Smith’, ‘123 Main St.’, 34],

[‘Jane Doe’, ‘456 Maple Ave.’, 28],

[‘Joe Schmo’, ‘789 Broadway’, 51] ],

columns=[‘name’, ‘address’, ‘age’])

We use argument name columns to create a list of column names to the dataframe.

Till here I have shown you the way to create your own dataframe with adding the data manually. But at some cases you already have data want to use the same for your project.
The data might be inside the CSV files which are known to comma separated values.

To load the data from the csv files follow this,

data = pd.read_csv(‘file.csv’)

Here the read_csv method is called and file.csv is file passed as an argument to the method.

We can also save the data to the csv files by passing the below method,

data.to_csv(‘new_file.csv’)

Will be continued in next part……

--

--