Pandas — Operations
Pandas Dataframe Examples: Column Operations — #PySeries#Episode 14
print(“Hello Pandas operations!”)
Preparing our Notebook:
import numpy as np
import pandas as pd
The DataFrame:
df = pd.DataFrame({'col1':[1,2,3,4],
'col2':[444,555,666,444],
'col3':[‘abc’,’def’,’ghi’,’xyz’]})df.head()
Operations in a DataFrame
Finding Unique Values:
df['col2'].unique()array([444, 555, 666])
Another form:
len(df['col2'].unique())3
Yet other:
df['col2'].nunique()
3
Counting Values
df['col2'].value_counts()
Conditional Selections
df
Returning a DataFrame where ‘col1’ happen to be greater than two:
df[df['col1']>2]
Combining Conditional Selection
df[(df['col1']>2) & (df['col2']==444)]
Apply Method
One of the powerful method in our tool belt When using Pandas;
We can grab a column and call a built-in function of it:
df['col2].sum()2109
But we can apply our custom function:
def times2(x):
return x*2
We can broadcast our function for each element in that column:
df['col2'].apply(times2)
Let’s go ahead and apply it with lambda expression:
This is probably the most powerful feature in Pandas: The ability to apply our custom lambda expression!
df[‘col2’].apply(lambda x:x*2)
Removing Columns
df
If you want that too occurs in place, we going to have to specify ‘implace=True’
df.drop('col1', axis=1)
Returning The Columns Name & Index Attributes
Columns:
df.columnsIndex(['col1', 'col2', 'col3'], dtype='object')
Index:
df.indexRangeIndex(start=0, stop=4, step=1)
Sorting & Ordering a DataFrame
Just pass in the column we want to sort by:
df
df.sort_values('col2')
df.sort_values(by='col2')
Booleans
df.isnull()
Pivot Tables
The pivot table takes simple column-wise data as input and groups the entries into a two-dimensional table that provides a **multidimensional summarization of the data**
As we build up the pivot table, I think it’s easiest to take it one step at a time. Add items and check each step to verify you are getting the results you expect. Don’t be afraid to play with the order and the variables to see what presentation makes the most sense for your needs.
data=pd.DataFrame({'A':['foo','foo','foo','bar','bar','bar'],
'B':['one','one','two','two','one', 'one'],
'C':['x','y','x','y','x','y'],
'D':[1,3,2,5,4,1]})#datadf=pd.DataFrame(data)df
Pivot method takes 3 values: values, index, and columns:
df.pivot_table(values='D', index=['A','B'], columns='C')
print(“Thanks everyone! See you in the Next Pandas Lecture o/”)
Colab File link:)
Credits & References:
Jose Portilla — Python for Data Science and Machine Learning Bootcamp — Learn how to use NumPy, Pandas, Seaborn , Matplotlib , Plotly , Scikit-Learn , Machine Learning, Tensorflow , and more!
Posts Related:
00Episode#PySeries — Python — Jupiter Notebook Quick Start with VSCode — How to Set your Win10 Environment to use Jupiter Notebook
01Episode#PySeries — Python — Python 4 Engineers — Exercises! An overview of the Opportunities Offered by Python in Engineering!
02Episode#PySeries — Python — Geogebra Plus Linear Programming- We’ll Create a Geogebra program to help us with our linear programming
03Episode#PySeries — Python — Python 4 Engineers — More Exercises! — Another Round to Make Sure that Python is Really Amazing!
04Episode#PySeries — Python — Linear Regressions — The Basics — How to Understand Linear Regression Once and For All!
05Episode#PySeries — Python — NumPy Init & Python Review — A Crash Python Review & Initialization at NumPy lib.
06Episode#PySeries — Python — NumPy Arrays & Jupyter Notebook — Arithmetic Operations, Indexing & Slicing, and Conditional Selection w/ np arrays.
07Episode#PySeries — Python — Pandas — Intro & Series — What it is? How to use it?
08Episode#PySeries — Python — Pandas DataFrames — The primary Pandas data structure! It is a dict-like container for Series objects
09Episode#PySeries — Python — Python 4 Engineers — Even More Exercises! — More Practicing Coding Questions in Python!
10Episode#PySeries — Python — Pandas — Hierarchical Index & Cross-section — Open your Colab notebook and here are the follow-up exercises!
11Episode#PySeries — Python — Pandas — Missing Data — Let’s Continue the Python Exercises — Filling & Dropping Missing Data
12Episode#PySeries — Python — Pandas — Group By — Grouping large amounts of data and compute operations on these groups
13Episode#PySeries — Python — Pandas — Merging, Joining & Concatenations — Facilities For Easily Combining Together Series or DataFrame
14Episode#PySeries — Python — Pandas — Pandas Dataframe Examples: Column Operations (this one)
15Episode#PySeries — Python — Python 4 Engineers — Keeping It In The Short-Term Memory — Test Yourself! Coding in Python, Again!
16Episode#PySeries — NumPy — NumPy Review, Again;) — Python Review Free Exercises
17Episode#PySeries — Generators in Python — Python Review Free Hints
18Episode#PySeries — Pandas Review…Again;) — Python Review Free Exercise
19Episode#PySeries — MatlibPlot & Seaborn Python Libs — Reviewing theses Plotting & Statistics Packs
20Episode#PySeries — Seaborn Python Review — Reviewing theses Plotting & Statistics Packs