Pandas — DataFrames

The Primary Pandas Data Structure! It Is a Dict-Like Container for Series Objects— #PySeries#Episode 08

J3
Jungletronics
6 min readSep 9, 2020

--

Hello, let’s see Pandas AGAIN!

This time, DataFrame!

Fig 1. Pandas is a fast, powerful, flexible, and easy to use open-source data analysis and manipulation tool,
built on top of the Python programming language.

Here are the topics for our study about Pandas Series:

Fig 2. Numpy & Pandas Together!

The second topic will be this one: DataFrames!

DATAFRAMES

The primary Pandas data structure!

Can be thought of as a dict-like container for Series objects.

And for our database creation:

Let's seed it, so our data is the same (in case you want to follow me:)

How To Create a DataFrame

For the purpose of our studying, here is how:

DataFrame(Data, xLabel, yLabel):

Note: to work on your code you may need to retype the single quotes (´), compatible with your system;)

Now call the object:

Fig 3. Here is the table that can be better viewed, right?

Each of these columns and row is Series themselves!

INDEXING & SELECTION IN PANDAS

Using Brackets Notation:

Just pass in the column name, ie ‘W’:

See what type of object df is:

See ‘W’ is just a Series!

And The DataFrame itself?

The df itself is the DataFrame!

Using SQL Notation:

Note: not recommended, because we can confuse with the real method of df object!

So, always use the bracket Notation when it comes to rescuing series from df :)

Anyway, here you have it!

Getting Multiple Columns back!

Pass in a List, please!

Fig 4. Running df[[‘W’,’Z’]] — Getting multiples columns back!

Creating a New Column

Just make some arithmetic on the right side with the series you want to create your column:

Fig 5. Running df[‘new’] = df[‘W’] + df[‘Y’] — Creating a new row!

Dropping Columns

Pandas requires that you specify that you really want to modify your data in place (affect the original DB);

It is like so you do not accidentally lose information;

In case you’ve done a bunch of adjustments to your data, you don’t want to accidentally lose it, right?

This is like ‘commit’ in DB!

Fig 6. Running df.drop(‘new’, axis=1, inplace=True) — Dropping Columns!

Dropping Rows

This time I am not doing this in place!

Note: axis=0 is the default, so you don’t need to specify it here:)

Fig 7. Running df.drop(‘E’, axis=0) — Dropping without ’commit’ :) Now you can work w/ dropped_df object. If you specify inplace=True it will return no object :/

See that our DataFrame has not been affected yet by the last drop! We didn’t make it in place, remember?

See, df isn’t affected yet!

Fig 8. Running df, rescuing the DataFrame again!

Selecting Rows

There are two methods:

  1. LOC -> label-BASE index
  2. ILOC -> numerical-BASE index

IT’S A LITTLE WEIRD HOW THE METHODS ARE CALLED IN PANDAS:

IT USES A SQUARED BRACKET!

But that’s the way it works for Pandas!

Or alternatively, type the index of the row required!

Returning a Single Value

Returning the same as previous, just locating it.

Returning a SUB-SET of the DataFrame

Just pass two lists of the rows and columns you want!

Fig 9. Running df.loc[[‘A’, ‘B’],[‘W’, ‘Y’]] — Creating a data sub-set!

And that’s it!

Ok, we’re going to stop here for now and continue the discussion in the next PySeries Episode!

Thank You for reading this post! Bye!

We’re gonna be alright. Live From home!

The code bundle for this episode is available at:

GitHub Repo link

Colab Link

Credits & References:

Jose Portilla — Python for Data Science and Machine Learning Bootcamp — Learn how to use NumPy, Pandas, Seaborn , Matplotlib , Plotly , Scikit-Learn , Machine Learning, Tensorflow , and more!

Posts Related:

00Episode#PySeries — Python — Jupiter Notebook Quick Start with VSCode — How to Set your Win10 Environment to use Jupiter Notebook

01Episode#PySeries — Python — Python 4 Engineers — Exercises! An overview of the Opportunities Offered by Python in Engineering!

02Episode#PySeries — Python — Geogebra Plus Linear Programming- We’ll Create a Geogebra program to help us with our linear programming

03Episode#PySeries — Python — Python 4 Engineers — More Exercises! — Another Round to Make Sure that Python is Really Amazing!

04Episode#PySeries — Python — Linear Regressions — The Basics — How to Understand Linear Regression Once and For All!

05Episode#PySeries — Python — NumPy Init & Python Review — A Crash Python Review & Initialization at Numpy lib.

06Episode#PySeries — Python — NumPy Arrays & Jupyter Notebook — Arithmetic Operations, Indexing & Selection, and Conditional Selection

07Episode#PySeries — Python — Pandas — Intro & Series — What it is? How to use it?

08Episode#PySeries — Python —Pandas DataFrames — The primary Pandas data structure! It is a dict-like container for Series objects (this one)

09Episode#PySeries — Python — Python 4 Engineers — Even More Exercises! — More Practicing Coding Questions in Python!

10Episode#PySeries — Python — Pandas — Hierarchical Index & Cross-section — Open your Colab notebook and here are the follow-up exercises!

11Episode#PySeries — Python — Pandas — Missing Data — Let’s Continue the Python Exercises — Filling & Dropping Missing Data

12Episode#PySeries — Python — Pandas — Group By — Grouping large amounts of data and compute operations on these groups

13Episode#PySeries — Python — Pandas — Merging, Joining & Concatenations — Facilities For Easily Combining Together Series or DataFrame

14Episode#PySeries — Python — Pandas — Pandas Dataframe Examples: Column Operations

15Episode#PySeries — Python — Python 4 Engineers — Keeping It In The Short-Term Memory — Test Yourself! Coding in Python, Again!

16Episode#PySeries — NumPy — NumPy Review, Again;) — Python Review Free Exercises

17Episode#PySeriesGenerators in Python — Python Review Free Hints

18Episode#PySeries — Pandas Review…Again;) — Python Review Free Exercise

19Episode#PySeriesMatlibPlot & Seaborn Python Libs — Reviewing theses Plotting & Statistics Packs

20Episode#PySeriesSeaborn Python Review — Reviewing theses Plotting & Statistics Packs

31 Episode#PySeries — Pandas — DATAFRAMES — When should I use pandas DataFrame?#PySeries#Episode 31

--

--

J3
Jungletronics

😎 Gilberto Oliveira Jr | 🖥️ Computer Engineer | 🐍 Python | 🧩 C | 💎 Rails | 🤖 AI & IoT | ✍️