Python DataFrame slicing in the easiest way (How to find a company from 5000 companies)

wsh
Python Data Analysis
5 min readAug 14, 2021

Python Data Analysis Basics (PDA3)

Python Financial Analysis | Home

DataFrame is one of the most common data types you see in data science. A DataFrame represents a table like a one you see on Excel. The existence of DataFrame is major reason why we want to do data analysis in Python instead of on Excel. It’s super convenient, has a lot of functions, and easy to use.

The moment you feel it’s more convenient that Excel is when you access specific column or cells. Imagine finding one cell on table that contains thousands of columns and millions of rows. It would not even be possible to open such file with Excel on a cheap computer. It’s not rate in a practical data science. But, with DataFrame, it’s just a single line of code!

This story is about DataFrame slicing. Slicing in programming means extracting data you want. Here’s some examples of slicing.

You can download the data set from this link
https://drive.google.com/drive/folders/1Ux2u1s5mctYiywS08sv7_3_PbnWd8v0G?usp=sharing

Examples of DataFrame slicing

0. Prerequisites

We first need to import packages and read the dataset “meta.csv”. The CSV file is available from the link above. If you don’t’ know what’s DataFrame in the first place, I believe this story would help you (some contents are duplicated)
#2 Handling table like data in Python with DataFrame (Python Financial Analysis)

The full Python code is available at the end

1. Let’s see a DataFrame object first!

The full Python code is available at the end

This code just displays the content of the table “meta.csv”. As you can also see on Excel, It has several columns like “ticker”. The rows are exact information of companies. Because the number of rows is over 5000, it seems not be easy to find specific companies on Excel (I admit I don’t’ know so much about Excel). The integers on the first column is called “index” in Python. The second image is an example of console output.

2. Get a column

The full Python code is available at the end

If we want to make a list of all tickers, or want to calculate the average of market caps, we need to extract just one column out of the table. That kind of operation is quite common in data science. The example code extracts the column “ticker”. The result on the console is like the following.

3. Get multiple columns

The full Python code is available at the end

If we want just pairs of tickers and market caps of companies, we give their columns names [“ticker”, “market_cap”] to the DataFrame “meta”.

4. Get the 10-th row

The full Python code is available at the end

Let’s say that you were interested in a specific company when you saw the table, and wanted to access its exact information. The “iloc[]” property of DataFrame can do that for you. We just give the row number. The example extracts the 10-th row. Note indexing starts from 0 in programming.

5. Get rows of index from 10 to 15

The full Python code is available at the end

You can also extract range of rows. Just give the start row number and the end row number with a semicolon “:” between them. The example selects rows of index from 10 to 15.

6. Get a row whose ticker is “AAPL” (Apple)

The full Python code is available at the end

The next example is more practical. You can find a company from the table with 5000 rows with just one line of code if you know its ticker.

7. Get “market_cap” of a company whose ticker is “AAPL” (Apple)

The full Python code is available at the end

The example we’ve seen so far is about searching over the rows. But we can combine tow limitations. This example extracts the market cap of Apple. We first specify that the row must have ticker “AAPL”. After that, we add another limitation that it should contain just one column “matket_cap”.

But you can see that this returns a “Series” object. What we want is a float value. So, we access the fist entry of the Series with the “iloc[0]”. The final result is the second image.

Full Python code

The dataset “meta.csv” is available from the link above.

Related articles

#2 Handling table like data in Python with DataFrame (Python Financial Analysis)

List of articles

1. Python Financial Analysis

1 Read fundamental data from a CSV in Python
2 Handling table like data in Python with DataFrame
3 Make graphs of stock price in Python
4.1 Make custom market index — prerequisites
4.2 Make custom market index — make your own index
4.3 Make custom market index — market cap based index
5.1 Analyze COVID-19 Impacts by Sector in Python — compare weighted average prices
5.2 Analyze COVID-19 Impacts by Market Caps in Python — compare weighted average prices
5.3 Find companies that lost or gained from the COVID19 pandemic

2. Python Data Analysis Basics (easiest ways)

Python “datetime” in the easiest way (how to handle dates in data science with Python)
Python DataFrame slicing in the easiest way (How to find a company from 5000 companies)

Other Links

Python Financial Analysis | Home
Python Data Analysis | Home

--

--