2 Easy Steps to Extract Numeric Columns in Python

Shashanka Shekhar
Python in Plain English
3 min readFeb 23, 2024

--

Python is a high-level, general-purpose and interpreted programming language. It is known for its ease of use, powerful standard library and dynamic semantics. Python is widely used in various sectors including machine learning, artificial intelligence, data analysis, web development and many more. Its simple, easy-to-learn syntax emphasizes readability and therefore reduces the cost of program maintenance.

Photo by Eusebiu Soica on Unsplash

The problem we will be solving?

A snapshot of our data

This is a Big Mart Sales data that contains sales of each product at a particular outlet in a BigMart. A number of attributes of different products have been defined which affect the value of sales they generate. The shape of data being (8523, 12).

There are total 8523 rows and 12 columns

BMS is our DataFrame storing the data having 12 different columns shown by BMS.info()

Here we want to extract all the numeric columns in our DataFrame. Now numeric columns are all those columns which have Dtype of int64 or float64.

1.Using list comprehension to extract numeric columns:

num_cols = [cols for cols in BMS.columns if BMS[cols].dtype in ['int64', 'float64']]

We will be using the above list comprehension approach to extract numeric columns. We could have used loops to do the same but list comprehensions are concise, easy to use and get the job done in a single, readable line of code.

Let’s focus on different portions of the code one by one:

  1. for cols in BMS.columns — here BMS.columns gives us all the column names from the BMS DataFrame. Now for loop needs an iterator which here is cols so cols will we be assigned with every column name of the DataFrame BMS one by one.
  2. if BMS[cols].dtype in [‘int64’, ‘float64’] — Now for every column name cols is assigned BMS[cols].dtype will extract the datatype of the column name and if it is in [‘int64’, ‘float64’] then the condition becomes True and the cols containing the corresponding column name is returned and added to the list num_cols.

2.Checking the contents of the num_cols list:

Using print function to display the num_cols

We will use print(num_cols) function to display our output and as you can see in the image above the columns in num_cols are ‘Item_Weight’, ‘Item_Visibility’, ‘Item_MRP’, ‘Outlet_Establishment_Year’ and ‘Item_Outlet_Sales’. These all are either int64 or float64 numeric columns which can be confirmed with the results of BMS.info()

Similarly we can extract categorical columns from the DataFrame too, just give it a try once.

To do aggregation of categorical columns in python refer to this link.

To do column-wise aggregation of numeric data in python refer to this link.

To make a dynamic measure column refer to this link.

To make Professional KPI refer to this link.

To create a KPI with TopN refer to this link.

To read more stories like this you can follow me with this link.

References:

  1. https://www.geeksforgeeks.org/what-is-python/
  2. https://www.python.org/doc/essays/blurb/
  3. https://www.britannica.com/technology/Python-computer-language

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

--

--

Contributor for Microsoft Power BI. I like Data Analysis and Data Science. Also I enjoy sports, videogames and Japanese Anime in my free time.