Member-only story

Pandas Query for SQL-like Querying

A data scientist’s python tutorial for querying dataframes with the pandas query function

Matt Przybyla
Towards Data Science
4 min readMay 8, 2020

--

Photo by Mélody P on Unsplash [1]

Table of Contents

  1. Dataset
  2. Pandas
  3. Query
  4. Tutorial Code
  5. Summary
  6. References

Dataset

The dataset used in this analysis and tutorial for pandas query is a dummy dataset created to mimic a dataframe with both text and numeric features. Feel free to use your own .csv file with either or both text and numeric columns to follow the tutorial.

Pandas

Pandas [2] is one of the most common libraries used by data scientists and machine learning engineers. It is mainly used in the exploratory data analysis step of building a model, as well as the ad-hoc analysis of model results. It also contains several functions, including the query function.

Query

The query function in pandas is a useful function that acts similarly to the ‘where’ clause in SQL. The benefit of it, however, is that you do not need to keep switching from pandas, Jupyter Notebook, and the SQL platform you are currently using. Some other benefits are listed below:

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Matt Przybyla
Matt Przybyla

Written by Matt Przybyla

Sr/MS Data Scientist. Top Writer in Artificial Intelligence, Technology, & Education. Towards Data Science. Subscribe: https://datascience2.medium.com/subscribe

Responses (2)