Member-only story
Pandas Query for SQL-like Querying
A data scientist’s python tutorial for querying dataframes with the pandas query function
Table of Contents
- Dataset
- Pandas
- Query
- Tutorial Code
- Summary
- References
Dataset
The dataset used in this analysis and tutorial for pandas query is a dummy dataset created to mimic a dataframe with both text and numeric features. Feel free to use your own .csv file with either or both text and numeric columns to follow the tutorial.
Pandas
Pandas [2] is one of the most common libraries used by data scientists and machine learning engineers. It is mainly used in the exploratory data analysis step of building a model, as well as the ad-hoc analysis of model results. It also contains several functions, including the query function.
Query
The query function in pandas is a useful function that acts similarly to the ‘where’ clause in SQL. The benefit of it, however, is that you do not need to keep switching from pandas, Jupyter Notebook, and the SQL platform you are currently using. Some other benefits are listed below: