TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Pandas for Data Science: A Beginner’s Guide, Part I

Begin learning the essential Pandas methods to begin creating data science projects with Python, head(), info(), sum(), and describe().

Christine Egan
TDS Archive
Published in
4 min readApr 4, 2021

--

Image by 995645 from Pixabay

I. About Pandas for Data Science in Python

Pandas is a data analysis library that is built on top of Python. This flexible library is useful for manipulating and analyzing data in a variety of structures, however it is especially useful for tabular data, like SQL tables and Excel spreadsheets. In this tutorial, I will focus on the most essential functions for wrangling labeled tabular data in Pandas with Python.

II. Getting Some Data

  1. Follow this link to Kaggle and download the Metal Bands by Nation data set into your project directory.

2. Open up your terminal, navigate to your project’s directory, and open a Jupyter notebook with the following command:

$ jupyter notebook

In your browser, a new tab will open up that contains the project directory. On the top right, you’ll see drop down menu that reads “New”. From the drop down, select the name of your virtual environment. This will create a new Jupyter notebook that uses the packages you have…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Christine Egan
Christine Egan

No responses yet