Member-only story
Pandas for Data Science: A Beginner’s Guide, Part I
Begin learning the essential Pandas methods to begin creating data science projects with Python, head(), info(), sum(), and describe().
I. About Pandas for Data Science in Python
Pandas is a data analysis library that is built on top of Python. This flexible library is useful for manipulating and analyzing data in a variety of structures, however it is especially useful for tabular data, like SQL tables and Excel spreadsheets. In this tutorial, I will focus on the most essential functions for wrangling labeled tabular data in Pandas with Python.
II. Getting Some Data
- Follow this link to Kaggle and download the Metal Bands by Nation data set into your project directory.
2. Open up your terminal, navigate to your project’s directory, and open a Jupyter notebook with the following command:
$ jupyter notebook
In your browser, a new tab will open up that contains the project directory. On the top right, you’ll see drop down menu that reads “New”. From the drop down, select the name of your virtual environment. This will create a new Jupyter notebook that uses the packages you have…