Geek Culture
Published in

Geek Culture

Tutorial — How to Create a Dataframe in Pandas From Scratch?

Learn how to create a dataframe and export it to a CSV format!

Photo by Dmitry Ratushny on Unsplash

Python is the first choice for many Data Scientists when it comes to data science and machine learning. It is a high-level general-purpose programming language with a wide range of functionality including Automation, Data Analytics, Databases, Machine Learning, Scientific Computing, Web Scraping, Image Processing, and so on. It is an open-source and free-to-use language that runs easily across all major operating systems.

Photo by Christina Morillo from Pexels

Python has hundreds of built-in libraries and frameworks to carry out the complex tasks of data science, machine learning, and big data. these libraries include Pandas, Numpy, Matplotlib, SciPy, Seaborn, and so on. Today we will discuss one of the most popular libraries, Pandas. Pandas is an extremely popular library that is used for data manipulation and analysis.

Photo by Stone Wang on Unsplash

In this tutorial, we’ll create a dataframe from scratch using Pandas. Pandas is an open-source and free software library written for the Python programming language for performing effective, fast, and reliable data science tasks. It has several tools for reading and writing data between data structures and different formats. Pandas deal with missing values in the data and manipulate messy data in an orderly form. You can read more about Pandas here.

Photo by Eva Elijas from Pexels

We’ll learn how to create a dataframe with Pandas in the Jupyter notebook. The data we are using today shows the top 10 metropolitan cities across all continents along with their population, country, area, and other details. This is how our data looks like —

Source — Wikipedia

There are two methods of creating a dataframe with Pandas: —

  1. Using List of lists and pd.DataFrame()
  2. Using a dictionary and pd.DataFrame()

To start working with Pandas, we will simply import the Pandas library using an alias pd and check the version with the following commands: —

Image by Author

First, we will create a dataframe using a list of lists and pd.DataFrame().

Image by Author

You will get the output as follows: —

Image by Author

Now, let's create the dataframe using pd.DataFrame: —

Image by Author

You will get the output as follows —

Image by Author

Now, let’s create a dataframe using the dictionary method —

Image by Author

You will get the output as follows —

Image by Author

Now, let's create the dataset –

Image by Author

You will get the output as follows —

Image By Author

Now let's import this dataframe to a CSV format –

Image By Author

Conclusion —

In this article, we learned how to create a dataframe using the Pandas library. If you liked my article or have any suggestions for me, please let me know by commenting below.

Thank You!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store