Pandas Library for Data Analysis in Python
The Pandas library is a powerful and versatile tool for data analysis in Python. This article provides a basic introduction to Pandas, including how to import it, create Series (one-dimensional data structures), and select data from a Series.
What is Pandas?
Pandas is an open-source, free-to-use Python library specifically designed for data analysis. It offers a wide range of functionalities for working with data, including:
- Importing data from various sources like CSV, Excel, and SQL databases
- Cleaning and manipulating data
- Performing data analysis and calculations
- Creating data visualizations
Importing Pandas and Creating a Series
Here’s how to import Pandas and create a Series:
import pandas as pd
# Create a Series from a list
data = [1, 2, 3, 4, 5]
series = pd.Series(data)
print(series)
This code will create a Series with the provided list as its data and indexes starting from 0.
Another way to create a Series is from a dictionary:
data = {"Name": "Alice", "Age": 30, "City": "New York"}
series = pd.Series(data)
print(series)
In this case, the dictionary keys become the indexes of the Series, and the values become the data.
Increasing the Column Count
To create a Series with multiple columns, you can provide a list of lists to the pd.Series()
function. Each sub-list within the main list will become a separate column in the Series.
For example:
data = [[1, "A"], [2, "B"], [3, "C"]]
series = pd.Series(data, index=["Row1", "Row2", "Row3"])
print(series)
This code creates a Series with two columns named “data1” and “data2” (default column names) and custom indexes.
By understanding these basics, you can start working with Pandas to import data, create Series, and perform essential data manipulations.
This article covers only a small part of Pandas functionalities. There’s much more to explore, including DataFrames (two-dimensional data structures), data cleaning, analysis, and visualization.
Please read the next part using this link: [Exploring Pandas: Reading CSV Files into DataFrames]