Transforming JSON Data to Pandas

A Financial Example of AAPL Income Statement

Sebastien Callebaut
stockviz
3 min readMay 30, 2023

--

Transforming data from one format to another is a common task. JSON (JavaScript Object Notation) is a popular data format for representing structured data, and Pandas is a powerful data manipulation library in Python.

Here, we explore how to transform data from a JSON file, specifically the income statement data of Apple Inc. (AAPL), into a Pandas DataFrame.

Photo by Florian Olivo on Unsplash

Step 1. Loading the JSON Data

First, we need to load the JSON data into Python. Assuming you have a JSON file named “aapl_income_statement.json” containing the AAPL income statement data, we can use the json module in Python to read the file and parse its contents into a Python object:

import json

with open("aapl_income_statement.json") as file:
data = json.load(file)

Step 2: Exploring the JSON Structure

Before transforming the JSON data into a Pandas DataFrame, let’s take a look at its structure. Understanding the structure will help us navigate through the data and extract the relevant information. You can use the print function to display the data structure:

print(data)

Step 4: Creating a Pandas DataFrame

Now that we have the income statement data, we can transform it into a Pandas DataFrame. We’ll create an empty DataFrame and then populate it with the extracted information. To do this, we’ll use the pd.DataFrame constructor from the Pandas library:

import pandas as pd

df = pd.DataFrame(income_statement_data)

Step 5: Data Cleaning and Manipulation

At this stage, we have the income statement data loaded into a Pandas DataFrame. However, we may need to perform some data cleaning and manipulation to make the DataFrame more useful for analysis. Depending on the specific requirements, you can apply various operations such as removing unnecessary columns, converting data types, and handling missing values.

For instance, you can drop unwanted columns using the drop method:

df = df.drop(["endDate", "eps"], axis=1)

Step 6: Data Analysis and Visualization

Once the data is in a Pandas DataFrame, you can leverage the extensive capabilities of Pandas and other Python libraries to analyze and visualize the data. For example, you can calculate summary statistics, plot time series data, or compare different financial metrics.

We explored how to transform JSON data into a Pandas DataFrame using the example of Apple Inc.’s income statement. We covered the steps involved, including loading the JSON data, extracting the required information, creating a DataFrame, and performing data cleaning and manipulation. With the data in a Pandas DataFrame, you can further analyze and visualize the financial metrics to gain insights into the company’s performance. By mastering the art of transforming data from various formats, you can unlock the full potential of Python’s data manipulation and analysis capabilities.

Additionally, if you are looking for a user-friendly platform where you can perform such analysis effortlessly, StockViz is an excellent choice. Give it a try today!

It is important to keep in mind that this article is not intended as specific investment advice, but rather serves to educate investors about potential investment strategies and tools. As always, it is essential to conduct thorough research and analysis before making any investment decisions, and to consult with a professional financial advisor or broker if necessary.

--

--

Sebastien Callebaut
stockviz

Using data and coding to make better investing decisions. Co-founder of stockviz.com