Converting nested JSON structures to Pandas DataFrames
Published in
4 min readMay 10, 2020
The Problem
APIs and document databases sometimes return nested JSON objects and you’re trying to promote some of those nested keys into column headers but loading the data into pandas gives you something like this:
df = pd.DataFrame.from_records(results["issues"], columns=["key", "fields"])
The problem is that the API returned a nested JSON structure and the keys that we care about are at different levels in the object.
What we want is this:
An example
We’re going to use data returned from the Jira API as an example. The Jira API often includes metadata about fields. Let’s say these are the fields we care about
- key: Within the list of issue records this is a top-level JSON key so no problem pulling it into its own column.
- summary: This is at the second level, inside the “fields” object.