Visualize Open Data using MongoDB

Li-Ting Liao
Dev Diaries
Published in
4 min readAug 19, 2020

Using Python to connect to Taiwan Government PM2.5 open data API and upload batch data to MongoDB — Part 1

Goal

MongoDB is the most popular NoSQL database in the world currently and is quite simple to use.

I gave it a try by quickly documenting the government’s open data API’s PM2.5 monitoring data (which is in JSON format) and upload to MongoDB for both storage and visualization.

The demo looks like this:

What I’m going to do:

  • Connect to API
  • Select data points I would like to show on my visualization
  • Convert time format from ISO-8601 to UTC, also change time to local time zone
  • Upload data using MongoDB query operators
  • Create dashboard on MongoDB

So, let’s get started.

Process

Import all required libraries:

Connect to API from this page 環保署智慧城鄉空品微型感測器監測資料 with “requests”:

Make sure URL parameters followed API instruction — here I set it as showing only PM2.5 data, only the latest value, randomly choosing n=100 stations’ data to read, and only showing those observation values> 0.

Note that the way how I convert time format is like below (used an example to show here) because I notice those sensors’ time format is ISO-8601 and it’s in Greenwich time (UTC-0) :

iso8601_utc0 = “2020–08–18T00:41:38.000Z”UTC_0 = dateutil.parser.parse(iso8601_utc0)UTC_8 = UTC_0.astimezone(pytz.timezone(“Asia/Taipei”))

Try to print out the very first value to double-check. All looks good:

{'name': 'PM2.5', 'stationID': '10062399613', 'observedArea': {'type': 'Point', 'coordinates': [120.2777716, 23.047355]}, 'iso8601_UTC_0': '2020-08-19T02:33:59.000Z', 'UTC_0': '2020-08-19 02:33:59+00:00', 'UTC_8': '2020-08-19 10:33:59+08:00', 'result': 5.0, 'unitOfMeasurement': 'μg/m3'}

Next, before I started anything on MongoDB Atlas (cloud platform), I used the following steps to launch a free cluster for this demo. Here’s how:

Create a project
Create a free cluster. Here I used was.
After 1–3 min, the cluster will be ready to go.
Remember to whitelist your current IP to your cluster (followed MongoDB manual).

Then, create a database “test” and a collection “test” and configure them to connect my application to this database:

Enter my free cluster and go to the “Collections” tab.
Created a database and a collection.
Come back to the original cluster page to configure the database connection.
The next step is to copy the “connection string” and paste it into our application code.

Now we’re ready to connect to MongoDB. Here I uploaded data with .insert_many. It’s one of pymongo python driver methods which allows us to upload multiple data entries on MongoDB.:

All are uploaded:

Note that MongoDB will provide each data entry with a unique ID by default “_id”.

Finally, we can start making charts!

Activate MongoDB charts for the first time use.
Then add a data source.
Choose our cluster.
Choose our database and collection to import data.
Add a new dashboard and we’re all set to make charts!
choose data source once we’re on the charts editing page.
First, I created a heat map with the above settings.
Used the same logic to create a scatter plot with customized visualization settings.
Lastly, I created a bar chart showing the average PM2.5 result by time.
Color the bar charts by result mean.
Make sure that our time/date columns are customized into our intended format.

Once we saved all of these charts, they will appear on an interactive dashboard:

Conclusion

So around 10:30 AM on Aug 19, 2020, I connected to Taiwan government PM2.5 monitoring API to see the latest air quality. The randomly chosen top100 data were processed, uploaded, and presented on MongoDB as an interactive dashboard. It looked like the intensity of PM2.5 was higher in the northern part of Taiwan. I can zoom in to see where each data point came from. Lastly, I can see how the average PM2.5 intensity changes over time across the entire island.

That’s all. Hope you find this helpful :)

Have a good day!

--

--

Li-Ting Liao
Dev Diaries

Software developer by day, amateur writer by night. Passionate about both code and creativity, and always seeking new ways to learn and grow.