Visualize Open Data using MongoDB in Real-Time

Li-Ting Liao
Dev Diaries
Published in
4 min readAug 20, 2020

Using Python to connect to Taiwan Government PM2.5 open data API, and schedule to update data in real-time to MongoDB — Part 2

Goal

This time I’m using the same PM2.5 open data API (used in Part 1) to showcase how to refresh real-time data into MongoDB every 2 min (because it’s the time for the government’s portal to refresh its API). The strength of MongoDB is it’s simple to use, especially with JSON document format data. This makes connecting to open data much easier. Also, we can directly show real-time data changes from our database using its Charts & Dashboard features.

How convenient!

The below demo uses Taipei City (the capital city of Taiwan) as an example:

Recorded at 13:30–14:00 on Aug 20, 2020

Skills covered:

  • Connect to API with required parameters to filter out all sensors data in Taipei City
  • Insert the first batch of data into MongoDB
  • Set a schedule to extract a new batch of PM2.5 data from API into MongoDB
  • Create charts into a dashboard

So, let’s get started.

Process

Import all required libraries:

Connect to API with required parameters to filter out all sensors data in Taipei City. Raw data looks like below (total count of sensors is 100):

All data was stored in the “first_batch” variable:

The first value within the “first_batch” list is a sensor station’s data read:

print(first_batch[0])# output: 
{'_id': '10189360662', 'name': 'PM2.5', 'areaDescription': '營建混合物土資場', 'city': '臺北市', 'township': '北投區', 'observedArea': {'type': 'Point', 'coordinates': [121.4871916, 25.121195]}, 'iso8601_UTC_0': '2020-08-20T05:22:58.000Z', 'UTC_0': '2020-08-20 05:22:58+00:00', 'UTC_8': '2020-08-20 13:22:58+08:00', 'result': 22.0, 'unitOfMeasurement': 'μg/m3'}

Then connect to my MongoDB Atlas and insert the first batch of data:

Next, set a scheduler to pull out the latest PM2.5 data read from API (every 2 min and stop at a time whenever we wanted) and update data by “_id” on MongoDB i.e. “stationID” of each station:

In MongoDB it will look like this:

PM2.5 intensity score was 19.47.
After 2 min, it became 20.16.

Lastly, we created each chart on dashboard as follows:

Add new data source (my real-time data is saved in collection “test2”).
Create a new dashboard.
Create a heat map.
Once we drag the chart into the dashboard, we can set the auto-refresh feature on the dashboard. When our application is running in the background, updating data into MongoDB, our charts will then be updated accordingly.
We can also create a scatter plot with customized tooltips. We can see there was a construction site which may result in higher level of PM2.5.
Note that time series line chart’s date format need to be modified in customized tab.
We can also create a gauge chart (The maximum score of PM2.5 is 100.)

Conclusion

With the above 4 charts, our dashboard is ready:

We can further modify the color according to the intensity level set by government e.g. in Taiwan, 0–30 μg/m3 is low, 30–50 μg/m3 is medium, etc. Below I set within 5 min, how much the PM2.5 intensity changed “slightly” across different sensors in Taipei City on both maps. This clip was recorded later than the previous demo, around 19:00–19:30, but still on the same day.

At the left-bottom corner of scatter plot, it shows how much time left for mongoDB to refresh the data input again, or just stare at the below clip for 10 sec you may spot the difference :D

Recorded at 19:00–19:30 on Aug 20, 2020

That’s it. Hope you find this helpful.

Have a wonderful day!

--

--

Li-Ting Liao
Dev Diaries

Software developer by day, amateur writer by night. Passionate about both code and creativity, and always seeking new ways to learn and grow.