Visualize Open Data using MongoDB in Real-Time
Using Python to connect to Taiwan Government PM2.5 open data API, and schedule to update data in real-time to MongoDB — Part 2
Goal
This time I’m using the same PM2.5 open data API (used in Part 1) to showcase how to refresh real-time data into MongoDB every 2 min (because it’s the time for the government’s portal to refresh its API). The strength of MongoDB is it’s simple to use, especially with JSON document format data. This makes connecting to open data much easier. Also, we can directly show real-time data changes from our database using its Charts & Dashboard features.
How convenient!
The below demo uses Taipei City (the capital city of Taiwan) as an example:
Skills covered:
- Connect to API with required parameters to filter out all sensors data in Taipei City
- Insert the first batch of data into MongoDB
- Set a schedule to extract a new batch of PM2.5 data from API into MongoDB
- Create charts into a dashboard
So, let’s get started.
Process
Import all required libraries:
Connect to API with required parameters to filter out all sensors data in Taipei City. Raw data looks like below (total count of sensors is 100):
All data was stored in the “first_batch” variable:
The first value within the “first_batch” list is a sensor station’s data read:
print(first_batch[0])# output:
{'_id': '10189360662', 'name': 'PM2.5', 'areaDescription': '營建混合物土資場', 'city': '臺北市', 'township': '北投區', 'observedArea': {'type': 'Point', 'coordinates': [121.4871916, 25.121195]}, 'iso8601_UTC_0': '2020-08-20T05:22:58.000Z', 'UTC_0': '2020-08-20 05:22:58+00:00', 'UTC_8': '2020-08-20 13:22:58+08:00', 'result': 22.0, 'unitOfMeasurement': 'μg/m3'}
Then connect to my MongoDB Atlas and insert the first batch of data:
Next, set a scheduler to pull out the latest PM2.5 data read from API (every 2 min and stop at a time whenever we wanted) and update data by “_id” on MongoDB i.e. “stationID” of each station:
In MongoDB it will look like this:
Lastly, we created each chart on dashboard as follows:
Conclusion
With the above 4 charts, our dashboard is ready:
We can further modify the color according to the intensity level set by government e.g. in Taiwan, 0–30 μg/m3 is low, 30–50 μg/m3 is medium, etc. Below I set within 5 min, how much the PM2.5 intensity changed “slightly” across different sensors in Taipei City on both maps. This clip was recorded later than the previous demo, around 19:00–19:30, but still on the same day.
At the left-bottom corner of scatter plot, it shows how much time left for mongoDB to refresh the data input again, or just stare at the below clip for 10 sec you may spot the difference :D
That’s it. Hope you find this helpful.
Have a wonderful day!