jakartasmartcity
Published in

jakartasmartcity

Visualizing Jakarta Mobility Trends and Traffic Using Data from Public APIs

Written by Muhammad Hadhari, Hansen Wiguna, and Ayu Andika

Table of Contents

Data Science Trainee Program

Background

Google Covid-19 Community Mobility Reports

Apple Mobility Trends Report

HERE Traffic Flow

Conclusion

Data Science Trainee Program

Jakarta Smart City has caught my eyes since my second year of university. I’m fascinated by how they try to make Jakarta better in every aspect of urban quality of life. Imagine a city where interconnected technology works seamlessly to improve public safety, transportation, sustainability, and economic development. That's what Jakarta Smart City wants to achieve. Amazing right? that's why I’m so grateful to be part of the Data Science Trainee Program - Batch III.

In the first month, we were introduced to how Data and Analytics Team at Jakarta Smart City tackles urban problems and makes decisions based on data. The topics are varied but related to current problems such as COVID-19, transportation, flood, air quality, CCTV, Customer Relationship Management (CRM), etc. Also, we were taught about Data Science Methodology, Design Thinking, Hypothesis Formulation, and Scientific Research Writing by experts in the subject. We not only learn from professionals, but we also learn from our peers! we share what projects have we done, what is our final year project in university, and what papers we just read. I still can’t believe how much knowledge and insights I got from this program.

Move to the second month, this was the time we implement our knowledge and skills to solve problems through a project and later presented the results to the team. Each two of us were accompanied by three mentors and grouped based on our project topics.

Background

There are several risk factors for COVID-19 transmission and mortality, including socioeconomic variables, mobility factors, social distancing rules, demographic and environmental variables. Finding these risk variables assists public health officials in identifying high-risk groups and developing various health intervention programs to minimize and mitigate disease spread.

As a response, we attempt to build a dashboard that portrays Jakarta's mobility trends using a variety of public data in order to gain a better knowledge of disease transmission on community movement and mobility level.

Jakarta Mobility Dashboard

Google Covid-19 Community Mobility Reports

In April of 2020, Google made its Community Mobility Reports freely available. This Google dataset records daily visitors to certain types of locations (e.g., grocery shops, parks, and train stations) and compares them to the baseline day before the pandemic breakout. Baseline days are a normal value for that day of the week, expressed as a median value for a five-week period from January 3rd to February 6th, 2020. Measuring it in comparison to a normal value for that day of the week is useful because people’s activities on weekends and weekdays are definitely different.

Google Covid-19 Community Mobility Reports website

On Google’s website, the report is visualized in pdfs and the data can be downloaded in CSV format. But we want to use the data to integrate automatically with the dashboard without having to download and upload it to the dashboard. So the solution is connecting Google Mobility Report to Google Data Studio using BigQuery.

1. Choose BigQuery as a data source at Google Data Studio

2. Select CUSTOM QUERY

3. Enter our custom query

SELECT
country_region as country
,sub_region_1 as region
,date
,retail_and_recreation_percent_change_from_baseline as retail
,grocery_and_pharmacy_percent_change_from_baseline as grocery
,parks_percent_change_from_baseline as parks
,transit_stations_percent_change_from_baseline as transit
,workplaces_percent_change_from_baseline as workplaces
,residential_percent_change_from_baseline as residential
FROM
`bigquery-public-data.covid19_google_mobility.mobility_report`
WHERE
country_region = "Indonesia" AND sub_region_1 = "Jakarta"
ORDER BY date
Query results preview

In this query, we connect the data from bigquery-public-data.covid19_google_mobility.mobility_report that actually can be explored further on Google Cloud Platform marketplace. Also, we limit our data only to Jakarta region, sub_region_1 = ‘Jakarta’.

Google’s Jakarta mobility visualizations on the dashboard

Emergency Community Activities Restrictions Enforcement( Emergency CARE) that instructs by the Ministry of Home Affairs Indonesia takes effect from 3 to 25 July 2021 and is implemented in Java and Bali regions. From Google Mobility data, we see Jakartan’s mobility is increasing after the Emergency CARE period especially for public places such as Transit stations (Stasiun Transit), Retail & Recreation (Ritel & Rekreasi), and Parks (Taman).

Apple Mobility Trends Report

Apple Mobility Trends Report website

Apple’s mobility trend reports, which are based on location data from Apple’s “maps” services, demonstrate how human mobility has evolved since January 2020. Different from the previous Google data, Apple categorizes their data based on the type of user mobility, such as driving, walking, and public transport. Unfortunately, in our case for Jakarta region only driving and walking data are available.

The CSV file and charts on the website show a relative volume of directions requests per country/region, sub-region, or city compared to a baseline volume on 13 January 2020.

CSV download link

Apple doesn’t provide an API or path that can be used to access the data directly to our dashboard. We have to download the CSV file from the website which contains all country's mobility trends data. Hence, we need to find a way to access this data directly from our dashboard and modify the table so it can be easily visualized later.

Apple’ mobility trends report raw data preview

Our solutions are using python to scrape the download link (link change every week) from Apple’s website then we perform data wrangling and store the data in spreadsheets. The challenge was to access spreadsheets via Google Sheets API we need to authenticate and authorize our application. Meanwhile, we would like to run the python script automatically to update the data without end-user. Therefore, we access spreadsheets on behalf of a bot account using Service Account.

Here’s how to get one:

  1. Enable API Access for a Project if you haven’t done it yet.
  2. Go to “APIs & Services > Credentials” and choose “Create credentials > Service account key”.
  3. Fill out the form
  4. Click “Create” and “Done”.
  5. Press “Manage service accounts” above Service Accounts.
  6. Press on ⋮ near recently created service account and select “Manage keys” and then click on “ADD KEY > Create new key”.
  7. Select JSON key type and press “Create”. Then download the credentials.
  8. Go to the spreadsheet that we’re going to use and share it with a client_email from the step above. Just like we do with any other Google account.
  9. Schedule python script to run weekly using Task Scheduler (Windows). You can check this article: Python Script Automation Using Task Scheduler
final python script
Apples’ mobility trends data preview

After we make sure the data is stored and weekly updated, we connect our dashboard to spreadsheets and use the data to create visualizations.

Jakarta mobility trends

As we can see from the visualizations above, in 2021 the lowest Jakarta mobility trends are in July. It shows that the Emergency Community Activities Restrictions Enforcement (3 July -12 July) effectively decreases the mobility of Jakartans.

Here Traffic Flow

HERE Traffic API documentation page

The HERE Traffic API is a REST API that provides access to real-time traffic flow data in XML or JSON, including information on speed and congestion for the region(s) defined in each request. The API can also deliver additional data such as the geometry of the road segments in relation to the flow.

Here’s how to get one:

  1. Create a freemium account on developer.here.com
  2. Generate your API key
  3. Go to Flow within a Bounding Box documentation to set your bounding box (bbox)
  4. For Traffic Flow data, request URL as follows: https://traffic.ls.hereapi.com/traffic/6.1/flow.json?bbox={input}&apiKey={input}
  5. Input bbox parameter and apiKey in request URL
Jakarta bounding box

With HERE traffic-flow data, we can visualize real-time traffic and patterns. From our bounding box, it covers most of the Jabodetabek area. The data is updated every minute and has ~160k rows. If we stored the data directly in DataFrame shape, the table size would be ~10MB for each minute. Therefore, we decided to store only certain variables that we think are important which only take ~1MB or ~15k rows for each minute. You can see the python script below on how we transform the data from JSON to DataFrame and shrinking the size.

HERE traffic data preview
Visualization of Jakarta traffic on the dashboard

Conclusion

There is a lot of data out there that we can use to analyze the Jakarta situation. In this project, we collected mobility data from several public APIs and integrated them into one dashboard. With this dashboard, we can monitor and examine the extent to which multiple factors related to government responses and disease prevalence can changes human mobility in Jakarta during the COVID-19 pandemic.

This article was written by Muhammad Hadhari(Data Science Trainee), Hansen Wiguna (Business Analyst & Lead Sub-Team), Ayu Andika(Data Analyst) from Jakarta Smart City, Data and Analytics Team. All of the opinion written in this article is personal and didn’t represent Jakarta Smart City or DKI Jakarta Provincial Government point of view.