Visualize Hong Kong Public Estate location using Mapbox web services APIs and Open Data offered by DATA.GOV.HK

Yin
4 min readJun 4, 2022

--

Map powered by Mapbox API services

Motivation

In my recent work, I need to identify whether an address belongs to public estate. Both attributes Estate name and House name are required to include in the analysis. My quick and dirty way is to search for matches of ‘HSE’ and ‘EST’ in the residential address data field using R built-in function grepl(). This may help me to distinguish whether the address belongs to any public estate. While not all customers filled in both Estate and House name in the application form and corresponding district region. Thus, a list of Public Housing Estate is required. In this article, I will explain: -

1. Web Scrape Wikipedia table using Python package: Beautiful Soup

2. Read Open Data provided by DATA.GOV.HK

3. Visualize Hong Kong Public Estate location using Mapbox web services APIs

1. How to read Wikipedia Table?

Wikipedia provides rich resource of public information created and edited by volunteers around the world. I able to search multiple lists of public housing estates in Hong Kong by district.

Import requests and BeautifulSoup library

#Request libraryimport requestshtml = requests.get('https://en.wikipedia.org/wiki/List_of_public_housing_estates_in_Hong_Kong')#Parsing a page with BeautifulSoupfrom bs4 import BeautifulSoupsoup = BeautifulSoup(html.text, 'html.parser')

By examine the html code, all the table are wrapped by class=’wikitable’. All wikitable can be identified by BeautifulSoup function .find_all() and read data by .select(‘d’)[].get_text().

HTML code showing tables with class=’wikitable’

Identify, read and store instances in pandas data frame

Output 1 will be

Output based on web scrape of Wikipedia table

2. Read Open Data provided by DATA.GOV.HK

Open Government Data (“OGD”) initiatives have proliferated since mid-2000s for many governments. Officials opened their data to public. This increases the transparency and increases citizens’ awareness of government activities and services.

Hong Kong government provides a public sector information portal DATA.GOV.HK since 2012. Several domains of datasets are available such as City Management, Commerce and Industry, Finance, Environment, Housing and other 15 areas. Public Housing Estates is one of the available items which provided by the Hong Kong Housing Authority. The data is in JSON format and contains additional data fields compared to the Wikipedia tables such as a. Houses of the estates, b. geolocation and c. tenancy/property/carpark management office’s address and contact information. link: https://data.gov.hk/en-data/dataset/hk-housing-eslocator-eslocator/resource/15048a51-0d18-463b-8f2c-3b0edbda1f62

Open Data supported by DATA.GOV.HK

The data can be easily read using the requests.get() function as explained in my previous article and there are 235 public estates in Hong Kong.

import requestsresponse = requests.get("https://www.housingauthority.gov.hk/datagovhk/prh-estates.json")json_data = response.json()len(json_data)

For each record, it provided the number of blocks (houses) and corresponding name of blocks in 3 languages. For English, the name of houses is separated by “<br>” and “\n” for both Traditional and Simplified Chinese.

JSON format data offered by Hong Kong Housing Authority

The data retrieval is mainly involved in 2 layers, which in Estate (i) and House (j) level. The “No. of Blocks” data field allow us to determine number of House for loop j An exception handling logic is added in the for loop as there is no respective number of elements in the data field “Name of Block(s)” after split for 110th and 183rd item.

Output 2 will be

Output based on DATA.GOV.HK

3. Visualize Hong Kong Public Estate location using Mapbox web services APIs

Mapbox is an open-source online maps API provider for creating and requesting maps. Default public token will be assigned to user after Mapbox account creation. API access token is required to requesting maps. Map can be created if simple code with Python package plotly like below with below features: -

· Geolocation by lat and lon.

· Zoom in size of the world map.

· Display of information when user hover on the data point(s).

· Color by 3-level regions (Hong Kong Island, Kowloon, and New Territories)

import plotly.express as px
px.set_mapbox_access_token(f"{token}")
map = px.scatter_mapbox(df_map,
lat='Latitude', lon='Longitude',
size_max=20, zoom=10,
hover_data=['Estate Name en','Estate Name zh','Year of Intake'],
color='Region',
color_continuous_scale=["black", "purple", "red"])
map.show()

Output 3 with interactive map!

How can we use this technique for Financial Institutes?

1 frequent usage for banking industry of map function would be ATM/ Branch locator. Usually bank website/mobile applications will embedded a locator allowing customer to search ATM/branch location with the information like opening hour, branch location, contact number and even link for e-ticketing reservation system.

Any other thoughts? Please share how your organization use geolocation/spatial information~

--

--

Yin

A McLaren F1 racing Fans who interested in FinTech and Data Analytics