Opening a Café in Kuala Lumpur- an Exploratory Study Using Foursquare API.
Purpose — The purpose of this study is to establish key areas in the city of Kuala Lumpur using alternative method, describe main activities that can be done within the city and identify the best places to open a Café within the vicinity.
Design/methodology/approach — The study employed publicly available data of government clinics and private hospitals around Kuala Lumpur. The data analysis of this study was supported mainly by Foursquare API, pandas, scikit-learn and folium packages. The number of K for clustering was analyzed and identified using a combination of “elbow” method, silhouette score and sum of squared error.
Findings — The study has managed to established 45 key areas in Kuala Lumpur. Further, the study recommended “eating”, “food-hopping” and “Café-hopping” as main activities while visitors are in the city. Finally, the study has identified 3 clusters within the city suitable for a newly-opened café. Few “pocket-areas” were also identified within the city where a new café-establishment can strive with very little to no competition.
Limitation/implications — The study has a limitation where the data from Foursquare API may have not been up-to-date for the city and lack of additional data sources to further corroborate the findings.
Value — The study contributes to current repository of analysis of the city. The method utilized in this study can be modified to analyze Covid-19 clustering and identification of risk area/ zones.
1. 0 Introduction
The dazzling city of Kuala Lumpur is the federal territory of Malaysia, which the situated in the heart of South East Asia. The city is populated by more than 1.73 million people in the recent years .
Kuala Lumpur is the cultural, financial and economic centre of Malaysia. It is also home to the Parliament of Malaysia and the official residence of the Yang di-Pertuan Agong (the King) - the Istana Negara (Royal Palace). The city is the role model of modern urban lifestyle in the country equivalent to its neighbor, Singapore.
The city consists of 11 parliamentary constituents which forming divisions within the city. Each constituent represents a seat in the Parliament of Malaysia who will be filled by leader of political party who won during the General Election for each area.
Coffee is among the favorite drinks by this city dwellers. Coffee consumption habit within the city is very dynamic as it represents city trend and lifestyle. This stimulates establishments of café businesses and in turn escalates competition among them. This is further supported with establishments of various café with unique themes and approaches within the city in recent years.
This study utilizes data on government clinics and private hospitals to establish key areas within Kuala Lumpur. As the main objective of government clinics is to provide healthcare services among general public, the clinics were positioned at the location where it can be easily accessible. Same goes for Private hospitals, even though they are profit-driven.
1.1 Objectives of the Study
This exploratory study is envisaged to answer some basic business problems:
1.1.1 What is the key area within Kuala Lumpur?
Due to unique management of Kuala Lumpur, the segregation of area in the City was very unique. For example, the area of Kuala Lumpur was segregated into 11 parliamentary constituents. Which was too general for a 243 km² area.
Malaysia also uses postcode system. However, in Kuala Lumpur there are around more than 200 postcode and are not equally distributed. This means some area may have a concentrated number of different postcodes than the other.
Thus, this study will employ alternative approach to establish Kuala Lumpur key areas. We will utilize data on government clinics and private hospitals around the city to address this question.
1.1.2 Where is the best place to open a Café in Kuala Lumpur?
With intense competition between Café establishment in Kuala Lumpur, this study will shed some lights for new business owners in strategizing their business plan.
Most of the new businesses will have a problem whenever they need to compete head-on with bigger, more matured direct competitors as they might have advantage on their brand names and huge profile of loyal customers.
One of the strategies is to be as far as possible (location wise) with direct competition and grow organically from there. If this is not possible, focusing on a more ‘diluted’ location is also an alternative strategy.
This study is envisaged to give ideas to new Café owners to reduce the head-on competition with existing business by strategically locating their establishment.
1.1.3 What is the best description of Kuala Lumpur?
As we are looking for places to open our Café, why not we go a little bit further and analyze the City as overall?
There must be a lot of activities that can be done in Kuala Lumpur. Based on Foursquare data, we will further analyze the top attractions/ activities that can be done on this vibrant City.
1.2 Main Audience of This Study
Main audience for this study is as follows:
Prospective Café owners- This study is envisaged to give insights to prospective Café owners on how to strategize their Café opening in Kuala Lumpur.
Coffee Lovers- Coffee Lovers can focus their search for the best Coffee on the area highly populated with Café establishment.
Tourists- With more analysis on the description of Kuala Lumpur, tourists can get some insights on what to be expected from their visit to Kuala Lumpur next time.
Data Scientists- The approach employed in this study will give some ideas to data scientist on how to tackle problem with the same theme. As there are limited resources available for this city, we hope this study could add to existing repository of analysis of Kuala Lumpur and its surrounding.
2.0 Data Description
- List and location of government clinics around Kuala Lumpur. The data was filtered and web-scrapped from the official website of Malaysia Ministry of Health
- List of private hospitals around Kuala Lumpur. The data was web-scrapped from Wikipedia page and stored in github
- Location data from Foursquare API. This will be used to analyze the surrounding activities of the city.
- Location data from HERE Location Services API. This will be used in this study for the purpose of geocoding and reverse geocoding.
- Kuala Lumpur Geo-json. The data taken from previous general election demarcation line. This is used to demarcate the area within the city of Kuala Lumpur .
3.1 Establishment of Kuala Lumpur Key Areas
We web-scrapped the lists of government clinics in Kuala Lumpur from Ministry of Health’s website by using Pandas. The snapshot result (head) of web-scraping is as follows:
In relation to data for private hospitals, we have previously web-scrapped them from Wikipedia and stored them in github as a .CSV file. We imported the data into notebook using pandas read.csv function. The snapshot(head) of such extraction is as follows:
Both of these data then were combined and all unnecessary columns were dropped. The snapshot is as follows:
Now we have established Kuala Lumpur key areas from the combination of these two lists.
3.2 Geocoding of Kuala Lumpur key areas
We utilized HERE Location Services API to find the coordinate for each key areas. All data with duplicates, area with ‘NA’ names and those outside Kuala Lumpur were dropped.
Further, we extracted all relevant information necessary to help us in the next steps. The snapshot (head) of such extraction is as follows:
3.3 Visualization of key areas
We utilized folium to generate visualization from the latitude and longitude output from geocoding exercise. We then added Kuala Lumpur geo-json to demarcate Kuala Lumpur area within the map.
3.4 Nearby venues around Kuala Lumpur City Centre
We utilized Foursquare API to extract popular venues around city centre together with its rating. For this purpose, we have set the radius as 1km with limit of 100 venues from center point of Kuala Lumpur. The result (head) is as follows:
From this data, we have generated “Top 20 Popular Venue Categories in Kuala Lumpur”.
and we have generated “Top 20 Highest Score Venue Categories in Kuala Lumpur City Centre”
3.5 Nearby venues around each Kuala Lumpur key areas
We expanded the analysis to each Kuala Lumpur key areas. For this, we have utilized Foursquare API and set the parameter to 2KM radius with limit of 100 venues per location.
We have collected 4,476 venues with 297 unique categories from Foursquare API calls. The head of such extraction is as follows:
We transformed the data using “one-hot encoding” and analyze the frequency of each venue category repeating for each area. For this purpose we capped it to up to top 10 venue category for each area. The result (head) that has been transferred to pandas data frame is as follows:
3.6 Determination of K
In order to proceed with clustering process, we need to determine the number of K. For this purpose, we utilized unsupervised learning K-means algorithm or Elbow Method . The range of K has been set to be up to 8 for optimum result.
The result, however, did show any distinctive “elbow” for us to ascertain the number of K. Alternatively, we employed Silhouette Score which was read together with Sum of Square Error (SSE)for better accuracy.
From the figure above, we could see Sum of Squared Distance or SSE is going down when K becomes bigger. It was also noted that Silhouette Score is also inconsistent for different number of K. Thus, for a more meaningful clustering, we Choose K=7 for this study since it has lowest SSE and considerably high Silhouette Score.
3.7 Key areas clustering
Based on determined number of K, we ran KMeans and merged the cluster label for each area with our result in 3.5. The outcome (head) of such merging is as follows:
With this data, we used folium to visualize the cluster on the map:
Key areas based on cluster are summarized as follows:
Based on analysis of each cluster, we could summarize Top venues for each cluster as follows:
Cluster 1: Chinese and Malay restaurants
Cluster 2: Malay restaurants
Cluster 3: Hotels
Cluster 4: Chinese restaurants
Cluster 5: Japanese restaurants, Chinese restaurants, Café and Ice Cream Shops
Cluster 6: Malay restaurants
Cluster 7: Indian restaurants, and Ice-cream shops
3.8 Specific Analysis on Café location around Kuala Lumpur
Initially, we utilized data from section 3.5. The data then filtered by Venue Category as “Café” and “Coffee Shop”. We utilized folium to visualize these venues on the map as follows:
The visualization has given us a rough idea on the best location to open up a Café but inconclusive due to lack of data. For this, we utilized Foursquare API to specifically extract café and coffee shops establishment around key areas as per section 3.2. The category id for this extraction was acquired from Foursquare documentation site .
The request has generated 2,239 results. The snapshot of such extraction is as follows:
By using folium, we generated heatmap to assist us in analyzing the density of café around Kuala Lumpur. We have also demarcated the city centre.
To further assist us in analyzing the heatmap, we have overlaid cluster data on the folium map as follows:
4.0 Results and Discussions
As mentioned earlier, this exploratory study envisaged to answer some basic business questions. We will discuss the results based on three business questions as per section 1.1.
4.1 What is the key area within Kuala Lumpur?
This study explore and attempt alternative approach in establishing key areas in Kuala Lumpur. Based on the data from government clinics and private hospitals around Kuala Lumpur, we have identified 45 location that can be the main centroid of Kuala Lumpur
4.2 Where is the best place to open a Café in Kuala Lumpur?
Based on the analysis of the heatmap, we could advise for the new café establishment to:
1. avoid opening café within cluster 1,2,3 and 7; and
2. focus to open their café within cluster 4, 5 and 6 for a more manageable competition.
3. Further ,we identified few “pockets areas” within the heatmap where café owners can explore to open up their café namely Bukit Tunku, Penchala, Setapak, Desa Melawati and Sungai Besi.
Such areas (“pocket areas”) were demarcated in black “x” in the heatmap above. These areas seem to be less crowded and might have low competition. However, not all “pocket areas” are good to be explored, for example those in red “x” are big cemeteries. It’s pretty obvious why there is no café there.
Cluster 1,2,3 and 7 are overly crowded with café thus the competition might be fierce within the area. High concentration of café within these areas are contributed with the fact that the location is situated in the middle of Kuala Lumpur City Centre.
Thus , it is recommended for new café owner to consider opening their café as per our advise no. 2 and 3 above. Competing head-on with more popular, matured establishment may be a bad idea in the beginning of the business. The owner of new café establishment can strategize by developing their brand and slowly moving into city centre when they are sizeable and the timing is appropriate.
4.3 What is the best description of Kuala Lumpur?
4.3.1 The City Centre
Based on our analysis in the earlier stage of this study, we could say that the best activity to do in Kuala Lumpur is “eating” or “food-hopping”. This is because 12 out of 20 popular venues in Kuala Lumpur are all related to food and beverages. This turns out not to be a surprise considering a diverse demography of Kuala Lumpur population.
To be brief, there are 3 major races in Malaysia- Bumiputra (who the original settler of Malaysia), Chinese (who the ancestors arrived from Mainland China) and Indian (who the ancestors arrived from India). These 3 races practice their own tradition, culture and set of belief .These diversity has also translated into a diverse types of food here in Malaysia.
“Café-hopping” is also one of the activities that you might want to consider whenever you are in Kuala Lumpur. Café and Coffee Shop establishment are among the top 10 popular venues in Kuala Lumpur. This is supported with the establishments of various themes of café in recent years around Kuala Lumpur. You might not want to be missing this experience.
4.3.2 Kuala Lumpur Key Areas
The results from section 3.7 is consistent with section 4.3.1 where 6 out of 7 clusters’ top venues are related to food and beverages. This further strengthen our previous suggestion in section 4.3.1 where “eating” , “food-hopping” and “Cafe-hopping” are the main go-to activities while you are in Kuala Lumpur.
The study has established key areas in Kuala Lumpur, identified main activities that can be done within the city and analyzed the best places to open a Café within the city. We identified clusters that are suitable to be a breeding ground of new café and a number of “pocket areas” in Kuala Lumpur that if utilized wisely, could become a great advantage to a new café owner.
The analysis utilized free Foursquare API calls which limit to only 100 results per call. Thus, the results may not be taking into consideration of all available data within the search area.
The location recommended in this study were merely based on data available in Foursquare thus, additional data from various sources will be helpful to corroborate the findings of this study. It is also recommended for an “on-site” study to be done prior to executing a business plan on one of those areas.
Further, there is possibility for the data for Kuala Lumpur is not being up-to-date. Unlike data from Google which is frequently updated by its users, data from Foursquare may be lacking of this attribute. Having said that, there is also a possibility that the data has taken into consideration of businesses that are no longer in operation.
5.2 Direction of Future Study
This study is also aimed to provide foundation for future study of Kuala Lumpur and its surrounding. This study can be enhanced through usage of different types of API such as Google Map API to provide more insights about Kuala Lumpur and its surrounding.
Finally, In relation to practicality of this study, the same method utilized can be used/modified to analyze Covid-19 clustering and identification of risk area/ zones within the city.