Photo by Isaac Smith on Unsplash

The case study series — learning data science independently.

(English) Data visualization — 3rd

3rd case study.

Arul Hasbi
May 22 · 14 min read

Introduction

As I have written statistics and machine learning case studies at the beginning of learning data science independently, I’d like to add data visualization as my next case study writing and elaborate in detail on the whole process of it. The main topic revolves around how covid-19 impacted Malaysia's economy. *As a side note, all of the supporting information and data used in the data visualization obtained from Malaysia Open Government Data from the middle of October to the end of December 2020. Therefore, any information presented in the visualization does not reflect the current circumstance.

I carried out this project by collaborating with another group member: myself 🇮🇩, Ibnu 🇮🇩, and Mokarram 🇧🇩 . Although the ultimate objective of the case study sounds trivial which to visualize the information all things related to Malaysia’s economy amid covid-19, it is important to have a comprehensive understanding of the issue to curate the unnecessary data and put the data that really matters. Once we have the nicely curated data, we able to transform the data into information represented with graphical representations and tailor it in a storytelling manner. There is a number of tools that we used throughout completing the project which is Alteryx, Python, PowerBI, Microsoft Excel, and Tableau. As usual, the content of the case study will be split into different sections. Without further ado, let’s dive in.

Background

The uncertainty of various conditions due to the widespread COVID-19 pandemic has impacted countries all over the world. There are two significant sectors of a country that got struck simultaneously, health and economy. Numerous governments’ actions such as policy reform have been taken in mentioned sectors to save its’ people and nation. Like any other country, Malaysia has been fighting hard to reduce the pandemic impact and rebuild its’ economy. It is also regarded as one of the countries that were successfully handling the crisis amid a lack of critical advantages and extensive resources. In the early surge of COVID-19 cases (1st Quarter of 2020), to suppress the speed of virus transmission, the Health Director-General Dr. Noor Hisham Abdullah and frontline health professionals at the Ministry of Health did aggressive movement towards the contact and strategic tracing.

In late March, the region considered a red zone (hotspot) was quarantined under immediate lockdown, and mobile testing was deployed to test all of the residents regarding the symptoms. In the 2nd Quarter of 2020, the Malaysian government-enforced further movement restrictions to prioritize Malaysians’ safety and health. The restrictions came in Movement Control Order (MCO), Conditional Movement Control Order (CMCO), and Recovery Movement Control Order (RMCO). These restrictions successfully contained the widespread of COVID-19, yet affected the overall Malaysias’ economic activities.

Problem statement

The Movement Control Order (MCO) on March 18 has affected Malaysia’s economic activities the most due to most companies having their employees working from home, and some of the employees had to stop working. In this enforcement, the Malaysian economy operating at 45%, as stated on the PMO site. This was followed by Conditional Movement Control Order (CMCO) on May 4 and Recovery Conditional Movement Control Order (RMCO) on June 10 until Augustus 31 subsequently. This also led to the reopening of the Malaysian economy in stages and strict Standard Operating Procedure (SOP) that should be adhered to.

Lately, there was a rising trend in COVID-19 cases, which was considered as the possibility of a new wave, said General Dr. Noor Hisham Abdullah. The increasing case trend caused the government to enforce another CMCO, emphasizing the work-from-home involving 800,000 private-sector workers and 200,000 civil servants in several areas. Once again, sudden uncertainty affecting the plan of rebounding the economic sector. Thus, Economic Council Action (EAC) needs to be supported in understanding the current economic issues to help conduct strategic and recommend feasible action plans, thus contributing to economic recovery.

Business understanding

To understand better how COVID-19 impacted the Malaysian economy, a systematic way of conducting it is critical. To create such a comprehensive business understanding, six following elements need to be identified. The six following elements are the business case, literature review, business objective, business use case, business questions, and insights.

The business case encompasses three subsequent activities: understanding the requirement, identifying gaps, and possible solutions.

A SWOT analysis will perform a quick analysis and overview of Malaysia’s current economic situation and issues. The main references for the analysis were from the Malaysia Economic Performance for Q1–20 and Q2–20, published by the Department of Statistics of Malaysia to understand the latest issue. Additional reading was also conducted from news media related to current economic conditions and issues in Malaysia. The SWOT analysis result is shown in Figure 1.

Figure 1. SWOT analysis

From the opportunities section depicted in the SWOT analysis result, there are two potential domains to solve as our business case related to COVID-19 impact on the Malaysia economy and EAC core functions. Tourism and labor market domain. To understand why these domains are relevant to solve, further brief elaboration is needed.

Tourism

Based on Tourism Satellite Account 2019, the Malaysia Tourism Industry made a 15.9% contribution to GDP, which more than half of the percentages due to international tourism expenditures. The year 2020 was expected to be the year of the Malaysia Tourism Industry sector due to “Visit Malaysia 2020” big campaigns, yet what came was not millions of guests but a virus. It has been difficult considering sub-sectors such as hotels, airlines, transportations, local businesses (restaurants, street stalls, etc.) have to suffer. Therefore, the Tourism Industry is expected to be recovered, helping to contribute to the country's recovery. Since the most significant share in the Tourism Industry came from international expenditure, such an action plan to intensify domestic tourism is required.

Labor market

To sustain productivity, industries started to embrace digitalization and further spur the Industrial Revolution 4.0, where automation becomes a larger part of the economy. Yet, the majority of economic activity in Malaysia is labor-intensive industries. With an additional fact that more than 50% of jobs and vacancies in the labor market situation are in the semi-skilled category, with 48% of employment are in SMEs spread across five production sectors. Given the Department of Statistics Malaysia’s latest report related to labor market information, currently, the unemployment rate is still 4.7% after recording 4.9% in June. However, the unemployment issue might even be concerning due to the latest trend of increasing COVID-19 cases. Further, lower labor demand might impact the situation of the labor market.

Two issues might come up; oversupply in the job market, especially the ones coming from new graduates, and could cause “graduate mismatch,” in which low-skilled jobs are occupied by the graduates expected to be in the high-skilled jobs due to job opportunities. If this issue is prolonged, the graduates’ talents and skills could not be fully maximized for the economy. Therefore, such an action plan to use this crisis momentum for pushing industries to stimulate the creation of more new skilled jobs to absorbs the growing graduates’ labor supply, thus contribute to the nation's economy in the long-term run.

Based on the identified gaps, there are two solutions that might worth considering by EAC.

  1. Create a short-term action plan to intensify and promote domestic tourism.
  2. Create a long-term action plan to stimulate sectors, especially in areas of e-commerce, delivery service, and information and technology, to create more skilled jobs and help to absorbs the growing graduate labor supply, thus use their potential to maximize the economy.

A literature review will provide acumen reasoning on how identified domains such as tourism industry and labor market are relevant as the business case for EAC to consider in regards to the impact made by COVID-19 on Malaysia’s economy. The summary of the literature review is shown in Figure 2.

Figure 2. Literature review

The literature review showed highlight(s) identified from three journal papers to support the relevancy of the tourism industry and the labor market as the business case for EAC to concern in regards to COVID-19 impact on Malaysia’s economy.

Tourism

The Tourism Industry domain (Foo et al., 2020) discussed subsequently on strikes given by the COVID-19 outbreak affected Malaysia Tourism Industry severely. It further stated that the government already gave economic stimulus packages to help the sustainability of businesses, especially within the Tourism Industry.

However, the economic stimulus packages would only be a reliever, not a cure; it will not last long enough until businesses not able to survive anymore. Thus, (Mokhtar et al., 2020) suggested a few applicable strategies to help the sustainability of businesses within the Tourism Industry sector. Among the five strategies, strengthening domestic tourism was mentioned. This strategy aligned with the current problem stated by the Department of Statistics Malaysia that more than 50% of the contribution made by the National Tourism sector is from International Tourism Expenditure.

Labor market

The labor market domain (Fields, 2006) emphasized the concept of stimulating economic development through the effective labor market. Further, he discussed the importance of policy objectives to achieve economic development via the labor market. Recalling earlier discussion in the labor market domain as the business case, there are two issues; oversupply and high potential of “graduate mismatch”.

The key takeaway problem in this domain is low absorption for growing graduate supply in the labor market due to job opportunities. Despite the underemployment issue, there is an increase of employed persons, particularly in e-commerce activity, delivery service, and information and technology-related activities, stated by the Department of Statistics Malaysia. This is an opportunity as target sources for the growing graduates’ labor supply.

The COVID-19 pandemic has contributed to a remarkable global loss of lives and raises a threat unparalleled for public health, food supplies, and the overall economic condition in the world. Hence, the world economy is struggling at its best to survive economically. The Economic condition in Malaysia is not an exception. Therefore, remedies for situations lie under the shade of the policy implementations.

As our business case is COVID-19; hence, we set our business objective is to minimize the impact of the pandemic on the economy by diversifying the oversupply Labour Force to intensify domestic tourism. Scaling the analytic process with CRISP-DM regulations, business objectives need to be structured under the SMART criteria. SMART stands for specific, measurable, achievable, relevant, and time-bound. Figure 3 is indicating the justification that the business objectives have fulfilled three of five criteria.

Figure 3. SMART assessment

ECA is the stakeholder for the analytics team. EAC should take policies to aid the tourism industry to reduce the negative impact of COVID-19 on the economy. 50% of the industry income comes from international tourists. Therefore, any changes in the tourism sector can have a significant impact on the economy. Moreover, the Labour market has been significantly damaged by the pandemic. A large proportion of the labor force has already lost their only sources of income. Which has been declining the growth of the economy by reducing consumer spending.

The labor market has been facing a major labor supply when the demand for labor is declining. Moreover, the increasing unemployment rate has several consequences. Such as moral and ethical degradation, mental breakdown among youth, increasing dependency ratio, including decreasing the growth rate (1 % increase in unemployment can decrease 2 % of the GDP). Therefore, diversification of the labor forces is much. Diversification can be possible through either creating new job facilities or creating more employment opportunities.

Business questions facilitate the insights of analytical decisions. Business questions for the tourism and labor market explicitly describe, identifies, and indicates the current situation, possible choices, and clues to get better insights. In terms of the labor market, it not only evaluates the current situation but also indicates the choices the government can make to offset the economical loss.

Figure 4. Business questions & insights

Data understanding

Within the data understanding phase, there are three tasks that need to be performed. Those are data description, data validity, and data exploration. The outcomes of the tasks would be explained as follows.

Most of the data were collected from MAMPU Data Terbuka. Further, the common characteristic of the datasets in the format of summary (accumulation, percentage share, etc.). Additionally, not all of the datasets fulfilled our main requirements, such as the latest data of 2020 in Q1, Q2, and Q3. Thus, we tried to collected datasets as much as possible, yet relevant to the predefined business objective.

  1. Unemployment Rate (1991–2020)
  2. Tourism Ratio (2005–2019)
  3. Labour Force Participation Rate (1982–2020)
  4. International Tourist Receipt (2000–2019)
  5. International Tourist Arrival (2000–2020)
  6. Inbound Tourism Expenditure of Tourists by Products (2005–2018)
  7. GDP Growth by Sectors (2018–2020)
  8. Employed Persons in the Tourism Industry (2005–2018)
  9. Domestic Tourism Expenditure of Tourists by Products (2005–2018)
  10. Covid Case (2020)
  11. Consumer Price Index (2019–2020)
  12. Consumer Confidence Index (2010–2020

Before curating the dataset collection, we collected 20+ datasets and curated them to only 12 that matter the most to address the predefined business questions. The usage of datasets would be used for different contexts. Figure 5 shows the structure of the contexts.

Figure 5. Data usage context structure

The structure contexts consist of three main components; background, objective, and conclusion. The datasets in the background context would support the factual data of our business use case; Local Tourism Industry & labor market. The datasets in an objective context would provide a summary of insight on opportunities for Local Tourism Industry intensification for EAC. The dataset, in conclusion, context would provide additional confidence on the Local Tourism Industry intensification plan.

After we completed the dataset collection, further quality checking is needed. Quality checking is important due to spotting issues such as invalid values, missing values, and atypical values within datasets. The result from quality checking will later be resolved by performing data processing with Alteryx and Python.

From the result of data validity, most of the issues found were atypical and null values. To tackle the issue, we might perform quick preprocessing data. To approach the problem of the null value, we might remove or impute it. To make imputations, we would need other information from other variables. If no variable(s) that could help to perform imputations, we will remove the row(s) from the dataset.

To have a better understanding of the datasets collected, we performed further quick data exploration and see if datasets have variables that have characteristics such as univariate, bivariate, or multivariate. The data exploration is a matter due to able to spot insight that might be relevant for our predefined business use case and help to answer business questions as well. As the result, the datasets explored are standalone (not related to each other), but we could use a dataset as a helper to explain other datasets (in the case of imputation alternative).

Proposed insights

We have performed three subsequent processes; data description, data validity, and data exploration. However, we identified new insights that might help the business objective. Figure 6 shows the potential insight(s).

Figure 6. Proposed insights

Visualizations

We constructed the data visualization with data that has been preprocessed using Alteryx, Python, and Microsoft Excel. We use Tableau to make the visualization and compile the data into the different dashboards representation by following the predefined data usage context structure (refer to Figure 5). We utilize the Tableau story feature to help us arrange the dashboard in a storytelling manner. Below is a series of Figures that exhibit the visualization results. Each of the visualizations has a description above it and we ensure that the graphs' representation is self-explained.

Covid-19 in Malaysia

Figure 7. A Sankey diagram of covid-19 cases in 2020
Figure 8. A Nightingale Rose diagram of total covid-19 cases grouped by states
Figure 9. A dashboard that shows the total covid-19 cases grouped by states represented in a geographical map and ranked bar chart (highest to lowest)
Figure 10. A dashboard that shows the total covid-19 cases all states combined and daily new cases grouped by states represented in an area chart and line chart

Tourism in Malaysia

Figure 11. A dashboard that shows the international tourist receipt (2005 to 2019) and tourism expenditure for domestic and international (2005 to 2018)
Figure 12. A dashboard that shows the international tourist arrival (2000 to 2020)

The labor market in Malaysia

Figure 13. A dashboard that shows the employment in the domestic tourism industry (2005 to 2018) and another one grouped by sectors
Figure 14. The scatter plots that show the correlation of employment across the tourism sectors

Metric

Figure 15. The area chart shows the business confidence index

How the analytics presented in the visualization able to feed the business understanding?

The analytics that developed and arranged within stories were made systematically and interconnected with one and another. It made it easier to connect the dots between story components. Let’s make a narrative flow to understand it easily. Our objective is to diversify the oversupply Labour Force to intensify domestic tourism in Malaysia due to the impact of unemployment caused by the COVID-19 pandemic. So, what do we need to do? collects datasets that revolve around COVID-19 in Malaysia, the domestic tourism industry, employment in the tourism industry, and the business confidence expectation index. Afterward, the datasets will be made into many lower visualization graphs and are arranged in a sequential manner to hopefully, provide a fundamental basis to address the objective.

The collection of lower visualization graphs then grouped into three different main components of a full story. How does it work? the current component will provide the fundamental basis and act as a transition and a background to the subsequent component. To have a clear view of how it would look like, take a look in Figure 16.

Figure 16. A role of components in the full story

There are two terms that need to be understood; ask and answer. What does it mean? ask is simply that a component needs the previous component to provide additional information to justify the information that resides within the asking component. The answer is simply that a component that provides additional information to its subsequent component. Thus, if we relate back to the defined objective, it makes sense that the idea of intensifying the domestic tourism industry needs to be backed by the COVID-19 case and then we are able to say “hey, something is happening in the domestic tourism industry due to the COVID-19. Let’s make a strategy to tackle that”.

Then, after looking at the insights inside of the objective component of the story, strategic planning might come up and later will be executed. To actually see if there is any difference after the strategic plan executed, there should be a metric that needs to be evaluated. The business confidence index (BCI) might be considered as it’s measuring the business confidence expectation for the upcoming period such as monthly or yearly on a nation’s scale.

Conclusion

Certainly, the analytics developed within this report might be improved to have even more comprehensive information for the business understanding, thus leading to more and better effective business decisions. However, in the development of this analytics, we faced two major challenges; summary data and lacked recent periods of time, i.e, 2019 and 2020. Therefore, to support the lacking of information, we collected many datasets and later filtered the most useful to address the objective. To have such a more detailed analytic, advanced modeling could be developed with help of advanced statistics such as machine learning, etc. Perhaps, advanced mathematical modeling might be helpful to estimate the budgeting agenda for targeted sectors in the domestic tourism industry and effectively absorbs labor forces, ultimately addressing the defined objective.

References

1. Fields, G. s. (2006). A public lecture : Labour markets and economic development. Journal of Eastern Caribean Studies, 31(2), 72–84.2. Foo, L. P., Chin, M. Y., Tan, K. L., & Phuah, K. T. (2020). The impact of COVID-19 on tourism industry in Malaysia. Current Issues in Tourism, 0(0), 1–5. https://doi.org/10.1080/13683500.2020.17779513. Mokhtar, M. R., Faizun, M., Yazid, M., & Shamsudin, M. F. (2020). Sustainability of tourism industry in Malaysia. Journal of Postgraduate Current Business Research, 1–3.

Nerd For Tech

From Confusion to Clarification

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Arul Hasbi

Written by

Currently staying in KL, Malaysia and “independently” studying business intelligence & data science during covid-19 pandemic.

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.