Data Science and Machine Learning

Let machines learn from data to transform your business

Raymond Ng, AI Coach, MSc (KM)
Knowledge Management
6 min readDec 7, 2019

--

繁體中文版 / 简体中文版

Where are we in human history?
Nowadays people are being connected by mobile devices anytime, anywhere. This provides a foundation for people to share information, exchange ideas, or even manage projects through social media in collaborative ways. Almost all forms of data are being converted to digital formats, and it is becoming feasible for them to be stored in cloud storage and exchanged through high-speed networks. We are currently in the age of digitalization.

Characteristics of the age of digitalization
The processing power of computers has increased a lot over the last few decades, from standalone to cloud or even quantum computers in the near future. A computer can be a virtual one instead of a physical one. Virtual computers are based on cloud technology and becoming more powerful than ever. In addition to processing power, cloud storage for computers is becoming unlimited the costs are affordable. With the fiber-optic transmission medium, the bandwidth for data transmission can carry a huge volume of data at a very high speed. In the age of digitalization, people can manipulate a huge volume of any time, anywhere, with any device.

Where does big data come from?
Every day a huge amount of data is being created. The sources of data come from enterprises, governments, schools, and individuals. Collectively they are called “big data”. The sources of big data come from:

  • Open data from governments
  • Dialogues from social media sites
  • Emails
  • Cookies
  • Instant messages
  • Customer data
  • Shared corporate data
  • Sensors
  • Cameras
  • Wearable devices
  • GPS
Sources of Big Data

What is data science?
Data science is the discipline in making sense of big data by, for example, visualization, data mining, pattern recognition, and machine learning. By making sense of big data, you can obtain insight into a specific situation. Insights are important for making the right decisions or even predictions. For example, by analyzing the customer buying behavior, you can better control your inventory or provide the services the customers look for. As a matter of fact, data science is not based on very advanced technologies. Instead, it is based on statistics and mathematics. In other words, a huge volume of data being manipulated by statistics and mathematics theories is the core of data science. Below is a typical data science process.

Data Science Process, credit: Wikipedia

What is machine learning?
Machine learning is a major subject of data science. There are two kinds of machine learning, they are supervised learning and unsupervised learning. The major functions of machine learning are classification (example: decide whether to grant a loan to a customer), regression (example: sales forecast), and clustering (example: group your customers). One of the main applications is predictive analysis.

Machine Learning — the big picture

Traditional programming vs machine learning
To understand how machine learning works, it is important to find out its difference from traditional programming. Over the years, with traditional programming, we feed data to a program and then obtain the outcome generated by the program. However, with machine learning, we do not first develop the program. Instead, we collect some historical data and its corresponding outcome (for example, what kinds of customers buying what kinds of products). A model can then be generated by using a specific machine learning function, such as clustering. By using the generated model, future buying behavior can be predicted. Based on this kind of predictive analysis, decisions can be made with a high degree of accuracy.

Traditional Programming vs Machine Learning

Example — property prices prediction
Here is a use case for predictive analysis. Property prices are determined by many factors. The factors may be location, age, transportation, etc. By collecting those factors and the corresponding property prices, a model for predicting property prices can be generated. That means anyone can predict the price of a property based on the current factors for feeding to the model generated by machine learning.

A Model Generation Process

One more example — granting the loan
Here is another example of machine learning. To build a model to predict whether to grant a loan to the customers, you need to extract some historical data about their loan history from a bank’s database. What data would you extract? You may consider collecting the data (attributes) below from every applicant. A model can be generated by applying the right machine learning algorithm. Once the model is ready for use, you can easily decide whether to grant the loan based on the same set of data (attributes) submitted by the applicants. You may ask whether this kind of prediction is accurate. Its accuracy depends on the algorithm selected and the quality of the data collected. At least human judgment can be complemented by the prediction.

Model for Granting the Loan
A Prediction

Predictive analysis
The predictive analysis involves two main parts. The first part is to feed some training data to a machine algorithm for generating the model. The second part is to feed new data to the model for generating the results. The picture below shows the whole process.

Predictive Analysis Flow

Machine learning — the applications
Machine learning is becoming popular in the business world. It changes the way businesses are being operated and managed. Here are some of the major applications:

  • Price Prediction: to predict optimal prices based on historical sales records
  • Risk Assessment: to predict the risk associated with decisions such as issuing a loan
  • Propensity Modeling: to predict future customer actions based on historical behavior
  • Diagnosis: to make better diagnoses by leveraging large collections of historical examples
  • Document Classification: to automatically classify documents into different categories
  • Recommendation System: rely on the properties of the items a customer likes, discovering what else the customer may like
  • Sales Forecasting: estimate the future sales volume based on actions taken
Machine Learning Applications

Steps to applying machine learning to your organization

  1. What are the business problems?
    a. How to gain new customers?
    b. How to sell more products/services?
    c. How to add efficiencies to a process?
    d. Customer segmentation
  2. Look for corresponding data sources
    a. Internal data (if no, create a plan to collect it constantly)
    b. External data
  3. In what ways could machine learning help address business problems?
    a. Choose the right machine learning models
    b. Evaluation
    c. Deployment

I hope you’re enjoying my articles on Medium. If you find them helpful, informative, or just plain entertaining, please consider supporting me through Buy Me a Coffee.

Don’t forget to give me your 👏 !

Leave your email address here to receive article news!

--

--