Machine learning revelations

Agata Gajec
Mar 28, 2019 · 8 min read
Image for post
Image for post
Photo from Adobe Stock

The article was written in cooperation with Stefan Ogonek.

Below, I will give you an idea what machine learning is and what you can expect from this arcane field. Please do not be afraid as our rational expectations are not terminator style killing machines or rise of the robots. On the contrary, I will try to convince you that machine learning is a wonderful field deeply rooted in science and engineering that can give your organization a competitive advantage in the market.

What machine learning is not

Image for post
Image for post
Comparison of Artificial Intelligence and Machine Learning

Machine learning is a very important part of AI, but only part of it. This should not be surprising because after all, we can say that AI is about agents acting rationally in the environment. To act rationally you need much more than only an ability to extract patterns from data. For example, you need to receive perceptions, to build knowledge about the world, be able to reason rationally, be able to act — machine learning is only a part of this great journey.

What is machine learning

It is similar to an induction method of reasoning: some general patterns are recognized based on some specific observations. However regarding the induction number of observations, there are hundreds, or even thousands at most. In addition, the reasoning agent is a human whereas in machine learning the number of training data could easily be millions where the reasoning agent is an algorithm.

How can we generalize the data and how do those algorithms work? Some most used machine learning approaches are mentioned below.

Supervised learning

  • Linear regression
    This is a standard, highly interpretable method for regression - which is modeling relationships between input variables which are independent of each other and depending on the input output variables. For example, this could be just a linear dependency between floor area and house price, easily representable on 2D space.
    A business case might be to understand product-sales drivers like price, competitors price, quality, etc…
  • Logistic regression
    This model is similar to linear regression but is used for a classification task. Classification means that we expect binary value as output. So the model predicts that something is either true or not. For example, in our case, we would like to know whether a house will be sold in the next 6 months, the output is either yes or no.
    Another business case might be a decision whether a loan will be repaid or not.
  • Decision trees
    A decision tree is a highly interpretable model that can solve both classification and regression problems. It is highly interpretable because it is represented as a tree that splits data values at each branch depending on some features. In each leaf of the tree, there is an output value.
    A business case in our house example would be to recognize which features are most important for determining house price. Another case might be to understand product features that make it most likely to buy.
  • Random forest
    This method is an example of ensemble learning. Its result is a combination of many various models of the same type. In the case of a random forest, it is easy to guess that the type of the model is, of course, a decision tree. Random forest improves the accuracy over decision tree by averaging the results from running the method multiple times. However, we are losing the high interpretability of simple decision tree.
    A business case might be for example to predict power usage in an electrical grid.
  • Naive Bayes
    This is a classification technique. It uses Bayes theorem to calculate the probability of events based on the knowledge of factors that might affect that event. It is a rather simple technique but for text categorization, it is competitive with more advanced approaches.
    A typical business case is to categorize emails as spam or not based on occurrences of words in the text.
  • Support vector machines (SVMs)
    This technique is usually used for classification but can also be applied to regression problems. This model represents training input data as points in space and tries to find a gap between categories that is as wide as possible. Then when a test vector is being checked it falls to one side of this wide gap and this side represents the result.
    A nice business case might be to determine whether a user is likely to click on the ad or not.
  • Artificial Neural networks (ANNs)
    Neural networks deserve separate text on their own, here are a few words to explain how they work. Neural networks try to mimic the way neural connections in the human brain work. They usually contain a few layers of neurons. Each neuron from one layer is connected to some neurons from the next layer. The first layer represents the input vector, and the last layer represents the output. Neural networks are good at representing complicated non-linear dependencies between the input and output. They have been known since the 1950’s but until about 2000s we did not have enough computing power to make them work for non-trivial examples. The huge improvement in the field was also possible by rediscovery of the backpropagation algorithm in the 1980s. There are different kinds of neural networks: standard neural networks, convolutional neural networks, recurrent neural networks, however, I will not go into details of their specifics here.
    Possible use cases are broad, they can solve all of the use cases mentioned in other models. Fine-tuned by an expert they can achieve much better results than models described previously. One of the typical cases is handwritten text recognition. This was an introduction of neural networks to make this task feasible for industry and reduce users usage.

Unsupervised learning

  • K-means clustering
    This algorithm clusters the data in a k separate groups (thus the name of the algorithm). Each group contains the input data vectors that are nearest to themselves in d-dimensional space where d is the number of attributes in each vector.
    The business case would be for example to find groups of houses that are somehow similar to themselves according to parameters such as price, price per square meter, city, location, number of floors, etc… Another obvious business case is one mentioned earlier for segmenting customers into different groups to better target each group with marketing campaigns.
  • Hierarchical clustering
    This algorithm creates a hierarchical classification tree where each node in the tree represents a group. Subnodes of a node represents a further split of the node group.
    In a typical use case, you can cluster your customers into more and more detailed groups. As those groups are represented as hierarchy, you can for example, target marketing campaign to a group represented by a node that is on any level of the tree depending on your specific needs.
  • Recommender systems
    Recommender systems are not a separate technique. It is rather that those systems usually use some clustering algorithms to identify similar groups for which similar things should be recommended.
    A use case might be to recommend which movies a user should watch using user similarity to other users.
Image for post
Image for post
Photo from Unsplash

Reinforcement learning

How to approach machine learning

nomtek

www.nomtek.com

Agata Gajec

Written by

I'm currently doing my master's degree in Big Data Analytics at Wrocław University of Science and Technology

nomtek

nomtek

www.nomtek.com Nomtek is an app design and development agency founded in 2009 with offices in US, UK, Germany, Poland.

Agata Gajec

Written by

I'm currently doing my master's degree in Big Data Analytics at Wrocław University of Science and Technology

nomtek

nomtek

www.nomtek.com Nomtek is an app design and development agency founded in 2009 with offices in US, UK, Germany, Poland.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store