When it comes to using machine learning, which category does your company fall into?

Published in

Enrique Dans

3 min readNov 21, 2018

An entry in the MIT Tech Review, “The rare form of machine learning that can spot hackers who have already broken in”, looks at the development of machine learning algorithms for cybersecurity purposes, although not along traditional lines by trying to detect patterns based on previous or known attacks, but instead focusing on identifying attackers who have already managed to enter the system and then preventing them from stealing information.

The approach uses unsupervised learning algorithms that compete with each other to detect possible anomalous behavior. As said, instead of focusing on what has been learned from previous security issues, unsupervised learning looks for anomalies without the need for a human to tell it what to look for, allowing it to examine countless examples of behavior within the corporate network and to then detect those with anomalous patterns. Thus, the repeated movement of employees through the corporate network, information searches or access to company resources could be identified as standard or risk-free behavior, while the patterns of an intruder trying to collect information in a particular way could be recognized as an attempted attack and then be quarantined.

Managing a range of risks and detecting anomalies are just a few of the areas where machine learning is showing promise, and I’m using it here simply as an example. In the case of supervised learning, the combination of different types of learning makes it possible to establish suspicious patterns; alternately, it’s possible to allow unsupervised algorithms to detect out-of-the-ordinary patterns. The range of possibilities is enormously varied and the investments and advances made today will undoubtedly become the competitive advantages of tomorrow.

As companies begin to see machine learning as a possibility within their reach, one that doesn’t necessarily require hiring expensive data scientists and that can be done with relatively simple and even visual tools, ideas like this can be seen as showcases that should attract the interest of those who have not yet tried this type of technology. Two other articles in MIT Tech Review should also raise awareness: they use two very simple and (at least for geeks) amusing flow diagrams. The first helps us to understand whether a project is based on artificial intelligence and the second on the conceptual differences between different types of machine learning algorithms.

What’s your company’s position with respect to machine learning? It’s important to remember that preparing data properly is the secret of successful analysis. If defining objectives takes up around 10% of the effort devoted to most machine learning projects, this second phase of data preparation can amount to up to 80%. Next comes the phase of creating models and obtaining predictions, which is more accessible and simple: the tools for this are becoming increasingly visual, simple and easy to manage. The phase previously associated with data scientists, who tend to be hard to find and the retain, now takes up barely 5% of a project. The final phase, that of evaluating the results obtained, tends to consume the remaining 5%.

Most companies fall into one of three categories: those that have made some tentative steps by looking at the tools involved and have maybe tried out a dataset and have read or evaluated something about machine learning. Then there are the early adopters, using models that have been in production for around two years; while the third group have models in production dating back five years or more, with all that this entails in terms of experience and capitalizing on their results. Which category does your company fall into?

(En español, aquí)

When it comes to using machine learning, which category does your company fall into?

Written by Enrique Dans