Machine Learning in Style!

Part 1: Before machine learning.

Published in

Data Science Innovations

2 min readOct 9, 2020

It is okay NOT to use machine learning!

Don’t be afraid to develop a solution without machine learning. It is cool to use machine learning, but it requires lots of resources especially if your are using a GPU to train and run the model. It also needs DATA! Not just any DATA that is just related to the problem that you are trying to solve, but GOOD DATA! A not so good dataset is very likely to underperform compared to a simple heuristic approach.

For example, if you want to rank products based on its popularity, a simple heuristic approach of sorting by number of units sold would be enough to outperform many machine learning models. So, until you have access to good data, it is not worthy to invest your time in machine learning.

Start collecting as much as data as early as possible!

It is always a good idea to track as many metrics as possible in your current system early on, because:

It is easier to gain permission from the system’s users earlier on.
If you think that something might be a concern in the future, it is better to get historical.
You will notice what things change and what stays the same.

For example, for social media platform, you can measure metrics like shares per visitor, upvotes per visitor , comments per visitor etc to compute the toxicity of a post.

By being more liberal about gathering metrics, you can gain a broader picture of your system. If you notice a problem, you can add a metric to track it!

Simplicity is the ultimate sophistication!

It is always a good idea to choose machine learning over a complex heuristic solution. As simple heuristic might be enough to make your product barely usable. On the other hand, a complex heuristic is unmaintainable.

Once you have the data and you know what you are trying to achieve, move on to machine learning. Like any other products, we need to constantly update the solution and a machine learning model is easier to update and maintain.

With good features comes great gains!

With all the plethora of machine learning algorithms available now, most of the performance boost comes from great features and not from great machine learning algorithms. We need to think more like an engineer and less like a machine learning enthusiast to make great products!

So a good starting point would be:

Start with a reasonable objective.
Add common sense features in a simple way.

Adding complexity only slows down future iteration of your solution. A good time to diverge from this approach is when you have tried all the simple tricks that you have in your hand when you need to push your models performance further.

To be continued…