#10 —The key success factor in ML projects
To have a thorough knowledge of the domain.
This is a story (actually the last one) of the series Flight checks for any (big) machine learning project.
Beyond all the technologies available, the papers, the AI revolution and so on, we must go back to the roots. Let’s remember that we are building a model. An ML model — like any model — is an abstract representation of reality. In our case, it is a model that seeks to capture signals from reality and then predict something on that basis.
On its own, the model cannot simply capture reality and act accordingly. The data scientist is the one who translates those signals from reality into this mathematical construct that can be “trained”: the model.
If we want to upgrade the model, at first glance it seems that there are only 2 options:
- To capture more and better signals,
- To increase the power of the model so that it learns more with the same signals.
Increasing the power of the model may provide improvements, but it is not too complex; it might even be commoditized (an infrastructure problem) in the future. The key to winning the game lies in improving the information we give to the model.
But the model cannot explore the domain and represent reality on its own. That’s where the work of the data scientist is key. And it is not a simple task (which is largely why it is an extremely valuable profession).
In order to get more and better signals, it is key to thoroughly understand the domain to be modeled. Ask all the relevant questions, become a domain expert.
If you don’t understand the details of your business, you’re going to fail. — Jeff Bezos
In Mercado Libre we are aware of this: a key factor to address the best IA implementation is to have deep knowledge of each problem.
For example, if the problem to be solved is to predict the delivery time of a product, it will be necessary to become an expert in the whole delivery process, product availability, user experience… and even to figure out the day-to-day life of a delivery worker. Some detective work must be carried out to understand the problem.
Only after considering all these alternatives can better variables be added to the models. It is usually a bad smell not to formalize and dedicate enough team time for this treatment.
Iteration by iteration, it is essential to get to the root of the problem.
So, here we are, at the end of this journey.
I hope you enjoyed this flight and feel safe to take off your own ML projects now.
Please feel free to comment this series and contact us if you are interested in adding further flight checks, or even co-piloting with us!
There are many opportunities for Data Analysts, Data Scientists, Data and Machine Learning Engineers, Developers, and other tech profiles all around Latam. Find your next challenge in https://careers-meli.mercadolibre.com/
Acknowledgments
- The whole team that was part of Fraud Prevention in the last 5 years, for giving me the opportunity to work with them in solving very difficult problems, failing in the process and learning from these experiences, especially Juan Gabriel Yonzo, Martín Pozzer , Guillermo Calvi, Franco Arito, Gustavo Baldani.
- The Meli Tech Blog Team for giving me the opportunity to share these stories and supporting me in the process: Barbara Michalla, Nahuel Barrios.
- The UX team for designing all the amazing illustrations for each story in the series: Laura Guarie, Guido Gaudioso, Vanina Ogueta.
- For the great translation and edition support Cecilia Sassone.
- And my current Fraud Tech Team for helping me build all the amazing tools we are creating to empower MELI with the technological capital we need to bring all the users a really secure UX.
See you around!