Machine learning and Software engineering, in balance

# 5 — Any Machine Learning Project, is a Software Project First

Nicolas Rodriguez Presta
Mercado Libre Tech

--

This is story #5 of the series Flight checks for any (big) machine learning project.

Well, we already have KPIs and a team too, and we managed to productize a baseline quickly without losing sight of the accumulated technical debt.

At this point, you might notice that if we change the object of study and instead of a machine-learning project, we build a management system or an app for online payments, 90% of what we have discussed so far applies in the same way. Moreover, it is important to understand that any machine-learning project is a software project, and the same rules apply to it.

Machine learning equals software engineering

I think this is an essential explanation, since after the necessary “Peak of Inflated Expectations” of the hype cycle curve, there may be some dissociation with what is actually involved in a machine-learning project in the industry.

The fact is that in the research stages where many great minds have focused their attention pushing the limits, the state-of-the-art has subtly different rules from those that apply when productizing a system with large-scale impact. Both stages are important, both necessary, but different; and the common mistake is not recognizing that difference, but acknowledging that embracing it can boost what both stages have to offer.

Data Scientists need to iterate, prototype and put ideas into practice quickly, check or reject hypotheses and generate new ones. It is acceptable to leave issues such as code versioning, unit testing, delivery processes, and others on the back burner. This is right, for the focus is on iterating quickly and exploring.

What is wrong here is to think that the entire life cycle of the project can be carried out under that same premise without seeing major negative impacts on the final product. This is the reason why we need teams with different roles, which boost each other and contribute to the project in a constructive way.

Everything that applies to any healthy software project, applies to a machine-learning project as well:

  • Good code management
  • Clear delivery processes
  • Good documentation
  • Alerts and monitoring
  • Uptime measurement
  • Testing and testing coverage measures
  • SLA’s and bug management
  • CI/CD
  • Pair Review

This explanation may seem unnecessary; not in my experience, though.

And a piece of advice to those AI, machine-learning and data-science passionate enthusiasts: although this area of expertise needs you to master new skills, always keep good general software practices in your skillset (and as a project manager be sure your team have them); a significant difference that could turn to your advantage.

No one can underplay having a good card up the sleeve.

Now you’re halfway through, and still more flight checks are yet to come. Keep up & fly safe!

--

--