Embracing Open Source ML at Feedzai with OpenML

Pedro Rijo
Feedzai Techblog
Published in
3 min readNov 28, 2018

In the past few years we have seen the rise of Artificial Intelligence (AI) and Machine Learning (ML) from a mythical beast to an ubiquitous technology.

A great recognition has to be given to companies such as Google, Apple, Microsoft, Amazon, and many others, in bringing ML to its current state. These companies have been not only pushing the field, but also working on making AI accessible to everyone by releasing their frameworks as Open Source Software (OSS): TensorFlow, Core ML 2, Azure Machine Learning, Amazon Machine Learning, and Spark ML are simply a few examples.

Photo by Andy Kelly on Unsplash

ML goes open source

If a few years ago a data scientist team was stuck using Weka or some internally developed framework, nowadays there are tons of available choices. You just need to pick one. Or several. Whatever you need to get the job done.

As in any other software field, each of the existing frameworks has advantages and disadvantages. Your team is not comfortable with Scala? Well, then maybe it should use TensorFlow instead of Spark ML. Your team already has many tools in Python? Well, then maybe it should use scikit-learn. There’s a new, ultra-shiny, ultra-promising ML tool that’s revolutionizing your field? Maybe your team should evaluate it.

Until now, the Feedzai platform allowed teams to do the full data science loop within the platform, using its own ML and data processing tools. While we are proud of the platform we have built throughout the years, we understand that there is a learning curve when adopting a new tool. Furthermore, when adopting a new platform, teams usually have to start from scratch, throwing away tools that took years to develop.

Welcome to Feedzai OpenML

Most of the existing ML platforms force your data scientists to work within their confines — to use their proprietary data science methods, models, tools, languages, and libraries. And it was in this scenario that Feedzai decided to create the Feedzai OpenML engine.

Feedzai OpenML represents the opposite approach. Feedzai is embracing the data science community. We want your teams to integrate existing approaches with Feedzai’s system.

To better aid our fraud fighting mission, the Feedzai OpenML engine allows you to integrate any tool and framework into the Feedzai platform! Your team has deployed a model trained using scikit-learn and you want to keep it? Just use our scikit-learn SDK. Or maybe your team has an awesome tool in R? We also got you covered with our R SDK

Here’s the full list of SDK’s already built:

These SDKs allow your team to use any tool on our platform. This means you can import your current models into Feedzai platform. Furthermore, the integration is so deep that you can even train H2O models directly from our platform, removing the need of switching between several tools! And Feedzai OpenML allows this level of integration to be extended to any platform/framework you may wish for. Just imagine installing Feedzai platform and having Feedzai’s industry knowledge partnering with your existing tools to fight fraud.

But you may be asking: what about that ultra-shiny-cool-amazing-new framework released 42 seconds ago? We got you covered: you just need to create a simple adapter on our publicly available Feedzai OpenML API and your favorite tool will be available on Feedzai platform: https://github.com/feedzai/feedzai-openml .

Interested in learning more about Feedzai OpenML? Stay tuned for more posts about the challenges developing our SDKs!

Feel free to raise any questions and to contribute directly on GitHub ! Or if you want to take part in building the next generation data science tool, join our team.

--

--