SOCML 2017

I recently returned from Google Research’s Self Organizing Conference on Machine Learning (SOCML) aka Pre-NIPs 2017. Conceived by the Ian Goodfellow-inventor of Generative Adversarial Networks (GANs)-with the support of Google’s university relations team, as a means of accelerating AI research. You can read the report from the inaugural conference held at OpenAI.

This year, I was a bit apprehensive about attending because last year’s conference did not have much in terms of offering support for Applied Scientists like myself who are chiefly interested in applying cutting-edge research to industry problems. After some thought, I decided to head back to SOCML and lead a few discussions around Machine Learning (ML) for production. And, I have to say, attending was, in fact, the right decision as a lot of very actionable conversations came out of the unconference.

ML in Production

I led the conversation of ML in production environments. The interesting thing about this conversation was the presence of several companies with varying data needs. The conversation saw participants from the cutting edge incubator co-founded by Yoshua Bengio — Element AI to companies like H20 AI, Shopify, LinkedIn, and more. While many of us were interested in applying the latest techniques to industry problems, pulling from best practices, we realized that there is still a lot of work to be done in translating these ideas to practice before we can safely use them in production environments.

As a potential first step towards solving the transition of research into practice, participants suggested the following:
- Realize the new algorithm in code.
- Rewrite the algorithm to work on distributed data (if necessary).
- Separate the inference (scoring) from the training.
- Move the trained model to a cloud service like AWS, wrap it in Flask and give it an API endpoint.
The underlying assumption is that if you are able to do all of the above, then you have solved some of the major challenges engaged with using the new algorithm in a production environment.

ML for Recommendation Systems

I also led the conversation around recommender systems. This session delved into issues of scalability, evaluation, sparsity, what’s next and which tools are giving good results for recommender systems. Speaking of tools, these two open-source libraries were brought up as interesting tools to consider:
- FAISS Facebook open source tool for efficient similarity search and clustering. You can try out the code on their GitHub repo.
- Photon LinkedIn’s open source ML library — which provides support for large-scale regression, supporting linear, logistic, and Poisson regression with L1, L2, and elastic-net regularization. You can try out the code on their GitHub repo.

Many companies are finding that these two open source tools perform better than Spark alone.

Interest moves from RL to GANs

In terms of where the field seems to be moving, the biggest change from last year to this year was the overwhelming shift of interest from Reinforcement Learning to GANs. For those who are interested in learning about GANs, you can pull the code from the groundbreaking paper by Ian Goodfellow in the repository accompanying the paper.

AI Ethics and Policy

Like last year at OpenAI, his year’s SOCML had a large contingent deeply concerned about AI safety, ethics, and policy. One of the outstanding questions for the community is whether to treat AI as we treat nuclear weapons or as a type of low-scale hacking akin to cyber warfare. For now, the community is divided on this issue. However, something that is consistent is the concern of scientists around the potential use of their inventions to harm society.

In all, I was happy to attend SOCML and look forward to continued collaborations with my colleagues both on the applied and research sides of AI. It is often through these kinds of conversations that we discover what works at scale.