Converging minds at SysML 2019

By Nikos Kaltsas, Senior Machine Learning Engineer, and Aris Valtazanos, Head of Machine Learning Engineering, QuantumBlack

2019 brought us the second edition of SysML, a research conference which focuses on the intersection of systems and machine learning, originally born out of events like NeurIPS and ICML. This year’s conference was held at Stanford University, where over 500 delegates took advantage of the wide range of demos, keynotes and technical talks on offer.

QuantumBlack’s interdisciplinary Machine Learning Engineering (MLE) team saw SysML as an opportunity to improve our ability to understand and collaborate with other teams. Within QuantumBlack we work alongside the Data Science team closely to build robust machine learning models; something which is becoming increasingly important as our clients’ needs are shifting from single-use prototypes to persistent production-grade solutions.

Besides the learning opportunity, it was also a great chance for us to do some MLE team bonding!

The QuantumBlack team at SysML 2019

OUR TAKEAWAYS

As with any conference, it’s a challenge to see and hear absolutely everything that’s on show. However, with such a strong QuantumBlack contingent at the event, we managed to capture the key themes. Below is a summary of our top SysML 2019 takeaways.

As the conference progressed, three key themes emerged:

  1. Maturing model lifecycles: This theme wasn’t surprising to us as models are now regularly used in production and have to be maintained. The proposed solutions we saw presented were inspired by well-established software engineering practices.
  2. The value of data: Several speakers reminded us of the value of data supporting the model and the risks associated with subtle changes, causing models to produce unexpected results. This has consequences in performance but also potentially in security and privacy; a topic that attracted a variety of interesting proposed solutions.
  3. Improving Deep Neural Network (DNN) training: This was the most talked about theme, and one that we see emerging in every major deep learning conference. Multiple groups gave presentations about optimising communications in a DNN graph, compressing outputs, parallelising and distributing workload.

Insightful papers

TicTac from the University of Illinois (authored by Sayed Hadi Hashemi, Sangeetha Abdu Jyothi and Roy H Campbell) covered the topic of accelerating distributed Deep Learning. With platforms like TensorFlow and PyTorch we have great tools to train and deploy DNNs on, however training and inference time — which is directly related with cost — still has vast room for improvement. This paper attempts to tackle the problem from the perspective of achieving optimal communication in a distributed DNN graph. The authors present a novel technique for substantial improvement in both training and inference by optimising parameter transfers, taking into account the order of the graph execution.

The Data Validation paper from Google (authored by Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang and Martin Zinkevich) focused on an immensely important aspect of the process of guaranteeing the quality of results. It tackles the problem of errors in input data being used for the early detection of problems, how to save engineering hours and ensuring the quality of models as data is refreshed over time. The examples presented showcase real use cases where missing features or schema changes were detected and fixed early, leading to time and cost savings.

Impactful presentations

This year we saw a number of product-focused presentations, including new software and upgrades to major existing frameworks. One of the most exciting was around TensorFlow with the major announcement being the eager execution mode & automatic graph generation coming with version 2.0 as well as the integration with JavaScript coming with TensorFlow.js.

Another notable presentation was that of ease.ml which was a bright attempt at bringing the well-established notion of continuous integration to the machine learning world. It was great to see the community building upon the feedback of the practitioners but also taking inspiration from the world of traditional software engineering.

Looking ahead to 2020

Not only was it great to hear about the latest trends in systems and machine learning at SysML this year, but we also picked up tips for improvement in several areas that we encounter in our everyday work, such as data validation for machine learning.

Next year, we hope to see the conference cover an even wider range of machine learning topics beyond deep learning, as well as more around the challenges of building machine learning systems for complex real-world applications.

Above all, we are incredibly excited about the new community being formed through SysML, and QuantumBlack looks forward to being part of it!

If you are interested in learning more about us please go to our QuantumBlack website, or if you are interested in specific roles, please contact us at careers@quantumback.com.