Three conferences and a week of Deep Learning

Published in

The Hive

6 min readApr 10, 2018

by Mohan Reddy, CTO, The Hive

During the week of March 24th I attended three very significant conferences in the field of AI — Scaled Machine Learning 2018, GTC 2018 and TensorFlow Dev Summit 2018. This post is a brief reflection of techniques and technologies showcased in these conferences and how it aligns to The Hive’s thematic viewpoints and their usage in our portfolio companies.

Scaled Machine Learning 2018

Scaled Machine Learning was at Stanford University on March 24th, 2018 organized by Matroid and is always very well attended.

Machine Learning applications are now becoming mainstream wherein it has become very important now to move from static model serving to dynamic, real-time decision making. This is generating new sets of requirements for existing distributed frameworks for real-time model training, construction of task graphs and model serving over diverse resources. In the light of the above it is very apt to refer to Yoshua Bengio’s talk in 2017 where he spoke about five key ingredients of machine learning systems.

1. Lots & lots of data

2. Very flexible models

3. Enough computing power

4. Powerful priors that can defeat the curse of dimensionality

5. Computationally efficient inference

The Scaled Machine Learning Conference in an attempt to address the scaling of machine learning/deep learning architectures and had an impressive lineup of speakers. Reza Zadeh who is the organizer of the conference was referring to them as “ML Avengers” and equating the summit to 1927 Solvay Conference on Electrons and Photons in Brussels

Ion Stoica spoke about Ray (A system for distributed AI) and how RISELab is building open source platforms, tools, and algorithms for real-time intelligent decisions on live-data which are secure and explainable. I have used Ray for Reinforcement Learning applications and it provides a powerful combination of flexibility, performance, and ease of use for the development.

Andrej Karpathy talked about the Software 2.0 stack written using neural networks and where the architecture and code are all created by the computers. I do see Google at the forefront of this thought process and is developing some of its systems using Software 2.0 paradigm. This is an active area of research at The Hive and will present some work in this area in a future blog.

Jeff Dean spoke about AutoML/Neural Architecture Search, TPU design considerations and areas where ML can be applied (meta-learn everything!). We believe that one of the most important areas of ML applications will be in datacenter systems. Peritus — one of our portfolio companies is at the front of SystemsML with an aim to build autonomous software.

Source: Twitter, Slide from Jeff Dean’s talk — teaching machines to learn to solve new problems without human ML expert intervention

There were other interesting talks by industry luminaries describing various technique and experiences which are very applicable to the kinds of problems we are solving in the AI field at the moment.

NVIDIA GPU Technology Conference 2018

The NVIDIA GPU Technology Conference is one the biggest conferences in AI field was held at San Jose Convention Center from March 26th — 29th, 2018.

Source : Twitter. The line for keynote, 8000+ attendees.

GTC has become a premium AI/Deep Learning event showcasing latest breakthroughs in self-driving cars, smart cities, healthcare, big data, high performance computing, and virtual reality.

In my opinion this is one single place to see how future of technology is shaping up.

Nvidia’s GPUs today are 25 times faster than five years ago said Jensen Huang in his keynote address and coined a new “Supercharged Law” along the lines of Moore’s Law. IEEE Spectrum refers to it as Huang’s law. It took AlexNet six days to train on two of Nvidia’s GTX 580s and with DGX-2, it takes 18 minutes — a factor of 500. DGX-2 is indeed a super computer and will pave way for great research in the fields of Deep Learning, Advanced Image processing etc.,

Source: Conference pic. NVIDIA CEO/Co-founder Jensen Huang unveiling DGX-2 world’s largest GPU. Conference. “he innovation isn’t just about chips, It’s about the entire stack.” — Jensen Huang

Amongst the many announcements project Clara was very interesting. Clara — a medical imaging supercomputer deployed on its cloud platform transforms standard medical images such as X-rays, ultrasound scans, CTs, MRIs, PETs, and mammograms into high-resolution cinematic renderings.This is going to be a game changer in the medical imaging.

NVIDIA has done a good job building the AI ecosystem by building faster GPU systems, software tools for deep learning training viz DIGITS, TensorRT for inference. TensorRT is now integrated with TensorFlow 1.7 natively. Huge kudos to them for empowering AI systems for the long-term future.

TensorFlow Developer Summit 2018

TensorFlow dev summit was held at Computer History Museum on March 30th, 2018. It was a great day packed with incredible learnings and exciting new announcements. There have been great advancements in many different fields using TensorFlow. One of the notable ones is Scientists in Africa are using TensorFlow to detect diseases in Cassava plants to improve yield for farmers.

Other noteworthy announcements include

TensorFlow 1.7 TensorFlow.js: support for JavaScript.
TensorFlow for SWIFT: support for SWIFT.
TensorBoard debugger GUI plugin
TensorFlow Hub: a shared ML library with reusable ML code called modules.
Eager Execution

I am particularly excited about eager execution. Eager execution evaluates operations immediately, without an extra graph-building step. Operations return concrete values instead of constructing a computational graph to run later. This makes it easy to get started with TensorFlow, debug models and reduce boilerplate code.

TensorFlow hub is a good addition. In the machine learning world in addition to sharing code sharing models is very important. Sharing pre-trained models will make it easier for developers to customize the models for their respective domains without having to spend too much on the computing resources that usually takes thousands of GPU-hours.

TensorFlow is the most popular machine learning library with strong distributed support, performance, and wider platform coverage. It is widely adopted across the industry. We use TensorFlow across The Hive portfolio in various applications, recent announcements will immensely be useful in scaling those applications.

Three conferences and a week of Deep Learning

Written by The Hive