TFDevSummit2019: Top 5 TensorFlow Announcements

Learnings from Google’s TensorFlow Conference

Ray Yamamoto Hilton

Published in

Eliiza-AI

5 min readMar 18, 2019

Not covered here: It might also be of interest to take a look at Google’s new Corel Edge TPUs

I had the opportunity to attend the #TFDevSummit in Sunnyvale, CA this year and while there was a LOT to take in, these are my top picks of what we will be keeping a close eye on.

For the uninitiated, this HackerNoon post does a good job of giving an overview of what TensorFlow actually is. Over the last few years, it has been gaining a lot of traction and quickly becoming the de-facto standard for performing end-to-end machine learning workflows — https://www.tensorflow.org/

The picture above shows how TF is attempting to cover an increasingly large part of the overall machine learning workflow, not just training, but also operational and data processing functions.

Just to give a quick overview, the main focus of the event was TF2 (which is actually a whole bunch of things) — The main goals are:

Simpler

Simplified APIs
Keras Layers API adopted as the main way to interact with TF
Eager execution by default

Cleaner

Removal of duplicate and legacy code
Better abstractions

Better Documentation

I love that this is being touted as a major feature of TF2
The documentation for TF so far has been lacking and often out of date
TF2 coincides with a major website and online resources update

The relatively benign, evolutionary nature, of these goals is a sign of maturity in the platform, and that it is gradually settling down (in a good way). Ok, onto the top picks:

1. TensorFlow Extended

Seems to include everything, including the kitchen sink
Think of it as TF-orientated, mildly-opinionated, CI/CD for machine learning
TFX is everything else that is needed to make an end-to-end machine learning workflow: Generate training/test data, generate statistics of training data, generate schemas, validate examples, transform data, training, evaluate models (compared to previous versions), push models to TF Serving, TF Lite, etc
While it’s designed to be relatively agnostic, the currently supported orchestration layer is Apache Airflow
For some reason, known only to Google, it currently requires Python 2.7 (ARGH)

TFX aims to cover all those other niggly bits of your machine learning workflow

2. TensorFlow2 Distribution Strategies

In TF 1.0, code has to be written to take advantage of multiple CPUs, TPUs, GPUs, etc
In TF 2.0, the introduction of distribution strategies allows your code to be scaled over multiple types of processors as well as nodes without changing your core code
This will make scaling from one’s local machine to harnessing the power of Google’s Cloud ML (i.e. multiple nodes, all with multiple GPUs) a much simpler task.

Easily scale your code over multi-GPUs and multi-nodes using strategy scopes

3. TensorFlow Lite Optimisations & Selective Operators

TFLite does a lot of clever optimisations
Quantising can reduce model size and also increase performance on simpler CPUs — at the expense of accuracy
Quantising during training offers the best results, but TFLite’s post-training quantisation comes pretty damn close
This means we can confidently only train a float-based model and only quantise after the fact for devices that require it.
TFLite has traditionally supported only a subset of all available TF operations.
Selective operators allow for adding support for additional operators as needed. The trade off being deployment size and some performance
Given that writing custom ops can be quite cumbersome, this is a very welcome trade-off.
It’s also worth mentioning that TFLite supports GPU acceleration on Android and iOS (via Metal)
Oh, TF Lite also has an (experimental) Swift API!

The new TF logo makes my sense of perspective hurt

4. TensorFlow.JS + WebGL Acceleration

TensorFlow.JS is a pure JS implementation of TF
you can even train models in TF.JS!
uses WebGL to accelerate linear algebra and calculus in browser
can also be deployed in Node.js and this will bring in the TF native library
…although it was suggested that the pure JS library could be used in the backend with a headless browser and WebGL-to-GPU mapping

5. tf.data & TF Datasets

data processing and pipelining right within TF
Support for tf.record and other TF primitives
used in TFX to transform data
TF datasets provide an easy way to pull in and process open data sets.
Data isn’t hosted by TF, it’s brought in from original locations and processed into tf.records on-demand according to how you want to batch/split/etc
Project is open so we’re encouraged to contribute our own

Summary

We’re very excited by all the above developments and will be investigating and adopting them in our day-to-day workflows. I’d be very keen to hear what others made of the conference announcements and what is most exciting for your research or company.

If you would like to read more on all of the above, here is a collection of resources referenced in this article

…hang on, why haven’t you mentioned Swift for TensorFlow?

I have purposely not mentioned Swift for TensorFlow here as I think this effort is worthy of an entire article in it’s own right, but is also operating over a much longer-term, so not something that will likely be relevant to most people in the near future.