TFDevSummit2019: Top 5 TensorFlow Announcements

Learnings from Google’s TensorFlow Conference

Ray Yamamoto Hilton
Eliiza-AI
5 min readMar 18, 2019

--

What we need is moiré TensorFlow

Not covered here: It might also be of interest to take a look at Google’s new Corel Edge TPUs

I had the opportunity to attend the #TFDevSummit in Sunnyvale, CA this year and while there was a LOT to take in, these are my top picks of what we will be keeping a close eye on.

For the uninitiated, this HackerNoon post does a good job of giving an overview of what TensorFlow actually is. Over the last few years, it has been gaining a lot of traction and quickly becoming the de-facto standard for performing end-to-end machine learning workflows — https://www.tensorflow.org/

An overview of what TensorFlow covers

The picture above shows how TF is attempting to cover an increasingly large part of the overall machine learning workflow, not just training, but also operational and data processing functions.

Just to give a quick overview, the main focus of the event was TF2 (which is actually a whole bunch of things) — The main goals are:

Simpler

  • Simplified APIs
  • Keras Layers API adopted as the main way to interact with TF
  • Eager execution by default

Cleaner

  • Removal of duplicate and legacy code
  • Better abstractions

Better Documentation

  • I love that this is being touted as a major feature of TF2
  • The documentation for TF so far has been lacking and often out of date
  • TF2 coincides with a major website and online resources update

The relatively benign, evolutionary nature, of these goals is a sign of maturity in the platform, and that it is gradually settling down (in a good way). Ok, onto the top picks:

It’s good to have aspirational goals

1. TensorFlow Extended

  • Seems to include everything, including the kitchen sink
  • Think of it as TF-orientated, mildly-opinionated, CI/CD for machine learning
  • TFX is everything else that is needed to make an end-to-end machine learning workflow: Generate training/test data, generate statistics of training data, generate schemas, validate examples, transform data, training, evaluate models (compared to previous versions), push models to TF Serving, TF Lite, etc
  • While it’s designed to be relatively agnostic, the currently supported orchestration layer is Apache Airflow
  • For some reason, known only to Google, it currently requires Python 2.7 (ARGH)
TFX aims to cover all those other niggly bits of your machine learning workflow

2. TensorFlow2 Distribution Strategies

  • In TF 1.0, code has to be written to take advantage of multiple CPUs, TPUs, GPUs, etc
  • In TF 2.0, the introduction of distribution strategies allows your code to be scaled over multiple types of processors as well as nodes without changing your core code
  • This will make scaling from one’s local machine to harnessing the power of Google’s Cloud ML (i.e. multiple nodes, all with multiple GPUs) a much simpler task.
Easily scale your code over multi-GPUs and multi-nodes using strategy scopes

3. TensorFlow Lite Optimisations & Selective Operators

  • TFLite does a lot of clever optimisations
  • Quantising can reduce model size and also increase performance on simpler CPUs — at the expense of accuracy
  • Quantising during training offers the best results, but TFLite’s post-training quantisation comes pretty damn close
  • This means we can confidently only train a float-based model and only quantise after the fact for devices that require it.
  • TFLite has traditionally supported only a subset of all available TF operations.
  • Selective operators allow for adding support for additional operators as needed. The trade off being deployment size and some performance
  • Given that writing custom ops can be quite cumbersome, this is a very welcome trade-off.
  • It’s also worth mentioning that TFLite supports GPU acceleration on Android and iOS (via Metal)
  • Oh, TF Lite also has an (experimental) Swift API!
The new TF logo makes my sense of perspective hurt

4. TensorFlow.JS + WebGL Acceleration

5. tf.data & TF Datasets

  • data processing and pipelining right within TF
  • Support for tf.record and other TF primitives
  • used in TFX to transform data
  • TF datasets provide an easy way to pull in and process open data sets.
  • Data isn’t hosted by TF, it’s brought in from original locations and processed into tf.records on-demand according to how you want to batch/split/etc
  • Project is open so we’re encouraged to contribute our own

Summary

We’re very excited by all the above developments and will be investigating and adopting them in our day-to-day workflows. I’d be very keen to hear what others made of the conference announcements and what is most exciting for your research or company.

If you would like to read more on all of the above, here is a collection of resources referenced in this article

…hang on, why haven’t you mentioned Swift for TensorFlow?

I have purposely not mentioned Swift for TensorFlow here as I think this effort is worthy of an entire article in it’s own right, but is also operating over a much longer-term, so not something that will likely be relevant to most people in the near future.

--

--