Highlights from O’Reilly’s TensorFlow World 2019

Published in

Compendium

9 min readNov 11, 2019

Ben Lorica and Edd Wilder-James welcoming us to TensorFlow World 2019

I’m now on my way back to Norway after participating in the world’s largest gathering of TensorFlow experts and enthusiasts. This conference was organized by O’Reilly. The first two days were dedicated to tutorials and the last two days to the main program of the conference. Take a look at the conference schedule for the program overview and the associated YouTube channel where the talks will be available in about three weeks.

The conference had a strong focus on the following themes:

TensorFlow 2.0
Kubeflow, the machine learning toolkit for Kubernetes.
TensorFlow Extended (TFX). If you are not familiar with TFX, you can think of it as a set of best practice templates open-sourced by Google on how to build machine learning pipelines.

In what follows, I will just discuss a few of the talks that impacted me the most. This is of course my personal bias, and as such, it just explores a “slice” of the conference, given my personal experience. Please, look at the other posts talking about their experience at TensorFlow World in order to get a better picture of this great conference.

Production ML pipelines with TensorFlow Extended (TFX)

This tutorial was led by Aurélien Géron who did a great job taking us through all the components from TFX to build end-to-end pipelines. The workshop was organized around colab notebooks, that you can try for yourself at the following GitHub repo. The notebooks are very well written, that even if you didn’t attend the tutorial, you should be able to follow the important steps. There were several other speakers taking us through the following themes (and there are corresponding notebooks with demos in the repo mentioned above):

Fairness Indicators for TFX. They were just released during the conference. This is a first step on how to measure bias in TensorFlow. At the moment, there are no mitigation algorithms implemented, and they just focus on classification problems. IBM has already a great library AIF360 that not only detects bias, but it also mitigates it. I believe the community would benefit if these two projects work together to standardize how we approach fairness problems. We need less tools to solve the same issues, and a better understanding on how to go from your abstract moral principles to the correct technical implementation of fairness. I used a bit of time to lobby this idea to the Google and IBM teams in these projects during the conference, and it seems they have already started to talk together about these issues. I really hope that they get a fruitful cooperation from which the whole community will benefit. Both Google and IBM had a talk about the ethical use of AI. Make sure you watch the talks when they become available!
How to build a CD/CI pipeline with TFX in Google Cloud (code available here). This is a good starting point if you are to integrate TFX in a CI/CD pipeline. The code comes with Terraform scripts that automate the whole setup in GCP. You should definitely check the code and give it a try!
Neural Structured Learning (video) which tackles the problem of how to define structure in your data. This is usually something not represented in your features, and this technique allows you to build relationships between your examples. One application of this technique is to increase model accuracy when you have little labeled data, or to make your model robust to adversarial attacks.
TensorFlow Serving. Yes, don’t use Gunicorn / Flask for deep learning inference. It is not optimized for this kind of workload, if you do it wrong you may block your GPU, and there are several other issues. Look at the talk when it becomes available.

All in all, this was a fantastic tutorial. But let me be picky about the things that were not covered, which is understandable given the time constraints:

Hyper-parameter tuning. It is not so clear to me how do you do hyper-parameter tuning of a model. I guess one way is to create a custom component (and there is an example notebook on the repo). After talking to the TFX team, I heard that they are working on a standard component that will take care of this, and make things simpler for us.
Fixing random seeds for reproducibility. As far as I understood, the random seed is not exposed in TFX from the beam implementation of some of the components. I may have misunderstood here, but I have an ongoing discussion with the TFX team, and I’ll update this post when I know more.
Better sampling techniques. TFX comes with a way to gather the statistics of your data. These are used downstream in the pipeline to detect anomalies like feature skews, bad train/test splits or model bias, to name a few. The way I see things, it would be very beneficial to have a first pass of the data and calculate these statistics on the whole dataset. Then one could have a standard mechanism to split the data in train/test/eval, given those statistics (and possibly other restrictions to prevent data leakage). This would then be a preventive measure, instead of finding out the imbalances of the data a later stage and redo the whole process. At the moment, as far as I know, there is no “simple” way to perform this kind of sampling in TFX. But a good starting point is using Beam’s composite transform and build your own TFX component. Any takers?

Managing the full deployment lifecycle of TensorFlow models with the MLflow Model Registry (sponsored by Databricks)

Talk by Clemens Mewald

A great overview on how MLFlow can help you keep track of experiments when building an ML pipeline. The talk had a great comparison on the differences in the development workflow of normal software development and machine learning. Then, Clemens gave us a great overview on how MLFlow solves the part that normal software development tools don’t cover. Finally, we got a demo of MLFlow’s new model registry that exposes a great user interface where teams can work together to share, experiment, test and monitor ML models. It comes also with a way to integrate with approval and governance workflows. All in all, a great tool for us practitioners!

As far as I know, there is no component in TFX or Kubeflow having an equivalent functionality to what MLFlow’s model registry provides. But again, conference rumours from the Kubeflow contributors tell that there is a work in progress on this issue, where such a component is soon to be released. It seems to me that the community has to work together to standardize the available formats for model governance, so that we can work across technologies and reduce fragmentation. From the rumours that I’ve got, people are thinking along these lines already, so hopefully things will work out fine.

Performant, scalable models in TensorFlow 2.0 with tf.data, tf.function, and tf.distribute

Talk by Taylor Robie and Priya Gupta. (video)

If you care about how to optimize your TensorFlow pipelines to the best possible performance, you should watch this talk. It comes with lots of practical tips that you can try in the following Colab notebook while you watch the talk.

TensorFlow Privacy: Learning with differential privacy for training data

Talk by Úlfar Erlingsson

This was the talk that impacted me the most during the whole conference. It addressed some of the issues that I’ve been wondering about for quite some time now. The talk focused on what differential privacy is, how to train machine learning models with privacy and how to use the tensorflow.privacy library.

Your first thought might have been: “Sure, privacy, well, my data is not sensitive, so I don’t need that”, and you’d be wrong!

“… differential privacy is a stability guarantee which is fundamentally aligned with the central goal of statistics, namely, to learn from data about the population as a whole and not about specific individuals.”
http://blog.mrtz.org/2015/03/13/practicing-differential-privacy.html

Training with privacy will help your model learn from the population, so that it is able to generalize. Neural networks tend to memorize training data, especially for data lying at the tails of your data distribution. Informally speaking, when you train with privacy, you do it in a way that the model output obtained from having trained with a certain individual example, is very similar to the one obtained if you hadn’t had trained with that example in the first place. I know, it sounds like magic, and in a sense, there are some nice mathematical spells under the hood!

One benefit from the technique is that it allows you to define in a consistent way, which instances on your training data are “well-represented”, given how your model “sees” your data. The intuition here is that you can get a “score” that lets you order your training data by an “outlier rank” (see arxiv:1910.13427, section 7, for examples). This requires several iterations of training with privacy enabled. The benefit here, is that it tells you what kind of data you have enough of, and what kind of data you need to collect more of. So this can also be used (see arxiv:1910.13427, section 4) as an interpretability technique! Ùlfar tells me there is code soon to be released that will let us play with these ideas in a more concrete manner. I’m really looking forward to it! I’ll update the article when I grab the link to the code.

The other advantage is for GDPR compliance. What happens when you train on a specific user-data, and the user requests to delete his/her data? Do you have to throw away your model and start again? Well, if you were smart and trained with privacy enabled, you don’t need to do a thing! The model can’t really remember any particular example that you used to train it with. This is because when you use differential privacy, you set bounds on the information your algorithm can leak about its inputs (see section 9.3 arxiv:1802.08232).

One cool application is for fine-tuning language models, like BERT, on sensitive data. Doing this fine-tuning with privacy enabled, will allow you to get a model that respects user privacy.

A cool tip to remember here, is that if the graph of the curve of your train and evaluation loss functions follow each other very closely when converging, then your model is training correctly with privacy enabled. Intuitively, this happens because the model should not be able to tell which data comes from the training set and which from the evaluation set.

Usually you will see a bit of a performance drop when you use this technique. This is not bad, since this will also protect you from overfitting your model architecture to a given problem (see arxiv:1806.0045). So you gain a better estimator of the true skill of your model in the real world. Hopefully you are now as convinced as me to use this technique when possible. One of the caveats with this technique, is that you will have to drink more coffee than usual, since the training times can increase in the 10x-100x range (see section 10.1 arxiv:1802.08232 for a full overview of the limitations and future work).

Natural language processing using transformer architectures

Talk by Aurélien Géron

Well, this talk gave me finally the last push I’ve been needing to dive into the transformer architectures. Aurélien explains in a nice and intuitive way how these architectures are put together, and what each component does. If you have been waiting for that push, I really recommend you his talk.

The conference opened an amusement park for us to celebrate Halloween!

Conclusion

All in all, it was a great experience coming to TensorFlow World 2019. I had lots of great discussions with the other participants, TFX’s team, Google, Nvidia, IBM and some of the speakers. I return home, humbled by the knowledge of the community, and full of best practices and new ideas on how to tackle the problems our customers face.

And to finalize, I’m probably being extremely unfair to the rest of the talks at TensorFlow World. To be honest, there were many times I struggled to pick a talk, given the availability of so many great speakers. Fortunately, you can follow TensorFlow’s Youtube channel and choose the talks that interest you. Let me know in the comments which other talks impacted you the most.

I’m happy to connect with you if you want to discuss more about TensorFlow, TFX, Kubeflow, Interpretability or Fairness. And if you want to see how we can help you with your use case, you can visit our home page and reach out to us!

Useful links to play with

I leave you with a list of several interesting links from other talks and tutorials at TensorFlow World. Thanks to Rahul Joglekar for some of these links!

Text Classification with BERT
https://github.com/lapolonio/text_classification_tutorial
https://github.com/lapolonio?tab=repositories
Zero to ML Hero
https://colab.research.google.com/github/lmoroney/
https://codelabs.developers.google.com/codelabs/tensorflow-lab3-convolutions/#0
Recurrent neural networks without a PhD
https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd/tree/master/tensorflow-rnn-tutorial
TensorFlow without a PhD
https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd
Swift for TensorFlow in 3 hours
https://github.com/AIwithSwift/TFWorld2019-SwiftIn3Hours
TensorFlow World: Privacy-Preserving Machine Learning with TensorFlow
https://github.com/dropoutlabs/tf-world-tutorial
TensorFlow Sampling from Datatonic
https://github.com/teamdatatonic/tf-sampling
TensorFlow model optimization
https://www.tensorflow.org/model_optimization

Acknowledgements

I would like to thank Ole-Magnus Bergby, Michael Gfeller and Josephine Honoré for their feedback on earlier drafts of this article.