Comparing auto-diff and dynamic model sub-classing approaches with PyTorch 1.x and TensorFlow 2.x

Source: Author

The data science community is a vibrant and collaborative space. We learn from each other’s publications, debate ideas on forums and online outlets, and share lots (and lots) of code. A natural side effect of this collaborative spirit is the high likelihood of encountering unfamiliar tools used by our colleagues. Because we don’t work in a vacuum, it often makes sense to gain familiarity with several languages and libraries in a given subject area in order to collaborate and learn most effectively.

It’s no surprise, then, that many data scientists and Machine Learning engineers have two popular Machine Learning frameworks…


Comparing AutoDiff and Dynamic Model Sub-classing approaches between PyTorch 1.x and TensorFlow 2.x, training from scratch a Linear Regression with a custom dynamic model class and manual training loop/loss function

Source: Author

This short article focus on how to use dynamic subclassed models with Module/Model API respectively in PyTorch 1.x and TensorFlow 2.x and how AutoDiff is used by these frameworks in the training loop to get the gradients of the loss and implement from scratch a very naive gradient descendant implementation.

Generate some linear data with a bit of noise

With this intent of focusing on the core of auto-diff / auto-grad functionality we will use the simplest possible model, a linear regression model, and we will generate first, using numpy, some linear data adding a random level of noise.


Comparing AutoDiff and Dynamic Model Sub-classing approaches between PyTorch 1.x and TensorFlow 2.x, training from scratch a Linear Regression with a custom dynamic model class and manual training loop/loss function

Source: Author

This short article focus on how to use dynamic subclassed models with Module/Model API respectively in PyTorch 1.x and TensorFlow 2.x and how AutoDiff is used by these frameworks in the training loop to get the gradients of the loss and implement from scratch a very naive gradient descendant implementation.

Generate some linear data with a bit of noise

With this intent of focusing on the core of auto-diff / auto-grad functionality we will use the simplest possible model, a linear regression model, and we will generate first, using numpy, some linear data adding a random level of noise.


Hacking the new Apple ML Compute framework to accelerate training of neural networks across CPU and GPU

With the recent new wave of operating system versions (BigSur, iOS 14 etc.), announced at recent WWDC, Apple quite silently introduced a new ML framework to accelerate training of neural networks across the CPU or one or more available GPUs.

ML Compute is not properly a new ML framework but new API’s that utilizes the high performance BNNS primitives made available by the Accelerate framework for the CPU and Metal Performance Shaders for the GPU.

After looking at the documentation and starting to use it on an iOS/macOS application, I understood that this is not really a simple, high level…


Hacking the new Apple ML Compute framework to accelerate training of neural networks across CPU and GPU

With the recent new wave of operating system versions (BigSur, iOS 14 etc.), announced at recent WWDC, Apple quite silently introduced a new ML framework to accelerate training of neural networks across the CPU or one or more available GPUs.

ML Compute is not properly a new ML framework but new API’s that utilizes the high performance BNNS primitives made available by the Accelerate framework for the CPU and Metal Performance Shaders for the GPU.

After looking at the documentation and start using it on a iOS/macOS application I understood that this is not really a simple, high level framework…


Did you know you can fully train a LeNet convolutional neural network (CNN) directly on iOS devices? And that the performance isn’t bad at all ?!

Photo by Eric Tompkins on Unsplash

In a previous article, I focused on transfer learning scenarios with Core ML, and in particular we saw how to create a new model on an iOS device, import embedding weights from a previously-trained model, and train the rest of the layers on-device, using private and local data:

Moving forward in my long journey towards developing a Swift federated learning infrastructure, this time I’ve investigated how to train, from scratch on iOS devices, a…


Did you know you can fully train a LeNet Convolutional Neural Network model with the MNIST dataset directly on iOS devices ? And that the performance is not bad at all ?!

Training MNIST CNN on iOS devices with Core ML

In the previous article I’ve been focused on transfer learning scenarios with Core ML and in particular we saw how to create a new model on iOS device, importing Embedding weights from a previously trained model and train the rest of the layers locally, on device, using private and local data (see https://heartbeat.fritz.ai/core-ml-on-device-training-with-transfer-learning-from-swift-for-tensorflow-models-1264b444e18d)

Moving forward in my long journey towards developing a Swift Federated Learning infrastructure, this time…


A demonstration of the potentiality of the SwiftCoreMLTools library in a long journey towards a Swift federated learning infrastructure

As a first small step towards a federated learning platform that supports mobile and wearable devices (in particular, devices within the Apple ecosystem) I’ve being developing a Swift library called SwiftCoreMLTools that mimics in Swift a subset of the functionalities of Apple’s CoreMLTools Python library.

To briefly give an overview of the library I’m working on…SwiftCoreMLTools exposes a DSL (function builder), as well as a classic API to declare Core ML models built from scratch as potentially re-trainable on device, and eventually importing weights from models trained on other frameworks.

The reasons I’ve being developing this library (which, by the…


A demonstration of the potentiality of the SwiftCoreMLTools library in a long journey towards a Swift Federated Learning infrastructure.

As first small step for a Federated Learning platform that support mobile and wearable devices, in particular on the Apple ecosystem, I’ve being developing a Swift library called SwiftCoreMLTools that mimic in Swift a subset of the functionalities of the official Apple CoreMLTools Python.

Very briefly SwiftCoreMLTools exposes a DSL (function builder), as well as a classic API, to declare from scratch CoreML models, potentially re-trainable on device, and eventually importing into it weights from models trained on other frameworks.

The…


Hacking Core ML protobuf data structures to export Swift for TensorFlow models to Core ML, and personalizing S4TF models on-device using Core ML 3 model personalization

Federated learning, transfer learning, and model personalization

For a healthcare research project I’m working on, I’m investigating for a federated learning platform that supports mobile and wearable platforms—in particular on the Apple ecosystem.

Federated learning represents a tremendous opportunity for the adoption of machine learning in many use cases, and especially where efficiency and privacy concerns require us to distribute the training process, instead of centrally collecting data on the cloud and applying traditional ML pipelines.

There are some fantastic toolkits already available—for example, TF-Federated is emerging to address this scenario—but all of these existing solutions are based on Python, and there still remains the issue of…

Jacopo Mangiavacchi

Microsoft Senior Data Scientist — Google Machine Learning Developer Expert (ML GDE) — Former  + IBM Senior Architect and Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store