Weights&Biases: An introductory guide

Marco Varrone
Polimi Data Scientists
7 min readOct 18, 2019

A big slice of time spent in training a Machine Learning model is in tracking, saving and organizing runs. If we don’t put enough effort into creating a stable system around our models we soon lose control of the project, ending up having to retrain the same model multiple times because we didn’t save it or we didn’t record its performance at the end of the training process. As always in Computer Science, poor design leads to a lot of wasted time later on.

But each project has its own characteristics and mechanisms. It is not always possible to design an environment that can track all the different models of your project, from the simple Linear Regression of your random personal project to the last state-of-the-art Transformer Neural Network from OpenAI. Unless, there is an entire team dedicating work, time and money into that.

A huge number of tools and platforms have been recently developed to track the performance of the Machine Learning models over time and different executions.

Weights&Biases

Weights&Biases is a platform that helps developers working in Deep Learning. By adding few lines of code to your script, you can start tracking almost everything about your models: performances, architecture and parameters used, system information (e.g., number of CPUs/GPUs used), running time and many more. The code to write is designed to be non-invasive so that it requires minimum effort to enable or disable the tracking.

The information will be sent to the dedicated project page on the W&B website, from which you can setup cool visualizations, aggregate information and compare the trained models. Last time I joined a group to participate in a Kaggle competition I didn’t know about this platform. Sharing models and results was incredibly difficult and after losing too much time from trying to understand with which version of the code a certain result had been obtained, we had to spend even more time in setting up an environment to keep track of that.

One of the advantages of storing remotely the data is that it is easy to collaborate on the same project and share the results. The platform has launched a service called Benchmarks, to allow people to share their implementation for a specific task. In this way, when someone wants to start working on that task, he/she already has access to a list of approaches (state-of-the-art included), with associated implementation and scores.

Tracking runs with W&B

There would be so much more to talk about, but it’s time to show a basic example of how it works!

Neural Network classifier

The task we will tackle is the classification of hand-written digits from the MNIST dataset. It is one of the most used datasets, for which you will find a ton of tutorials online if you don’t have so much experience with neural networks. We will use the Keras library to implement our simple neural network, but even if you have never used it, don’t worry, we can consider the implementation as a black box for now and still appreciate all the features of Weights&Biases!

As I said, I will not go into details about the code implementation because it’s not the scope of this blog post. It is not the best code to solve the task, but I wanted to keep it as simple as possible. The neural network is composed of the first layer with size 28*28=784 (which is the size of the 28x28 input image reduced to a single vector), then a single hidden layer of size 516 and an output layer of size 10 (the number of digits, i.e. the classes to predict).

import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import RMSprop
num_classes = 10
epochs = 20
# split the data between train and test sets
(x_train, y_train), (x_valid, y_valid) = keras.mnist.load_data()
# transform 28 by 28 pixels image to a vector of length 28*28
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
# define a simple neural network with 1 hidden layer of size 516
model = Sequential()
model.add(Dense(516, activation='relu', input_shape=(28*28,)))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=epochs,
verbose=1,
validation_data=(x_valid, y_valid))

Add tracking code

Now it’s the time to introduce the Weights&Biases tracker to our code!

The steps are the following:

  1. Register to the platform. The service is free unless you want to use big, enterprise-level models or you want to create collaborative, but private, projects. For students, academics and open source projects, the latter limitation is not present. After signing up you will receive an API key, required to link your code to your account.
  2. Create a new project. A project can be public, meaning that anyone has read access but cannot upload any run; open, in which anyone has also the possibility to add his/her runs and reports; private, in which only the owner has access.
  3. Install the Python wandb library to automatically track the training process and then, connect to your account using the API key. The process is made easy by the Python package manager called pip.
pip install --upgrade wandb 
wandb login YOUR_API_KEY

4. Insert the code into our script to start tracking.

The new lines to our code are highlighted in the following script.

import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import RMSprop
import wandb
from wandb.keras import WandbCallback
wandb.init(project="introduction_wandb")
num_classes = 10...model.fit(x_train, y_train,
epochs=epochs,
verbose=1,
validation_data=(x_valid, y_valid),
callbacks=[WandbCallback()])

Visualization

By running the code, the model will train for 20 epochs, reaching a validation accuracy of 97.45%, but that’s not the only information that W&B is storing. Let’s go to see on the project page.

The run we have executed is now shown on the left side, with a random name (it is possible to change the name from the page or setting it from the Python code).

If we click on upbeat-yogurt-1 we have access to a lot of information that W&B has automatically recorded.

The Overview section contains data about the running time, the training and validation accuracy of the model and more. In the Python script it is possible to add new parameters to be tracked, or for instance, save which activation function has been used, but I will not go in-depth about it.

If we click on the Chart section we have the plots of how the metrics of our model changed from epoch to epoch. The more interesting ones are in this case the training and validation accuracy.

If we run the script again, with a different setting (e.g. changing the number of layers or the size of the layers) it is possible to see the plots one on top of the others and directly compare their performance.

In the System section we have information about, for example, how much CPU, Memory, and GPU have been used by the model. This is becoming more and more important in the Deep Learning field, because a modest improvement in accuracy may not be useful if it requires a huge increase in resources.

The Model section shows a summary of the neural network architecture.

A very useful section is Logs, which shows all the shell output during the training process. It is incredibly useful to check for warnings and errors even if we don’t have access to the terminal anymore.

And finally, the Files section, from which you can download the pre-trained model and lots of other data.

In this introduction to Weights&Biases, I showed you only the basic features of the platform. It’s incredible how many tricks I have discovered while using it and they will be shown in future posts, such as Github integration, collaborative projects, automatic hyperparameter tuning, reports, gradients visualization, alerts through Slack and many more.

The project page on the Weights&Biases platform is public, so you can access and explore it at https://app.wandb.ai/mrc-varrone/introduction_wandb even if you don’t have an account.

I hope you enjoyed this guide!

This is the first of a series of blog posts published by the PoliMi Data Scientists community. We are a community of students of Politecnico di Milano that organizes events and write resources on Data Science and Machine Learning topics.

If you have suggestions or you want to come in contact with us, you can write to us on our Facebook page.

--

--

Marco Varrone
Polimi Data Scientists

PhD student in Computational Biology at the University of Lausanne. MSc in Computer Engineering at Politecnico di Milano. Member of PoliMi Data Scientists