What is a random seed and how is it important?
The random seed is a number that’s used to initialize the pseudorandom number generator. It can have a huge impact on the training results. There are different ways that the pseudorandom number generator can be used in ML. Here are a few examples:
- Initial weights of the model. When using not fully pre-trained models, one of the most common approaches is to generate the uninitialized weights randomly.
- Dropout. Dropout is a common technique in ML that freezes randomly chosen parts of the model during training and recovers them during evaluation.
- Augmentation. Augmentation is a well-known technique, especially for semi-supervised problems. When the training data is limited, transformations on the available data are used to synthesize new data. Mostly the transformations and how they are applied are chosen randomly (e.g. change the brightness and its level).
As you can see, the random seed can have an influence on the result of training in several ways and add a huge variance. One thing you do not need when tuning hyper-parameters is variance.
The purpose of experimenting with hyper-parameters is to find the combination that produces the best results, but when the random seed is not fixed, it is not clear whether the difference was made by the hyperparameter change or the seed change. Therefore, you need to think about a way to train with fixed seed and different hyper-parameters. the need to train with a fixed seed, but different hyper-parameters (comes up)?.
Later in this tutorial, I will show you how to effectively fix a seed for tuning hyper-parameters and how to monitor the results using Aim.
How to fix the seed in PyTorch Lightning
Fixing the seed for all imported modules is not as easy as it may seem. The way to fix the random seed for vanilla, non-framework code is to use standard Python
random.seed(seed), but it is not enough for PL.
PL, like other frameworks, uses its own generated seeds. There are several ways to fix the seed manually. For PL, we use
pl.seed_everything(seed) . See the docs here.
Note: in other libraries you would use something like:
Find the full code for this and other tutorials here.
Analyzing the Training Runs
After each set of training runs you need to analyze the results/logs. Use Aim to group the runs by metrics/hyper-parameters and have multiple charts of different metrics on the same screen.
Do the following steps to see the different effects of the optimizers:
- Go to dashboard, explore by experiment
- Add loss to
SELECTand divide into subplots by metric
- Group by experiment to make all metrics of similar color
- Group by style by optimizer to see different optimizers on loss and accuracy and its effects
Here is how it looks on Aim:
From the final result it is clear that the SGD (broken lines) optimizer has achieved higher accuracy and lower loss during the training.
If you apply the same settings to the learning rate, this is the result:
For the next step to analyze how learning rate affects the experiments, do the following steps:
- Remove both previous groupings
- Group by color by learning rate
As you can see, the purple lines (lr = 0.01) represent significantly lower loss and higher accuracy.
We showed that in this case, the choice of the optimizer and learning rate is not dependent on the random seed and it is safe to say that the SGD optimizer with a 0.01 learning rate is the best choice we can make.
On top of this, if we also add grouping by style by optimizer:
Now, it is obvious that the the runs with SGD optimizer and
lr=0.01 (green, broken lines) are the best choices for all the seeds we have tried.
Fixing random seeds is a useful technique that can help step-up your hyper-parameter tuning. This is how to use Aim and PyTorch Lightning to tune hyper-parameters with a fixed seed.