Using an appropriate scale to pick hyperparameters

In the last aticle you saw how random sampling, over the range of hyperparameters, can enable you to more efficiently search the space of hyperparameters. Yet it turns out that random sampling doesn’t imply uniform sampling at random, over the spectrum of valid values.Instead, it’s important to pick the appropriate scale on which to explore the hyperparamaters.

Let’s look at one case. Say about your search for the alpha hyperparameter, the rate of learning. And let’s say you suspect that 0.0001 may be on the low end, or perhaps it might be as…