A Simple Example with HyperparametersJS

In this article, we’ll be looking at a super simple example written in Tensorflow.js and going over how we can optimize some of its hyperparameters with HyperparametersJS.

Also, my last article gives a brief overview of machine learning in Javascript, hyperparameter optimization, and HyperparametersJS, and I’d suggest you quickly read it before continuing.

Perhaps the easiest way to understand how HyperparametersJS works is to look at a simple Tensorflow model and optimize its hyperparameters. Our example is based on the Getting Started example from the Tensorflow.js repo. (You can find more of their examples here.)

Creating a Simple Tensorflow Model

Let’s make a model that, given an input value x, will predict 2x-1 e.g. if we give 6 as an input, we expect 11 as the output.

The first thing we’ll do is create a model and add an input layer. As our model is straightforward, all we really need is a tf.sequential model. (tf.sequential is part of the layers api and modeled after the Keras python library) Then, we’ll add an input layer, which’ll be of shape [1] (since we’re only giving one variable as an input) and an output layer with 1 unit (since our prediction is a single value). Here’s what we have so far:

const model = tf.sequential();
model.add(tf.layers.dense({units: 1, inputShape: [1]}));

Next, let’s compile the model, where we’ll prepare it for training and testing. We’re required to specify the loss and optimizer. Since we’re doing a regression problem, we’re using mean squared error, but could also use something like mean absolute error.

model.compile({   
loss: 'meanSquaredError',
optimizer: 'sgd'
});

Now, let’s add some training data. We’ll create two tensors (Tensorflow’s way to store data), a tensor xs with input values, and a tensor ys with output values: (Note that we also have to specify the shape, in this case [6,1], since we have six training examples with one variable each)

const xs = tf.tensor2d([-1, 0, 1, 2, 3, 4], [6, 1]);
const ys = tf.tensor2d([-3, -1, 1, 3, 5, 7], [6, 1]);

Now that we’ve got some data, let’s train our model. Using tfjs’s model.fit function, we define the input values (xs in this case), expected output values (ys), and other configurations (like epochs, batch size, etc). Because model.fit returns a promise, we can use async/await to wait for the execution to be completed. Here’s what we have:

await model.fit(xs, ys, {epochs: 250});

Now that we’ve trained the model, we can make it predict the result from an input it’s never seen in the training data using the tfjs’s model.predict function. We’ll give a tensor as the input like so:

// we expect to see ~39 as the output
model.predict(tf.tensor2d([20], [1, 1]));

When we put all our code so far in an async function, add some html/scripts, and call it, we get the code below, and if you copy/paste it in an html file (or use this Stackblitz example), it should run:

Note: I used tfjs’s dataSync() method to retrieve the prediction data.

Optimizing the Model’s Hyperparameters using HyperparametersJS.

For a simple example, let’s find the best optimizer and epochs using hpjs.

For this, we’ll make two functions: a train function and an optimization function. We’ll split what we have from the tfjs example above into the two functions and add some more code.

For the train function, we’ll start by defining a search space. This defines the model’s hyperparameters we want to optimize. We’ll do this using some of hpjs’s parameter expressions. If you’re unfamiliar with them, I’ve gone over each of the expressions in detail at the end of my previous article.

const space = {
optimizer: hpjs.choice(['sgd', 'adam', 'adagrad', 'rmsprop']),
epochs: hpjs.quniform(50, 250, 50),
};

The parameter expressions we’ve used here are hpjs.choice and hpjs.quniform. hpjs.choice will randomly select one of the objects (optimizers in this case) from the list each time the function is called, while hpjs.quniform will randomly select a number between 50 and 250, with a step size of 50, each time the function is called.

Next, let’s the make the optimization function. We’ll be returning the loss, which we’ll use to measure which parameters are "best."

const optFunction = async ({ optimizer, epochs }, { xs, ys }) => {
  // Create a simple sequential model.
const model = tf.sequential();
  // add a dense layer to the model and compile
model.add(tf.layers.dense({ units: 1, inputShape: [1] }));
model.compile({
loss: 'meanSquaredError',
optimizer,
});
  // train model using defined data
const h = await model.fit(xs, ys, {epochs});

//print out each optimizer and its loss
console.log(optimizer, h.history.loss[h.history.loss.length - 1]);
  // return the model, loss, and status, which is necessary
return { model, loss: h.history.loss[h.history.loss.length - 1],
status: hpjs.STATUS_OK };
};

Here, we’re passing in our hyperparameters as the first argument and the input/output data as the second argument.

Now, let’s finish the train function. In it, we’ll use the hpjs fmin function to find the optimal hyperparameters (the set of hyperparameters for which the model had the minimum loss).

const trials = await hpjs.fmin(
optFunction, space, hpjs.search.randomSearch, 6,
{ rng: new hpjs.RandomState(654321), xs, ys }
);
const opt = trials.argmin;

In fmin, we’re passing the optimization function we just wrote, the search space, the search method (random search in this case), number of times we’ll call the optimization function (6 in this case), and an object where we can pass anything we want. In it, we’re passing a seeded random generator (rng) for replicable results as well as the train/test data. We could also create the data in the optimization function instead of passing it in like above, but that would be wasteful since it would be recreated every time that function runs (6 times in this case). Below that, we define a const opt which contains the optimizer and epochs from the run that had the smallest loss i.e. the “best” optimizer and epochs. From there, we can display the best optimizer, epochs, and a prediction using some basic html/javascript. Here’s what we have now: (Again, you can copy/paste it in an html file and it should run. Or, here’s a Stackblitz example to play around with)

Because of the seeded random generator, we expect to see the sgd optimizer and 250 epochs as the best hyperparameters.

Optimizing Learning Rate and the Importance of Search Spaces

Measuring the optimal number of epochs is a bit pointless since the largest number of epochs is obviously best, but measuring it made for a pretty easy example. Let’s now look at the learning rate and see how optimizing it can be a bit more involved.

Defining a Search Space

The first thing we need to do is define our search space, but there are multiple ways to do this with learning rate. Let’s take a look at some examples:

Random Choice

const space = hpjs.choice(['0.0001', '0.001', '0.01', '0.1']);

We could define the space as a hpjs.choice and simply put in common learning rates, one of which will be randomly chosen every time the optimization function runs. However, this is limiting in that the optimal learning rate is probably in between these learning rates and we’re unlikely to capture it.

Uniform Distribution

const space = hpjs.uniform(0.0001, 0.1);

We could also define the space as a hpjs.uniform, where a random value between 0.0001 and 0.1 is chosen each time the opt function runs. This would allow for any of the values between our high and low to be chosen, unlike in the previous example, but there would be a much higher bias towards the learning rates closer to 0.1 than 0.0001 because each value has an equally likely probability to be chosen.

Loguniform Distribution

To help with the problem of uniform distribution for learning rate, namely that we’re much more likely to be testing higher learning rates, we can use a loguniform distribution. If a uniform distribution is a horizontal line, a loguniform distribution looks like this:

A loguniform distribution between 1 and 100

This can make it such that if we are looking for learning rates between 0.0001 and 0.1, there’s an equal probability the randomly chosen learning rate will be in each of the ranges 0.0001 — 0.001, 0.001 — 0.01, and 0.01–0.1, and this is what we want.

When defining a space using loguniform, you may think that if you’re looking for a learning rate between 0.0001 and 0.1, you can define the space as:

// this doesn't work like you probably think it does
const space hpjs.loguniform(0.0001, 0.1),

But, hpjs.uniform doesn’t quite work like that. Instead you’d have to use Javascript’s Math object, define the low and high using the Math.log function, and multiply by the amount of digits you want. So to get our desired functionality, we instead define it like this:

// this is what we actually want
const space = hpjs.loguniform(-4*Math.log(10), -1*Math.log(10));

Overall, we should expect the loguniform distribution to help us find better learning rates than choice and uniform because we’re able to go through the largest variety of learning rates.

Passing Optimizers as Objects

Another thing worth mentioning is that with hpjs, you can pass optimizers also as objects. Above, we specified the optimizer name as a string. To do this, we would still define our optimizer search space as usual:

const space = {
learningRate: hpjs.loguniform(-4*Math.log(10), -1*Math.log(10)),
optimizer: hpjs.choice(['sgd', 'adagrad', 'adam', 'adamax',
'rmsprop'])
};

Then, we create an object with the tfjs optimizer classes like this:

const optimizers = {
sgd: tf.train.sgd,
adagrad: tf.train.adagrad,
adam: tf.train.adam,
adamax: tf.train.adamax,
rmsprop: tf.train.rmsprop,
}

Now, in model.compile (in the optfunction), we can create the optimizer object from the class name and pass the learning rate as a parameter to the constructor:

model.compile({
loss: 'meanSquaredError',
optimizer: optimizers[optimizer](learningRate),
});

Remember that optimizers is the const we defined above, while optimizer is a randomly chosen optimizer from our search space.

Here’s a Stackblitz for a full example using this functionality.

Conclusion

In this article, we optimized a simple TensorflowJS model with hpjs, optimized learning rate, and also looked at some neat things you can do with hpjs. Feel free to leave feedback and check out our github page and website!