An introduction to Deep Learning using Flux Part I: A simple linear regression example

SophieB
5 min readJan 23, 2022

--

Image from https://fluxml.ai/

Flux is a Deep Learning library written entirely in Julia. Flux’s website states that “it makes easy things easy while remaining fully hackable”. This makes Flux an ideal library to get started in Deep Learning because you can look under the hood and see how algorithms are implemented. Also, some members of the community say that in the future, Julia might have a more predominant role in Machine Learning.

In this post, we go through a simple linear regression example using Julia and Flux. To train a Machine Learning model in Flux, we need to perform the following five steps:

1. Get the data.

2. Define the model.

3. Define the loss function.

4. Set an optimisation routine.

5. Train the model.

If you want to follow along, then make sure that you install Julia and Flux. However, you can try both Julia and Flux without installing them by using this Colab notebook as a starting point. For a quick introduction to Julia, see Learn Julia For Beginners and Julia for Pythonistas.

Note: For more information on basic concepts about Deep Learning, see Part II.

Before going through the steps listed above, import the packages we need:

Besides importing Flux, we also use the package Plots for creating visualisations in Julia. You can learn more about visualisation Julia in this post.

Step 1: Get the data

For this example, we generate our own data but keep in mind that you can also use your own data when creating models with Flux. We create random data that has two variables with a linear relationship. Also, we add some noise to the data we generate so it looks a bit more real.

Now, we plot the data we just generated to check its distribution.

The train and test data have more or less the same distribution. Both data sets show a linear relationship. Feel free to experiment with the function we used for generating the data and create a function that returns data that is not correlated.

We check the size of the train data.

Before moving on to the next step, we need to reshape our data so that we can pass it to the model we’re creating. The train data (x_train)is a list that contains 500 elements. However, Flux expects the data as a 1x500 array.

We use the following routine to reshape the data.

We use the hcat (horizontal concatenation) and the reduce (high order function) functions to obtain the correct shape. Thus, we create a function with the code above.

Now, we apply the function to all of the data and check the final shape.

Step 2: Define the model

We want to create a simple linear regression model m(x) = W*x + b. To define this type of model, we set a single neuron with no activation function. In Flux, we can use the Dense function to define this model:

Our model is initialized with random values for the parameters W and b (weights and biases). These values won’t be useful for predicting new data. Therefore, we need to train our model so we can find better values for W and b.

Before we move on, we need to collect all of the parameters so we can access and update them during the training steps. We do this with the params function.

Step 3: Define the loss function

We need to measure the predictions that our model makes so we can determine how good they are. This measure is also known as loss function. Flux has a lot of ready to use loss functions. For this simple example, we’ll use MSE.

Step 4: Set an optimisation routine

Now, we set the optimisation routine (optimiser) that we’ll use to train our model. This optimiser will optimise the loss function (find the best value for the weights and biases that achieve the smallest loss).

For this example, we use the gradient descent algorithm but Flux offers many optimisation routines.

We can also set hyperparameters for the optimisation routines. However, for this example, we’ll just use gradient descent out of the box.

Before we train our model, we compute the predictions and the current loss to compare with the final results.

Step 5: Train the model

Finally, we are ready to train our model. Before calling Flux’s training routine, we need to zip the train data.

In Flux, we can execute one training step with the train! function.

But we can also put the function train! inside a for loop. Then, we execute the train steps (epochs) until the W and b parameters minimize the loss function.

Note: We can have more control of the training routine by using a different approach. For more information on training in Flux, see Training.

After training our model, we plot the results to see what our model’s predictions look like. Note that since we reshaped our data, we need to transform it back to its original shape.

The plot above shows both the model’s initial predictions (with random W and b) and the final ones (after training). We can see that the model improved very quickly after a few iterations. However, when training a model with real data things get more complicated and we need to use more complex training routines.

Full code

We can run the full script at once. First, create a file with extension .jl with the code below and then run it as julia name-of-your-file.jl.

Final remarks

As you can see, it is very easy to create a simple model using Flux. However, Flux offers way more than that. We can create more complex models such as CNN’s and RNN’s. For more information on Flux and how you can use other examples as a starting point for your own projects see Flux’s docs and the Model Zoo.

--

--