Linear Regression from Scratch in Java

Learner1067
Analytics Vidhya
Published in
5 min readApr 11, 2020

Why AI/ML: As we are slowly and steadily moving towards Age of Algorithms, its great time to brood over how traditionally we have been solving problems using present set of tools. The need for AI /ML cannot be ignored because there are far more challenging problems which cannot be solved using present set of tools. e.g Try writing code to predict if input image is orange or apple. And if you solve it then think of other cases like. What if input image is in black and white? What if image contains leaves along with fruits? Surely, adding if and switch block in, is not ideal way of solving this problem.

In this article I have tried to explain how to write Linear Regression for scratch in java without using any framework. Logic behind not using any framework was to understand and explain the internals of algorithm in detail. Enough of theory let's get back to coding.

Linear Regression: By mathematical definition Linear Equation of one variable can be defined as below.

y = mx+c

In order to correlate it with real world example let assume small company wants to predict number of 25 L drinking water bottles needed in month based on month’s average temperature. They want to establish pattern between number of bottles and average temperature of the month [ To keep it simple let's assume employees count is fixed]. The company wants to use their historical data to build model which can help them achieve it.

Example of data set.

Month's Avg Temperature   Number of Bottles consumed in month
35 4
37 8

Let's try to relate it with equation.

Temperature is x  and y is bottles now we have to find m and c. so that when we put x in below equation it should help with prediction of number of bottlesy = mx+c

Problem Statement: Find optimal value of m and c so that we can use above equation to predict value.

Solution: In order to find optimal value of m and c we will have to start with some random initial values for both. We will gradually keep correcting the values to achieve required precision in prediction [ do not think about over fitting now please]. This gradual process of correcting values is space where we will use mathematics learned in high school. The correction of values is done in training phase using historical data. Formula to update m and c are shown below.

m1 = m0 - learningRate * differentitaion of M.S.E with respect to mlearning rate is fixed 0001d [ Learning rate hyper parameter which can be optimized using GridSearch or Randomoptimizer (Out of scope )]M.S.E = Mean squared error [ I find it difficult type maths formula using keyboards hence attaching image]

Now let's use differentiation to calculate rate at which MSE is changing with respect to m and c. I always used to wonder in high school why we are learning Calculus now seeing it in action is pleasure.

For this article we are going to use stochastic gradient descent [ other types are batch, mini batch]. The stochastic gradient descent only takes one entry from set of training data and calculate predicted value for it. It then adjusts weights based on value calculated for that using that value. The differentiation equation has n as number of records considered but in this it 1 and hence 1/n can be neglected from MSE derivative. Here epoch is also a hyper parameter fixed for now, but it can also be optimized as learning rate using GridSearch /RandomOptimizer.


public void trainSGD(double[][] trainData, double[] result) {

if (trainData == null || trainData.length <= 0) {
throw new RuntimeException("Input data can not be null");
}
// Stochastic Gradient descent
for (int e = 0; e < epochs; e++) {
double mse = 0d;
for (int i = 0; i < trainData.length; i++) {
double[] tempInput = trainData[i];

Optional<Double> predictedValueOptional = this.predict(tempInput);

double predictedValue = predictedValueOptional.get();

double error = predictedValue - result[i];
mse = error * error + mse;

for (int j = 0; j < weights.length; j++) {
weights[j] = weights[j] - learningRate * error * tempInput[j];

}
beta = beta - learningRate * error;

}
mse = (Math.sqrt(mse)) / trainData.length;
System.out.println(" MSE " + mse + " Weights " + Arrays.toString(weights) + " Beta " + beta);
}
}

Just to explain code in bit detail I am using for loops but ideally, we should use Matrix operations instead of for loops. I will cover it in next article.

Here is the complete code

package org.ai.hope.core;

import java.util.Arrays;
import java.util.Optional;



public class LinearRegression {

private double beta;

private double[] weights;

private double learningRate = 0.001d;

private int epochs;

//private Function<T, R>

public LinearRegression(int featuresCount, int epochs) {
weights = new double[featuresCount];
this.epochs = epochs;
}

public Optional<Double> predict(double[] inputs) {
if (inputs == null || inputs.length <= 0) {
return Optional.empty();
}

double result = 0d;
for (int i = 0; i < inputs.length; i++) {
result = inputs[i] * weights[i] + result;
}

result = result + beta;

return Optional.of(result);
}


@Override
public void trainSGD(double[][] trainData, double[] result) {

if (trainData == null || trainData.length <= 0) {
throw new RuntimeException("Input data can not be null");
}
// Stochastic Gradient descent
for (int e = 0; e < epochs; e++) {
double mse = 0d;
for (int i = 0; i < trainData.length; i++) {
double[] tempInput = trainData[i];

Optional<Double> predictedValueOptional = this.predict(tempInput);

double predictedValue = predictedValueOptional.get();

double error = predictedValue - result[i];
mse = error * error + mse;

for (int j = 0; j < weights.length; j++) {
weights[j] = weights[j] - learningRate * error * tempInput[j];

}
beta = beta - learningRate * error;

}

mse = (Math.sqrt(mse)) / trainData.length;
System.out.println(" MSE " + mse + " Weights " + Arrays.toString(weights) + " Beta " + beta);
}

}


}

Code for running the Program with sample data.

 private static void trainModel()
{
double[][] trainSet = {{20},{16},{19.8},{18.4},{17.1},{15.5}};
double[] result = {88.6,71.6,93.3,84.3,80.6,75.2};
LinearRegression linearRegression = new LinearRegression(trainSet[0].length, 1000);
linearRegression.train(trainSet, result);

}

public static void main(String[] args) {
//trainModel();
trainModel1();
//testRandom();
}

Stay tuned.

Please drop me feedback note at nirmal1067@gmail.com

--

--