Linear Regression with Gradient Descent in C++

3 min readDec 30, 2021

I want to explain how I used C++ to create a simple machine learning library that supports (only as of now) linear regression all written from scratch. While many may think C++ is archaic due to its weird syntax and general low-level nature, I wanted to take advantage of some of its cool features like implicit constructors, templates, and operator overloading to name a few to create a fast and easy-to-use library for regression problems.

The Base Stuff

Before I began the actual machine learning algorithms portion, I wanted to create wrappers around the std::vector structure as well as some sort of structure for multi-dimensional data to mimic Python’s NumPy and Pandas libraries. Creating these structures allowed me to easily load and modify data with simple operators without the need for excessive loops everywhere as well as read and parse a CSV file. Unfortunately at the moment, there is no support for string parsing to allow one-hot encoding, so only numeric values at the moment. Here is a snippet of the ‘Series’ class I created.

which prints out the following

Test 1
<Series size=(3), objects=[10, 12, 14]>Test 2
<Series size=(5), objects=[25, 25, 25, 25, 25]>Test 3
<Series size=(7), objects=[1, 1.41421, 1.73205, 2, 2.23607, 2.44949, 2.64575]>

Now we have support for vector and scalar multiplication which will help us a lot! I also created a class called DataFrame which uses the Series class to store values mapped to a string key similar to the Pandas DataFrame structure, here is a sample below.

Similar to the Series class, the DataFrame takes advantage of features like operator overloading to make things simple and even Pythonic to use. The above code snippet has the output…

DataFrame info
<DataFrame size=(8, 128)>Prices
<Series size=(128), objects=[114300, 114200, 114800, 94700, 119800, 114600, 151600, 150700, 119200, 104000, 132500, 123000, 102600, 126300, 176800, 145800, 147100, 83600, 111400, 167200, 116200, 113800, 91700, 106100, 156400, 149300, 137000, 99300, 69100, 188000, 182000, 112300, 135000, 139600, 117800, 117100, 117500, 147000, 131300, 108200, 106600, 133600, 105600, 154000, 166500, 103200, 129800, 90300, 115900, 107500, 151100, 91100, 117400, 130800, 81300, 125700, 140900, 152300, 138100, 155400, 180900, 100900, 161300, 120500, 130300, 111100, 126200, 151900, 93600, 165600, 166700, 157600, 107300, 125700, 144200, 106900, 129800, 176500, 121300, 143600, 143400, 184300, 164800, 147700, 90500, 188300, 102700, 172500, 127700, 97800, 143100, 116500, 142600, 157100, 160600, 152500, 133300, 126800, 145500, 171000, 103200, 123100, 136800, 211200, 82300, 146900, 108500, 134000, 117000, 108700, 111600, 114900, 123600, 115700, 124500, 102500, 199500, 117800, 150200, 109700, 110400, 105600, 144800, 119700, 147900, 113500, 149900, 124600]>

Machine Learning with Linear Regression

With these classes out of the way, it was time to begin coding our first ML algorithm, the Linear Regression. Linear regression is a fairly simple algorithm that uses a weighted sum to predict values. While the notation for a line on a cartesian graph may look like y = mx+b, in machine learning, the notation (accounting for multiple inputs or features) is a bit different.

Linear Regression Format

That y value, in the beginning, is what we want to predict and those multiple w’s in front of the x’s can be thought of as the slope but from now on we’ll call them weights. That fancy uppercase B at the end can be thought of as the y-intercept and is known as the bias. The goal at the end is to somehow update the weight[s] and the bias until that function is accurate enough through an iterative process called Gradient Descent.

I created a class called LinearRegressor which has the functions fit() and predict() which is very similar to the interfaces that python uses for machine learning frameworks like scikitlearn or xgboost. The magic of fitting correct data making the model accurate will exist in the fit() function. As we see in the header definition below, the LinearRegressor class also has private properties to store the weights and the bias.

The function for fitting the data takes the x values or features as a vector of Series where the Series is a column of values and the ‘y’ parameter is just a single Series of the already correct data from whatever dataset. In addition, there are a couple of helper methods to get the weights and bias. Due to the general complexity of the algorithm in C++, I won’t be explaining it line-by-line but I will talk about the general concept.

https://github.com/rmiguelkelly/mlpp

Linear Regression with Gradient Descent in C++

Written by Ronan Kelly