Regression — a short introduction of the method used in fast AI
What is regression?
Regression is finding the relationship between the dependent and independent variables and different methods of calculating regression are used in AI for different reasons such as predictive behavior, finding the midpoint of a face in images, finding the size and position of a circle in a square, etc.
REGRESSION techniques are widely employed to solve tasks where the goal is to predict continuous values. In computer vision, regression techniques span a large ensemble of applicative scenarios such as: head-pose estimation, facial landmark detection, human pose estimation, age estimation, or image registration.
There are at least 15 types of regression used in machine learning (the one used in Lesson 6 is none of the list below so there are more), these are:
Linear Regression; Polynomial Regression; Logistic Regression; Quantile Regression; Ridge Regression; Lasso Regression; Elastic Net Regression; ; Principal Components Regression (PCR); Partial Least Squares; (PLS) Regression; Support Vector Regression; Ordinal Regression; Poisson Regression; Negative Binomial Regression; Quasi Poisson Regression; Cox Regression; Tobit Regression
For linear regression or the line of best fit, we would be looking for the best fit assuming the formula y = bx + a, where y is the dependent variable and x in the independent variable. An example of solving a linear regression on a random set of numbers would look like the header image:
However, in machine learning the data does not necessarily fit a linear regression in which case, we need to select a different type of regression model to understand the data we are working with. There are links to the different methods should you wish to do deeper research, the scope of this paper only to covers chapter 6 of the fast AI lecture.
Since this article is specifically referring to the fast AI course lesson 6 we will be using the key point regression model that is used for finding the midpoint of faces in the Biwi Kinect head pose set.
For this purpose, we will follow the example used by Jeremy Howard in his lecture on the subject and attempt to understand the method used:
The preparation of the data can be found in the notebook 06_multicat.ipynb in colab our focus is on the regression method.
The Biwi Kinect head pose set has two sets of data, the RGB image file which is the independent variable and the pose file that corresponds is the dependent variable, the pose file is the position of the midpoint in the image. They include in the text file a method to get the centre point which returns a tensor that can be passed to get_y in the DataBlock.
In this example the images have been halved in size to reduce training time.
The code for key point regression using fast AI is as follows:
biwi = DataBlock(
blocks=(ImageBlock, PointBlock),
get_items=get_image_files,
get_y=get_ctr,
splitter=FuncSplitter(lambda o: o.parent.name==’23'),
batch_tfms=[*aug_transforms(size=(240,320)),
Normalize.from_stats(*imagenet_stats)]
)
The DataBlock has the independent variable “ImageBlock” as expected and dependent variable is a PointBlock (which is a tensor with 2 values) so that when the images are augmented the same augmentation is applied to the labels to maintain the location of the centre point. By combining the ImageBlock and the PointBlock this signals that the image regression will be done with a dependent variable with two continuous values.
To get_items call the image_files
To get_y call the get_ctr (the get center function the code that comes with the data set)
The splitter function cannot be random otherwise it will not be effective. It is recommended to split the data by each person (in this case by folder). The data in the Biwi Kinect head pose set is ordered using a person in each folder, using a random split in this case could result in over fitting.
In this data set it is important that the validation set does not include one or more of the people in the training set, in the example in lesson 06 person 13 was excluded, this was randomly selected. Person 13 will then be used as the validation set. This is specific to this data set because of the way the data was prepared. In other data sets a random validation set could be used.
Data augmentation is used on these sets and normalization in this example was used as well as a line of code calling for Normalization, however it seems that this is now a built-in feature in which case it might not need to be included as a line of code. Best is to check to see that it works though.
There is a note that except for fast AI it seems that no other libraries automatically apply the correct data augmentation to the coordinates. If you are using another library, it is advisable to disable augmentation with this specific kind of problem.
Don’t forget to check the output at this stage, a manual check to see that the results are as expected before training could save on time if there is an anomaly that occurs you can fix it before the training. see the image below:
The model is not read to train
Sources
Linear Regression — Fun and Easy Machine Learning — https://www.youtube.com/watch?v=CtKeHnfK5uA
06_multicat.ipynb — https://colab.research.google.com/github/fastai/fastbook/blob/master/06_multicat.ipynb#scrollTo=wkvmMezplMgP — Jeremy Howard
Biwi Kinect head pose set authors: Fanelli, Gabriele, Dantone, Matthias, Gail, Juergen, Fossati, Adrea, Van Gool and Luc)
Deep Learning Applications — M. Arif Wani, Mehmed Kantardzic, Moamar Sayed-Mouchaweh
A Comprehensive Analysis of Deep Regression — Stephane Lathuili ´ ere, Pablo Mesejo, Xavier Alameda-Pineda, ` Member IEEE, and Radu Horaud
15 TYPES OF REGRESSION IN DATA SCIENCE — https://www.listendata.com/2018/03/regression-analysis.html