[Week 4] Make World Green Again!

İhsan Baran Sönmez
bbm406f17
Published in
2 min readDec 16, 2017

This week, we have tried regression models to estimate the peak value of NO2 concentration. As input data, 3 early hour measurements of NO2 concentration and relative humidity measurements of those hours are used. Maximum values of hourly measurements are considered as true labels. A set consisting 12785 samples is used in training phase. For the validation, a test set consisting 3000 samples is used.

1- Linear Regression Model

As seen from the graph, learned parameters fit the test data into an interval and discarded the unusual values. The R² score of the test is 0.74. 1 represents the perfect prediction.

2- Polynomial Regression Model

The polynomial model fits the data better compared to the linear regression model.

4th-order polynomial regression

Degree 4 ;

alpha = 10 — MSE: 64.56, Variance score: 0.63

alpha = 0.001 — MSE: 41.11, Variance score: 0.77

alpha = 1e-06 — MSE: 41.02, Variance score: 0.77

7th-order polynomial regression

Degree 7;

alpha = 10 — MSE: 60.22, Variance score: 0.66

alpha = 0.001 — MSE: 40.94, Variance score: 0.77

alpha = 1e-06 — MSE: 43.51, Variance score: 0.75

The result shows the relation between alpha values and scores and relation between degrees and scores. It is concluded that as the value of alpha increases, the model complexity reduces. High alpha values may cause underfitting as seen in the first plot of figures. However, very small alpha values may cause overfitting. That’s why we have tried a few alpha values and we decided 0.001 is the optimum value for this model.

The change in degree did not effect the results significantly. However, with alpha = 0.001, we got the best results with degree 7.

Comments on the Error

As the error function mean squared error is used. The resulted error is higher than we expect. The reason might be the extreme cases. The models used might be requiring more features about air status or the extreme cases might occur because of an instant external factor.

--

--