[WEEK 6] Prediction Of Real Estate Price

Muhammed İkbal Arslan
bbm406f18
Published in
3 min readJan 7, 2019

Theme: Prediction Of Real Estate Price and Image Classification with Textual and Image Features

Team Members: Batuhan Ündar, Muhammed İkbal Arslan, Enes Koçak

Photo by Flo Pappert on Unsplash

Evocation

I would like to start this week’s blog by explaining some of the methods we have planned in the previous weeks and the path we followed.

As you may recall, we have said that our main goal in the project is to classify and estimate the pictures in the data we collect according to parameters such as price, location, number of bathrooms, number of rooms, square meter, building age and luxury of houses. Additionally, we mentioned below processes:

  1. We will train 3 convolutional neural networks.
  2. And then we will give the results obtained from CNN as a parameter to the artificial neural network with other continuous parameters.
  3. While this process continues, we thought we should do some comparisons. For this purpose, this week, we have tested our model with the dataset mentioned below.
Photo by Jon Tyson on Unsplash

In order to compare our results with the dataset we have collected, we have applied the same process as kc_house_data for the United States from kaggle and found the opportunity to compare the results. But this time, we didn't use any image features, just simplified the US dataset which has different features than our picked dataset such as predicted square footage of the home and square footage of the lot, etc.

We have observed the results using the KNeighbors Regression and Linear Regression models, thinking that these different features would have a positive effect on the results we received.

After KNeighbors Regression, the model gives an average training score as 0.519 with the same shuffled data while average testing score is 0.442. Intercalarily, a training score of Linear Regression on the same data is 0.348 while a testing score is 0.335. In fact, these results disappointed us. Distributed predicted and expected results for regression models below:

KNeighbors Regression
Linear Regression

Linear SVR of SVM model gave us poor result so putting these results may be a shame for us :)

When we compared our datasets with the US dataset, we decided that the results would not improve as expected and it would be a better choice to continue with our own data. We believe that after the image classification process, the results will be at the level we expect. These were our experiences this week.

See you next, last week post!

Dataset

https://www.kaggle.com/harlfoxem/housesalesprediction

References

https://towardsdatascience.com/regression-predict-house-price-lesson-2-5e23bec1c09d

--

--

Muhammed İkbal Arslan
bbm406f18

Hacettepe University - Computer Science & Engineering