[Week4-YelpGuesser]

YelpGuesser
bbm406f16
Published in
2 min readDec 30, 2016

Choose the Most Suitable Algorithm and Get the best Predictions!

Hi Everyone

Today we are going to talk about the SVR algorithms. Immediately created our datasets.We designed what dataset to use in project. Now we have to determine the algorithm which is most suitable for our project.

We improve the sentiment analysis concept a little bit more and try to guess how many stars will be given according to the reviews. So, we know that stars will have float values or integer.For this reason, anymore we know we have to use the regression. What we need is to be able to predict the actual number values. Using classification is not going to work. Our goal is always same: to reduce the error most, maximizes the margin.We thought about SVM . But SVM is about classification. After that, we decided use SVR.It’s relevant to our project.

taken from:http://www.saedsayad.com/support_vector_machine_reg.htm

SVR help us to find a regression function.It can improve the calculation speed. That reasons,our project we will use SVR . Also we will use the bag of words. That algorithm counts how many time world appear on text. We produce vocabulary from review’s word.Thanks to that we can prepare our feature vectors.After that we can use SVR and predict stars.
It is very simple to test it with our test set:
Just example data;

L=[[1, 1, 1,1, 1, 2,1, 2, 1,1, 1, 0,1, 1,1],[1,
0,0,2,0,1,1,1,0,0, 0,0, 0, 1,0]]
y= [5, 1]clf= svm.SVR()clf.fit(L,y)print(clf.predict([[1,0,
0, 1,0, 1, 1,1, 0, 1,1, 0, 0,1, 0],[1,0, 3, 1,100, 1, 1,1, 0, 1,1, 0,
0,1, 0]]))
It will
predict:[2.18126925 3.] .We will do same in our project.

Thank you for reading. See you next week !

--

--