Week 6-Hate Speech Detection on Social Media

Published in

bbm406f19

2 min readJan 5, 2020

Team Members: Yiğit Barkın Ünal, Gökhan Özeloğlu, Ege Çınar

INTRODUCTION

Hi everyone! Last week we have discussed preprocessing data, and we applied Logistic Regression to our model. It was the first model for our project and we have actually found a pretty good prediction than we have expected. This week, I am going to talk about Accuracy in the first part and Support Vector Machines and 1D Convolutional Neural Network models for text classification in the second part.

Accuracy

We usually look at the accuracy to determine if our model makes the best prediction or not. But it isn’t always the right case. Usually, the training and test data isn’t balanced. For example, if the gathered data has 99% of “not a hate speech” tweets and our model predicts “not a hate speech” to all tweets, then our prediction accuracy would be 99% but actually our model doesn’t do a good job. One of the solutions to this problem is, we can balance the data by ignoring the skewed part. But this means we will lose the data. We will talk about this and other possible solutions next week.

Support Vector Machines (SVM)

Support vector machines is an algorithm that converts data into a non-probabilistic binary linear classifier. It determines the best decision boundary between vectors that belong to a given group and vectors that do not belong to it.

1D Convolutional Neural Networks (1D CNN)

Convolutional Neural Networks work really good and efficient on computer vision type of problems with their convolution part. We can use the same logic to our problem.

We will add the results of SVM and 1D CNN next week.

Conclusion

We have talked about Accuracy, SVM, 1D CNN this week. We will add the results of SVM and 1D CNN in the following weeks. Stay with us!

See you next week!