Naomi Weinberger
4 min readOct 20, 2021

Hate Speech Classification Model Part 1

**Due to the nature of this project, this notebook contains some hateful language. This in no way reflects the views of the author.**

According to justice.gov, from the years of 2004–2015 there were approximately 250,000 hate crimes committed each year. The majority of these cases were not reported to law enforcement.

A 2019 study conducted by NYU claimed that there is a correlation between places with a high rate or hate crimes and places with a high rate of hate speech. Because of this relationship, creating a better filter for hate speech on twitter is imperative.

I took two data sets from Kaggle one entitled Hate Speech and Offensive Language and the other one entitled Twitter Sentiment Analysis and synthesized them. The hate speech data set categorized tweets into three classes: hate speech, offensive language, and neither. The twitter sentiment analysis categorized the data into sexist/racist or neither. After exploring the data, I scraped twitter using tweepy (searching for some common words from the previous dataset) and created a validation set. I then tested the model on this validation set.

I started by importing the necessary libraries:

import numpy as np
np.random.seed(0)
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import re
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')

stopword=set(stopwords.words('english'))
from nltk.stem.porter import PorterStemmer
from nltk import word_tokenize, FreqDist
stemmer = nltk.SnowballStemmer("english")
from nltk.tokenize import TweetTokenizer
from nltk.stem import WordNetLemmatizer
nltk.download('wordnet')
from nltk import ngrams, FreqDist
nltk.download('punkt')
import string
from nltk import word_tokenize
from nltk.stem import WordNetLemmatizer
from wordcloud import WordCloud, STOPWORDS

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc
from sklearn.metrics import precision_score, recall_score, accuracy_score, f1_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import plot_confusion_matrix
from sklearn.metrics import classification_report

from xgboost import XGBClassifier as xbg
from imblearn.pipeline import Pipeline as IMBPipeline
from imblearn.over_sampling import SMOTE


import preprocessor as p
from collections import Counter

I then joined together two datasets, first by making the column names and values consistent and then by using pd.concat

hatespeech_df = pd.read_csv("labeled_data - Copy.csv")
hatespeech_df2=pd.read_csv("train_E6oV3lV.csv")
hatespeech_df_1=hatespeech_df[['class','tweet']]
hatespeech_df2['class']=hatespeech_df2['label']
hatespeech_df2['class'] = hatespeech_df2['class'].replace(['0'],'2')
df_2 = pd.DataFrame(hatespeech_df2, columns= ['class','tweet'])
df_2['class'] = df_2['class'].replace([1,0],[0,2])
hatespeech=pd.concat([hatespeech_df_1, df_2])
hatespeech.shape

For my EDA I started by printing out a histogram of the amount of tweets in each class.

sns.countplot('class',data=hatespeech)

I printed out world clouds for each class and plotted the word frequencies. I added some other stop words to the default list. I did this for the other two classes as well.

stopwords = set(STOPWORDS)
hatetext = " ".join(hatetext for hatetext in Hate.tweet)
wordcloud = WordCloud(stopwords=stopwords).generate(hatetext)
stop_words=set(stopwords)
new_stopwords=
['b','dtype','a','i',',@white_thunduh','@WhaleLookyHere','@VigxRArts','@NoChillPaz','@MarkRoundtreeJr','@HowdyDowdy1','@DevilGrimz','@CB_Baby24','@CB_Baby24',':','name','youu',"you's",'tweet','tellin','really','Length','@viva_based','@mleew17','@UrKindOfBrand','@T_Madison_x',
'@ShenikaRoberts', '@LifeAsKing','@C_G_Anderson','!','20',',','.','..','...','...\n...','0sbaby','1','19190','2','24774','24775','24778','24780','24781','3','4','5',':','?','@2','@8','@C_G_Anderson',
'@LifeAsKing','@ShenikaRoberts','@T_Madison_x','@UrKindOfBrand','@mleew17','@viva_based','110','202','type','dia','Name','1430', 'object','24777','@Blackman38Tide','@white_thunduh','I','hope','24776','24751','184','PEOPL','At','biggest','know',"I'm",'"','...\n ...','24685','24576',
'*', 'Dawg', 'RT', '“','need', 'lo', 'The', 'ain', 'young', 'bout', 'dwn', "'",'-','/','0','24736', '24737','24767','24779','24782','40','4163', '63', '66', '67',':/','@Addicted2Guys','@AllAboutManFeet','@Allyhaaaaa', '@N_tel','@ViVaLa_Ari','@mayasolovely', 'http://t.co/3gzUpfuMev', 's', 'th', 'tho', '~', '|',
'&', 'As', 'Eileen', 'Dahlia', 'SimplyAddictedToGuys','yaya','avi','User','co','ð',"@user"]
new_stopwords_list = stop_words.union(new_stopwords)
# visualize the image
wordcloud_h = WordCloud(stopwords=new_stopwords_list).generate(hatetext)
# visualize the image
fig=plt.figure(figsize=(15, 8))
plt.imshow(wordcloud_h, interpolation='bilinear')
plt.axis("off")
plt.title('Hate Speech Word Clooud')
plt.show()

I also printed out word frequencies and plotted them. I did this for all three classes.

lt_hate=lemmatize_text(str(Hate['tweet']))
fdist_hate = FreqDist(lt_hate)
hate_2=[] for word in lt_hate:
if word not in new_stopwords_list:
hate_2.append(word)
freq_hate=FreqDist(hate_2)
freq_hate.plot(50)