We will create a classifier using Keras to differentiate between positive sentiments and negative sentiments for movies. Then we will load the model to the browser for user prediction.
I created a repository on GitHub with the code required to follow the tutorial. If you spot any error or faced any problems please raise an issue there.
We will use the dataset for sentiment classification. The dataset contains
7086 statements about movies with labels. label
1 means positive sentiment and
0 means negative sentiment.
Preprocessing the data
For text analysis we need first to preprocess the data. First of all, upon uploading the data we need to split the sentences into words then populate the training set.
Then after that we will need to create the dictionary
word_index that will map words to integers for embedding. Note that the dictionary length might be huge so we count the counts of each word in the training set and we only include words that repeat more than some threshold.
The process method will create words out of statements by removing punctuations
After that we will need to map each sentimental statement to a sequence of integers but first get the statement with the largest number of words
Now we can create a sequence of integers using our dictionary
word_index Note that we create a fixed number of sequences with length
max_tokens which has the largest number of tokens. We use pre padding since it gives better results
Now we are ready to create the keras model. The first layer is the embedding layer. Then we follow it with 3 GRU layers. Finally we create a dense layer with sigmoid activation. We compile the model using an Adam optimizer
Then we train the model for
5 epochs with
5% split for validation with
32 batch size
97% accuracy on validation set.
We can see the model summary using
We will save the model using
Now that we are done with the model we will import it to run on the browser. First we will need to convert it into json format. Before this step you will need to install tensorflowjs tools using
Convert the keras model into a model understood by tensorflowjs
This will create one json file which contains the meta variables and some other variables with names like
group1-shard1of1 which contains the computed values of the weights
Porting the model to the browser
For simple processing we will load the dictionary generated by the keras code
word_index will now contain the same word,index pair as in the previous section. After that we need to create some helper methods to process the text,
tokenize it then map it to integers using the dictionary
Now we can load that into the browser using
Finally we combine all the steps using one method
Using the sample text we get a prediction value
0.98 which is very close to