Deploying BERT on Heroku

Published in

Analytics Vidhya

5 min readAug 10, 2020

I’ve always been a big fan of Natural Language Processing. Since I love machines, I’d always love to find ways to communicate with my machine too.

Isn’t it cool, you ask your machine something and it answers back?😍

BERT(Bidirectional Encoder Representations from Transformers) can be used for one such application i.e. Question and Answering. You give the deep learning model a paragraph to read and then you ask a question related to the same. There are many such applications where BERT has proved to be strong enough in NLP (Natural Language Understanding to be more precise). The concept of BERT was introduced in 2018. Since then, variations of BERT have come into existence like ALBERT, RoBERTa, MobileBERT, etc. You can read about the original paper here.

We shall use BERT that is trained by huggingface on a dataset for questions and answers i.e. Stanford Question Answering Dataset (SQuAD) and deploy the fined tuned model on Heroku for real time inference. You can find all the material for the same on my github repo.

Now, let’s begin to code..

Note

Make sure you follow this folder structure:

/web-app
|--templates
|----index.html
|--app.py
|--requirements.txt
|--Procfile

Step 1

Since this has to be deployed on Heroku, let’s make sure Heroku installs all the libraries needed to run the program.

Make a file named “requirements.txt” and put the following libraries in the file:

https://download.pytorch.org/whl/cpu/torch-1.3.1%2Bcpu-cp36-cp36m-linux_x86_64.whl
transformers==3.0.2
numpy==1.19.1
flask
joblib==0.16.0
sentencepiece==0.1.91
urllib3==1.25.10

Step 2

Make a file called “app.py” and put the following code code:

import os
from flask import Flask, render_template
from flask import request


import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

name = "mrm8488/bert-small-finetuned-squadv2"

tokenizer = AutoTokenizer.from_pretrained(name,)

model = AutoModelForQuestionAnswering.from_pretrained(name)

def answer_question(question, answer_text):
    '''
    Takes a `question` string and an `answer` string and tries to identify 
    the words within the `answer` that can answer the question. Prints them out.
    '''
    
    # tokenize the input text and get the corresponding indices
    token_indices = tokenizer.encode(question, answer_text)

    # Search the input_indices for the first instance of the `[SEP]` token.
    sep_index = token_indices.index(tokenizer.sep_token_id)

    seg_one = sep_index + 1

    # The remainders lie in the second segment.
    seg_two = len(token_indices) - seg_one
    
    # Construct the list of 0s and 1s.
    segment_ids = [0]*seg_one + [1]*seg_two

    # get the answer for the question
    start_scores, end_scores = model(torch.tensor([token_indices]), # The tokens representing our input combining question and answer.
                                    token_type_ids=torch.tensor([segment_ids])) # The segment IDs to differentiate question from answer

    # Find the tokens with the highest `start` and `end` scores.
    answer_begin = torch.argmax(start_scores)
    answer_end = torch.argmax(end_scores)

    # Get the string versions of the input tokens.
    indices_tokens = tokenizer.convert_ids_to_tokens(token_indices)
    
    answer = indices_tokens[answer_begin:answer_end+1]
    #remove special tokens
    answer = [word.replace("▁","") if word.startswith("▁") else word for word in answer] #use this when using model "twmkn9/albert-base-v2-squad2"
    answer = " ".join(answer).replace("[CLS]","").replace("[SEP]","").replace(" ##","")
    
    return answer


app = Flask(__name__)


@app.route('/', methods=['GET', 'POST'])
def index():
  
    if request.method == 'POST':
      form = request.form
      result = []
      bert_abstract = form['paragraph']
      question = form['question']
      result.append(form['question'])
      result.append(answer_question(question, bert_abstract))
      result.append(form['paragraph'])

      return render_template("index.html",result = result)

    return render_template("index.html")

if __name__ == '__main__':
    port = int(os.environ.get("PORT", 5000))
    app.run(host='0.0.0.0', port=port)

This will get the BERT model and the required tokenizer for the model. We have used “mrm8488/bert-small-finetuned-squadv2” from huggingface since it is comparatively smaller than the other BERT models and we have a limited space of 512 MBs on the Heroku free tier account. Next, we create a flask server to receive the inputs in the form of paragraphs and questions.

Now, make a folder named “templates” and inside the folder, create an file named “index.html”. Put the following code in the file:

<!DOCTYPE html>
<html>
  <head>
    <title>Bert Question Answering</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link href="//netdna.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet" media="screen">
    <style>
      .container {
        max-width: 1000px;
      }
    </style>
  </head>
  <body>
    <div class="container">
      <div class="row-sm-5 row-sm-offset-1">
          <h4>Enter a paragraph and a question to test BERT</h4>
            <form role="form" method='POST' action='/'>
              <div class="form-group">
                <textarea name="paragraph" class="form-control" id="url-box" placeholder="Enter a paragraph" style="max-width: 300px;" autofocus required>
			{% if result %}
          			{{ result[2] }}
		        {% endif %}
		</textarea>
                 <br>
                <input type="text" name="question" class="form-control" id="url-box" placeholder="Enter a question" style="max-width: 300px;" autofocus required>
              </div>
              <button type="submit" class="btn btn-default">Predict</button>
            </form>
          <br>
      </div>
      

      <div class="row-sm-5 row-sm-offset-1">
          {% if result %}
          <h4>Question = {{ result[0] }}</h4>
          <h4>Answer= {{ result[1] }}</h4>
          {% endif %}
      </div>

    </div>
    
  </body>
</html>

With the above code, we create a form to receive the inputs.

Step 3

Create a file naming it “Procfile” without any extension. And put the following code:

web: python app.py

This will tell Heroku what to do once the application has been deployed.

Step 4

We have our code ready. Now let’s do the talking with Heroku. Make sure you have Heroku CLI and git installed.

Once done, type the following commands into your command terminal:

heroku login

This will connect you with Heroku CLI

Next, type this to create heroku application:

heroku create your_app_name

Your app name can be anything unique.

Then type the following commands to deploy your app to Heroku:

git init
git add .
git commit -m 'initial commit'
git push heroku master

Hurrayyy! Your app has been deployed! Let’s see how does it perform.

Open a browser window and type the following web address:

https://your_app_name.herokuapp.com

Your should see a web page like this:

Since I am a huge fan of Iron Man, I shall use this paragraph taken from here:

Just A Rather Very Intelligent System (J.A.R.V.I.S.) was originally Tony Stark’s natural-language user interface computer system, named after Edwin Jarvis, the butler who worked for Howard Stark. Over time, he was upgraded into an artificially intelligent system, tasked with running business for Stark Industries as well as security for Tony Stark’s Mansion and Stark Tower. After creating the Mark II armor, Stark uploaded J.A.R.V.I.S. into all of the Iron Man Armors, as well as allowing him to interact with the other Avengers, giving them valuable information during combat.

Let’s ask BERT ‘who created Mark II’

I have deployed my BERT app here if you wish to give it a try.

And that’s it for now for deploying BERT on Heroku. Thank you for reading!😄