Understanding BERT with Huggingface
Using BERT and Huggingface to create a Question Answer Model
In my last post on BERT, I talked in quite a detail about BERT transformers and how they work on a basic level. I went through the BERT Architecture, training data and training tasks.
But, as I like to say, we don’t really understand something before we implement it ourselves. So, in this post, we will implement a Question Answering Neural Network using BERT and HuggingFace Library.
What is a Question Answering Task?
In this task, we are given a question and a paragraph in which the answer lies to our BERT Architecture and the objective is to determine the start and end span for the answer in the paragraph.
As explained in the previous post, in the above example, we provide two inputs to the BERT architecture. The paragraph and the question separated by the <SEP>
token. The purple layers are the output of the BERT encoder. We now define two vectors S
and E
(which will be learned during fine-tuning) both having shapes(1x768
). We then get some scores by taking the dot product of these vectors and…