Get ready for our pre-trained BERT

Hanna Bergenwald
Peltarion
Published in
5 min readNov 4, 2019

We’ve taken the next step in enabling our users to have full-scale language model capabilities available at their fingertips, in an easy and accessible way. From releasing Natural Language Processing (NLP) capabilities to our platform this September, we have taken a leap even further! Let us introduce to you: BERT.

What is BERT and why it matters

BERT stands for Bidirectional Encoder Representations from Transformers and is a new, state-of-the-art deep learning language processing model. The idea and theory behind BERT was originally introduced by a Google AI Language research paper in 2018.

Since being introduced, BERT has quickly outperformed other Natural Language Processing (NLP) methods, become a state-of-the-art solution and has been a huge breakthrough for the entire industry.

BERT performs significantly better than all other previous language models

Here’s why:

  1. BERT is bi-directional. This means it actually understands the meaning of words based on its full context which it is in. For example, imagine the two sentences “The cat didn’t eat the food because it wasn’t hungry” and “The cat didn’t eat the food because it wasn’t fresh.” For a human, it is clear that the word “it” in the first sentence refers to the cat, and in the second sentence to the food. Because BERT looks at the context both ways (compared to traditional language models which only look one way, thus are one-directional), it too can successfully understand if “it” in this case refers to the cat or to the food.
  2. BERT uses attention. Meaning, it can handle long sentences of text more easily. For example, consider a text where the sentence “We have great neighbors where we live in Stockholm,” and a few lines further down, “On one occasion last year, we had a party for some of our closest friends and accidentally played the music a little bit too loud. They became really angry.” Previous language models would not be able to identify who “they” refers to due to the length of the sentence and distance between the words “neighbors” and “they.” BERT, on the other hand, understands that it’s the neighbors being referred to.
  3. A BERT model can be used for a lot of different purposes. Once trained and fine-tuned on relatively small amounts of labeled datasets to the domain-specific problems, BERT can be used for a lot of different purposes achieving high accuracy very quickly.

Pre-trained BERT now available on the Peltarion Platform

BERT is a complex model, with many stacked layers and different parameters — 12 stacked layers and 110 million parametres to be exact. Therefore, building and training a BERT model from scratch is very expensive, takes a lot of time and requires huge amounts of data. This is why we provide a pre-trained BERT model so that you only need to input your data and define the size of the vocabulary you want to use.

What used to be an expensive and complex process requiring weeks and weeks of training and knowledge of which specific code libraries to use, is now a few clicks job to achieve. Using BERT on the Peltarion Platform gives you access to a pre-trained BERT model. This model is capable of providing very accurate results, even when training on a small dataset.

Being able to train an existing BERT model on some domain-specific topics with a small amount of labeled data and short training time, can enable numerous companies to build business value, quickly.

Get started using BERT

Currently, we provide you with a pre-trained BERT model for text classification, compatible with the English language. For recommendations on using our pre-trained BERT, check out this documentation.

There are a variety of different aspects that text classification can be done on:

…across different kinds of media:

…and different functions:

Needless to say that the applications of text classification are plentiful. Here’s just a small selection of examples:

  • Tag content or products: Improve the browsing experience or to identify related content on your website. Useful for outlets such as e-commerce sites, blogs or online news… You name it
  • Ticket classification: Save time and enable a quicker response by classifying incoming customer service requests as a first step, e.g., by topic, urgency or language. This way, the ticket ends up with the right person capable of answering the request straight away
  • Analyze sentiment: Get quick insight into how your customers perceive a product or campaign through sentiment analysis from sources such as social media feeds, forums or emails. This opens the door to new opportunities, e.g., lets you: Continuously quantify how your customers feel about your brand and product by analyzing written communication, turning this into live KPIs which you can track and act on straight away OR Do customer segmentation based on how your customers talk about your product/brand online and thereby gain the ability to target each segment with a different strategy

Ready to get started? Start with our movie review-tutorial here and learn the basics of how to train BERT for your needs.

What’s up next? Stay tuned for more advanced BERT models on the Peltarion Platform during the coming months such as similarity, search, analysis and question answering.

Originally published at peltarion.com.

Author

Hanna Bergenwald is the Head of Product at Peltarion. She has more than 10 years of experience of Product Management and has worked at Google, Tele2 and Viaplay before joining Peltarion, most recently as the Head of Product for the streaming service Viafree.

--

--