Is Flair a suitable alternative to SpaCy?

Sapphire Duffy
6 min readJan 7, 2020

--

Flair vs SpaCy

Flair is a powerful NLP (Natural Language Processing) library which is open-sourced and developed by Zalando Research. The framework of Flair is built directly on PyTorch which is known as a great deep-learning framework, it is currently at version 0.4.3. During this spike, I investigated both Flair and SpaCy to compare their benefits, pros and cons and evaluate whether Flair is a suitable alternative to SpaCy.

Flair released the following pre-trained models for NLP Tasks:

  1. Name-Entity Recognition (NER)
  2. Parts-of-Speech Tagging (PoS)
  3. Text Classification
  4. Training Custom Models

For this investigation, I specifically focused on NER (Named Identity Relationship) which is also known as entity chunking and entity extraction classifies named entities that are present in a text to pre-defined categories like People, Countries, Organisations, Dates etc… A few use-cases of NER could be:

  • Customer Support
  • Classifying content, for example; news providers
  • Search Algorithms
  • Recommendation Systems
NER example of Flair

There is a sentence at the bottom of this diagram, which is the input as a character sequence into a bidirectional character language model. It was pre-trained on extremely large unlabelled text corpora. We retrieve for each word a contextual embedding by extracting the first and last character cell states from the language model. The word embedding is then passed into a vanilla BiLSM-CRF sequence labeller.

Flair has provided two pre-trained NER models, the model used is identical -a bi-LSTM on top of a word embedding layer. The NER dataset used to train each classifier is different.

Another impressive note to make about Flair is that it outperforms previous best methods on a range of NLP tasks:

Install Snippet Example:

!pip3 install flair
from flair.models import SequenceTagger
model = SequenceTagger.load('ner-ontonotes-fast') #.load('ner')
from flair.data import Sentence
sentence = Sentence(document)
model.predict(sentence)
s.to_dict(tag_type='ner')

What about SpaCy?:

SpaCy is also an open-sourced library that is free for advanced NLP in Python. It is designed for production use and helps you build applications that process and “understand” large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning. SpaCy has many features such at tokenisation, POS tagging, NER, Dependency Parsing, Sentence Boundary Detection (SBD) and more.

SpaCy example

SpaCy has a simple classifier for it’s NER model. For example; a shallow feedforward neural network with a single hidden layer which is made powerful using some clever feature engineering. Before any input features are fed into the classifier, a stack of weighted bloom embedding layers merge neighbouring features together. This then gives each word a unique representation for each distinct context it’s in.

Install Snippet example:

!python3 -m spacy download en_core_web_lg
import spacy
import en_core_web_sm
sp_lg = spacy.load('en_core_web_lg')
{(ent.text.strip(), ent.label_) for ent in sp_lg(document).ents}

So is Flair a suitable alterative to SpaCy?

SpaCy is well documented and engineered, which is why a lot more people would trust this library. It is an open-sourced library that is published and disturbed under MIT license and is developed for performing simple to advanced NLP tasks such as tokenisation, part-of-speech tagging, named entity recognition, text classification, calculating semantic similarities between text, lemmatisation, and dependency parsing etc.. It is also known as the fasted NLP framework out there. It is easy to learn and use because it has one single highly optimised tool for each task. SpaCy provides built-in word vectors. The support is active and the development is ongoing. However, there are some drawbacks to the library. SpaCy’s accuracy is too limited and SpaCy also doesn’t support many languages either.

Flair on the other hand, is an open source library designed to reach the state of the art in NER. It is modular enough to easily integrate all kinds of NLP evolution. Flair supports a number of languages unlike other libraries out there, Flair is also simple to use like SpaCy. One of the major drawbacks of Flair is that it is known to be slow, however it can be optimised up to a point where inference time can be divided by 10, which will make it a lot faster.

Below outlines a table of the pros and cons of SpaCy and Flair:

Case Study — SpaCy vs Flair to anonymise French Legal Cases:

Great findings have been discovered at a company called Lefebvre Sarrut when they did collaboration with the French administration and a French Supreme Court around NER libraries from SpaCy and Flair. They decided to switch from SpaCy to Flair for many reasons. They stated that SpaCy’s accuracy was too limited for their needs and Flair was slow. They also said that out of the box accuracy of Flair is better than SpaCy on their data by a large margin, even after their improvements on SpaCy — It would have taken 30 days just on a single recent GPU.

SpaCy didn’t have a ready to download French pre-trained generic language model, a feature did exist though but wasn’t very well documented. Therefore using SpaCy on languages other than English may show poor results with pre-trained NER models. Flair did surprise them because sometimes it is described as a library not meeting the “Industrial Performance“. Flair is module by design meaning that you can choose the representation that you want to use. They used a combination of models as Zalando shared many of them on their research paper. A mix of FastText embeddings and a character-based pre-trained language model (both trained on French Wikipedia) with no fine-tuning of the language model on legal data has been performed.

Overall, instead of a total of 30 days, a complete inventory processing takes less than 3 days. A 3 days processing is something that they can perform 1 or 2 times a year (when the model is significantly improved, for instance because of a more annotated data). SpaCy is still faster, but the optimisations described by them make Flair a better solution for their use case.

Other alternative tools:

There are many other pre-trained NER models out there provided by popular open-source NLP libraries:

  • NLTK
  • GATE
  • Polygot
  • Deep Pavlov
  • Allen NLP
  • Standard Core NLP

This graph below that shows the pros and cons of popular NLP libraries:

Overall Findings:

Between Flair and SpaCy, it really depends on the use-case as to which library is more superior than the other. As mentioned, SpaCy is faster but the optimisations make Flair a far better solution for certain use-cases. SpaCy is very popular and the documentation for this library is phenomenal. However, I do think the future looks bright for Flair.

--

--

Sapphire Duffy

AI Engineer @ Kainos | Director Women Who Code Belfast #Product #Techie #Community