Finding Startups with BERT

How we use Natural Language Processing to identify new companies.

hy
Axel Springer Tech
4 min readApr 1, 2020

--

by Christoph Schwienheer

An important part of our work at hy is the systematic analysis of start-up companies and venture capital investments. In addition to the comprehensive examination of sectors, the analysis of competitive fields and the identification of new partners or investment opportunities play a central role.

Semantic search using sample companies

In addition to proprietary company databases, we use our self-developed scouting technology to carry out customer projects. The internal database stores, indexes, and analyses full-text data such as company descriptions, website and news texts. Based on classic search technologies such as ElasticSearch and Query Expansion, we test new technologies to achieve even better results in our scouting processes. Most recently, we have developed new software based on BERT (Bidirectional Encoder Representations from Transformers), which helps us find a large number of similar start-ups based on just a few reference companies.

Similarity search with BERT: The tool efficiently finds companies that are similar to given example companies.
Similarity search with BERT: The tool efficiently finds companies that are similar to given example companies.

The Technology: Basics of BERT

BERT implements several techniques that help us to implement a powerful method for start-up similarity analysis. On the one hand, the model processes the language in context and thus recognizes the different meanings of words depending on their usage. For example, the model can recognize that the word “bank” in the company descriptions “manufacturer of bank and chair screws” and “bank and financial services provider” each expresses something different. The model also learns which types of words are important for determining similarities and which play a less decisive role.

Furthermore, BERT enables us to use a new training method, the so-called Transfer Learning. This method has been successfully used in computer vision for several years and has helped this field to a great leap forward. Intensive research was done to find ways to make transfer learning effective for language processing as well, until the breakthrough was finally achieved in 2018. A model is trained on a very large amount of full-text data, such as news or Wikipedia articles, and initially learns many facets of the basic structure of language. This can be used as a basis for more efficiently incorporating additional data from the target domain. Finally, the model is trained to solve the actual target problem, such as text classification or similarity analysis. This step-by-step training of models allows to reuse large pre-trained models to finally solve the target problem precisely with relatively little annotated data.

transfer learning

A lot has happened since the release of BERT in November 2018. Google and Microsoft are already using the technology in productive operation. Meanwhile, the research community has been reporting on ever improving models. Many AI research groups of large technology companies (e.g. Facebook, Salesforce) and universities are currently working on further developing the basic architecture or optimizing it for specific use cases.

“In light of the impressive empirical results of ELMo, ULMFiT, and OpenAI it only seems to be a question of time until pretrained word embeddings will be dethroned and replaced by pretrained language models in the toolbox of every NLP practitioner. This will likely open many new applications for NLP in settings with limited amounts of labeled data. The king is dead, long live the king!”

An accurate forecast by Sebastian Ruder in July 2018

Integration into the Ecosystem Manager

Due to our extensive project history, we have access to a large amount of manually curated and categorized company data. On the one hand, this serves as training data, which the model can use to learn what kind of language constructs are generally important for similarity analysis. On the other hand, we can use a certain part of the data as evaluation data and check how well our new development performs compared to previous approaches. With the help of this data and the BERT base model, we have implemented a similarity analysis and can predict the degree of similarity of previously unseen companies and have a powerful tool that efficiently supports us in scouting for new startups.

We integrated this technology into the hy Ecosystem Manager and making this new approach accessible to our customers. Interested? Sign up for a free demo version at www.ecosystem-manager.co

This article was originally published on hy.co

--

--