Similarity Company Analysis: A Free Model for Comprehensive Competitor Identification

Carriere Maxime
2 min readNov 30, 2023

Investing or starting a business in a specific industry can make it challenging to identify competitors. While platforms like Refinitiv or Bloomberg offer this service, their average cost exceeds 20k per year. To address this issue, I’ve created a free model that helps find similar companies based on publicly available online information.

How does the model work?

Model breakdown

The model has three parts: (1) a search model, (2) a parse model, and (3) a matching model.

Using just the company name, website, and a brief description, the search model retrieves pages from Google and other sources.

Next, the parse model eliminates common words, focuses on important information using word frequency and semantic relevance tools like KeyBert and Rake.

Lastly, the matching model compares these extracted words with our database, which includes information on over 70,000 companies worldwide.

Similary Matrix

The matching model generates a matrix showing how semantically similar a company’s information is to our database (left). By applying straightforward Hierarchical Clustering, you can see distinct clusters in darker colors, each representing various fields (right). For example, the large cluster in the center is linked to tech companies, further divided into smaller sectors.

What output?

The model provides the top 10 matches for both public and private companies. It includes their countries of operation, websites, and a confidence index (green: high; orange: medium; red: low).

Why choose this over Bloomberg?

This model’s advantage lies in its reliance on up-to-date online research. It ensures the information on companies is current. Additionally, the flexibility of the short description allows for a detailed breakdown of a company. For example, Tesla engages in various areas: cars, batteries, and autonomous vehicles. If the description focuses on a specific sub-sector, the model identifies competitors within that sector while maintaining an overall understanding of the industry.

Where to try? How to help?

The model is entirely free to use at https://tailskew.com/, and no information is needed. Feel free to use it and provide as much feedback as possible to help enhance the model and share what you’d like to see in the future!

Maxime Carrière

--

--