Using correlation to find similar stocks (including video)

Fidap Analytics
Fidap
Published in
3 min readMar 12, 2022

There are a few reasons you may want to find similar stocks —

  1. Find competitors
  2. Find alternative investments
  3. Avoid having too concentrated of a portfolio

A quick Youtube summary of this article is available as well —

If you want to find companies similar to company X, there are several methods you can try.

First, you can take the characteristics of the company — perhaps the sector and the market cap, and find similar companies that way.

A slightly more quantitative alternative is to use correlations to identify which other stocks move similarly.

Let’s use Snowflake as an example here. Snowflake is a database company with a market cap of about $55bn. As a baseline, let’s try the first method out.

First, we grab the market cap, sector, and industry (query is here).

SELECT
ticker,
name,
marketcap,
sector,
industry
FROM
fidap.US_EQUITIES.TICKERS
WHERE
ticker = 'SNOW'

Results are here —

So let’s see similar stocks to Snowflake by looking at companies in the same industry with a similar market cap — between say $30bn and $80bn.

SELECT
ticker,
name,
marketcap,
sector,
industry
FROM
fidap.US_EQUITIES.TICKERS
WHERE
sector = 'Technology'
AND marketcap > 30e9
AND marketcap < 80e9
AND industry = 'Software - Application'

Results are here —

We get a smattering of similar-ish companies. We could add in constraints around revenue growth, IPO’ed around the same time, etc.

Let’s try the other method — the correlation method. The query is here.

Here are the results —

There’s only one company in both lists — Datadog. Interestingly, there are more companies in the “Software-Infrastructure” industry that are correlated with Snowflake vs “Software-Application.” I’d argue that Snowflake is more infrastructure than it is an application as a database company. At a quick glance, the companies surfaced via the correlation method feel a bit more similar to Snowflake.

Let’s try this with another company — Coinbase. Here are the companies most correlated with Coinbase —

While this picked up some crypto heavy companies, such as Argo, Microstrategy, Riot and Marathon, it also picked up a few more seemingly random companies like Toast and Udemy.

The query itself is actually reasonably complex —

WITH main_stock AS 
(SELECT DATE, TICKER AS main_ticker, PC_1D AS main_pc
FROM FIDAP.US_EQUITIES.DAILY
WHERE DATE >= DATEADD(day, -500, CURRENT_DATE()) AND TICKER = 'COIN' AND main_pc IS NOT NULL AND main_pc != 'inf'
),
other_stock AS
(SELECT d.DATE, d.TICKER AS other_ticker, t.NAME AS company_name, round(t.marketcap/1e9,2) as mc_bn, t.industry, d.PC_1D AS other_pc
FROM FIDAP.US_EQUITIES.DAILY AS d
JOIN FIDAP.US_EQUITIES.TICKERS AS t
ON d.TICKER = t.TICKER
WHERE d.DATE >= DATEADD(day, -500, CURRENT_DATE()) AND d.TICKER != 'COIN' AND other_pc IS NOT NULL AND other_pc != 'inf' AND
t.MARKETCAP >= 1e9 AND t.isdelisted = 'N'
)
SELECT DISTINCT main_ticker, other_ticker, company_name, industry, mc_bn, round(CORR(main_pc, other_pc),2) AS correlation
FROM main_stock AS a
JOIN other_stock AS o
ON a.DATE = o.DATE
GROUP BY main_ticker, other_ticker, company_name, industry, mc_bn
HAVING correlation IS NOT NULL
ORDER BY correlation DESC
LIMIT 10

As you can see above, we are using a 500 day moving average, and restricting the market cap to > $1bn. You can fork the query and change the tickers.

--

--