Diversify Your Stock Portfolio with Graph Analytics

Learn how you can use correlation between stock prices to infer a similarity network between stocks — and then use that network information to help you diversify your portfolio

Photo by Daniel Lloyd Blunk-Fernández on Unsplash
Graph model schema. Image by the author.
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/stocks/stock_prices.csv" as row
MERGE (s:Stock{name:row.Name})
CREATE (s)-[:TRADING_DAY]->(:StockTradingDay{date: date(row.Date), close:toFloat(row.Close), volume: toFloat(row.Volume)});
MATCH (s:Stock)-[:TRADING_DAY]->(day)
WITH s, day
ORDER BY day.date ASC
WITH s, collect(day) as nodes, collect(day.close) as closes
SET s.close_array = closes
WITH nodes
CALL apoc.nodes.link(nodes, 'NEXT_DAY')
RETURN distinct 'done' AS result
Linked list between trading days for a single stock. Image by the author.

Inferring relationships based on the correlation coefficient

We will use the Pearson similarity as the correlation metric. The authors of the above-mentioned research paper use more sophisticated correlation metrics, but that is beyond the scope of this blog post.

MATCH (s:Stock)
WITH {item:id(s), weights: s.close_array} AS stockData
WITH collect(stockData) AS input
CALL gds.alpha.similarity.pearson.write({
data: input,
topK: 3,
similarityCutoff: 0.2
YIELD nodes, similarityPairs
RETURN nodes, similarityPairs
A subgraph of the inferred similarity network between stock tickers. Image by the author.
CALL gds.louvain.write({
Network visualization of stock similarity community structure. Image by the author.
MATCH (s:Stock)-[:TRADING_DAY]->(day)
CALL apoc.create.addLabels( day, [s.name]) YIELD node
RETURN distinct 'done'
MATCH (s:Stock)-[:TRADING_DAY]->(day)
WHERE NOT ()-[:NEXT_DAY]->(day)
MATCH p=(day)-[:NEXT_DAY*0..]->(next_day)
SET next_day.index = length(p)
MATCH (s:Stock)
CALL apoc.math.regr(s.name, 'close', 'index') YIELD slope
SET s.slope = slope;
MATCH (s:Stock)
WITH s.louvain AS community, s.slope AS slope, s.name AS ticker
RETURN community, collect(ticker)[..3] as potential_investments


This is not financial advice — do your own research before investing. Even so, in this blog post, I only looked at a 90-day window for NASDAQ-100 stocks, where the markets were doing well, so the results might not be that great in diversifying your risk.



Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tomaz Bratanic

Data explorer. Turn everything into a graph. Author of Graph algorithms for Data Science at Manning publication. http://mng.bz/GGVN