Neo4j Cosine Similarity Algorithm example
May 12: Flags, One Month Graph Challenage
In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge
Today is a Day of State Emblem and National Flag of the Republic of Belarus. Belarus is my Homeland, this is why from history of the day I pick this event from others.
I plan to create small amount of data based on flags of some European countries splitted by color. Then I want to find how similar flags are. I hope algo library and Cypher ready for this challenge.
Colors and flags:
It was boring even for this 22 random flags I picked. Sorry, if I skipped flag of your country, you are welcomed to extend this list.
Important to notice, that each relationship have a “weight” of color in flag. Depends on percentage of color usage in flag. So, for example, flag of Poland gives 50 “points” to red and white, while Russian flag gives to red, white and blue 33 “points” each.
First of all let’s do a small filtering and find the most used color:
And finally lets find the most similar flags of all the countries based on the color weight:
For sure, here I do not consider the structure of the flag, for example, 2 horizontal lines, 3 vertical lines or cross, or any other pattern. Anyway results pretty cool. Now I know, that Belarus flag pretty same as Portugal flag. Never thought about that.
With this small graph I find another usage one more use case for the algorithm from algo library and very happy about that. In this example would be interesting to somehow express the flag pattern similarity, then maybe we will got another results. What do you think about it, guys? Any ideas how to do this?