Top 5 interesting applications of graph tech(a guide for tech-savvy business gurus)
Traditional databases were conceived to digitize paper forms and automate well structured business processes, and still have their uses.
Graph databases were built to discover hidden connections in your data. They enable unprecedented levels of analytical capabilities, uncover new functionality that RDBMS only dream of and speed up development while maintaining flexibility. They are also whiteboard-friendly, meaning it’s easy for people in non-technical roles to understand what’s going on.
“The best entrepreneurs know this: every great business is built around a secret that’s hidden from the outside. A great company is a conspiracy to change the world; when you share your secret, the recipient becomes a fellow conspirator.” Peter Thiel, Zero to one
Such a secret was harnessed by Google, in it’s early days. You might know that 98% of their revenue(110B USD of it) comes from the search engine, which is built on top of a graph. The initial algorithm is called Page Rank(both because it’s about pages, and because Larry Page came up with it). Numerous other empires were built on top of this powerful tech. Here’s an example of what knowledge can be extracted, in real time, from the world map, put in a graph.
Still, graph database usage remains low. I believe this is because a division between the business world, and the people building the tech. With my graph database consulting services, i aim to help with that by using a factory-like playbook, that transforms what’s possible or solves your problem in 5 days.
I will give 5 examples of what’s possible, when graph databases are used. Let’s get straight to it.
5. Recommender systems
What it does: Seeks to predict the “rating” or “preference” a user would give to an item, based on his past actions and their characteristics.
Added business value: Upsell at checkout or during shopping, product personalization.
How it works:
How to read the graph above:
Look at pair of three elements, circle — line — circle. See the blue one, with name:iSushi? Let’s go from there, follow the red line, marked with LOCATED_IN, and read the red dot that says location: New York.
Here’s the knowledge we get from there: A restaurant named iSushi is located in New York.
Let’s try another one. We’ll start with the same node, but follow the line that says SERVES. So, a restaurant named iSushi serves cuisine of type Sushi. Congratulations, you can now read graphs and extract knowledge from them.
A recommender system looks at the structure of a graph, and infers new knowledge. Because the face in the blue circle up top(call him Dan) has two friends that like sushi places located in New York, we could recommend to Dan a sushi place in New York.
Who uses it?
Facebook, when it recommends new friends and people you might know. Airbnb, when it suggests new places. The examples could go on and on.
4. Visualizing and interacting with a high-level view of your data
What it does:
Graphs are whiteboard friendly by default. Your current database is not.
Humans are visual creatures, we are naturally trained to derive a lot of meaning from pictures.
Notice how little i had to write to explain this one.
Neo4j supports visualization out of the box. There are various 3rd party software(both open source, meaning free) and subscription/license based models.
Faster communication between business and development teams.
Immediate insight into the data, for strategic decision making.
Who uses it?
Companies leveraging the graph database power.
3. Network analytics
What it does:
Probably the most prominent example of network analytics is the six degress of separation. All people are six or fewer social connections away from each other. A chain of “a friend of a friend” statements can be made to connect any two people in a maximum of six steps. But that’s just the tip of the iceberg.
There are three main categories these algorithms can fall into:
a. Uncovering knowledge(answering questions such as what movies has Tom Hanks starred in, what is the most common co-star in Tom Hanks movies, what is the shortest path between San Francisco and New York, etc)
All this is possible because graph databases make connections(or relationships) a first class citizen, whereas other databases primarily deal in tables(silos of data that can be interconnected, but with additional effort and expensive computation).
b. Detecting topologies(or shape of the graph/network)
Out of the box, the tech is able to present a summary of the shape of the network the data draws(refered to by specialists as centrality of the network).
Betweeness centrality has to be the most prominent example of this. Imagine a network of friends, like the one below.
A quick look at this graph, and you understand that You are the only broker of information between your friends and the rest of the people in the network. Betweeness centrality computes the number of times a node is seen on the shortest path between all pairs of two nodes.
What it can actually tell you is the influence a node has on the flow of information in a graph. But it has other interpretations, as many concepts in this space.
By putting all the roads on the map in a graph, Atlas is able to compute the roads that are most important to the traffic in a city. A city hall could use this to disrupt traffic at a minimum, when doing maintenance work on roads. Another example is, given phone conversations between people, computing the probability that a certain user has info another user has.
Detecting communities(clusters of friends in a social network, for instance) is another application of this. An algorithm that came out in 2008, Louvain, was used to propose to Reddit users, subredit recommendations, based on their general behaviour. Twitter and Youtube use the same algo to extract topics from their social platforms.
c. Finding similarities between networks
One customer i consulted had a problem; they were unable to find similar elements in a webpage, by looking at the html source code. Various AI methods did exist, but the answers were fuzzy. Using graphs, I was able to step in, and save the day. We did not even use a database, just on the fly, in-memory computation.
Imagine you are the owner of an app that sets up groups of friends to meet. What groups of friends would you merge together? What restaurant would you suggest to them?
The inner workings of these are quite contrived, as a couple dozen algorithms are involved. The main insight is the fact that various problems are easier solved when the data is expressed in a graph format.
This is probably the most heavily-used concept in the graph database space, and it’s central to any development model, out of all 5 uses presented in this article.
Financial companies levrage this to do fraud detection and perform audits, traces are analyzed by cybersecurity professionals and many, many others use graphs on a daily basis to support critical components of their software and application .
2. Topic extraction/making sense of text
What it does: Text search is the most prominent form of search out there. The problem of searching through large sets of data was so prominent, a tech company that does just that has a market value of 6.29B USD. Now, graph databases enhance these possibilities, and add meta-information to all the searchable text.
For instance, let’s say some major news website has an article(among millions), and a phrase reads: “Senator M visited the Los Angeles County Museum of Art this afternoon”. Then, let’s say we have another article, and another phrase that reads: “Senator M visited San Diego today, to meet with N”.
Using graphs, we can easily build a system that searches for news inside the state of California that involve a senator, or even a government official, with no extra labeling and effort.
Natural language processing is one of the hardest disciplines inside Computer Science, if you ask me. But the gist of it is this: graphs enable connections and exploring paths that other technologies strive to achive.
Let users search the content you produce better. Let decision makers in your organization have a clearer picture and make data-driven decisions. Build the next Google. Maybe read the next point, before jumping straight to that.
1. Knowledge graphs
“There are still many large white spaces on the map of human knowledge. You can go discover them. So do it. Get out there and fill in the blank spaces. Every single moment is a possibility to go to these new places and explore them.” Peter Thiel
What it does:
What is knowledge? We need knowledge to gain understanding of how the world works. Essentially knowledge informs a comprehensible, sharable model of how the world works, a representation we can grasp and share and use to gain and tap into understanding. We need understanding to make decisions and act effectively.
Knowledge for that reason belongs beneath understanding in a pyramid of understanding.
“Understanding is a very compressed representation of the world,” AI researcher Eric Baum says in his book What is Thought? For humans, the representation or model evolves over a lifetime and the shorthand or compression occurs gradually.
For machines, effective, compressed representation hinges on logical consistency of a globally coherent entity-relationship model, also known as a graph with nodes as entities and edges as logically established relationships. Graphs can be much more relationship dense than tables are, and so can capture and convey more contextually derived meaning.
Knowledge graphs capture information about the world, and can effectively compress information, to gain understanding.
While there are several systems in place, i know of none that fully realize this vision. Wikipedia has been parsed and put in a nice graph database, and will be the topic of a separate article.
Build the singularity.
Bonus: Deploying to the cloud is a hot topic nowadays, and LinkedIn predicts that the hottest skill for techies this year will be cloud knowledge. While building the Chinese version, LinkedIn used Neo4j, the leading graph database. Their conclusion was that they were able to move very fast, and didn’t have the need for a database administrator.
Last tip: while you won’t be needing an administrator, you will be needing a cloud deployment, and i use GrapheneDB.
I hope you have gained an understanding of what’s possible with graph databases, why they’re different and by now have started thinking about what’s possible in your organization. What would you build or ehance using this?
A surgeon uses a scalpel, a lumberjack uses an axe, and a carpenter uses a hammer. Knowing when to use each tech is key. If you plan on using graphs, make sure you get a seasoned expert to help you fully levrage what’s possible.