Building a LinkedIn-like Knowledge Graph with Neo4j and Spring Boot

Abhishek Ranjan
5 min readApr 27, 2023

Introduction

As a Software Architect at leading tech companies, including those similar to LinkedIn, I have had the opportunity to work with some of the most cutting-edge technologies in the industry. One such powerful combination is using Neo4j, a highly scalable graph database, along with Spring Boot, a framework that simplifies the process of creating stand-alone applications.

In this article, I will share my experience and insights on how to leverage Neo4j with Spring Boot to build a knowledge graph like LinkedIn. The goal is to create a scalable, performant, and flexible system that can efficiently manage and query vast amounts of interconnected professional data.

Understanding the Requirements

A knowledge graph like LinkedIn requires an efficient way to store and manage relationships among users, organizations, job positions, skills, and other professional attributes. This necessitates a database that can handle complex relationships and provide an intuitive query language to retrieve relevant information efficiently.

Choosing Neo4j as the Graph Database

Neo4j is an ideal choice for this task due to its native graph storage and processing capabilities, as well as its expressive Cypher query language. Let’s start by designing the graph model for our knowledge graph. The following diagram represents a high-level view of the main entities and their relationships:

Integrating Neo4j with Spring Boot

To begin, we need to set up a Spring Boot project and add the required dependencies for Neo4j. You can either use Spring Initializr or manually add the following dependencies to your build.gradle or pom.xml file:

For Gradle:

dependencies {
implementation 'org.springframework.boot:spring-boot-starter-data-neo4j'
implementation 'org.neo4j.driver:neo4j-java-driver:4.3.4'
}

For Maven:

<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-neo4j</artifactId>
</dependency>
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>4.3.4</version>
</dependency>
</dependencies>

Creating Domain Entities and Repositories

We will create Java classes to represent our graph entities and Spring Data Neo4j repositories to handle database operations. For example, the User entity and its corresponding repository would look like this:

User.java:

import org.neo4j.springframework.data.core.schema.Id;
import org.neo4j.springframework.data.core.schema.Node;
import org.neo4j.springframework.data.core.schema.Relationship;
@Node
public class User {
@Id
private final String username;
private final String name;
@Relationship(type = "HAS_SKILL")
private List<Skill> skills;
@Relationship(type = "HAS_POSITION")
private List<Position> positions;
@Relationship(type = "WORKS_AT")
private Company company;
@Relationship(type = "CONNECTED_TO")
private List<User> connections;
// Constructors, getters, and setters omitted for brevity
}

UserRepository.java:

import org.neo4j.springframework.data.repository.Neo4jRepository;
public interface UserRepository extends Neo4jRepository<User, String> {
}

Now, let’s define the additional entities and relationships in our graph model:

Skill.java:

import org.neo4j.springframework.data.core.schema.Id;
import org.neo4j.springframework.data.core.schema.Node;
@Node
public class Skill {
@Id
private final String name;
// Constructors, getters, and setters omitted for brevity
}

Position.java:

import org.neo4j.springframework.data.core.schema.Id;
import org.neo4j.springframework.data.core.schema.Node;
import org.neo4j.springframework.data.core.schema.Relationship;
@Node
public class Position {
@Id
private final String title;
@Relationship(type = "BELONGS_TO")
private Company company;
// Constructors, getters, and setters omitted for brevity
}

Company.java:

import org.neo4j.springframework.data.core.schema.Id;
import org.neo4j.springframework.data.core.schema.Node;
import org.neo4j.springframework.data.core.schema.Relationship;
@Node
public class Company {
@Id
private final String name;
@Relationship(type = "HAS_INDUSTRY")
private Industry industry;
// Constructors, getters, and setters omitted for brevity
}

Industry.java:

import org.neo4j.springframework.data.core.schema.Id;
import org.neo4j.springframework.data.core.schema.Node;
@Node
public class Industry {
@Id
private final String name;
// Constructors, getters, and setters omitted for brevity
}

Implementing Query and Data Manipulation Operations

With our entities and repositories in place, we can now implement the necessary query and data manipulation operations. For instance, to find users with a specific skill, we can extend the UserRepository with a custom Cypher query:

public interface UserRepository extends Neo4jRepository<User, String> {
@Query("MATCH (u:User)-[:HAS_SKILL]->(s:Skill) WHERE s.name =
$skillName RETURN u")
List<User> findBySkill(String skillName);
}

We can also implement more complex queries, such as finding users who have a specific skill and work in a particular industry:

@Query("MATCH (u:User)-[:HAS_SKILL]->(s:Skill), (u)-[:WORKS_AT]->(c:Company)
-[:HAS_INDUSTRY]->(i:Industry) WHERE s.name = $skillName AND i.name =
$industryName RETURN u")
List<User> findBySkillAndIndustry(String skillName, String industryName);

Creating Controllers and Services

Next, we’ll create controllers and services to handle API requests and perform business logic. For example, we can create a UserController to expose an endpoint that retrieves users based on a specific skill:

UserController.java:

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class UserController {
private final UserService userService;
public UserController(UserService userService) {
this.userService = userService;
}
@GetMapping("/users")
public List<User> getUsersBySkill(@RequestParam String skill) {
return userService.findUsersBySkill(skill);
}
}

UserService.java:

import org.springframework.stereotype.Service;
@Service
public class UserService {
private final UserRepository userRepository;
public UserService(UserRepository userRepository) {
this.userRepository = userRepository;
}
public List<User> findUsersBySkill(String skill) {
return userRepository.findBySkill(skill);
}
public List<User> findUsersBySkillAndIndustry(String skill, String industry) {
return userRepository.findBySkillAndIndustry(skill, industry);
}
}

Configuration and Deployment

Before deploying our application, we need to configure the connection to the Neo4j database. Update the application.properties file with the following settings:

spring.neo4j.uri=bolt://localhost:7687
spring.neo4j.authentication.username=neo4j
spring.neo4j.authentication.password=your_password

Make sure to replace your_password with the password for your Neo4j instance.

To deploy the application, execute the following command in the project’s root directory:

./gradlew bootRun

Or, if you’re using Maven:

./mvnw spring-boot:run

This will start the Spring Boot application and expose the API endpoints.

Testing and Further Improvements

To test the functionality of our knowledge graph, we can use tools like Postman or curl to send requests to the exposed API endpoints. For example, to get users with a specific skill, we can send a GET request to /users?skill=Java.

There are many ways to further improve and expand the knowledge graph, such as:

  • Implementing additional API endpoints for managing users, positions, companies, and industries
  • Implementing pagination and filtering for API endpoints
  • Enhancing the data model with more entities and relationships, such as education, certifications, and recommendations
  • Integrating with external data sources to enrich the knowledge graph
  • Implementing advanced analytics and recommendations using Neo4j’s graph algorithms

Conclusion

In this article, we’ve explored the process of creating a LinkedIn-like knowledge graph using Neo4j and Spring Boot. We’ve covered the high-level design of the graph model, the integration of Neo4j with Spring Boot, creating domain entities and repositories, implementing custom queries, and exposing API endpoints. By following these steps and expanding upon the provided examples, you can create a powerful, scalable, and flexible knowledge graph for professional data.

--

--