Create a Data Marvel — Part 7: Connecting the Graph

Jennifer Reif
Neo4j Developer Blog

--

*Update*: All parts of this series are published and related content available.

Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9, Part 10

Completed Github project (+related content)

In the posts leading up to this one, we have taken data from the Marvel API, organized it into a graph model, imported the data to Neo4j, created a Spring application from the Spring Initializr page, and set up 5 out of 6 of our entities (Character, Creator, Event, Series, Story) with domain, repository, and controller classes. Finally, we are ready to start knitting all these pieces together using the ComicIssue domain and code the connectors that define a graph — the relationships!

In this post, we will work through some coding for our ComicIssue model, repository, and controller classes, as well as code one additional class for a service. We will talk about what purpose the extra class fills and what goes into it when we get to it. For now, let’s code!

The ComicIssue class

This class will look similar to our other classes we coded in the last couple of blog posts. To create it, simply right-mouse click on our comicissue folder and choose New->Java Class. Name it ComicIssue, and Spring will generate the skeleton for us.

We will have our annotations, Lombok shortcuts, and our fields. We can see this in the class code below.

package com.example.demo.comicissue;import ...@Data
@NoArgsConstructor
@RequiredArgsConstructor
@NodeEntity
public class ComicIssue {
@Id @GeneratedValue
private Long neoId;
@NonNull
private Long id;
@NonNull
private Integer pageCount;
@NonNull
private Double issueNumber;
@NonNull
private String name, thumbnail, resourceURI;
@Relationship(type = “INCLUDES”)
private List<Character> characters = new ArrayList<>();
@Relationship(type = “CREATED_BY”)
private List<Creator> creators = new ArrayList<>();
@Relationship(type = “PART_OF”)
private List<Event> events = new ArrayList<>();
@Relationship(type = “BELONGS_TO”)
private List<Series> series = new ArrayList<>();
@Relationship(type = “MADE_OF”)
private List<Story> stories = new ArrayList<>();
public List<Character> getCharacters() { return characters; }
public List<Creator> getCreators() { return creators; }
public List<Event> getEvents() { return events; }
public List<Series> getSeries() { return series; }
public List<Story> getStories() { return stories; }
}

What is different is the bit after our fields. We have added some annotations, array lists, and getters here. This is where we connect our ComicIssue entity to our other entities. Recall the data model that we drew back in our first post.

Before today’s post, none of those arrows between our nodes existed in our application (they do in the database, though). Let us take just one of these relationships and examine the code a bit more closely.

@Relationship(type = “INCLUDES”)
private List<Character> characters = new ArrayList<>();
public List<Character> getCharacters() { return characters; }

In that block, we use the @Relationship annotation from the object-graph mapper (OGM) to specify that a relationship exists between the current entity (ComicIssue) and the entity retrieved in the ArrayList (in this case, Character). The type annotation argument tells us the kind of the relationship that exists between the two entities. Since we have several different relationships that can exist in our graph (CREATED_BY, MADE_OF, etc), we want to specify that the INCLUDES relationship exists between a ComicIssue and a Character.

The line beneath the annotation then sets up an ArrayList variable that will contain a list of Character entities. Finally, we add a “getter” method that will also return a list of type Character. This means that when we retrieve a ComicIssue, we can access the characters that are connected to that ComicIssue by an INCLUDES relationship.

We use this same logic to set up the relationships between the ComicIssue and each of our other entities, as we saw in the initial code block. With those connections, we will be able to retrieve any of the other entities connected to a particular ComicIssue.

The repository class

Now that we have the model class completed, we can set up our data access layer with the repository class. Just as we did with our other classes, we will create a new Java class, name it (ComicIssueRepo), and set the Kind to Interface.

This interface will be just a bit different from our previous ones because we will be accessing all of our data through the ComicIssue entities, so we need to define some methods to retrieve different kinds of information. We will still extend our interface from the Neo4jRepository, same as we did with our other repositories.

package com.example.demo.comicissue;import ...public interface ComicIssueRepo extends Neo4jRepository<ComicIssue, Long> {
ComicIssue findByName(@Param(“name”) String name);
Iterable<ComicIssue> findByNameLike(@Param(“name”) String name); @Query(“MATCH (i:ComicIssue)-[r]-(n) RETURN i,r,n LIMIT {limit}”)
Collection<ComicIssue> graph(@Param(“limit”) int limit);
}

In our code, we decided to define 3 methods — findByName(), findByNameLike(), and graph(). The first method allows us to search for a ComicIssue by name and retrieve it from the database. It uses the @Param annotation to pass the String name that we will type in to search. Since we are searching for a particular ComicIssue, we only expect a single result to come back.

The next method is doing a fuzzy search for a possible group of matching comic issues. If we wanted to look for anything that had “Captain America” in the name, then we would see many results retrieved. Again, we have our @Param annotation to allow us to pass a String value as the ComicIssue name, and the method will return an Iterable of type ComicIssue with the list of possible results.

These two methods are different from the third method because we didn’t use the @Query annotation on them. This is because Spring Data Neo4j (through the OGM) will derive the first two queries for us based on the method name. Common methods such as findByName() and findByNameLike() can be auto-mapped because they are straightforward and broadly applicable. This way, we don’t need to write queries for these common methods and can focus on other data that we need to retrieve with custom queries. Let us review this third method and explain each component.

@Query(“MATCH (i:ComicIssue)-[r]-(n) RETURN i,r,n LIMIT {limit}”)
Collection<ComicIssue> graph(@Param(“limit”) int limit);

It begins with an @Query annotation for us to write a Cypher query to be executed. In this case, we want to retrieve a section of the graph to show on the screen for a visualization. Our query, then, is matching ComicIssue entities that have any type of relationship to any other entity. It then returns the ComicIssue nodes (alias i), the relationships (alias r), and the connected nodes (alias n) and limits the results to an arbitrary number (we will pass a parameter into limit). The limit prevents the visualization and browser performance from getting overwhelmed with everything in the database since we have a large amount of data. The line after the annotation is our method definition that uses the @Param annotation to pass in the argument for limit and expects a collection (a type of Iterable) of ComicIssue entities in return.

That’s all we need! This class will create the data access layer and act as the go-between for the database and types of requests.

What I Learned

Though there were a few bits of new information in these classes, it was really only to add the connections between this central class (ComicIssue) and the other classes. Our class structures and their functions matched our expectations from previous code, so we just needed to map the relationships and set up the methods for accessing the data through the ComicIssue.

As always, I will share the highlights of what I learned.

  1. Much of the code mirrored what we did with other classes, so I could focus on the new piece — the relationships between entities. Understanding what annotations did what and which arguments were needed or unnecessary took some research and some testing. Your use case and data will not be exactly what someone else has, so you will also need to experiment a bit to ensure you don’t have any useless code. You cannot rely simply on copy/paste (nor do you probably want to).
  2. This project was a bit trickier than other existing code examples out there because we had so many entities that all connected to each other (most examples have 2 or 3 entities, ours has 6). However, this just prepares us for the variety of complex data sets in real-world scenarios. Real data will not be simple or consistently-organized, so projects like this are more applicable to real-world data scenarios.

Next Steps

Progress in these later posts may seem slower, but we are stair-stepping our way through more difficult material and code. Knowing how each piece fits together and why something functions the way it does is vital to replicating the process with different use cases and data. Understanding why helps you apply the logic and create your own applications without relying on copy/paste or trial-and-error code.

In the next post, we will walk through the controller class and the service class, too! This will wrap up our code for handling requests and data before we drop in the final piece for a pretty user interface with html. See you soon!

Resources

--

--

Jennifer Reif
Neo4j Developer Blog

Jennifer Reif is an avid developer and problem-solver. She enjoys learning new technologies, sometimes on a daily basis! Her Twitter handle is @JMHReif.