RAG on knowledge graphs using Zephyr-7B

Manoj Kumar Vohra

7 min readNov 1, 2023

Introduction

In this article, we will learn to build a RAG framework leveraging information from knowledge graphs.

We will be using below components in this framework:

Neo4j as the knowledge graph

Zephyr-7B as the underlying language model from HuggingFace H4

Langchain for end to end stitching

Understanding the components

The knowledge graph

We will be using a dummy dataset of movies and actors connected with each other with different types of relationships. Please refer the schema below:

Here Person belongs to an entity which can play multiple role like an actor, producer, director, critic etc. A person node contains information like name of person and year of birth.

The Movie entity is the main entity and carries information like title, release year and tagline.

Zephyr language model

Zephyr is a series of language models, from HuggingFace, that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO).

For more details refer the model card here.

Neo4j vector index

We will not go in details of what a vector index or database is here. You can follow dedicated spaces on this topic in public knowledge hubs.

Neo4j released its vector index as a public beta in Neo4j 5.11 and general availability in Neo4j 5.13.

Neo4j vector indexes are powered by the Apache Lucene indexing and search library. Lucene implements a Hierarchical Navigable Small World (HNSW) Graph to perform a k approximate nearest neighbors (k-ANN) query over the vector fields.

Before generating the index, you have to generate vector embeddings and store that as a property (or vector fields) in the graph nodes. These vector embeddings are stored as LIST<FLOAT> properties on a node, where each dimensional component of the vector is an element in the LIST.

To generate embeddings we can use inbuilt Neo4j ML procedures which have support with multiple providers to create embeddings. At the time of writing this article, the supported providers are Vertex.AI, OpenAI and AWS Bedrock. Please find commands syntax for each one of them below:

#Vertex.AI
CALL apoc.ml.vertexai.embedding(['Some Text'], $accessToken, $project, {}) yield index, text, embedding;

#OpenAI
CALL apoc.ml.openai.embedding(['Some Text'], $apiKey, {}) yield index, text, embedding;

#AWS Bedrock
CALL apoc.ml.bedrock.embedding(texts, $config) yield index, text, embedding;

I didn’t have subscriptions for any of these providers, so I decided to host my own local embeddings server with open source embeddings using sentence transformers 😉.

I used a sentence-transformers model: all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384 dimensional dense vector space.

Then I defined a custom Neo4j APOC to generate embeddings for a node property using my embeddings server.

Here is the full class code

package com.mkv.custom;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.List;
import java.util.stream.Stream;

import org.neo4j.procedure.Description;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.Procedure;

import com.fasterxml.jackson.databind.ObjectMapper;

public class Embeddings {

    public static class CustomEmbeddingsResult {

        public final List<Double> embeddings;

        public CustomEmbeddingsResult() {
            this.embeddings = null;
        }

        public CustomEmbeddingsResult(List<Double> embeddings) {
            this.embeddings = embeddings;
        }
    }

    private static final ObjectMapper objectMapper = new ObjectMapper();

    @Procedure(name = "com.mkv.custom.embeddings")
    @Description("com.mkv.custom.embeddings('s1') - return vector embeddings for the input text.")
    public Stream<CustomEmbeddingsResult> embeddings(
            @Name("text") String text) throws URISyntaxException, IOException, InterruptedException {
        if (text == null) {
            return null;
        }

        String POST_URL = "http://127.0.0.1:8081/generate_embeddings";
        URI postURI = new URI(POST_URL);
        HttpRequest httpRequestPost = HttpRequest.newBuilder()
                .uri(postURI)
                .POST(HttpRequest.BodyPublishers.ofString(text))
                .header("Content-Type", "text/plain")
                .build();
        HttpClient httpClient = HttpClient.newHttpClient();

        HttpResponse<String> postResponse = httpClient.send(httpRequestPost, HttpResponse.BodyHandlers.ofString());
        CustomEmbeddingsResult result = objectMapper.readValue(postResponse.body(), CustomEmbeddingsResult.class);
        return Stream.of(new CustomEmbeddingsResult(result.embeddings));
    }
}

And the test case:

package com.mkv.custom;

import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInstance;
import org.neo4j.driver.Driver;
import org.neo4j.driver.GraphDatabase;
import org.neo4j.driver.Record;
import org.neo4j.driver.Session;
import org.neo4j.driver.Value;
import org.neo4j.harness.Neo4j;
import org.neo4j.harness.Neo4jBuilders;

import java.util.List;

import static org.assertj.core.api.Assertions.assertThat;

@TestInstance(TestInstance.Lifecycle.PER_CLASS)
public class EmbeddingsTest {

    private Neo4j embeddedDatabaseServer;

    @BeforeAll
    void initializeNeo4j() {
        this.embeddedDatabaseServer = Neo4jBuilders.newInProcessBuilder()
                .withDisabledServer()
                .withProcedure(Embeddings.class)
                .build();
    }

    @AfterAll
    void closeNeo4j() {
        this.embeddedDatabaseServer.close();
    }

    @Test
    void generateEmbeddings() {
        // This is in a try-block, to make sure we close the driver after the test
        try(Driver driver = GraphDatabase.driver(embeddedDatabaseServer.boltURI());
            Session session = driver.session()) {

            // When
            Record result = session.run("CALL com.mkv.custom.embeddings('Hello') YIELD embeddings RETURN embeddings").single();

            // Then
            Value actual_embeddings = result.get("embeddings");

            List<Double> expected_embeddings = List.of(-0.0627717524766922, 0.054958831518888474, 0.05216477811336517);

            assertThat(actual_embeddings.get(0).asDouble()).isEqualTo(expected_embeddings.get(0));
            assertThat(actual_embeddings.get(1).asDouble()).isEqualTo(expected_embeddings.get(1));
            assertThat(actual_embeddings.get(2).asDouble()).isEqualTo(expected_embeddings.get(2));
        }
    }
}

Finally the dependencies details for Neo4j.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.mkv.custom</groupId>
  <artifactId>embedding-apocs</artifactId>
  <version>1.0.0</version>

  <packaging>jar</packaging>
  <name>Custom Neo4j Procs</name>
  <description>Custom Neo4j Procs For Generating Embeddings</description>

  <properties>
    <java.version>17</java.version>
    <maven.compiler.release>${java.version}</maven.compiler.release>

    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    <neo4j.version>5.12.0</neo4j.version>
    <jackson.version>2.13.4</jackson.version>
    <neo4j-java-driver.version>5.12.0</neo4j-java-driver.version>
    <junit-jupiter.version>5.10.0</junit-jupiter.version>
    <maven-shade-plugin.version>3.5.1</maven-shade-plugin.version>
    <maven-compiler-plugin.version>3.11.0</maven-compiler-plugin.version>
    <assertj.version>3.24.2</assertj.version>
    <maven-surefire-plugin.version>3.1.2</maven-surefire-plugin.version>
    <maven.version>3.8.1</maven.version>
    <maven-enforcer-plugin.version>3.4.1</maven-enforcer-plugin.version>
  </properties>

  <dependencies>
    <dependency>
      <!-- This gives us the Procedure API our runtime code uses.
           We have a `provided` scope on it, because when this is
           deployed in a Neo4j Instance, the API will be provided
           by Neo4j. If you add non-Neo4j dependencies to this
           project, their scope should normally be `compile` -->
      <groupId>org.neo4j</groupId>
      <artifactId>neo4j</artifactId>
      <version>${neo4j.version}</version>
      <scope>provided</scope>
    </dependency>

    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-databind</artifactId>
      <version>${jackson.version}</version>
    </dependency>

    <!-- Test Dependencies -->
    <dependency>
      <!-- This is used for a utility that lets us start Neo4j with
           a specific Procedure, which is nice for writing tests. -->
      <groupId>org.neo4j.test</groupId>
      <artifactId>neo4j-harness</artifactId>
      <version>${neo4j.version}</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-nop</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

    <dependency>
      <!-- Used to send cypher statements to our procedure. -->
      <groupId>org.neo4j.driver</groupId>
      <artifactId>neo4j-java-driver</artifactId>
      <version>${neo4j-java-driver.version}</version>
      <scope>test</scope>
    </dependency>

    <dependency>
      <groupId>org.junit.jupiter</groupId>
      <artifactId>junit-jupiter-engine</artifactId>
      <version>${junit-jupiter.version}</version>
      <scope>test</scope>
    </dependency>

    <dependency>
      <groupId>org.assertj</groupId>
      <artifactId>assertj-core</artifactId>
      <version>${assertj.version}</version>
      <scope>test</scope>
    </dependency>

  </dependencies>

  <build>
    <plugins>
      <plugin>
        <artifactId>maven-enforcer-plugin</artifactId>
        <version>${maven-enforcer-plugin.version}</version>
        <executions>
          <execution>
            <id>enforce</id>
            <goals>
              <goal>enforce</goal>
            </goals>
            <phase>validate</phase>
            <configuration>
              <rules>
                <requireJavaVersion>
                  <version>${java.version}</version>
                </requireJavaVersion>
                <requireMavenVersion>
                  <version>${maven.version}</version>
                </requireMavenVersion>
              </rules>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>${maven-compiler-plugin.version}</version>
      </plugin>
      <plugin>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>${maven-surefire-plugin.version}</version>
      </plugin>
      <plugin>
        <!-- This generates a jar-file with our procedure code,
             plus any dependencies marked as `compile` scope.
             This should then be deployed in the `plugins` directory
             of each Neo4j instance in your deployment.
             After a restart, the procedure is available for calling. -->
        <artifactId>maven-shade-plugin</artifactId>
        <version>${maven-shade-plugin.version}</version>
        <configuration>
          <createDependencyReducedPom>false</createDependencyReducedPom>
        </configuration>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

The above procedure will be invoked to generate embeddings.

Time to generate embeddings for Movie nodes

Now with setup to create embeddings is in place, its time to define relevant information in Movie nodes which will be translated to embeddings.

To do this, we condensed information from all the connected Person nodes for a movie node and created a new source field which summarises all information like movie title, release year, plot, actors, directors in a single source field (highlighted in green section below).

Then we generate embeddings from the text in source field using our custom procedure and store that in embeddings field/property (highlighted in orange section below).

With embeddings in place on Movie nodes, we can define our Neo4j vector index (name: movies_extended_info_embeddings) now on the embeddings field.

#create vector index
CALL db.index.vector.createNodeIndex('movies_extended_info_embeddings', 'Movie', 'embeddings', 384, 'cosine')

The index would take some time to setup and post setup we can query this index.

Setup RAG

With neo4j index in place, we will setup our RAG QA pipeline.

Here we will use langchain to define a vector db handle on our existing index.


from langchain.vectorstores import Neo4jVector
from langchain.embeddings import HuggingFaceEmbeddings

index_name = "movies_extended_info_embeddings"
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                   model_kwargs={'device': 'cpu'})
vectorDb = Neo4jVector.from_existing_graph(
        embedding=embeddings,
        url='bolt://localhost:7687',
        username='neo4jadmin',
        password='XXXXXX',
        index_name=index_name,
        node_label="Movie",
        text_node_properties=["title", "tagline"],
        embedding_node_property="embeddings",
        search_type="vector"
    )

Testing with a similarity search with score:

query = "Who played in the Matrix?"
retriever = vectorDb.as_retriever()
retriever.get_relevant_documents(query)[0]

Setting up the RetrievalQA chain

import os
from langchain.llms import HuggingFaceHub

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "XXXXXXXXXXXXXXXXXX"

repo_id = "HuggingFaceH4/zephyr-7b-beta"
llm = HuggingFaceHub(
    repo_id=repo_id, model_kwargs={"temperature": 0.5, "max_new_tokens":512}
)

chain = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)

Summary

We see that Zephyr-7B even though being significantly smaller in size than popular LLM models like gpt-3.5-turbo and llama-70B and is equally performant in summarisation tasks, understanding the context and instructions following.

With Neo4j support for vector index has tremendous potential to exploit relationships in a natural language query format.