Embeddings + Knowledge Graphs: The Future of Generative AI

Thcookieh
4 min readMar 4, 2024

--

Embeddings and knowledge graphs are important to be emphasized because they will revolutionize the way we interact with AI in the near future.

Embeddings are vectors that represent concepts or entities. They can be used to create a common language between different types of data, such as text, images, and audio. They are usually very abstract representations of concepts, which allow us to package meaning in the form of numbers. This allows us to perform analogical operations.

Example: Child + age ⇔ man. Simplifying the relationships between words, embeddings are great for doing operations.

Knowledge graphs are databases that store information about entities and their relationships. They can be used to represent real-world knowledge, such as historical facts, relationships between people and places, or information about products and services.

The main difference in this way of representing information lies in how the relationships are stored. First, the relationships are semantic, which means that the relationships are very natural. These relationships can be as complex or simple as desired.

What do they have to do with Artificial Intelligence? The answer lies in the fact that an AI can learn to write code that interacts with the database in a semantic, or natural, way.

Why are they important?

Both solutions are RAGs, or Retrieval Augmented Generation. These involve providing a language-expert machine with the necessary context so that it does not have to rely on its own memory of concepts.

Remember that grammar lives in both facts and lies.

Using a RAG avoids the dreaded hallucinations and contributes greatly to using LLMs for what they are very good at: language tasks. Such as:

  • Translating to other languages.
  • Summarizing and synthesizing.
  • Explaining in simpler ways.
  • Comparing concepts and ideas.

What if we combine both?

Vector Indexes: What you need to make your RAGs more robust.

Vector indexes are a new technology from Neo4j, a type of vector database that combines embeddings and knowledge graphs. They allow language models to query knowledge graphs in a more efficient and accurate way.

Benefits of embeddings, knowledge graphs, and vector indexes:

  • Semantic Storage: Information is stored in a natural way, similar to the language we use to express relationships between concepts, making it easier to explore facts.
  • Segmented Search: Retrieving information from a RAG is not only done on a subgraph, but also ranks, compares, and discriminates on the data, avoiding comparing a result with all possible results in the graph.
  • Fact-based Recommendations: By guiding a machine towards the task of explaining a result obtained from exploring a subgraph, we are avoiding the burden of having the answer in memory, and instead it is responsible for the task of comparing, rephrasing, translating, etc.

Use cases:

  • Financial Information Inquiry: Use this solution to query your company’s financial information and make sense of it without consuming more resources than necessary.
  • Recommendation Systems: Offer the best response to your queries for product sales or service recommendations using embeddings.
  • Search Engines: Forget about having special filters to find the information you want from your database, and let an LLM search for you.

How can I use this technology?

  1. Convert your data source into a semantic graph in Neo4j.
  2. Make each node a concept and feed this concept with information that complements it, such as examples, descriptions, etc.
  3. Store an embedding that represents the node in the node.
  4. Create a Vector Index on the field that represents the embedding within the node.
  5. Use an LLM to mediate between you and your graph.
  6. Make queries that call up the information inside your graph.
  7. Compare your query with the nodes and get the most similar one.
  8. Have your LLM explain the result to give it context.

Here is my try on doing one on a Code Search Engine with Neo4j.

Keep in mind that you will need to know how to create your database, how to limit your interpretation model, how to limit the query rules to the database, and how to create the workflow.

Therefore, it is always a good idea to approach a professional who can help you shape your idea.

Thank you so much for reading my post, if you got so far, please consider subscribing to my newsletter, sharing, commenting or leaving a clap to the post. It helps us a lot, and its a constant motivation to continue creating content like this.

We have a lot of things in our hands at the moment, but we love to share content, your interaction is a good reminder that taking a moment to write is helping others and its a well used time. Don’t forget on checking out our social media and our agency if you want us to help you on building your business around AI.

--

--

Thcookieh

R&D | AI Consultant | You cannot compete with someone who loves what he does. It is in his instict. He does not compete. He lives.