Semantic search and Knowledge Graph’s

Naresh Ganesan
Everything is Connected
4 min readMay 13, 2023
Search, the knowledge graph way
source: https://unsplash.com/photos/fmCr42xCLtk

In an era of Artificial Intelligence, where data is considered the new oil. Since 2010, data generation and consumption have been exponentially growing and people’s lifestyle preferences have been highly influenced by data. To cater to current lifestyles, businesses have to make better sense of high-dimensional exploding data, which led to creative ways of automation, to both manage and analyse data. Traditional data modelling and management techniques are slow to catch up with high-dimensional data. Recently, many businesses have started employing graph-based storage and knowledge graph-based intelligence systems mainly due to their flexibility in handling high dimensional data use cases.

Now that the stage is set for high-dimensional data-driven applications. We will see how this aspect “high dimension” of data, influences one of the crucial components of business “Search”. The meaning of Search has consistently evolved over some time. To name a few search techniques traditionally used.

  1. Keyword Retrieval
  2. PageRank
  3. Personalization
  4. Natural Language Processing (user intent query)
  5. Knowledge Graph (connected data)

Search Landscape

Today, Search encompasses a variety of activities, like web search — information, E-commerce search — products, and reviews, social media search — people, images, and video, entertainment search — movies, songs, Map search — location, etc. It might be pretty clear about the variety and highly dimensional, connected nature of data, against which search is applied.

Search
source: https://unsplash.com/photos/f74kZNWhfps

Search has grown beyond finding information on the web to discovering new things and making connections between people, concepts, etc. This connected nature of the world and its remarkable hidden patterns requires data representation that supports complex relationships and also evolves as we discover new relationships over time. This sets the context for Knowledge Graphs which are good at representing highly interconnected data.

Knowledge Graph

A Knowledge Graph is a type of knowledge representation that captures knowledge as a graph of nodes and edges, where nodes represent entities or concepts, and edges represent the relationships between them. Every node and edge may have additional metadata about the entity or relationship it represents.

source: https://arxiv.org/pdf/2002.00388.pdf

Let’s understand the features, which make a Knowledge Graph one of the best data representation forms, for uncovering hidden patterns and relationships in evolving data.

Semantic — data (Representation and Retrieval)

Semantic Representation

Core essence of the Knowledge Graph is the data representation. Data is represented by encoding the meaning of terms and concepts in a structured and standardised way. The representation is only as good as our knowledge of the concepts/domain/world. There are many ways to build a semantic graph

  1. Handcrafted modelling
  2. NLP Techniques — NER, Entity linking, dependency parsing, POS tagging, etc.

Semantic Retrieval

The meaning of the query is used to identify entities or concepts and they are further used to retrieve relevant information. This is an active area of research.

  1. Pattern-based query generation: This technique involves using pre-defined patterns or templates to generate queries based on the user’s search query. For example, if a user searches for “actors who starred in action movies,” a pattern-based query generator might use a pre-defined pattern to generate a query that retrieves all actors who have starred in action movies.
  2. Query expansion: This technique involves expanding the initial query to include additional concepts and entities that are related to the user’s search query. This can help to retrieve more relevant information from the knowledge graph.
  3. Entity and relationship extraction: This technique involves using NLP algorithms to extract entities and relationships from the user’s search query and use them to generate a structured query that captures the user’s intent.
  4. Natural language processing (NLP)-based query generation: This technique involves using NLP algorithms to analyse the user’s search query and generate a structured query that captures the user’s intent. NLP-based query generators can handle more complex queries that involve multiple entities and relationships.

Advantages of Semantic search:

  1. Context-aware — It is based on the meaning of the concept or entity or the relation
  2. Personalised — It is based on the user’s intent
  3. Disambiguate — It can clearly differentiate between concepts

Survey of Knowledge Graph research

sour
Categorization of research on knowledge graphs https://arxiv.org/pdf/2002.00388.pdf

Knowledge Graph reasoning

Knowledge graphs can at some basic level help with reasoning like use cases to uncover hidden or implicit patterns using some advanced concepts. Have attached a survey of different reasoning techniques explored in the community.

We will be covering the process of building a semantic Knowledge Graph from multiple sources. How different NLP techniques can be applied in our data augmentation and search process.

References:

Data growth

https://onlinelibrary.wiley.com/doi/full/10.1002/aaai.12033

Data Landscape

https://searchengineland.com/modern-search-landscape-how-where-reach-target-audience-388290

Knowledge graph example from the industry

https://engineering.linkedin.com/blog/2016/10/building-the-linkedin-knowledge-graph

https://www.statista.com/statistics/871513/worldwide-data-created/

https://www.theguardian.com/technology/2013/jan/19/google-search-knowledge-graph-singhal-interview

How to build a KG

https://www.stardog.com/blog/how-to-build-a-semantic-search-engine-using-a-knowledge-graph/

https://www.quora.com/What-algorithms-are-behind-Googles-Knowledge-Graph

Reasoning

https://www.sciencedirect.com/science/article/pii/S1674862X2200012X

adhoc

https://support.google.com/knowledgepanel/answer/9787176?hl=en#:~:text=Facts%20in%20the%20Knowledge%20Graph,stock%20prices%2C%20and%20weather%20forecasts.

https://onlinelibrary.wiley.com/doi/full/10.1002/aaai.12033

https://arxiv.org/pdf/2002.00388.pdf

--

--

Naresh Ganesan
Everything is Connected

Software professional with a passion for solving real-world challenges — Currently, building an AI platform to model any business domain @ Jio