Microbial Knowledge Graph with BioCypher and SemSpect
Construct and analyze a Biolink-compatible graph
This article has been updated to reflect the newest changes in BioCypher 0.5.35.
A knowledge graph (KG) is a type of database that stores and organizes knowledge in a graph-like structure, where nodes represent entities and edges represent relationships between them. It is designed to capture complex relationships and dependencies among different pieces of information, enabling more effective search, retrieval, and analysis of data. Knowledge graphs are commonly used in biology and healthcare.
OpenAI’s GPT-3 and ChatGPT have made the construction and query of knowledge graphs a lot easier. Now, every biologist can build his or her own knowledge graphs. It is no surprise that biological knowledge graphs are popping up like mushrooms in Medium, LinkedIn, and GitHub (1, 2, 3, 4, 5, 6, 7, and 8). But there are two problems.
Firstly, many of these private knowledge graphs used non-standardized vocabulary. For example, a taxon can be called Taxon in one knowledge graph and NCBITaxon in another. As a result, it is difficult to merge knowledge graphs by different creators, and sometimes by the same creator on different projects. So bioinformaticians usually create their graphs from scratch and often…