What is DBpedia, and why is it important?

DBpedia is a community project that creates and provides public access to critical structured data for what’s commonly referred to as the Linked Open Data Cloud. Data is published strictly in line with “Linked Open Data” principles that mandate the following:

  1. Entities are identified using hyperlinks (HTTP URIs).
  2. Entities are described using RDF Language based sentences/statements where the subjects and predicates are identified by HTTP URIs, while objects may be identified using either an HTTP URI or a Literal.
  3. Entity descriptions are published to HTTP networks (e.g., the World Wide Web) using RDF documents, where content (from step #2 above) has been serialized using any of a variety of formats (e.g., HTML, JSON-LD, RDF-Turtle, RDF-XML).


The DBpedia project was officially launched in 2007 at the World Wide Web conference in Banff — when the Linked Open Data Cloud was only a tiny collection of bubbles coalescing around DBpedia.

Linked Open Data Cloud, circa 2007. Source: http://cs.smith.edu/dftwiki/images/e/e0/AboutDBPedia.jpg

Today, courtesy of the viral power of hyperlinks and eternal value of structured data, the Linked Open Data cloud has grown to a massive collection of web-accessible and web-like structured data represented as RDF Predicate/Property Graphs (or subject→predicate→object Triples).

Linked Open Data Cloud, circa 2016. Source: http://lod-cloud.net/versions/2014-08-30/lod-cloud.svg

The DBpedia project comprises three main areas:

  1. Structured Data Extractors & Transformers — which extract entities, entity relationship types, and entity relationships from Wikipedia documents
  2. Deployment of Linked Open Data — that makes entity relationships available to the Web in Linked Open Data form; i.e., entities and entity relationship types are identified using hyperlinks (HTTP URIs), and then these are used to create web-like (or webby) entity relationship graphs using RDF Language sentences/statements, using a variety of notations and document content types
  3. Live Web Query Services — that provide ad-hoc SPARQL Query Language access to structured data, where query results are delivered as Relational Tables or Entity Relationship Graphs, using a variety of document types/formats (e.g., HTML, JSON-LD, RDF-Turtle, RDF-XML, CSV, JSON, OData, and others)

Why is DBpedia Important?

DBpedia provides a complementary service to Wikipedia by exposing Wikipedia-knowledge in a form compatible with tools covering ad-hoc structured data querying, business intelligence & analytics, entity extraction, natural language processing, reasoning & inference, machine learning services, and artificial intelligence in general.

By providing the critical kernel around which the Linked Open Data Cloud blossomed, DBpedia enabled that cloud to become today’s massive reference database that allows anyone to look up the description of an entity based on either its hyperlink-based identifier or its literal label.

DBpedia has been, and continues to operate as, a major focal point for research and expertise related to artificial intelligence (machine learning, natural language processing, and knowledge management).

Enterprises, such as Apple (via Siri), Google (via Freebase and Google Knowledge Graph), and IBM (via Watson), and particularly their respective high-visibility projects associated with artificial intelligence, have benefited immensely from DBpedia’s contributions.

DBpedia also remains a staple for academic pursuits in the areas of Information Architecture, Ontology Design, Artificial Intelligence, Machine Learning, Natural Language Processing, and more. It has provided fodder for experts driving projects in these areas at Google, Microsoft, Facebook, Oracle, IBM, Apple, and many others.

Echoing the international nature of Wikipedia, DBpedia has also spawned a number of country- and/or language-specific derivates that include:

How can you help DBpedia?

Since its inception in 2008, the project has been run and financed on a voluntary basis.

More recently, the DBpedia Association has been created to enable the project to attract financial assistance, in the form of donations, from the broader public that benefits from its services.

Basically, individuals and corporations can now make donations to DBpedia to aid its growth and to improve the quality of its services.


OpenLink Software Blog

Blog Publication Hub focused on Data Access, Integration, Flow, and Management Tech

Kingsley Uyi Idehen

Written by

Founder & CEO, OpenLink Software — provider of Secure, High-Performance, and Cross-Platform Data Access, Integration, Virtualization, and Management Technology.

OpenLink Software Blog

Blog Publication Hub focused on Data Access, Integration, Flow, and Management Tech