Linked Data Benchmark Council (LDBC) & OpenLink Virtuoso
Thanks to their built-in support for storing and analyzing highly interconnected data, graph and RDF databases are proving themselves to be the most viable solution for developing applications driven by conceptual graphs of Linked Open Data.
Linked Data Benchmark Council (LDBC)
The Linked Data Benchmark Council (LDBC) is an organization that emerged from an EU FP7 ICT project aimed at bringing together a community of academic researchers and industry, whose main objective was the development of industrial-strength benchmarks for NoSQL Databases supporting Relational Property Graphs (RDF-based and non-RDF-based alike). As one outcome following completion of the project, the LDBC has been established as an independent authority responsible for specifying benchmarks, handling benchmarking procedures, and verifying/publishing results for software systems designed to manage graph and RDF data.
Membership
The council was established by the following founding members:
- OpenLink Software
- Ontotext AD
- Neo Technology
- Sparsity Technologies
- STI International
- FORTH Foundation for Research and Technology-Hellas
- Peter Boncz.
Benchmarks
The LDBC currently has developed two new benchmarks: the Social Network Benchmark (SNB), and the Semantic Publishing Benchmark (SPB).
SNB is aimed at testing graph data management technologies for three scenarios:
- interactive — transaction query workload
- business intelligence — analytical query workload
- graph analytics — graph analysis algorithms, such as PageRank
All the SNB workloads share a common scalable synthetic data set, produced by a state-of-the art data generator. SNB is designed to be a plausible look-alike of all aspects of operating a social network site, as one of the most representative and relevant use cases of modern graph-like applications. Each workload produces a single metric for performance at a given scale, and a price/performance metric at the same scale. The full disclosure further breaks down the composition of the metric into its constituent parts, e.g., execution times for each exemplar query. Currently, the Interactive Workload is in release stage, while the other two workloads are under development. An extract from the audit results, presented in the following table, shows the superiority of Virtuoso over the other systems being tested.
The SNB specification contains a description of the benchmark, a detailed explanation of the data used in the whole LDBC-SNB benchmark, a detailed description for the complete Interactive Workload, and instructions on how to generate the data and run the benchmark with the provided software. All information about its software components can be found on the SNB developer page.
The Semantic Publishing Benchmark (SPB) is based on the BBC News website, and models a mixed workload of queries and updates with a limited amount of semantic inferencing. In particular, LDBC worked with the BBC (British Broadcasting Corporation) to define this benchmark, for which BBC donated workloads, ontologies, and data. SPB performance is measured by producing a workload of CRUD (Create, Read, Update, Delete) operations which are executed simultaneously. The benchmark offers a data generator which uses real reference data to produce datasets of various sizes, and tests the scalability aspect of RDF systems. The benchmark workload consists of:
- editorial operations that add new data, and alter or delete existing data
- aggregation operations that retrieve content according to various criteria
Virtuoso was dominant on this benchmark as well.
The SPB specification contains the description of the benchmark and the data generator, and all information about its software components can be found on the SPB developer page.