Dataspaces have arrived
It took less than 10 years to deliver on Alon Halevy’s vision.
Michael Franklin, Alon Halevy and David Maier introduced the idea of dataspaces based on the observation that data management solutions can be understood along two dimensions:
In their seminal 2005 ACM SIGMOD paper they hence suggested the term dataspace as an extension of the traditional database term. One example dataspace is shown in the paper:
While some of the components (looking at you XML and WSDL) are arguably outdated and may cause eye cancer, you get the idea, right?
Guess what? In 2013, less than 10 years after their paper, the dataspaces have arrived in the form of the Hadoop ecosystem. We are now in a position to design and deploy dataspaces, addressing a variety of datasources and data formats with a range of ‘schema-awareness’—from strongly typed RDBMS over JSON to plain text. We can now query, and manipulate and manage the data sources and integrate them in a true pay-as-you-go approach.
#polyglotpersistence #lambdaarchitecture