Using Artificial Intelligence to get from Open Data to Linked Open Data — Part 1
AI (Artificial Intelligence) is fundamentally about Software Agents (and the machines they drive) being able to perform Reasoning & Inference against data from a variety of sources in varying circumstances.
The city of Boston has recently published a vast collection of Open Data. One of these datasets is about the locations of electric car charging stations.
Challenge
One significant challenge is that the fields (a/k/a attributes, relationship types, or relations) used to construct longitude and latitude data have identifiers that are too tightly scoped to the CSV document in which they were originally published.
Solution
Produce a Linked Open Data rendition of the data without making any changes to the source data and without losing sensitivity to changes to the source over time.
Linked Open Data implications
- Every entity (including entity relationship types) is identified using a HTTP URI (hyperlink)
- Entity descriptions take the form of a collection of RDF Language sentences/statements
How?
By producing a set of built-in and/or custom inference rules for an RDF-aware Software Agent (e.g., Virtuoso) that enables it to generate the change-sensitive Linked Open Data rendition.
Note: in this post, built-in inference rules are the key to the solution; custom inference rules will be addressed in part 2, demonstrating an alternative approach.
Steps using Built-In Inference Rules
- Create a rule using rdfs:subProperty relations that map the local field names (relations, attributes, properties, predicates) for longitude and latitude to geo:lat and geo:long terms from the Geo Ontology.
- Ingest the electric car charging station data using a live instance of the Virtuoso Sponger (Linked Open Data Middleware service).
- Produce Linked Open Data using a SPARQL Query (leveraging the Inference Rule pragma) against the electric car charging station dataset.
SPARQL Query with Built-In Inference Rule Enabled
## Context Rule for Built-In Reasoning & Inference
## See: http://kingsley.idehen.net/DAV/home/kidehen/Public/SPASQL/built-in-inference-rules/geo-data.sqlDEFINE input:inference "urn:geospatial:cleanup:inference:rules"
DEFINE get:soft "soft"PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>SELECT DISTINCT
?s1 AS ?webid
?s3 AS ?latitude
?s2 AS ?longitude
?s4 AS ?name
?s8 AS ?address
?s6 AS ?fuelType
?s5 AS ?city FROM <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv>WHERE {
?s1 a <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#class> .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Longitude> ?s2 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Latitude> ?s3 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Station_Name> ?s4 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#City> ?s5 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Fuel_Type_Code> ?s6 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Status_Code> ?s7 .
?s1 <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv#Street_Address> ?s8 .
?s1 geo:lat ?lat1 ;
geo:long ?lng1 .
}
Live SPARQL Query Links
- Query Results with Reasoning & Inference Context enabled
- Query Results without Reasoning & Inference enabled — basically an empty page.
Screenshots from Faceted Browsing Service #1
Screenshots from Faceted Browsing Service #2
Screenshot from our HTML5-based PivotViewer
Live Demo Links
- Virtuoso Faceted Browser Page
- PivotViewer Page
- RESTful Virtuoso ETL Service Import Page (a/k/a Sponger Middleware for Linked Data Transformation)
Related
- Demonstration of built-in reasoning and inference — https://medium.com/virtuoso-blog/using-british-royal-family-data-snippets-to-demonstrate-sparql-query-language-based-reasoning-56626a152419
- Demonstration of custom inference and reasoning (using Virtuoso Macro Language) — https://www.linkedin.com/pulse/reasoning-inference-using-british-royal-family-part-idehen
- http://kingsley.idehen.net/DAV/home/kidehen/Public/SPASQL/built-in-inference-rules/geo-data.sql