Wallscope
Published in

Wallscope

The Olympics: How to Build a Linked Data Application

Combining RDFox and Wallscope’s Platform

Photo from Unsplash - Edited by Felicity Mulford and Angus Addlesee

The Problem

The Planned Output

Athlete View
Sport View
Continent View

The Data

  • A knowledge graph about the Olympics (containing 1.8 million triples) created for a different use case and therefore requiring some refactoring.
  • A small tabular dataset that we transformed into linked data.
  • Reddit (30k submissions) during the Olympic games from London 2012 to PyeongChang 2018. This is our unstructured textual data.

Knowledge Graph

Colours are for display purposes only. The prefix “walls” represents <http://wallscope.co.uk/ontology/olympics/> in this case. All other prefixes can be found on prefix.cc
RDFox Graph Visualisation - coming with v4

Small Tabular Dataset

@prefix noc: <http://wallscope.co.uk/resource/olympics/NOC/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <https://schema.org/> .
noc:POR dbo:continent dbr:Europe .
dbr:Europe a schema:Continent ;
rdfs:label "Europe"@en .
To link two entities (NOC and continent) I need a predicate. I typed “continent” in the predicate search and used the top result.
To find an appropriate entity type for our continent entities, I typed “continent” into the type search. As this is an article read by humans, I figured the dbo result might cause confusion as it looks similar to the predicate (difference is the capital C). I therefore chose the schema result.

Reddit

Data Foundry is getting a new face! Here is a sneak preview of the prototype - designed by Dorota Burdach

The Reasoning

Example summary of all material purchases. Source

Rules in Practice

[?athlete, wso:athleteInGames, ?games]
:-
[?instance, wso:athlete, ?athlete],
[?instance, wso:games, ?games].
[?ath, wso:minAge, ?min]
:-
AGGREGATE(
[?ath, foaf:age, ?age]
ON ?ath BIND MIN(?age) AS ?min ) .
[?ath, wso:earliestYear, ?min]
:-
AGGREGATE(
[?ath, wso:athleteInGames, ?g],
[?g, dbp:year, ?y]
ON ?ath BIND MIN(?y) AS ?min ) .
[?ath, wso:birthYear, ?by]
:-
[?ath, wso:earliestYear, ?ey],
[?ath, wso:minAge, ?age],
BIND(?ey - ?age AS ?by) .
[?part, a, wso:Participation],
[?part, wso:hasAthlete, ?ath],
[?part, wso:hasGames, ?g],
[?part, wso:hasYear, ?y],
[?part, wso:hasAthleteAge, ?age],
[?part, wso:hasCountry, ?ctry]
:-
[?ath, wso:athleteInGames, ?g],
[?ath, wso:birthYear, ?by],
[?ath, wso:hasCountry, ?ctry],
[?ath, foaf:age, ?age],
[?g, dbp:year, ?y],
FILTER( ?age + ?by = ?y),
BIND(IRI( CONCAT(STR(wsr:), "participation/",
REPLACE(STR(?ath), STR(wsr:),""), "_",
REPLACE(STR(?g), STR(wsr:),""))) AS ?part ) .
[?part, wso:medalsAtGames, ?ct]
:-
AGGREGATE(
[?part, wso:hasInstance, ?inst],
[?inst, wso:medal, ?med]
ON ?part
BIND COUNT(?med) AS ?ct ) .
[?ath, wso:totalMedalCount, ?mc]
:-
AGGREGATE(
[?part, wso:hasAthlete, ?ath],
[?part, wso:medalsAtGames, ?meds]
ON ?ath
BIND SUM(?meds) AS ?mc
) .
[?ath, wso:totalMedalCount, 0]
:-
[?ath, a, foaf:Person],
NOT EXIST ?meds, ?part IN (
[?part, wso:hasAthlete, ?ath],
[?part, wso:medalsAtGames, ?meds] ) .
[?year, wso:yearHasAverageMedals, ?avg]
:-
AGGREGATE(
[?ath, wso:birthYear, ?year],
[?ath, wso:totalMedalCount, ?tot]
ON ?year BIND AVG(?tot) AS ?avg
) .
Depiction of a hierarchy of rules (each node represents a rule). If data enters that matches the bottom right rule’s conditions, all the marked rules update the graph as required.

The Queries

Chart Queries

PREFIX wso: <http://wallscope.co.uk/ontology/olympics/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT (YEAR(?date) - ?birthYear AS ?age) ?avgMedalCountWHERE {
BIND(xsd:dateTime(NOW()) AS ?date)
?birthYear wso:yearHasAverageMedals ?avgMedalCount .
}
ORDER BY ?age
RDFox Query Console
RDFox Faceted Search - Finding the average African female swimmer’s height in 2016.
PREFIX wso: <http://wallscope.co.uk/ontology/olympics/>
PREFIX wSport: <http://wallscope.co.uk/resource/olympics/sport/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT
(((AVG(?mWeight) + AVG(?fWeight))/2) AS ?avgWeight)
(((AVG(?mHeight) + AVG(?fHeight))/2) AS ?avgHeight)
(((AVG(?mAge) + AVG(?fAge))/2) AS ?avgAge)
WHERE {
?cis wso:continentInSportAverageMaleWeight ?mWeight ;
wso:continentInSportAverageMaleHeight ?mHeight ;
wso:continentInSportAverageMaleAge ?mAge ;
wso:continentInSportAverageFemaleWeight ?fWeight ;
wso:continentInSportAverageFemaleHeight ?fHeight ;
wso:continentInSportAverageFemaleAge ?fAge ;
wso:hasContinent ?continent ;
wso:hasSport ?sport .
# When user selects "Africa", ?continent is set to dbr:Africa.
# When user selects "Swimming", ?sport is set to wSport:Swimming.
}
A HiCCUP “recipe” for the query above. The chart can populate live but through a controlled endpoint.
PREFIX wso: <http://wallscope.co.uk/ontology/olympics/>
PREFIX wSex: <http://wallscope.co.uk/resource/olympics/gender/>
PREFIX wSport: <http://wallscope.co.uk/resource/olympics/sport/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name ?mc
WHERE {
?instance wso:event ?event ;
wso:athlete ?athlete .
?event rdfs:subClassOf wSport:Swimming . # Swimming for example. ?athlete foaf:gender wSex:M ; # Switch "M" to "F" for female.
wso:totalMedalCount ?mc ;
rdfs:label ?name .
}
ORDER BY DESC(?mc)
LIMIT 5
Query results in RDFox’s query console.

News Queries

PREFIX dct: <http://purl.org/dc/terms/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX walls: <http://wallscope.co.uk/ontology/>
CONSTRUCT{
?file a walls:File ;
dct:subject ?subs ;
schema:text ?text ;
schema:url ?url .
?subs rdfs:label ?match .
}
WHERE {
{SELECT ?file ?subs ?match ?text ?url
WHERE {
BIND(CONCAT(".*",CONCAT(replace("Michael Phelps"," ",".*"),".*")) as ?candidate)
?file dct:subject ?subs ; schema:text ?text ; schema:url ?url .
?subs rdfs:label ?match .
FILTER regex(lcase(str(?match)), lcase(str(?candidate)))
BIND(SHA512(CONCAT(str(?file), str(RAND()))) as ?random)
}
ORDER BY ?random
LIMIT 10
}
}
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wallscope.co.uk/ontology/File> .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://purl.org/dc/terms/subject> <http://wallscope.co.uk/resource/cc00aafe-a54c-41ae-8ea0-a10570e493c9> .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://schema.org/text> "Michael Phelps Wins Gold in Men's Swimming 200M Butterfly | Olympics 201...\n" .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://schema.org/url> <https://www.reddit.com/r/olympics/comments/4x4hwi/michael_phelps_wins_gold_in_mens_swimming_200m/> .
<http://wallscope.co.uk/resource/cc00aafe-a54c-41ae-8ea0-a10570e493c9> <http://www.w3.org/2000/01/rdf-schema#label> "Michael Phelps Wins Gold" .
PREFIX dct: <http://purl.org/dc/terms/>CONSTRUCT { <file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> dct:subject ?subs } 
WHERE {
VALUES (?type) {
(<http://wallscope.co.uk/ontology/nlp/PERSON>)
(<http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing>)
(<http://schema.org/Organization>)
(<http://www.w3.org/2004/02/skos/core#Concept>)
}
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> dct:subject ?subs .
?subs a ?type .
}
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://purl.org/dc/terms/subject> <http://wallscope.co.uk/resource/cc00aafe-a54c-41ae-8ea0-a10570e493c9> .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://purl.org/dc/terms/subject> <http://wallscope.co.uk/resource/baede3ee-9471-494d-8d11-3544b53e1067> .
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT {
?mention dct:relation ?s .
?s rdfs:label ?name .
}
WHERE {
BIND(<http://wallscope.co.uk/resource/baede3ee-9471-494d-8d11-3544b53e1067> as ?mention)
VALUES (?types) {
(<http://xmlns.com/foaf/0.1/Person>)
(<https://schema.org/Continent>)
(<http://dbpedia.org/ontology/Sport>)
}
?s a ?types ; rdfs:label ?name .
?mention rdfs:label ?mentionLabel .
BIND(replace(str(?mentionLabel)," ",".*") as ?candidate)
FILTER regex(?name, ?candidate)
}
LIMIT 1
<http://wallscope.co.uk/resource/baede3ee-9471-494d-8d11-3544b53e1067> <http://purl.org/dc/terms/relation> <http://wallscope.co.uk/resource/olympics/sport/Swimming> .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://purl.org/dc/terms/subject> <http://wallscope.co.uk/resource/olympics/sport/Swimming> .
<file://DF/reddit/results-sm/olympic-rs-2016-08-4420.txt> <http://purl.org/dc/terms/subject> <http://wallscope.co.uk/resource/olympics/MichaelFredPhelpsII> .
Screenshot from dashboard interface.

The Dashboard

source
Athlete View - Allyson Felix
Athlete View - Allyson Felix
Sport View - Athletics
Sport View - Athletics
Continent View - Africa
Continent View - Africa
Continent View - North America
Athlete View - Jesse Owens
source

The Conclusion

If you want to discuss how you could benefit from anything discussed in this article, please feel free to get in touch with us here or here.

--

--

Blogs from the Wallscope team

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Angus Addlesee

Research Associate at Heriot-Watt University. Studying a PhD in Artificial Intelligence. Contact details at http://addlesee.co.uk/