Linked Open Data Accelerates Citizen Data Science
It’s been a while since the LOD Challenge 2021 awards symposium was held in March 2022, but I would like to share some highlights about the event, our activities, and the recent status of LOD usage in Japan.
LOD stands for Linked Open Data, which was proposed as the most effective method for public institutions and companies to publish open data and facilitate its wider usage. The LOD Challenge event has been held annually since 2011 by the local semantic web community aimed at promoting the spread of the technology.
During that time, the importance of open data has become widely recognized, and the amount of open data on the web has increased significantly. Nowadays, open data is not only for a few research institutions to share scientific journals and life science data. Open data initiatives have become essential for governmental organizations in many countries, and citizens have actually been accessing those data to solve social problems.
At the same time, we have been improving technical methods for sharing data on the web. Rather than just putting data on a website as a document file, machine-readable formats make it easier to analyze the data and build applications. The 5-star deployment scheme for open data has become widely known as a guideline, even though it may not always be a mandatory requirement for data publishers.
Today, many public institutions, such as statistical centers and cultural agencies, are actively providing LOD. With these trusted data sources at the core, other open data, such as maps, content from media companies, and asset information of museums and libraries, have been linked.
The National Statistics Center provides an API for “Statistical LOD”, so users can write programs to analyze data and create applications. Such developers are called “citizen data scientists”, and they will play an important role in solving social issues in future societies with higher computer literacy.
In the LOD Challenge 2021, we found many submissions utilizing existing open data. For example, the “Population Pyramider”, using the Statistical LOD, can immediately visualize changes in the population of your city, town, or village. Once data is widely used, more data will be published and maintained, creating a data-sharing ecosystem and cultivating an open data culture.
Oracle was pleased to participate in the LOD Challenge as a sponsor. Also, as a technology partner, we presented a seminar on how to build a SPARQL endpoint (= a server for publishing LOD) using the always-free tier of Oracle Cloud. We offered the attendees a cloud account that does not require a credit card so that even students can have their own servers, for free, without time limitations.
The Oracle Award was given to the submission “Course of Study LOD.” The value and quality of this dataset were highly evaluated, and this work also won the Grand Prize of this event.
Here is an excerpt from the judges’ evaluation:
“This project contributes to the construction of the model for the national study courses, which will have a wide range of uses. It also shows a careful and creative method for LOD creation, with components such as the HTML page for each resource URI, English summary page, issue management and website publication using GitHub. We hope that the project will continue to be enhanced, for example, by linking with external data and publishing SPARQL endpoints, and that it will become a best practice for future LOD construction.”
The list of award-winning projects can be found here (in Japanese).
LOD Challenge 2022 is currently accepting submissions, and Oracle Cloud accounts were again provided at the seminar this year. If you are interested in creating your own SPARQL endpoint using the always-free tier of Oracle Cloud, please check out my article on how to do so.