Every Nation consists of institutions and stands to others in what we can think of as a diplomatic social network. Were this information to be published using Linked Open Data, what useful things could we do to increase Trust on the Internet? How can we help create an immune system for the body politic?
Here are 13 use cases for the Web of Nations:
- Enhance Trust in Small Business
(Here I explain how the verification works, so this part is a must-read)
- Help Legitimate Institutions of Knowledge stand out
- Make Fake News sites stand out
- Stop Phishing attacks
- Stop GUI Confusion attacks
- Improve Trust in Apps
- Help Search Engines and Social Networks attribute content
- Help AI spam filters recognise legitimate content
- Cheaper and Richer EV X509 Certificates
- Anchor flexible Verifiable Credentials (immunity certificates, drivers licences…)
- Trusting Linked Data
- Anchor Provenance
- Machine Readable GDPR Policies
In this post, I will look in more detail at each use case.
Why would they?
Nations have to keep information about the institutions that make them up — schools, universities, hospitals, law courts, police units, armies, … — as they have to pay for them. They have to keep information about the companies that operate on their territory as those pay the taxes. Nations and local authorities have been employing people to do this for centuries. Now they are doing it online. See for example the UK Companies House which also keeps a list of equivalent registrars worldwide. Making the Data machine-readable is a simple next step, and many are already working on these. For example, Companies House in the UK provides a RESTful API producing JSON documents which are searchable using SPARQL, the W3C data query standard.
Countries also have to keep information about their diplomatic relations to other countries, if only because they have to keep embassies open there. Like personal social network profiles, states have many reasons to keep that rich information up to date. Making it machine-readable is but a small further step. If all these internal and external links are available, we get what I have called a Web of Nations.
From Digital Sovereignty to the Web of Nations
This blog post follows up on Epistemology in the Cloud — on Fake News and Digital Sovereignty presented at The Web…
Compare this to existing internet registrars such as those for DNS or X509. These can publish information about the relation of web sites to the companies that run them. But they have little incentive to maintain more than the minimum information: they can’t quickly close companies that lie, and they can’t easily pursue companies either. In any case, keeping such data up to date is no part of their business model. It is part of the “business model” of states.
If enough nations were to settle on ontologies to publish such linked data on the Web, how could it be put to use?
Trusted information about Web Sites
Web sites we navigate to have very little information about the location of the company running them, what type of business or institution it is, who the owners are if they have legal problems, …
The early Web could give hints as to this in the top-level domains. Yet, their principal use is to be short and memorable, which conflicts with providing rich semantic information. Thus:
- The only two-character country codes that people know are those of their own country. Nearly nobody knows all the others by heart.
- Furthermore, it has become widespread for companies to repurpose country top-level domains. For example, the .io British Indian Ocean Territory is often used by Data-Oriented companies to stand for Input/Output. Or .co, which is the code for the Republic of Columbia, is used for people who don’t want to pay a ransom to a .com domain squatter.
- The explosion of new generic top-level domains has further diluted the formal meaning one can attach to them, with domains such as .wow competing with .xxx and .com
- Further geographic top-level domains such as .paris, .москва, … don’t make life easier for people choosing a domain name: should they buy a domain ending in .shop or .london? And for end-users why would .cat not refer to feline animals rather than to Catalonia?
Top Level Domains are thus useful mostly to help people remember web sites they visit often. But not much more.
Allowing Web sites to add links to the official machine-readable records about the companies or institutions running them (which need to link back) would allow browsers to display this information in specially designed panes. Having this would enable the following use cases:
1. Enhance Trust in small businesses
Small businesses such as bakers, watchmakers, accountants, software shops, pharmacies, … don’t have the money for global advertising campaigns to get their domain name to be widely known. They can only get going by showing their educational credentials online and have their reputation grow through customer satisfaction. At present, those who find links about them online, cannot quickly tell if the site linked to is the one they intended to reach, a fake or a competitor. Having official information about the web site made visible in the browser would be very helpful to increase local business confidence. An example would be wanting to buy a watch from a Swiss watchmaker. It would help to know the web site was that of a shop in Switzerland.
How does it work?
A browser connecting to a web site such as https://co-operating.systems/ would receive a link 1a (see first diagram above) to the record at companieshouse.gov.uk which if it contains a link back 1b would constitute the first link in the chain proving that the description of the company is about the company behind the web site. The links 2a from that record to gov.uk and the link back 2b to Companies House proves the registrar is not a fake one. For UK citizens that would be the end of the verification procedure. For a French citizen though, the browser agent would need to prove the existence of a link from gov.uk to the trust authority of French citizens gouv.fr 3a and back 3b, a link in the diplomatic peer to peer web of Trust between nations, which would likely be cached by browsers. This chain of links tells the browser that the record at companieshouse.gov.uk is the legal one describing the co-operating.systems web site, which French law will accept in legal proceedings. This rich and live information could then be displayed elegantly in the browser on first arriving at the web site, and it could be enriched further by information from French authorities if needed.
2. Help Legitimate Institutions of Knowledge stand out
Institutions hold most human knowledge. But the World Wide Web is a global medium that extends far beyond the local physical space we grow up in and where we learn to recognise hospitals, schools, the police station, etc. Now physical hospitals cannot appear out of anywhere overnight, but on the Web they can. This is made all the more problematic as widely available machine translation technologies are exposing people to web sites published in regions they know little of.
Having official information readily available in the browser showing the domain of expertise of the company or institution behind a web site would hugely increase our ability to work together on urgent problems.
3. Make Fake News Web sites stand out
Newspapers are not perfect. They have biases and different standards of quality controls on the news they provide. All that said, registered ones have to abide by codes of conduct for their country, can be pursued in court for defamation, must allow a right to reply, and have to correct stories if required to do so. Fake News sites, on the other hand, work outside any legal framework, yet take on the look and feel of real ones. There is one thing the fake news sites would not do: to provide links to official records about their activities, as these would either result in them being scraped from the records or show them to be engaging in some other field of work such as advertising. On the other hand, traditional news sites already have to carry the costs of their legal responsibilities. They would, therefore, be happy for their status to be visible to people reaching their web site.
4. Stop many Phishing attack vectors
A widely used Phishing attack vector comes from getting people to click on a link to a web site that looks exactly like one they use. That web site then entices them to enter a password, a credit card, or other sensitive information. To make the scam more effective, they try as hard as possible to make the domain name of the web site as close to the one they are faking.
Yet, if browsers would display the official rich information about the ownership of web sites (especially when first landing there), then it would be immediately apparent to the user that the web site they had reached was not the intended one.
Phishing in Context — Epistemology of the Screen
Apple’s Touchbar — a 2nd screen — should be part-owned by the OS to display info on the agent accessing the main…
5. Stop GUI Confusion Attacks
Applications running on consumer Operating Systems can take over the screen entirely. If they can do that they can then also download a look and feels for the OS or indeed any other App and so get the user to open an emulated App instead of the real one. When opened, the emulated App can act as a Trojan Horse, passing all information the user enters onto a malicious third party.
If the device provided a second screen where reliable information about the running application could be shown, with an About button to the official information about the company that built it, then it would be easy for users to notice such types of attacks.
A second screen? Yes. All MacBook Pros, now come with a second screen known as the TouchPad, which could be controlled by the Mac OS. But what about mobile phones or other devices? Smartwatches could play that role in that case too.
6. Improve Trust in Apps
We download most of our Apps from an App Store, be it the Apple or the Android one. These stores take on the job of verifying the code in the App for various attack vectors that they may contain. Whatever the quality of the verification tools, these cannot determine the legal status of the companies that made them. Only official company information provided by the national registrars can provide that. Linking to such information would then allow people to know what company is liable for a malfunctioning App and which legal jurisdiction may be involved in case it comes to conflict. It would also enable purchasers to distinguish the App they want from the many similarly-named ones available.
Social Networks and Search Engines
7. Better metadata published stories
We access most of our news through search engines or social networking sites. These often present short views of a story headline and summary. On Social Networks such stories are often quickly reposted without people taking time to verify the source. If these Social Networks could have access to practical legal information about the company publishing a story, they could present handy links to it. This information would help determine how much to trust the story. It would allow their users to distinguish publishers that have declared legal obligations from those that do not, allowing them to be more sceptical when presented with a story that does not have such backing. As linking into official information cannot be obligatory, this will not, of course, exclude anonymous story writing.
8. Help AI-based Spam Filters recognise legitimate content
Social Networks are using AI to detect spam, fake news stories, phishing attacks and more. But without a grounding in legal institutions that rely on police on the ground, law courts, democratic checks and balances, etc. the AI can only work on statistical patterns in content and behaviour. Such pattern-based judgement can easily be misled, such as when Facebook started flagging Cornoavirus posts as spam, even though they point to legitimate news sources. Indeed the post you are now reading was also flagged as spam.
The law is not a statistical matter. A judge does not condemn someone because most cases that looked like this one were judged that way, but by taking the particular facts of the case into account as well and interpreting these in view of the of previous similar cases, in light of a historical retelling of the past that acknowledges the successes and mistakes that were made, in view of informing future judgements. (See the work by Prof Robert Brandom on the subject)
9. Cheaper and Richer EV X509 Certificates
The deployment of https based web sites has grown tremendously since the EFF provided the free and easy to use LetsEncrypt service helping make the Web more secure by increasing the difficulty of man in the middle attacks on the wire. Yet, this has also lead to a considerable growth of https based phishing attacks, since these free certificates only contain one piece of information: the domain.
Extended Validation (EV) certificates cost more (>$300 a year) and contain a little more information: usually just the address of the company headquarters. EV certs are highlighted by browsers to distinguish them from other plain domain name certificates such as those provided by LetsEncrypt. The expense of the verification process can explain the cost of EV certs. If Registries used standard ontologies and allow their records to contain links back to the web sites of the company described (as shown in JSON example above), then providers of Extended Validation X509 server certificates could automatically find such details. As more information would be available from Registrars, it would also become possible to add more information to the certificates and even link to the live registry record.
10. Anchor Verifiable Credentials
The W3C Verifiable Claims Working Group has put together a Use Case Document to help them develop new Verifiable Credentials standards. These cover everything from age verification credentials, to academic credentials, drivers licenses, health insurance cards, all the way potentially to full-blown machine-readable passports.
The recently released Verifiable Credentials Data Model has an issuer field that is a URL naming the issuer of the Credential. Who the issuer is, is of course of paramount importance in a claim. For a Drivers License Claim, the issuer has to be a recognised institution that can verify the ability to drive of the claimant. A medical Immunity Credential could help people who have been immunised by catching SARS-COV-2 travel but needs to be signed by the right medical authorities. The Web of Nations can help verifiers check that an institution in a foreign country is legally entitled to produce such claims. It can also help developers write Apps to keep up to date on the institutions globally that can sign specific claims.
The much older X509 certificates also have an issuer field, known as the Issuer Alternative Name. By keeping a map between these issuers and their public keys, browsers can verify the certificates received on connecting to secure web sites.
Yet, for end-users looking at such certificates, the issuer remains completely opaque. Who are these trusted certificate authorities that we find in our browser? Being able to complement that Issuer Alternative Name with rich information from national registrars would also allow us, users, to get a much better feel as to how much these should be trusted.
This Web of Nations is a Linked Open Data (LOD) project. LOD has been growing in leaps and bounds, as shown by the Linked Open Data Cloud.
11. Trusting Linked Data
Nevertheless, having LOD on the Web does not yet tell us if we can trust the data. Before one can make a trust evaluation, one has to know who published it, if they have the systems in place to validate the data and if they are responsible for the data published. In a friend of a friend peer to peer social network, where the relations are between individuals that know each other, those making the claims are held responsible by their peers in the social network. The Trust can be built there by two-way links from one person to another and feedback through interactions on other channels, including face to face meetings. But it can’t scale to the world as argued in Why did the PGP Web of Trust fail?
Without the Web of Nations, establishing who published the data and if they are a legitimate source, can require a lot of technical knowledge available only to a few specialists who spend all their time working on the internet. For an example of the amount of reasoning one needs to go through to establish the source of some data see the 15 March 2020 mail to the W3C LOD mailing list, where the question of whether some data on the Covid-19 virus is reliable comes up.
12. Anchor Provenance
Of course, it is legitimate for someone to re-publish data that came from another source if provenance information accompanies the data. For this, the W3C has developed a set of Provenance standards. The entry point for that is the Prov-Overview spec and the book Provenance: An introduction to PROV by Jean-Luc Moreau and Paul Groth.
But Provenance information by itself is not enough. For it does not tell one if the entity from which the data came is entitled to such claims. For that again, we will often need to know more about the legal status of the body publishing the data. Having downloaded PROV enabled data on the spread of Covid-19 from some web site located in a foreign country (choose one that you know nothing about), does not tell one what type of institution published it. Here again, the Web of Nations can help anchor the final part of the Trust needed to get big linked data projects to work reliably.
Being able to locate web sites in the legal space makes it possible to solve some further problems.
13. Machine-readable GDPR policies
At present, all web sites feel obliged to ask users if they want to accept even something as benign as cookies for identification. From the users’ point of view, these constant requests are leading to consent fatigue, which instead of getting people to think about the danger of giving away too much information, risks teaching them to consent to anything automatically. If web sites could instead publish machine-readable versions of their GDPR policies, then browsers could provide a friendly user interface to display this on-demand. But more importantly, users could select strategies to decide which GDPR clauses they find acceptable and for which ones they want to be alerted. I would, for example, automatically accept any cookie if the information remained on that site.
For that to work though, one has to know which legal space the web site is operating under. For if no laws bound that web site, then any GDPR policy they publish has no legal value. Thus, here again, having a link from the Web site to a record issued by a national registrar linked to a Web of Nations is essential.
Note: for a historical view on machine-readable privacy policies, starting with the W3C P3P standards finalised in 2002 up to current work see the thread on the semantic web mailing list, from the 18 March 2020.
We find thus that the Web of Nations will play a foundational role grounding trust on the Web, linking the legal spaces that holds societies together with the new form of writing that the Web is.
The use cases above are just those I was able to think of this weekend. If you have further suggestions, please do not hesitate to leave them in the comments below, on Twitter or via email at firstname.lastname@example.org.