Fixing Orwellian Reputation Systems
Incl. proposal for potential solution
Reputation systems on the Internet are very helpful, very powerful, and can sometimes become very scary. In this article, I will explain traditional reputation systems (based on blacklisting), and propose a system that works completely counter-wise (based on whitelisting), which gives the user more relevant search results, and more control over his/her own data.
(if you already understand reputation systems, and their drawbacks, feel free to skip this first part, to “Reputation Systems Based on Whitelisting”)
Reputation Systems Based on Blacklisting
Think of all the (electronic) reputation systems you use on a daily basis. From Google Search (PageRank), to Twitter, Facebook, spam-filters, finding Uber drivers, and eBay sellers, etc. Without reputation systems, the Internet is an information hell. We need proper systems to organize all this info.
Most of these reputation systems are based on blacklisting, and trust everything at first instance:
- The platform owner scans / indexes all of the (user-)generated content;
- Content deemed irrelevant, by them (which is subjective, and 100% up to the platform owner (or their government) to define) is filtered out;
- Only the content the platform owner wishes you to see is presented to you.
An algorithm determines a general score, and rates the product, service, user, etc., accordingly, and displays this info to other users. Trolls, sockpuppets, Sybil swarms, etc., are sometimes partially detected by the algorithm, but nearly never all of it. This is why, in many cases, moderators are involved. They are a costly resource, and prone to all sorts of human flaws. All in the hope to present everyone the best information possible, of course.
Somewhere along those steps, your personal preferences are probably stored, and sold to the highest bidder. Because that’s how you make money off a traditional reputation system on the Interwebs, right?
What almost all reputation systems have in common, is that they’re centralized. It means that not only all of the collected information belongs to the system’s owner, but also all of your personal information generated on their platform. After all, you agreed to their terms …
Apart from having no idea what people (or faceless organizations) are doing with ‘your’ information (search preferences, medical data, baby and wedding photos, chat logs, address books …), there are other risks.
Centralized reputation systems have the power over what people THINK. They’re the new version of the newspaper, radio, and TV, all in one. There’s a gazillion ways to influence a person via reputation systems. We tell these systems what we like, all day long by using them, and get presented what the algorithm ‘thinks’ we like best. We all know that line on Amazon:
“Since you like … , you probably also like …”.
In cases of censorship on these platforms, it could be hard to even accuse the platform of censorship, if the information has never been shown. And if the user isn’t even aware of the existence of the content, how will they ever prove it hasn’t been presented to them?
“If a tree falls in a forest and no one is around to hear it, does it make a sound?”
Unless someone finds a way to reverse engineer the user interface, on-top of a centralized reputation system, or has another trick up their sleeve (smartest kid in the room principle), there’s a big chance nobody will ever know …
Now think of the political influence a decent reputation system can buy you. In other words, centralized, blacklisted, reputation systems, equal power.
Reputation Systems Based on Whitelisting
Reputation systems based on whitelisting trust nothing at first.
In order to be known to such a system, a link has to be (manually) established, which is incredibly inefficient. We like to automate things, don’t we?
Lets start with your address book; contact list. It’s already in place, and it easily already has many contacts in there, collected over-time. If you’re living the 21st century to the max, and don’t use 19th-century technologies anymore, you probably have a few connections on Twitter, Facebook, LinkedIn, IRC, etc.
For the sake of argument, lets say the average Twitter, Facebook, or LinkedIn account has some 300 connections. Many of these (centralized) networks only allow you to connect and exchange information directly with those 300 connections; your first degree connections.
When you have 300 connections, each connected to 300 connections by itself, it means you have tens of thousands ‘friends of friends’ (2nd degree connections). Multiply this to the third degree, and you could quite easily have some ten million connections in your network . . .
Now imagine searching for something specific in your personal network with over 10 million connections. Instead of using a search engine, or a forum with folks you don’t know, you can search among your friends, and their friends, and theirs … for that barber, a movie, or VPN suggestions from your IT buddies (hello American Internet users! o/ ).
Imagine a reputation system where YOU censor the content. First you whitelist your 1st degree connections into your address book, and then you blacklist any irrelevant 2nd to ∞ degree connections, in a user-friendly way:
- You crawl all of your existing contacts on your social networks, through an easy to use interface, and store them in an address book (lightweight database distributed among your own devices). Congratulations! You just took your first step towards owning your digital identity, social network, reputation, and any data linked to you, or that you choose to share!
- You now go to a restaurant, have dinner there, and add a rating for that restaurant to your address book. You can then make this rating public, keep it private, or share it with specific groups. As you see fit.
- You decide who to listen to, when, and within what context. You now have your own (huge) network to filter from. This could be a friend you find good at marking trolls on Reddit, a company that provides a certain type of ‘Content Filtering as a Service’, DNS, yellow pages, etc., or even the output from a device or application you own, or trust, and trusts you. ;-)
You need no one’s permission with such a system. Perhaps, you could even use it to set up trust-based mesh networking among drones, or between drivers and passengers, in search for an efficient way to cut middlemen, like Uber, or AirBnB. Why not, right? Markets should be efficient, some say.
No one will now decide who you may listen to, unless you actually prefer that. Then it’s okay. You can simply whitelist censors into your network, and filter results based on what they have to say / hide.
Friends, Strangers, Trolls & Sock Puppets
Many websites and apps are plagued by trolls, sock puppets, and other parties that spread information irrelevant to you. Apart from not even knowing ~99% of the people who write the comments you read, that is. Together with censorship, this is inherent to reputation systems based on blacklisting.
But what is the chance that a person closely tied to your personal network is a sock puppet, or a troll? And even if they are to others, are they to you?
If someone’s connected to you, via someone else, and you find that person to be annoying / intrusive / offensive (for whatever reason), you could simply down-vote them. Doing this will ensure that their messages are no longer shown to you.
So, lets say that a friend of a friend makes a bunch of fake accounts, and gives ratings from fake accounts to other fakes. You could visualize your (whitelisted) network, and it will show a huge Sybil swarm connected to your friend’s friend. You can then simply down-vote that connection, and all the fake ones will disappear with it.
Next, you could share this new information about the fake accounts with your own connections, so you can alert them about that (fake) connection, for whatever reason. They can then decide if they want to act on your message, or ignore it.
Conclusion
First of all, I’m biased on this subject, since I’m a big fan of Identifi. It is a proof-of-concept for a P2P address book, combining cryptographic identities with whitelisted reputation, as described above — for the most-part.
Existing reputation systems offer many advantages, but also have very serious drawbacks. So, could we all be our own Google? Can we make our reputation on Uber or AirBnB mobile, and take it elsewhere? What if we could easily filter out (more) relevant data, rated by the people, companies, or even machines and applications, we personally trust? Or, maybe even by the parties they trust? Everyone should be able to decide for themselves who they want to listen or talk to, since nothing seems more subjective than confidence in a person or thing these days. /fakenews
ps: If you’re curious how I believe we could scale any of the above, my answer is probably that there will be more (micro-)intermediaries, not less.