Imagine being able to find out which books inspired Shakespeare, or what music Beethoven listened to throughout the years.

Services like Goodreads, Last.fm and Letterboxd are building the tools that help preserve exactly this kind of data. They want to share the media items we consume packed with bags of metadata. By default, our media consumption is made public. That can encourage our friends to join, but what should we do with the part of us we do not want shared?

The problem with public

Things we consume can deliver strong signals about who we are and what we are pursuing at the moment. If you read books on fighting alcoholism, chances are you are not opening up to the whole world about it. Similarly, if you ride a Harley, you do not want to be caught enjoying Taylor Swift.

The first instances of media consumption logging sites were social networks: you shared the things you wanted to show your friends. As they are moving from social recommendations to recommendation engines, providing only a partial idea about our taste can impact recommendation accuracy.

Many choose not to use these sites at all. For example, if you are not prepared to share all the music you listen to, it becomes inconvenient to try and separate private and public listening.

Introducing private

The simple solution is to add private sharing controls, i.e., “I read this book, but do not want to share it with my friends”.

There are a few subtleties with this approach. First, sites must be very careful in using this data. Feeding personal recommendations to users and to anonymous databases is probably safe. Matching similar users based on private data is not safe, since it could be used for inference*.

Another concern: users might be tempted to apply privacy by default and this can limit interactions between users and hurt viral growth.

To deal with the latter, Pinterest fixed the number of “secret” boards to five. If we permit a natural proportion of public and private, we can invite users to be open to their friends when they can and leave room for privacy when required.

It is not clear that people would be satisfied, for example, with five private book collections in Goodreads. I would suggest an approach that limits the proportion of hidden reads, not the absolute amount.

There are challenges in finding a balanced implementation of these features, but private logging is a necessity for achieving great recommendations. Let's hope this idea makes it out of the backlogs.


* — That is to say a mathematician might be able to infer that you have read the Kama Sutra with 68.3% certainty.