How widely companies share data

Nick Halstead
DataScan
Published in
4 min readJan 22, 2018

Threats galore for centralised data. Three ideas for 2018 to stay ahead of competitors. Is your company’s data actually valuable in the AI era? How Bitcoin’s utopia has failed.

All included in this week’s digest on the world of data. 👇🏼

How widely companies share data. Katharine Schwab discusses PayPal’s recent spreadsheet which outlines all the third-parties with whom they share customer data. Designer and researcher Rebecca Ricks neatly visualised this information in a new tree diagram — recommend exploring it to very clearly see how many companies PayPal are sending their customer’s data to (including name, DOB and photos):

It would be a great resource for consumers if all companies that collect and share their customers’ personal data were up-front about how they were sharing it–and better yet, if they presented it through this kind of visualization, which makes it much easier to comprehend the overwhelming scale of the list and dig into particular categories.

A recent SAP Hubris survey found that 66% of US consumers expect companies to “be transparent about how the data are being used with partners”. They are happy to share information, but in return expect for understandable and honest communication around how their data is being used, where it is being moved to and who is accessing it.

— What 100 cities are learning from each other by sharing their data.

Threats galore for centralised data. Interesting article debating the “safety of people’s privacy” when data is pooled in a central location, like in India’s leaky Aadhaar system:

There are huge legal implications regarding the privacy of the data. Because the data is now available in a long trail linking a person’s every habit, the government has total access to his entire record, including health, sexual preference, etc. The consequences of government agencies having access to all information on a person are not measurable, particularly when it comes to mala fide action. The protection would, however, come under Data Protection laws that will have to come into force so that the individual does not suffer. The scary part is not the legislation but the follow-up by courts and law enforcing authorities. How effective would that be once a person’s digital footprint has been exposed?

— More on the Aadhaar privacy firestorm here. Plus, the “rising tension” between new, innovative products and consumer data protection.

3 ideas for 2018 to stay ahead of competitors. Kasia Moreno, editorial director at Forbes Insights, discusses the top themes for this year and how companies can use them to as differentiators. Moreno emphasises how “this is going to be the year of data” — due to the GDPR coming into action, the growth of smart data sharing and companies creating business value from customer data. Importantly:

Companies have been focused on changing their business models from selling products to selling services, based on data. For example, a maker of a product, if it has data about the use of that product from its customers, can advise on maintenance, thus cutting down on downtime. Since the maker has historical product performance information about both the maker and the client, the trove of knowledge to share and use to improve business is quite vast. This means the ability to add to ongoing services. The trick is to figure out which data and which services the client will find useful.

But this customer most probably also shares and obtains data from other companies, and some of it may be even more useful than the data from the product maker. What’s more, this useful data can often come from a relative startup in the industry, which does not have historical data at all, but finds one useful piece of data — sometimes publicly available — to mesh with the client data and provide the most useful services. I call such data phantom data. It emerges unexpectedly, out of the blue — even though it was there all along, only impossible for many to see — and instantly seems both obvious and visionary. Companies must be on a constant hunt for such phantom data.

CES isn’t about the hardware anymore: it’s about software, data and the connection behind it all.

Is your company’s data actually valuable in the AI era? Writing for Harvard Business Review, Ajay Agrawal, Joshua Gans and Avi Goldfarb argue that data is helpful to build prediction machines — but not operate them:

The data you have now is training data. You use that data as input to train an algorithm. And you use that algorithm to generate predictions to inform actions.

So, yes, that does mean your data is valuable. But it does not mean your business can survive the storm. Once your data is used to train a prediction machine, it is devalued. It is not useful anymore for that sort of prediction. And there are only so many predictions your data will be useful for.

Miscellaneous

Stop. Calling. Bitcoin. Decentralised. → Bitcoin’s utopia has failed as big players hold all the power. 👏🏼

The Google Brain Team — Looking back on 2017. 💡

Updates to Uber’s open source project for differential privacy. 💯

Amazon won’t say if it hands your Echo data to the government. 🔍

Inside Telegram’s ambitious $1.2B ICO to create the next Ethereum. 🤔

Bitconnect, which has been labelled a Ponzi scheme, shuts down. 💸

Dark Energy Survey releases first three years of data. 🚀

The end of the conference era? 📉

New monthly dataset shows where people fall into Wikipedia rabbit holes. 🙊

Data viz: The temperature of the world since 1850:

Enjoy reading this? Join my free weekly data digest. 🚀

--

--