How much is data worth in the age of generative AI?

Enrique Dans
Enrique Dans
Published in
4 min readFeb 20, 2024

--

IMAGE: Two big stacks of paper files with a few labels, in black and white
IMAGE: Ag Ku — Pixabay

Reddit, which as recently as October threatened to block Google’s access to its pages, has now accepted an offer from an AI algorithm development company that wants to use the social news site’s content to train its models. Meanwhile, Apple and OpenAI continue posing multi-million dollar deals to mainstream media outlets for access to their news. All of which raises a very interesting question: how much is data worth, where is it, and how can it be monetized?

The recent explosive growth of generative AI is based on a key decision the companies developing it took without too much thought: instead of training algorithms with specific data sets, they have trawled information from the web. Rulings against the owners of sites such as LinkedIn, which seemed to suggest that if data is published openly, it can be collected through web scraping and used freely, were opposed by rulings to companies like Clearview, which clearly abused this practice and created a flagrant privacy nightmare.

--

--

Enrique Dans
Enrique Dans

Professor of Innovation at IE Business School and blogger (in English here and in Spanish at enriquedans.com)