Small data is next big thing in machine learning

Virginia Dignum
4 min readFeb 22, 2018

Big data worries me. And this idea that Artificial Intelligence (AI) cannot do without big data worries me most. Just this week, a very quick and by no means significant mini-survey during the World Summit AI in Amsterdam revealed that around 20% of the participants expect that more and bigger data is key for the further development of AI. In fact, I believe that small data will be the next big thing in AI; it will make AI smarter. This is why:

Big data demands a lot of effort, is very expensive and brings with it many problems. One needs to collect that data, maintain it, manage it, develop all kinds of governance structures to deal with it. It brings with it problems with privacy and security and it vulnerable to attacks, misuse and misinterpretation. Data is context and time dependent, which means that if we want to keep it up to date we need to continuously collect more data, about more situations, on more context. Thus big data leads to more data leads to bigger data. Of course, there are many reasons and many domains where one needs big data. But I claim that AI is is not one of those domains.

In AI, big data is mostly used for machine learning (ML). Current ML techniques are mostly applications of probabilistic theories, in different flavours. Basically, ML is the search for correlations within data: the more data the more certain one is that the correlations are correct. Hence the need for big, bigger, biggest data. Typically, an image recognition algorithm needs to be trained with several…

--

--