Data Centric AI — What’s it all about

Since a year or so, Data Centric AI has been a great topic for discussion among the internet minds. All Data Science Enthusiasts know clearly that AI is all about data, but why Data Centric AI? Does it sound ambiguous?. In this article, i will be talking about Data Centric AI, a campaign launched by popular Professor Andrew Ng, which i think is a must-know for all Data Science enthusiasts and practitioners.

Clearing the ambiguity

Data Centric AI is a movement launched by Professor Andrew Ng to refocus the attention of AI/ML practitioners from model/algorithm to quality of data. Ng says that “Data is food for AI”. The movement is focused on generation of quality data rather than optimizing the model for a given data — Just thinking in reverse direction. Thus, he focuses on shifting the attention of people from modelling for a given data to obtaining quality and consistent data and then moving onto modelling.

What could change

Traditionally, models are optimized to get the best performance out of a given data, provided the data is kept constant. Which means that for a given data, the model has to be optimized to adjust to the data and the noise in the data until maximum performance is achieved- This is known as Model Centric AI.

With Data Centric approach, the model is kept fixed, and the quality of data is improved. Even adding more data might not be the right solution at times if they quality of the obtained data is poor.

What i think

Just like others, i too personally think that this movement will bring about a major change in how the field of AI is perceived. Even now, it is a common sight to see people ignoring data quality and consistency for modelling it. Technically, it may or may not give good results depending on the data given, but the quality and performance of the model depends on the data you give. Remember, AI is no magic, it is just plain mathematical algorithms written using code — you give good data, it will give good results even in real world scenarios. What you give is what the model will learn, it’s that simple.

