Towards an AI just for the elite?

Nova Talent
The Nova Network
Published in
3 min readJul 27, 2020

This article was published on Nova Connect by our member Pablo Vicente, Data Scientist at Morgan Stanley.

At the beginning of the century, we witnessed a rapid development of AI in general and Machine Learning in particular. Such growth was possible thanks to three factors:

  1. Availability of enough data required to train the Machine Learning models
  2. Breakthroughs within Machine Learning subfields such as Computer Vision or NLP
  3. Graphic Compute Units (GPUs) brought cheap compute power and massive parallelization

These elements enabled the development of the field and the increasing interest from both industry and academia. As a consequence, Machine Learning and Data Science are now two of the hot topics within technology attracting talent and investment like never before.

However, the race to beat the latest metrics in contests such as ImageNet and the eagerness from tech companies to position themselves as leading innovators is reverting the path walked during the last decades. In certain fields, algorithms are becoming harder to run and ridiculously expensive to train to the point where just an elite set of companies can afford it. If the research community worked during the last few decades towards making AI more accessible, recent advances seem to make the field less democratic. In my opinion, there are multiple reasons behind this trend but I will only mention one and elaborate on a second idea.

Recent advances seem to make AI less accessible to the braoder public, why is so?

There is an increasing number of papers whose results cannot be reproduced since the datasets are not public. Having the algorithm is no longer good enough to use it and companies know it. Joelle Pineau is one of the scientists trying to change this issue along with many others [1].

The second and more worrisome factor is the exponential growth on the number of parameters which makes it impossible, for most companies, to reproduce the results or use it in production. There is a trend to increase the size of the models in order to obtain algorithms with better generalization capabilities, for instance, language models. The latest version of GPT, developed by OpenAI, has a total number of 175 billion parameters [2]. Probably that does not tell you much but there is a universal language all of us understand. Training GPT-3 costs more than $12 million dollars [3] and most likely, they needed many attempts to come up with the right configuration which can bring the cost up to hundreds of millions… and just for one model! Not many companies can spend that much money on research, let alone training one single algorithm. We can find the following quote on GPT-3 paper [4]:

Unfortunately, a bug in the filtering caused us to ignore some overlaps, and due to the cost of training it was not feasible to retrain the model.

The following graph, adapted from DistilBERT [5], shows the growth on the number of parameters in some of the latest Language Models. There is a clear exponential trend which makes everyone wonder if the interesting results of the model are due to its size or its architecture.

References

[1] https://www.wired.com/story/artificial-intelligenc...

[2] https://www.technologyreview.com/2020/07/20/100545...

[3] https://lambdalabs.com/blog/demystifying-gpt-3/

[4] https://arxiv.org/pdf/2005.14165.pdf

[5] https://arxiv.org/pdf/1910.01108.pdf

--

--

Nova Talent
The Nova Network

The global by-invitation-only top-talent network that connects high potential individuals amongst themselves and with the best professional opportunities