The Rise of Open-source AI

Will a new era of innovation dethrone tech giants?

Seth Grief-Albert
QMIND Technology Review
10 min readJun 5, 2023

--

Image generated by OpenAI’s DALL·E 2

Introduction: A New Era

Technology is an interesting phenomenon. It can exist in one form or another for quite a while, hidden or inaccessible, until one day it seems to emerge fully formed and is then voraciously consumed by the public. The newest gold rush of the 21st century is upon us. The era of marketized Artificial Intelligence (AI) is accelerating forwards, and with it, the battle for its potential power and profits is being waged.

The ‘AI’ that has risen to extraordinary prevalence is more aptly labeled as its specific subset, Large Language Models (LLMs), but popular culture has a way of assigning buzzwords that stick to the entire field. Many may have heard about “this new AI technology” for the first time from the media or a family member rather than their usual Twitter feed or technical blog – such is the speed of its spread into the marketplace. The field of AI has been making incredible progress for years, but the point of consumer explosion appears to have been early to mid-December 2022, when OpenAI released ChatGPT to the public on a webpage. The following chart speaks for itself:

Visual retrieved from: https://explodingtopics.com/blog/chatgpt-users#

It is clear that people are hungry to interact with and use AI technology. Imagine yourself as a developer who sees this visual, or the executive of a giant tech corporation, or the founder of a startup. Borrowing from history, we can liken the current era of AI to the Medieval period. The rulers held power and riches and guarded their castles, while the commoners were excluded. From time to time, roving barbarians would attempt to usurp the rulers. Large corporations (Google, Meta, Microsoft, etc.) take the seat in the throne room, while developers and academics play the role of outsider. How does this story unfold? Let’s return to the beginning of 2023.

Faster, Faster, Faster

ChatGPT had been out for over a month, and the hype around Large Language Models was in full swing. But what exactly is an LLM? They originate as ‘foundation models’, or specialized algorithms that have gone through long training periods over vast quantities of textual data. The result of this process is a model that can respond to language queries with reasonable sounding answers. These models have a great number of weights and parameters that for our purposes can be thought of as all of the learned patterns in training. These weights and parameters are malleable and can be fine-tuned. Here, LLMs are exposed to a more specific task than general text reconstruction. The most popular fine-tuned task we’ve seen thus far has been natural conversation, but a runner up is for the language model to follow text instructions.

The foundation model to ChatGPT is GPT-3, which was spruced up to interact conversationally using Reinforcement Learning with Human Feedback (RLHF), becoming GPT-3.5. In essence, real people looked through a bunch of responses that the LLM had given, and selected the ones that were most similar to actual conversations. This human-in-the-loop fine-tuning turned a model for predictive text (that had already been around for over a year) into the global marvel we know today. Fine-tuning is a powerful tool that can turn the chaos of data into the semblance of order.

But OpenAI was not the only player in the game. Quietly working away, Meta was preparing the release of its own foundation model. On February 24th, 2023, LLaMA (Large Language Model Meta AI) was launched into the world. It didn’t take long for things to go wayward.

The LLaMA gets loose. Image retrieved from istockphoto.com

Fast forward less than a week. The afternoon of March 2nd, 2023, marked a new beginning for AI decentralization: A file containing the weights of LLaMA was leaked to the public by an anonymous user on the social media site 4chan. This prompted an explosion of interest that took the world of AI by storm. It didn’t take much time for the weights to make their way across the internet into GitHub and HuggingFace territory – essentially the internet’s front page of software and AI respectively.

Off to the Races

The floodgates had been opened. With the weights of LLaMA at everyone’s fingertips, anyone could tap into the power of the foundation model put into their hands. Open-source software is accessible for individuals to develop code on a global network. Think of Wikipedia where anyone around the world can edit a massive encyclopedia – open-source software’s encyclopedia can instead be thought of as a decentralized codebase. It is public and anti-siloed, which makes it accessible and thus incredibly popular for software developers globally. It is also the perfect petri dish for AI experimentation.

Stanford was quick to get in on the LLaMA action. From the initial leak, they had been working for just over a week on fine-tuning the language model, dubbing it ALPACA and releasing it on March 13th. For the staggeringly low cost of $600, they had effectively achieved state-of-the-art results in instruction following, a popular branch of language modelling. But they were still bound by Meta copyright, right? Well, kind of. Alongside ALPACA, the new weights prompted an adoption of low-rank fine-tuning, which allowed freedom from Meta’s property rights. This strategy allowed anyone to repeat Stanford’s process on consumer-grade hardware in a remarkably short period of time: We’re talking a beefy computer and a couple of hours.

In an onslaught of applications based on LLaMA, an established developer group called Nomic AI released a monumental project, GPT4All:

GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.

– Nomic AI

Now, with this open-source ecosystem bootstrapped from harvested language models and rogue data, even more people could build for themselves. You want to run a LLM without having to connect to the internet? You got it. Privacy? Built in.

All was booming in Open-source with this AI extravaganza – getting faster, cheaper, and increasingly distributed by the day. This caught the attention of some big players. On May 4th, an internal document from Google leaked, titled ‘We Have No Moat and Neither Does OpenAI’. The opinion of a researcher at the institution, it chronicled the fast pace of LLM development since the start of 2023 and made a case for why Open-source is in direct competition with Google research (or more specifically, Google Deepmind). The titular “moat” ties in wonderfully to our Medieval castle analogy: How are major technology corporations supposed to defend their AI advancements when the world of Open-source seems to be lapping them?

“We need them more than they need us”

Put yourself in the shoes of an AI researcher at Google. Following closely the acceleration of Open-source AI research, with timescales of days and weeks, you might feel left behind as part of a massive company perceived to work at a slower and more deliberate pace. For that is how a large corporation is structured: intense market analysis, conformity to existing policy, strategic imperatives. It is surely difficult to resist the omnipresent whirlwind of AI hype.

By their nature, private companies silo talent to maintain a competitive advantage. OpenAI began as a nonprofit, forwarding public-facing research, whereas now as a private organization they keep their valuable progress to themselves. But is the metaphorical silo of Big Tech actually under threat? Is the castle doomed to be invaded by Open-source barbarians, or more catastrophically, made obsolete?

Defending the Castle

On the contrary, I believe these castles are naturally well defended. It might be irresponsible to imagine the end of Big Tech’s competitive advantage so soon. Let’s go through a couple of “moats” that solidify these giants’ stances in the market.

Users: more than half of all humans currently alive are Google users. An astronomical amount of people also use products and services from Microsoft, Meta, and now OpenAI. Let’s face it: most people don’t know how to interact with the bleeding edge of open-source software, let alone what “forking a GitHub repository” means. It took an intuitive interface for ChatGPT to make global rounds, and we should be careful to avoid conflating technological progress with technology adoption. People are used to software already present in their lives. Even if third-party options are cheaper, or more advanced, or more private, most will trust the reliable packaging they know and love.

Accessibility is extremely important in reaching consumers. We can look back at a recent application of image-generation AI that exemplified this: Lensa AI, which coincided with the release of ChatGPT in late November of 2022. Released through a preexisting photography app, Lensa allowed you to upload a few pictures of yourself and receive a number of AI-generated avatars. The app blew up and saw an enormous profit in a remarkably short period. It turns out that this same photo to avatar concept had already existed a few weeks prior to Lensa’s release – the difference was that the already existing avatar-generation platform was on a website, not an app. Consumers interested in image generation AI were best reached through their mobile device, not on their computer. Interestingly, what bottlenecked the image generation market was what allowed LLMs to flourish! OpenAI deployed ChatGPT to the public through a simple webpage. A plausible explanation for this deployment difference was the end result for the user. Generating a cool avatar to send to your friends is easier on your phone, while getting a program like ChatGPT to do your homework is more convenient on the web. Microsoft followed the example set by OpenAI, and released Bing Chat through their browser to a user base of millions.

Computing Platforms: I recently heard the analogy that if AI is akin to the locomotive, computing power is its coal. It may be helpful to look back to the history of the industrial revolution. Who was getting rich? Setting up industry was of course lucrative, but think also of the large-scale coal mining operations so characteristic of the time! The reality of hosting large AI models is that they require an abundance of energy and robust infrastructure. The only actors truly capable of hosting such extreme demand are big technology corporations: Google with their Cloud, Microsoft’s Azure, and Amazon’s AWS. Even in a case where open-source research laps any individual AI lab in revolutionary style, these giants sell the ammunition.

World-Class Research and Development: The largest technology corporations have sniped some of the best minds in various AI fields. What differentiates Open-source from a given research lab is the enormous volume of good work, which in the case of LLMs seems to have some advantage over more concentrated research of superb quality. The best silos can be incredibly effective when they need to be.

The competitive edge of big technology corporations is that they chase profits as all costs. They are machines, perpetually working to churn data into revenue. It is becoming clear that, at least in the minds of these corporations, rogue developer communities will not stand in their way. With all the trusting users they could ever want, the power to host and distribute models, and the promise of the next generation of AI research, it would be naïve to profess the imminent fall of these competitively fortified bastions.

Looking Beyond the Hype

The state of AI progress must be accounted for. LLMs are built upon transformer models, and advancement around these models is just scratching the surface of potential. Transformers were introduced to the world through the seminal 2017 paper ‘Attention Is All You Need’. It may be helpful to imagine research as climbing up a tree, with the transformer tree appearing quite tall and bearing a lot of fruit.

While lucrative now, eventually a new tree may need to be planted: one that ushers in the next generation of foundation model and accompanying modifications. On the road to increasing the general capabilities of artificial intelligence, perhaps language models aren’t the be-all-and-end-all. In a scenario where hype around these models dies out, I would stake my money on research labs at institutions like Google and OpenAI making novel advancements, if only for the fact that an enormity of breakthrough work has already emerged from them.

Since the document leak on May 4th, a lot has happened in regards to Google’s deployment strategy. At the recent I/O, an annual developer showcase event, AI was a main theme:

What Happens Next?

Predicting the future of AI based on current knowledge is an elusive task. Is there potential for the power of cutting-edge artificial intelligence to be distributed among the hands of ethical actors? Training foundation models is incredibly expensive, and almost entirely undertaken by large corporations. That seems unlikely to drastically change in the foreseeable future. If one is focused on getting research and technology to consumers, it seems to be the avenues of delivery to market that matter the most. But a distinction should be made: research and development are not necessarily aligned with the goals of technology deployment. Big Tech will remain a powerful player, if only in its capacity to make people use AI in their everyday lives.

We have seen that when access to competitive resources is granted to the public through Open-source, incredible progress can occur extremely fast. Imagine a world where AI researchers could more quickly bring their ideas to the implementation phase, and spend greater efforts tackling the forefront of their fields. With the right guidelines, I think this is preferable to a model of corporate monopoly on state-of-the-art research.

This begs the question: what kind of world are we approaching? Will researchers and developers continue to rely on corporate giants to grace them with foundation models? How will the landscape of Open-source AI evolve, and will it stand the test of time? These questions don’t have definite answers. If one thing is clear, it is that we are only at the beginning of this paradigm.

Any road followed precisely to its end leads precisely nowhere. Climb the mountain just a little bit to test that it’s a mountain. From the top of the mountain, you cannot see the mountain.

– Frank Herbert | Dune

If you have any questions or comments, please feel free to reach out to me on LinkedIn!

This article was written for QMIND — Canada’s largest student-run organization in disruptive technology and artificial intelligence.

--

--

Seth Grief-Albert
QMIND Technology Review

Applied Mathematics and Engineering Student @ Queen’s University