Meta Unleashes LLaMA: The New Large Language Model Outperforming GPT-3

Domenico Lombardi ✌️
3 min readFeb 24, 2023

Battle of Giants: LLaMA Takes on GPT-3 in the Race for Dominance Among Large Language Models

Photo by Greg Bulla on Unsplash

Meta has recently unveiled a new large language model (LLM) called LLaMA. Available in various versions, ranging from 7 billion to 65 billion parameters, the model has shown impressive results, outperforming the GPT-3 despite having far fewer parameters.

Similarly to other LLMs, LLaMa uses the transformer architecture and the AdamW optimizer, which has been shown to be highly effective at reducing overfitting and improving training efficiency. This optimization technique, combined with the diverse range of data used for training, has allowed LLaMA to achieve impressive results especially when compared to state-of-the-art LLMs models such as GPT-3.

One other striking features of LLaMA is that it is trained using publicly available data from multiple domains, including CommonCrawl, Github, Wikipedia, and ArXiv (see image below).

Source

The performance of LLaMA agaist other prominent LLMs is evaluated using the massive multitask language understanding benchmark, or MMLU, introduced by Hendrycks et al. (2020). The benchmark consists of multiple-choice questions covering various domains of knowledge, including humanities, STEM, and social sciences. The evaluation was conducted in the 5-shot setting using examples provided by the benchmark. The perfomance of LLaMA fell behind both Chinchilla-70B and PaLM-540B by a few percent on average and across most domains. However, this is not unexpected, as LLaMA is pre-trained using a limited amount of books and academic papers from ArXiv, Gutenberg, and Books3, that only summed up to 177GB. In contrast, the models that outperformed LLaMA had access to up to 2TB of books during their training.

Source

Differently from GPT-3, LLaMA has been released to the research community open-source under the GNU GPL v.30 license. This means that that LLaMA can be used, modified and distributed for free, provided that any derivative works, including modifications, are released under the same license. The open-source approach adopted by META is expected to encourage further research and collaboration in the natural language processing (NLP) communitythe as well as democratize access to NLP technology.

Note: This story was written with the help of a generative AI tool

If you like the article and would like to support me make sure to:

  • 👏 Clap for the story (50 claps) to help this article be featured
  • Follow me on and view more content on my Medium profile
  • 🔔 Follow Me: LinkedIn

--

--

Domenico Lombardi ✌️

Deep tech startup founder passionate about machine learning, startups, and sustainability. Sharing insights on how technology can create a better future.