Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

The History of Open-Source LLMs: Better Base Models (Part Two)

How LLaMA, MPT, Falcon, and LLaMA-2 put open-source LLMs on the map…

16 min readNov 18, 2023

--

Press enter or click to view image in full size
(Photo by Iñaki del Olmo on Unsplash)

Open-source research on large language models (LLMs) is incredibly valuable, as it aims to democratize a powerful and influential technology. Although open-source LLMs are now commonly used and widely studied, this area of research saw some initial struggles that were difficult to overcome. Namely, open-source LLMs performed poorly at first and were heavily criticized. Within this overview, we will study a line of research that changed this narrative by making high-performing pre-trained LLMs available to everyone. Given that pre-training a language model is so expensive, the models we will study here are especially impactful. After these high-performing base models were created and released, many people could conduct research using these models at marginal added cost.

“The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the training methodology.” — from [14]

The current series. This overview is part two of a three part series on the history of open-source LLMs. The first part in the series overviewed initial attempts at creating open-source LLMs. Here, we will study the most popular open-source base models (i.e., language models that…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Cameron R. Wolfe, Ph.D.
Cameron R. Wolfe, Ph.D.

Written by Cameron R. Wolfe, Ph.D.

Research @ Netflix • Deep Learning Ph.D. • I make AI understandable

No responses yet