Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Understanding Large Language Models: The Physics of (Chat)GPT and BERT

12 min readJul 20, 2023

--

ChatGPT and ice crystals may have more in common than one might think (credit: 15414483@pixabay)

ChatGPT, or more broadly Large Language AI Models (LLMs), have become ubiquitous in our lives. Yet, most of the mathematics and internal structures of LLMs are obscure knowledge to the general public.

So, how can we move beyond perceiving LLMs like ChatGPT as magical black boxes? Physics may provide an answer.

Everyone is somewhat familiar with our physical world. Objects such as cars, tables, and planets are composed of trillions of atoms, governed by a simple set of physical laws. Similarly, complex organisms, like ChatGPT, have emerged and are capable of generating highly sophisticated concepts like arts and sciences.

It turns out that the equations of the building blocks of LLMs, are analogous to our physical laws. So that by understanding how complexity arises from our simple physical laws, we might be able to gleam some insight on how and why LLMs work.

Complexity from Simplicity

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Tim Lou, PhD
Tim Lou, PhD

Written by Tim Lou, PhD

Data Scientist @ TTD | ex Researcher @ Berkeley/LBNL | Particle Physics PhD @ Princeton | Podcast @ quirkcast.org