Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Member-only story

Open Language Models

Everything You Should Know About Evaluating Large Language Models

From perplexity to measuring general intelligence

10 min readAug 28, 2023

--

Image generated by the author using Stable Diffusion.

As open source language models become more readily available, getting lost in all the options is easy.

How do we determine their performance and compare them? And how can we confidently say that one model is better than another?

This article provides some answers by presenting training and evaluation metrics, and general and specific benchmarks to have a clear picture of your model’s performance.

If you missed it, take a look at the first article in the Open Language Models series:

Perplexity

Language models define a probability distribution over a vocabulary of words to select the most likely next word in a sequence. Given a text, a language model assigns a probability to each word in the language, and the most likely is selected.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Donato Riccio
Donato Riccio

Written by Donato Riccio

AI Engineer specialized in Large Language Models.

Responses (1)