Meta’s Multi-token Model, A New Beginning for AI?

A New Type of Faster… and Smarter LLMs?

Ignacio de Gregorio
9 min readJul 15, 2024
Generated by author using GPT-4o

A new, better way of training Large Language Models (LLMs)?

Meta offers this with new research: a model that predicts multiple tokens at once in every prediction, not just one, and unlike previous proposals, with no training overhead.

This not only speeds up the model's text generation but could also make it smarter, meaning we might be about to enter a new training paradigm for frontier AI.

A Weak form of Learning

Today, all LLMs are taught the same, very inefficient way.

A Universal Task Interface for LLMs

When training a deep neural network, you must define the task you want the model to optimize.

In the LLM case, that task is next-word prediction; a model receives a set of input words in the form of a text sequence and predicts the next token (a word or a subword).

Then, that token is added to the sequence, and the entire sequence is driven through the model again, generating the next token.

And as with all neural networks today, we need to find a way to measure each prediction from the model and use this signal (how good or…

--

--