Member-only story
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI
Large Language Models (LLMs) have become indispensable tools for diverse natural language processing (NLP) tasks. Traditional LLMs operate at the token level, generating output one word or subword at a time. However, human cognition works on multiple levels of abstraction, enabling deeper analysis and creative reasoning.
Addressing this gap, in a new paper Large Concept Models: Language Modeling in a Sentence Representation Space, a research team at Meta introduces the Large Concept Model (LCM), a novel architecture that processes input at a higher semantic level. This shift allows the LCM to achieve remarkable zero-shot generalization across languages, outperforming existing LLMs of comparable size.
The key motivation behind LCM’s design is to enable reasoning at a conceptual level rather than the token level. To achieve this, LCM employs a semantic embedding space known as SONAR. Unlike traditional token-based approaches, this embedding space allows for higher-order conceptual reasoning. SONAR has already demonstrated strong performance on semantic similarity metrics such as xsim and has been used successfully in…