Sign in Get started

DAIR.AI

Democratizing Artificial Intelligence Research, Education, and Technologies

[LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Methods, Mistral Nemo, Spreadsheet LLM

[LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Methods, Mistral Nemo, Spreadsheet LLM

Huge efforts in small language models and improving reasoning efficiency in LLMs.

Jul 22

LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

Lots of exciting developments in AI and LLMs in the past couple of days.

Jul 1

[LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, Safe SuperIntelligence Inc

[LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, ...

Developments in AI and LLMs just don’t slow down.

Jun 22

Papers Explained: Mistral 7B

Papers Explained: Mistral 7B

Mistral 7B is an LLM engineered for superior performance and efficiency. It leverages grouped-query attention (GQA) for faster inference…

Oct 22, 2023

Papers Explained 63: LLaMA 2 Long

Papers Explained 63: LLaMA 2 Long

LLaMA 2 Long is a series of long-context LLMs built through continual pretraining from LLAMA 2 with longer training sequences that support…

Oct 19, 2023

Papers Explained 62: Code Llama

Papers Explained 62: Code Llama

Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models…

Oct 15, 2023

Papers Explained 61: Humpback

Papers Explained 61: Humpback

Instruction back translation is a scalable method to build a high-quality instruction following language model by automatically labeling…

Oct 12, 2023

Papers Explained 60: Llama 2

Papers Explained 60: Llama 2

Llama 2 is a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters…

Oct 8, 2023

Papers Explained 59: Falcon

Papers Explained 59: Falcon

As larger models require pretraining on trillions of tokens, it is unclear how scalable is curation of “high-quality” corpora, such as…

Oct 5, 2023

Most Popular

Detecting Sarcasm with Deep Convolutional Neural Networks

Detecting Sarcasm with Deep Convolutional Neural Networks

Overview This paper addresses a key NLP problem known as sarcasm detection using a combination of models based on convolutional neural…

Apr 30, 2018

A Light Introduction to Transfer Learning for NLP

A Light Introduction to Transfer Learning for NLP

In this post, I will introduce transfer learning for natural language processing and key questions necessary to better understand this…

Jul 26, 2018

Deep Learning for NLP: An Overview of Recent Trends

Deep Learning for NLP: An Overview of Recent Trends

In a timely new paper, Young and colleagues discuss some of the recent trends in deep learning based natural language processing (NLP)…

Aug 23, 2018

About DAIR.AILatest StoriesArchiveAbout MediumTermsPrivacyTeams