Sign in Get started

DAIR.AI

Democratizing Artificial Intelligence Research, Education, and Technologies

Prompt Caching with Claude 3.5 Sonnet

Prompt Caching with Claude 3.5 Sonnet

This new video tutorial provides an overview of the recent prompt caching feature introduced by Anthropic for their Claude 3.5 Sonnet…

Aug 23

Fine-tuning with GPT-4o Models

Fine-tuning with GPT-4o Models

We are working on tiny tutorials that explore fine-tuning with LLMs.

Aug 22

[LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Methods, Mistral Nemo, Spreadsheet LLM

[LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Method...

Huge efforts in small language models and improving reasoning efficiency in LLMs.

Jul 22

LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

Lots of exciting developments in AI and LLMs in the past couple of days.

Jul 1

[LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, Safe SuperIntelligence Inc

[LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, ...

Developments in AI and LLMs just don’t slow down.

Jun 22

Papers Explained: Mistral 7B

Papers Explained: Mistral 7B

Mistral 7B is an LLM engineered for superior performance and efficiency. It leverages grouped-query attention (GQA) for faster inference…

Oct 22, 2023

Papers Explained 63: LLaMA 2 Long

Papers Explained 63: LLaMA 2 Long

LLaMA 2 Long is a series of long-context LLMs built through continual pretraining from LLAMA 2 with longer training sequences that support…

Oct 19, 2023

Papers Explained 62: Code Llama

Papers Explained 62: Code Llama

Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models…

Oct 15, 2023

Papers Explained 61: Humpback

Papers Explained 61: Humpback

Instruction back translation is a scalable method to build a high-quality instruction following language model by automatically labeling…

Oct 12, 2023

Most Popular

Detecting Sarcasm with Deep Convolutional Neural Networks

Detecting Sarcasm with Deep Convolutional Neural Networks

Overview This paper addresses a key NLP problem known as sarcasm detection using a combination of models based on convolutional neural…

Apr 30, 2018

A Light Introduction to Transfer Learning for NLP

A Light Introduction to Transfer Learning for NLP

In this post, I will introduce transfer learning for natural language processing and key questions necessary to better understand this…

Jul 26, 2018

Deep Learning for NLP: An Overview of Recent Trends

Deep Learning for NLP: An Overview of Recent Trends

In a timely new paper, Young and colleagues discuss some of the recent trends in deep learning based natural language processing (NLP)…

Aug 23, 2018

About DAIR.AILatest StoriesArchiveAbout MediumTermsPrivacyTeams