Homepage
Open in app
Sign inGet started

DAIR.AI

Democratizing Artificial Intelligence Research, Education, and Technologies

  • AI
  • Research
  • Developers
  • Learn
  • Contribute
  • 🔥 dair.ai
  • Prompt Caching with Claude 3.5 Sonnet

    Prompt Caching with Claude 3.5 Sonnet

    This new video tutorial provides an overview of the recent prompt caching feature introduced by Anthropic for their Claude 3.5 Sonnet…
    Go to the profile of elvis
    elvis
    Aug 23, 2024
    Fine-tuning with GPT-4o Models

    Fine-tuning with GPT-4o Models

    We are working on tiny tutorials that explore fine-tuning with LLMs.
    Go to the profile of elvis
    elvis
    Aug 22, 2024
    [LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Methods, Mistral Nemo, Spreadsheet LLM

    [LLMs News] GPT-4o mini, Codestral Mamba, Prompt Engineering Method...

    Huge efforts in small language models and improving reasoning efficiency in LLMs.
    Go to the profile of elvis
    elvis
    Jul 22, 2024
    LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

    LLM News: ESM3, CriticGPT, Gemma 2, LLM Compiler, LongRAG, GraphReader

    Lots of exciting developments in AI and LLMs in the past couple of days.
    Go to the profile of elvis
    elvis
    Jul 1, 2024
    [LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, Safe SuperIntelligence Inc

    [LLM News] Claude 3.5 Sonnet, Open-Sora, Context Caching, PlanRAG, ...

    Developments in AI and LLMs just don’t slow down.
    Go to the profile of elvis
    elvis
    Jun 22, 2024
    Papers Explained: Mistral 7B

    Papers Explained: Mistral 7B

    Mistral 7B is an LLM engineered for superior performance and efficiency. It leverages grouped-query attention (GQA) for faster inference…
    Go to the profile of Ritvik Rastogi
    Ritvik Rastogi
    Oct 22, 2023
    Papers Explained 63: LLaMA 2 Long

    Papers Explained 63: LLaMA 2 Long

    LLaMA 2 Long is a series of long-context LLMs built through continual pretraining from LLAMA 2 with longer training sequences that support…
    Go to the profile of Ritvik Rastogi
    Ritvik Rastogi
    Oct 19, 2023
    Papers Explained 62: Code Llama

    Papers Explained 62: Code Llama

    Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models…
    Go to the profile of Ritvik Rastogi
    Ritvik Rastogi
    Oct 15, 2023
    Papers Explained 61: Humpback

    Papers Explained 61: Humpback

    Instruction back translation is a scalable method to build a high-quality instruction following language model by automatically labeling…
    Go to the profile of Ritvik Rastogi
    Ritvik Rastogi
    Oct 12, 2023
    Most Popular
    Detecting Sarcasm with Deep Convolutional Neural Networks

    Detecting Sarcasm with Deep Convolutional Neural Networks

    Overview This paper addresses a key NLP problem known as sarcasm detection using a combination of models based on convolutional neural…
    Go to the profile of elvis
    elvis
    Apr 30, 2018
    A Light Introduction to Transfer Learning for NLP

    A Light Introduction to Transfer Learning for NLP

    In this post, I will introduce transfer learning for natural language processing and key questions necessary to better understand this…
    Go to the profile of elvis
    elvis
    Jul 26, 2018
    Deep Learning for NLP: An Overview of Recent Trends

    Deep Learning for NLP: An Overview of Recent Trends

    In a timely new paper, Young and colleagues discuss some of the recent trends in deep learning based natural language processing (NLP)…
    Go to the profile of elvis
    elvis
    Aug 23, 2018
    About DAIR.AILatest StoriesArchiveAbout MediumTermsPrivacyTeams