PinnedOXEN AI. Build World-Class AI Datasets. Together.inOxen HerdWe version our code, why not our data?All machine learning solutions start with a good dataset. The author of “Deep Learning with Python” goes as far as statingOct 31, 2022Oct 31, 2022
OXEN AI. Build World-Class AI Datasets. Together.How to Train Diffusion for Text from Scratch | Oxen.aiThis is part two of a series on Diffusion for Text with Score Entropy Discrete Diffusion (SEDD) models. Today we will be diving into the…Apr 30Apr 30
OXEN AI. Build World-Class AI Datasets. Together.How to train Mistral 7B as a “Self-Rewarding Language Model” | Oxen.aiAbout a month ago we went over the “Self-Rewarding Language Models” paper by the team at Meta AI with the Oxen.ai Community. The paper felt…Mar 20Mar 20
OXEN AI. Build World-Class AI Datasets. Together.ArXiv Dives — Diffusion TransformersDiffusion transformers achieve state-of-the-art quality generating images by replacing the commonly used U-Net backbone with a transformer…Mar 13Mar 13
OXEN AI. Build World-Class AI Datasets. Together.Arxiv Dives — Toolformer: Language models can teach themselves to use toolsLarge Language Models (LLMs) show remarkable capabilities to solve new tasks from a few textual instructions, but they also paradoxically…Feb 13Feb 13
OXEN AI. Build World-Class AI Datasets. Together.Self-Rewarding Language Models — ArXiv Dives with Oxen.aiThe goal of this paper is to see if we can create a self-improving feedback loop to achieve “superhuman agents”. Current language models…Feb 6Feb 6
OXEN AI. Build World-Class AI Datasets. Together.Arxiv Dives — Direct Preference Optimization (DPO)This paper provides a simple and stable alternative to RLHF for aligning Large Language Models with human preferences called “Direct…Jan 30Jan 30
OXEN AI. Build World-Class AI Datasets. Together.Arxiv Dives — Efficient Streaming Language Models with Attention SinksThis paper introduces the concept of an Attention Sink which helps Large Language Models (LLMs) maintain the coherence of text into the…Jan 20Jan 20
OXEN AI. Build World-Class AI Datasets. Together.Arxiv Dives — How Mixture of Experts works with Mixtral 8x7BMixtral 8x7B is an open source mixture of experts large language model released by the team at Mistral.ai that outperforms Llama-2 70B and…Jan 13Jan 13
OXEN AI. Build World-Class AI Datasets. Together.Arxiv Dives — LLaVA 🌋 an open source Large Multimodal Model (LMM)What is LLaVA?Jan 7Jan 7