PinnedJeffrey IpLLM Evaluation Metrics: Everything You Need for LLM EvaluationAlthough evaluating the outputs of Large Language Models (LLMs) is essential for anyone looking to ship robust LLM applications, LLM…Jan 223Jan 223
PinnedJeffrey IpA Step-By-Step Guide to Evaluating an LLM Text Summarization TaskWhen you imagine what a good summary for a 10-page research paper looks like, you likely picture a concise, comprehensive overview that…Dec 18, 20231Dec 18, 20231
PinnedJeffrey IpHow to Evaluate LLM ApplicationsChatGPT, the leading code generator, has exploded in popularity over the past year thanks to the seemingly omniscient GPT-4. Its ability to…Nov 9, 20232Nov 9, 20232
PinnedJeffrey IpWhy we replaced Pinecone with PGVectorPinecone, the leading closed-source vector database provider, is known for being fast, scalable, and easy to use. Its ability to allow…Oct 31, 202311Oct 31, 202311
Jeffrey IpHow to Build an LLM Evaluation Framework, from ScratchLet’s set the stage: I’m about to change my prompt template for the 44th time when I get a message from my manager: “Hey Jeff, I hope…Apr 81Apr 81
Jeffrey IpLLM Testing in 2024: Top Methods and StrategiesJust a week ago, I was on a call with a DeepEval user who told me she considers testing and evaluating large language models (LLMs) as…Feb 262Feb 262
Jeffrey IpThe Ultimate Guide to Fine-Tune LLaMA 2, With EvaluationsFine-tuning a Large Language Model (LLM) comes with tons of benefits when compared to relying on proprietary foundational models such as…Feb 21Feb 21
Jeffrey IpLLM Benchmarking: Evaluating LLMs in 2024Picture LLMs ranging from 7 billion to over 100 billion parameters, each more powerful than the last. Among them are the giants: Mistral 7…Jan 83Jan 83
Jeffrey IpWhy OpenAI Assistants is a Big Win for LLM EvaluationA week after the famous, or infamous, OpenAI Dev Day, we at Confident AI released JudgementalGPT — an LLM agent built using OpenAI’s…Nov 22, 20234Nov 22, 20234
Jeffrey IpWhat is Retrieval Augmented Generation (RAG)?Large-language models like ChatGPT are powerful and versatile generators of natural language, but also extremely limited by the the data…Oct 25, 20231Oct 25, 20231