Guilherme BaptistaLBPE Score: A New Perspective for Evaluating AI LLMsEvaluating 9 AI models, ChatGPT outperforms Gemini and others in areas not captured by popular benchmarks. Mistral AI is getting closer to…9 min read·Jan 1, 2024----
Guilherme BaptistaGemini claims superiority over ChatGPT: I tried to replicate their findingsMy MMLU test reproduction matches GPT-4’s results but contradicts those of GPT-3.5 and Gemini Pro, including their reported performance…5 min read·Dec 25, 2023----