Gemini Pro vs. GPT-3.5: Another Evaluation, Another Conclusion

Changing hyperparameters is all you need

3 min readDec 28, 2023

When Google announced Gemini, they presented an evaluation showing that Gemini Pro, the best version of Gemini currently available through Google’s API, significantly outperforms GPT-3.5 on some benchmarks.

We don’t know much about this evaluation. Many parameters, such as the prompts and the decoding hyperparameters, have not been disclosed but we know that they have a huge influence on the final results.

To better understand how Gemini is better than GPT models, the NeuLab of CMU performed a new evaluation on a much larger number of tasks:

Knowledge-based question answering (MMLU)
Reasoning (BIG-Bench Hard)
Math (GSM8k, SVAMP, ASDIV, MAWPS)
Code generation (HumanEval, ODEX)
Translation (FLORES)
Web Instruction Following (WebArena)

They ran Gemini Pro, GPT-3.5 Turbo, GPT-4 Turbo, and Mixtral on these benchmarks using the same prompts and the same decoding hyperparameters for all of them.

Gemini Pro vs. GPT-3.5: Another Evaluation, Another Conclusion

Changing hyperparameters is all you need

Written by Benjamin Marie