Genius or Subpar AI Mathematician? New Study Questions ChatGPT’s Mathematical Capabilities
The November release of ChatGPT garnered unprecedented public and media attention. OpenAI’s conversational large language model (LLM) was widely applauded for its ability to answer complex queries, generate correct computer code and coherent long-form essays, and even solve math problems. But might that last claim have been premature?
In the new paper Mathematical Capabilities of ChatGPT, a research team from the University of Oxford, TU Wein, University of Cambridge, University of Vienna, and Princeton University tests ChatGPT’s mathematical capabilities on publicly available and hand-crafted datasets and evaluates its suitability as an assistant to professional mathematicians. The team concludes that despite the glowing media reviews, ChatGPT’s mathematical abilities “are significantly below those of an average mathematics graduate student.”
The team summarizes their main contributions as follows:
- Insight for mathematical use is provided. We show for which types of questions and which domains of mathematics ChatGPT may be useful and how it could be integrated into the…