Sitemap
AIGuys

Deflating the AI hype and bringing real research and insights on the latest SOTA AI research papers. We at AIGuys believe in quality over quantity and are always looking to create more nuanced and detail oriented content.

Member-only story

Featured

(How) Do LLMs Reason and Plan?

--

Large Reasoning Models (LRM) have been all the rage for the last few months. The age of LLMs is over, now it is LRMs' time. Be it Gemini 2.5, Claude thinking mode, or GPT o-series models, all of them have moved towards reasoning models. Fundamentally, all of them are still LLMs only, but suddenly, these models feel much better and smarter. Their ability to reason and plan seems to increase manyfold. Every week we are crushing different benchmarks, but as responsible researchers and AI enthusiasts, we must ask, how much of this development is real and how much of it is just a hype, a marketing gimmick.

Table of Contents

  • Why You Shouldn’t Trust Benchmarks
  • Types of Large Reasoning Models (Test Time Scaling & Post Training Methods)
  • Confusion About LLMs’ Reasoning Capabilities
  • How good are LRMs?
  • Is RL Overhyped In Making Machines Smarter?
  • Conclusion
Photo by Estée Janssens on Unsplash

Why You Shouldn’t Trust Benchmarks

I feel that most benchmarks are compromised in one way or the other. Personally, I don’t trust any benchmark anymore, these are mere indicators, not the absolute performance.

There have been many leaks in the past few years in the AI benchmark.

--

--

AIGuys
AIGuys

Published in AIGuys

Deflating the AI hype and bringing real research and insights on the latest SOTA AI research papers. We at AIGuys believe in quality over quantity and are always looking to create more nuanced and detail oriented content.

Responses (1)