Top researchers from OpenAI, DeepMind, and Anthropic are grappling with the trade-offs between capability and safety, and what it means for the future of the industry.
tl;dr — Current AI models can “think out loud” in English (Chain of Thought, or CoT), giving us a rare chance to monitor their reasoning for safety risks. This is a fragile opportunity. Future AI might learn to hide its thoughts or use unreadable internal “language,” especially if we’re not careful about how we train it.
Top AI labs agree we must actively work to preserve this “monitorability” as a key safety layer, treating it like a precious resource before the window closes.
Dr. Adnan Masood is an Engineer, Thought Leader, Author, AI/ML PhD, Stanford Scholar, Harvard Alum, Microsoft Regional Director, and STEM Robotics Coach.