Category: Model Interpretability

Top Models Like GPT-5.4 and Claude Opus Exposed for 'Fake Reasoning': Is the Problem-Solving Process Just a 'Performance'?
Google's New Research Identifies the Crucial Tokens Where Large Models Ponder Deeply!
"Removing One Layer" Makes the Model Better at Tasks? HIT(SZ) | Yang Shuo's Team Discovers Task-Interfering Layers in VLMs
How Does Claude 4 Think? Senior Researchers Respond: RLHF Paradigm is Out, RLVR Proven in Programming/Mathematics