Category: Large Language Models
- Princeton Danqi Chen's Group's New Work: RLHF Insufficient, RLVR Bounded? RLMT Forges a Third Path
- First Code World Model Ignites AI Community, Enabling "True Reasoning" for Agents, Meta Open-Sources It
- The More You Think, The More You Err: CoT "Deep Deliberation" as a Catalyst for LLM Hallucinations!
- Boost LLM Reasoning Accuracy to 99% Without Fine-Tuning! Try DeepConf, a Lightweight Inference Framework | Latest from Meta
- Stanford Proposes New RL Paradigm: 3B Model Agent Outperforms Claude, GPT-4
- Why Do Large Language Models Hallucinate? OpenAI's Latest Research Uncovers the Reasons
- Stanford's Latest Research: Even the Strongest LLMs Struggle with Cutting-Edge Code! Gemini 2.5 Pro's Success Rate Under 40%
- Microsoft Introduces rStar2-Agent: "Thinking Smarter" Proves Far More Effective and Efficient Than Simply "Thinking Longer"
- 【Master's Thoughts】Martin Fowler's AI Musings: We're in an Era Where Even the "Problem" Isn't Clear
- Meta Introduces Deep Think with Confidence: Boosting Reasoning Accuracy and Efficiency with Minimal Changes
- MCP Tool Stacking is a Trap! Developer Guru: Command Line's 'Brittleness' Crushes AI! Better to Axe It Down to a Single Code Executor: 7 Calls Become 1! Netizens: Should've Abandoned Black Box Tools Long Ago!
- LLMs Dominate Math Boards, Yet Forget How to Chat? CMU et al. Reveal Striking Differences Between SFT and RL!
- A New Revolution in Reward Models! SWIFT Reads "Inner Voice" Instead of Text, Creating a Faster, Stronger, and More Cost-Effective AI Judge
- The "Mirage" of Chain-of-Thought Reasoning: An In-depth Look at LLM Generalization
- GPT-5 vs Claude Opus 4.1: Coding Capability Assessment
- In-depth Dissection of Large Models: From DeepSeek-V3 to Kimi K2, Understanding Mainstream LLM Architectures
- ARPO: Agentic Reinforced Policy Optimization, Enabling Agents to Explore One Step Further at Critical Moments
- Open-Sourcing the Largest High-Quality Scientific Reasoning Post-Training Dataset to Quickly Turn Qwen3 and Others into "Scientists"
- Wang Mengdi's Team Review of "Self-Evolving Agents": From Static LLMs to Artificial Superintelligence (ASI)
- Anthropic Team Uncovers 'Persona Variables' to Control Large Language Model Behavior, Cracking the Black Box of AI Madness
- Is Your Model's Attention Drifting? RUC and Tsinghua University Introduce LeaF: Pruning Distracting Tokens for Focused Learning
- Can Models Truly "Reflect on Code"? Beihang University Releases Repository-Level Understanding and Generation Benchmark, Refreshing the LLM Understanding Evaluation Paradigm
- ReaGAN: Empowering Each Node as an Intelligent Reasoning Expert in Graphs
- Google's Challenge: DeepSeek, Kimi and More to Compete in First Large Model Showdown Starting Tomorrow
- RAG Revolution! Graph-R1, the First RL-driven Graph Reasoning Agent