Category: Large Language Models

AI Visionary Fei-Fei Li's Extensive Article Ignites Silicon Valley: Large Language Models Are on the Wrong Path, Spatial Intelligence Is the Only Way to AGI
Meta's Two Latest Agent Learning Papers Are Quite Interesting!
Inoculation Prompting: Making Large Language Models "Misbehave" During Training to Improve Test-Time Alignment
We Planted a Word in Claude's Mind, and It Began to "Rationalize"! Anthropic's Latest Research: AI Possesses Introspective Abilities!
GPT models becoming more conservative? Stanford Manning team proposes Verbalized Sampling to make models "think a bit more"
Abandoning Manual Annotation! Chinese Team Proposes Self-Evolution Algorithm for Multimodal Large Models
Abandoning Fine-Tuning: Stanford Co-releases Agentic Context Engineering (ACE), Boosting Model Performance by 10% and Reducing Token Costs by 83%
Just Released! Tsinghua and Partners Open Source UltraRAG 2.0! Performance Soars by 12%
Google Enters the CUA Battleground, Launches Gemini 2.5 Computer Use: Allowing AI to Directly Operate the Browser
LLMs in Document Intelligence: Survey, Progress, and Future Trends
Chinese Team Trains "Spiking Large Model," Boosting Inference Speed by 100 Times
NeurIPS'25! AutoPrune: A Plug-and-Play Adaptive Pruning Framework for Large Models
SJTU & Stanford Propose "Long Code Compression Artifact": 5.6x Extreme Slimming Without Performance Drop
Princeton Danqi Chen's Group's New Work: RLHF Insufficient, RLVR Bounded? RLMT Forges a Third Path
First Code World Model Ignites AI Community, Enabling "True Reasoning" for Agents, Meta Open-Sources It
The More You Think, The More You Err: CoT "Deep Deliberation" as a Catalyst for LLM Hallucinations!
Boost LLM Reasoning Accuracy to 99% Without Fine-Tuning! Try DeepConf, a Lightweight Inference Framework | Latest from Meta
Stanford Proposes New RL Paradigm: 3B Model Agent Outperforms Claude, GPT-4
Why Do Large Language Models Hallucinate? OpenAI's Latest Research Uncovers the Reasons
Stanford's Latest Research: Even the Strongest LLMs Struggle with Cutting-Edge Code! Gemini 2.5 Pro's Success Rate Under 40%
Microsoft Introduces rStar2-Agent: "Thinking Smarter" Proves Far More Effective and Efficient Than Simply "Thinking Longer"
【Master's Thoughts】Martin Fowler's AI Musings: We're in an Era Where Even the "Problem" Isn't Clear
Meta Introduces Deep Think with Confidence: Boosting Reasoning Accuracy and Efficiency with Minimal Changes
MCP Tool Stacking is a Trap! Developer Guru: Command Line's 'Brittleness' Crushes AI! Better to Axe It Down to a Single Code Executor: 7 Calls Become 1! Netizens: Should've Abandoned Black Box Tools Long Ago!
LLMs Dominate Math Boards, Yet Forget How to Chat? CMU et al. Reveal Striking Differences Between SFT and RL!