Category: Large Language Models
- Models Are Too Fond of Cheating! Cursor Reveals the Inside Story of Composer 2's Reinforcement Learning: Models Can Detect 'Fake Environments', and Floating-Point Non-Determinism Is a Fatal Flaw in RL Training
- Stop Handwriting Skills! Microsoft’s Latest Research: Train Your Skills Like Neural Networks
- Unbelievable… FaceWall Had AI Write a Training Framework, and It Trained the Strongest 1B Model by Itself: MiniCPM5-1B
- On 520, Meet China's New 'Model King' Qwen3.7-Max!
- Burning Too Many Tokens Using Claude Code on Large Codebases? This Open-Source Tool Cuts Tool Calls by 92%
- 35B-Parameter Science Model Rivals Trillion-Parameter Giants: 'Intern' Science Model Intern-S2-Preview Open-Sourced
- WWW'26 | A New Paradigm for Cross-Task Adaptive Multi-Agent Collaboration
- Math Major, in Crisis! Fields Medalist Tests ChatGPT 5.5 Pro, Achieves Thesis-Level Result in 17 Minutes
- Zero Index, Zero Embedding, Pure Grep: DCI Does Deep Research Directly on Raw Corpora
- Attention Is All You Need Author Returns: Can a 99% Sparse Transformer Be Even Faster?
- The Ghost of Markov: From Predicting the Next Word to Predicting the Next Action
- Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies in LLM Reinforcement Learning
- Compression Is All You Need — A Letter on Mathematics and AI from Fields Medalist Michael Freedman
- Token-Level, Precision Length Control: 3B Model Beats GPT 5.4 and Claude
- Perhaps the Most Impressive AI Paper of Recent Years: After Giving AI Reasoning Real-Time Subtitles, Its Inner Thoughts Are Shocking!
- Hardcore: Google's Jeff Dean Says the Bottleneck for Million-Chip LLM Pre-training Has Been Completely Broken!
- Static Benchmarks ‘Outdated’? OpenKG Continues to Update the LLM Knowledge-Enhanced Dynamic Evaluation Leaderboard Dynamic OneEval-202605
- Leaderboard-Hacking AIs Wiped Out! Meta-Stanford's Hellish Test Leaves GPT/Claude/Gemini Scoring Zero
- Karpathy and Claude Code Creator Boris Drop Latest Interview That's Shaking the Programmer World
- Abstract-CoT: Reasoning Tokens Slashed 11.6x, Chain-of-Thought Without Words Shatters LLM Efficiency Ceiling
- Your Agent Isn't Really Learning—It's Just Flipping Through a Notebook
- The Era of Software 3.0 Has Arrived
- Qwen-Scope: Seeing Through the 'Hidden Thoughts' of Large Models
- The Father of GPT Throws AI Back to 1930: Never Saw a Line of Code, Yet 'Invented' Python!
- Scaling Pain: Lessons from Serving Ultra-Large-Scale Coding Agents