Category: Large Language Models
- Can AI "Admit Its Own Mistakes"? Solving the "Rashomon" of Multi-Agent Collaboration, Earning ICML 2025 Spotlight
- Stanford Chinese Team's Surprise Upset! AI Writes Pure CUDA-C Kernels, Outperforming PyTorch?
- Large Models Struggling with Sudoku?! Transformer Author's Startup Releases Ranking: o3 Mini High's "Variant Sudoku" Accuracy Only 2.9%
- Andrej Karpathy Praises Stanford Team's New Work: Achieving Millisecond-Level Inference with Llama-1B
- Tsinghua University's New RAG Framework: DO-RAG Accuracy Soars by 33%!
- LLM + RL Questioned: Deliberately Using Incorrect Rewards Still Significantly Boosts Math Benchmarks, Causing a Stir in the AI Community
- Qwen Team Releases Long-Context Reasoning Model QwenLong-L1, Surpassing o3-mini
- All-In Podcast Transcript: Gemini Leads "Infinite Context," AI Ascends from Tool to Cognitive Collaborator
- Llama Paper Authors "Flee," Only 3 Remaining from 14-Person Team, French Unicorn Mistral Becomes the Biggest Winner
- Prolonged Reasoning ≠ High Accuracy! Adaptive Switching Between 'Quick Response' and 'In-depth Thought': A Win-Win Philosophy for Token Saving and Accuracy Improvement
- ICML 2025 | Bursting the AI Bubble with 'Human Testing Methods': Building a Capability-Oriented Adaptive Assessment New Paradigm
- Alibaba Open-Sources New Qwen Model: A Dragon Boat Festival Gift!
- ICML 2025 | Fast and Powerful Liger! Transformer Instantly Switches to Linear RNN with only 20M Token Fine-tuning
- Can LLMs Understand Math? Latest Research Reveals Fatal Flaws in Large Models' Mathematical Reasoning
- How She Brought "System 2" to Large Language Models | An Interview with Dr. Li Zhang from Microsoft Research Asia
- 312 Trajectories Boost Performance by 241%! SJTU and SII Open-Source Computer Agent Surpasses Claude 3.7
- Historic First! o3 Finds Linux Kernel Zero-Day Vulnerability, Uncovered After 100 Scans of 12,000 Lines of Code, No Tools Required
- Statistically Controllable Data Synthesis! New Framework Breaks LLM Data Generation Limitations, McGill University Team Launches LLMSynthor
- Deep Dive | Interview with Character.AI CEO: The Best Applications Haven't Been Invented Yet, AI Field is Like Alchemy, Nobody Knows Exactly What Will Work
- The Smarter AI Gets, The Less Obedient It Becomes! New Study: Strongest Reasoning Models Only Follow Instructions 50% of the Time
- Breaking the Chain-of-Thought Reasoning Bottleneck! "Soft Thinking" Enables LLMs to Learn Human Abstract Abilities, with Reduced Token Usage
- How Does Claude 4 Think? Senior Researchers Respond: RLHF Paradigm is Out, RLVR Proven in Programming/Mathematics
- Interpretation of Seed1.5-VL Technical Report
- Train a Tiny LLM from Scratch for Just ¥8 in 9 Hours! Full Tutorial Including Reasoning, MoE, and More
- Gemini Diffusion: 1500 tokens/sec, Lightning Fast!