Category: Deep Learning

RLVR Reinforcement Learning Training Costs Plummet 98%! 12 PEFT Methods Head-to-Head, Results Are Surprising...
Attention Is Not What You Need? Reframing Sequence Modeling with Geometric Aesthetics via Grassmann Manifolds
Wenfeng Liang Signs, DeepSeek Kicks Off New Year with a New Macro Architecture Chapter, Cracking the Gradient Explosion and Memory Wall
[In-Depth] Ilya Sutskever's Selected Paper: The Platonic Representation Hypothesis
SJTU PhD's Latest Insights: Clarifying Reinforcement Learning with Just Two Questions
A New Perspective on NAS: Graph Neural Networks Drive Universal Architecture Space, Hybrid Convolutional and Transformer Performance Leaps!
Is Cancer Truly Close to Being Conquered by AI? Google Announces Two Breakthroughs in Two Days
NTU and Others Propose A-MemGuard: Locking AI Memory, Dropping Poisoning Attack Success Rate by Over 95%
Mamba Architecture Heads to ICLR 2026: Can AI's Core Brain, Transformer, Maintain Its Throne?
Recursive Reasoning HRM Model Reimagined! TRM Two-Layer Network (7M Parameters) Outperforms LLMs!
In-depth Dissection of Large Models: From DeepSeek-V3 to Kimi K2, Understanding Mainstream LLM Architectures
Xiaohongshu Open-Sources First Multimodal Large Model, dots.vlm1, Performance Rivals SOTA!
Google Open-Sources DeepPolisher, Halving Genome Assembly Error Rates; Jeff Dean: "Exciting!"
Qwen Updates Overnight: Runs on RTX 3090, 3B Parameters Activated Rival GPT-4o
Hierarchical Reasoning Model
Andrew Ng Launches Free LLM Post-Training Course, Covering Three Major Optimization Methods: SFT, DPO, RL
A Recent Survey on Continual Reinforcement Learning Technologies
Alibaba Open-Sources Breakthrough Agent Overnight, Directly Challenges OpenAI with State-of-the-Art Performance!
Did "More is Better" Fail? ModelSwitch Jumps Out of the Sampling Black Hole, Rewriting the LLM Inference Paradigm
Kaiming He's New Work: Adding Regularization to Diffusion Models for Performance Improvement with No Pre-training or Data Augmentation, Simple to Implement
R1-like Training No Longer Just Focuses on Result Correctness! CUHK Launches SophiaVL-R1 Model
10 Lines of Code, 15% Improvement in AIME24/25! Unveiling the Entropy Mechanism in Large Language Model Reinforcement Learning
No Manual Annotation Needed! AI Self-Generates Training Data, Unlocking Reasoning Capabilities via "Deduction-Induction-Abduction"
Deep Learning: Mamba Core Author's New Work Replaces DeepSeek's Attention Mechanism, Designed for Inference
Andrej Karpathy Praises Stanford Team's New Work: Achieving Millisecond-Level Inference with Llama-1B