Category: Sparse Attention

TIP × AsyncTLS: Distillation Training Cuts Tokens by Half, Sparse Attention Inference Surges 4.7x
Long Context Reduced by 60% + 95% Sparsity: A Double Breakthrough Today Sets New Records in Inference Efficiency
DeepSeek R2's Secret Weapon Revealed! The Technology Just Awarded a Top Prize to Liang Wen-feng Allows AI to Read Long Texts 11 Times Faster
Ushering in the Era of On-Device Long Text! OpenBMB's New Architecture Boosts MiniCPM up to 220x Faster