Category: Neural Networks
- Rotate Attention by 90 Degrees! Today, Kimi's 'Attention Residuals' Takes Off
- Less is More: Recursive Reasoning with Tiny Networks
- Understanding neural networks through sparse circuits
- ByteDance Seed's New Work DeltaFormer: An Attempt at Next-Generation Model Architecture
- Global Attention + Positional Attention Refresh SOTA! Nearly 100% Accuracy!
- 10 Years of Hard Research, Millions Wasted! AI Black Box Remains Unsolvable, Google Breaks Face-off
- Continuous Thought Machines Are Here! Startup by a 'Transformer Eight Sons' Member Launches, Letting AI Stop Making 'One-Step' Snap Decisions