Category: Vision-Language Models
- "Removing One Layer" Makes the Model Better at Tasks? HIT(SZ) | Yang Shuo's Team Discovers Task-Interfering Layers in VLMs
- Abandoning Manual Annotation! Chinese Team Proposes Self-Evolution Algorithm for Multimodal Large Models
- Interpretation of Seed1.5-VL Technical Report
- Multimodal Large Models Collectively Fail, GPT-4o Only 50% Safety Pass Rate: SIUO Reveals Cross-Modal Safety Blind Spots