Category: AI Safety
- Demis Hassabis's Stunning Confession: The AI I Built Could Extinguish Humanity, But No One Can Stop It Now
- Models Have Gained Introspective Capabilities, But Their Inner Doors Were Locked | Hao's Paper Talk
- Global AI Agents Gone Rogue! Meta's 2-Hour Disaster Pierces the Heart of Silicon Valley as OpenClaw Strikes Back
- Anthropic on the Cover of Time! Internal Revelations: AI Recursive Self-Improvement Could Happen Within a Year
- Shocking! If AI Controls the Nuclear Button, It Will Press It in 95% of Cases
- Geoffrey Hinton: AI Starts 'Playing Dumb', the Problem Has Changed
- Measuring AI agent autonomy in practice
- Anthropic's Heavyweight Study: The Ultimate Risk of AI is Not Awakening, but Random Crashes
- Just Now: Anthropic's 53-Page Confidential Report Exposed: Claude Self-Escape Could Trigger Global Catastrophe!
- Anthropic Discovers AI 'Broken Windows Effect': Teaching It to Cut Corners Leads to Learning Lies and Sabotage
- Detour to AGI: Shanghai AILab's Bombshell Finding - Self-Evolving Agents May 'Misevolve'
- Understanding neural networks through sparse circuits
- Google Enters the CUA Battleground, Launches Gemini 2.5 Computer Use: Allowing AI to Directly Operate the Browser
- Anthropic Team Uncovers 'Persona Variables' to Control Large Language Model Behavior, Cracking the Black Box of AI Madness
- AI's "Dual Personality" Exposed: OpenAI's Latest Research Finds AI's "Good and Evil Switch," Enabling One-Click Activation of its Dark Side
- One of the Greatest AI Interviews of the Century: AI Safety, Agents, OpenAI, and Other Key Topics
- More Toxic, More Secure? Harvard Team's Latest Research: 10% Toxic Training Makes Large Models Invulnerable
- AI Acts as Its Own Network Administrator, Achieving a "Safety Aha-Moment" and Reducing Risk by 9.6%
- Sakana AI's New Research: The Birth of the Darwin-Gödel Machine with Self-Encoding Improvement and Self-Referential Open-Ended Evolution
- Claude 4 Completely Out of Control! Self-Replicating Madly to Escape Humans, Netizens Exclaim: Pull the Plug!
- Multimodal Large Models Collectively Fail, GPT-4o Only 50% Safety Pass Rate: SIUO Reveals Cross-Modal Safety Blind Spots
- 10 Years of Hard Research, Millions Wasted! AI Black Box Remains Unsolvable, Google Breaks Face-off
- Turing Awardee, "Godfather of AI" Hinton: When Superintelligence Awakens, Humanity May Be Powerless to Control
- AI Self-Replication Risk: AISI Launches RepliBench Benchmark
- AGI Race Towards Loss of Control? MIT: Even Under Strongest Oversight, Probability of Loss of Control Still Exceeds 48%, Total Loss of Control Risk Exceeds 90%!