Category: AI Agents
- Codex Lead Reveals OpenAI Internal Development: Reinvented Weekly! Codex Has Evolved into a Teammate, Can Run Overnight and Self-Test! Advice for Newcomers: Fundamentals Never Go Out of Style; Windows Version Coming Soon
- OpenAI Codex Product Lead: Programming as We Know It Is Over
- Qwen3.5: Towards Native Multimodal Agents
- The Bitter Lesson! ROLL Team Shares: Practical Experience in Agentic RL Training
- WebMCP: A Bomb Google Planted in Chrome 146
- Just Released: Claude 4.6 and GPT-5.3-Codex Simultaneously Launched!
- Is Cursor No Longer Cool? A Top 0.01% Expert Defects to Claude, and Their 10,000-Word Defection Notes Go Viral!
- Let AI Level Itself Up: Meta Pushes Coding to Superintelligence with Self-play RL
- LAMER: Meta-Reinforcement Learning Enables Language Agents to Perform Active Exploration
- Can Large Models Handle Precision Work Too?! MIT Top Conference Paper Teaches AI to Operate Industrial CAD Software
- The Significance of Gemini 3: AI Has Surpassed the 'Hallucination Phase', Approaching Humans, 'Human-Machine Collaboration' Will Shift from 'Humans Correcting AI' to 'Humans Guiding AI Work'
- Claude Launches Skills Feature and Agent Skills Development Guide
- Letting CoT "Evolve" with the Environment: AgileThinker Achieves "Thinking While Doing" | Latest from Tsinghua
- Reinforcement Learning + Large Model Memory: Mem-α, Enabling Agents to "Learn How to Remember" for the First Time
- The More You Fail, The Faster You Learn! Trajectory Rewriting Allows AI Agents to Create Perfect Experiences from Mistakes!
- The Two Major Pain Points of Agent Long-Range Search Have Been Solved! CAS DeepMiner Runs Nearly 100 Rounds with 32k Context, Open Source Performance Closes in on Closed Source.
- Abandoning Fine-Tuning: Stanford Co-releases Agentic Context Engineering (ACE), Boosting Model Performance by 10% and Reducing Token Costs by 83%
- Google Enters the CUA Battleground, Launches Gemini 2.5 Computer Use: Allowing AI to Directly Operate the Browser
- Stanford Proposes New RL Paradigm: 3B Model Agent Outperforms Claude, GPT-4
- OpenAI Board Chair: "Per-Token Billing" Is Completely Wrong, Market Will Eventually Choose "Outcome-Based Pricing"
- ARPO: Agentic Reinforced Policy Optimization, Enabling Agents to Explore One Step Further at Critical Moments
- RAG Can Also Reason! Thoroughly Solving the Multi-Source Heterogeneous Knowledge Challenge
- OpenAI Podcast Revisited: The AI Coding War! Developers Are the Most Fortunate: Specialized Code Models Will Emerge! Host Leaks: "I Like Claude the Most!"
- RL Scaling Breakthrough! DeepSWE Open-Source AI Agent Tops Leaderboard, Training Methods and Weights Fully Released
- One of the Greatest AI Interviews of the Century: AI Safety, Agents, OpenAI, and Other Key Topics