Latest Articles
- Models Are Too Fond of Cheating! Cursor Reveals the Inside Story of Composer 2's Reinforcement Learning: Models Can Detect 'Fake Environments', and Floating-Point Non-Determinism Is a Fatal Flaw in RL TrainingArtificial IntelligenceMachine LearningAI AgentsSoftware DevelopmentLarge Language ModelsReinforcement LearningStartupsTechnologySoftware Engineering...
- Stop Handwriting Skills! Microsoft’s Latest Research: Train Your Skills Like Neural NetworksArtificial IntelligenceLarge Language ModelsMicrosoft ResearchAgent OptimizationPrompt Engineering...
- Enterprise Text-to-SQL: 5 Disruptive Insights from LinkedIn and Top LabsArtificial IntelligenceEnterprise TechnologyBusiness IntelligenceNatural Language ProcessingData Management...
- The Common Mechanism Behind Claude Code and Robots: A Deep Dive into UIUC, Meta, and Stanford's Latest SurveyArtificial IntelligenceAgent SystemsResearch SurveysMachine LearningSoftware Engineering...
- Unbelievable… FaceWall Had AI Write a Training Framework, and It Trained the Strongest 1B Model by Itself: MiniCPM5-1BArtificial IntelligenceLarge Language ModelsDeep Learning FrameworksEdge ComputingAI Training Infrastructure...
- Claude Pass Rate Under 4%: SaaS-Bench Shatters the 'Fully Automated Office' Fantasy of Computer-UseTechnologyArtificial IntelligenceBenchmark TestingSoftware as a Service...
- OpenAI Post-Training Lead: AI Isn't Suddenly Stronger, It Just Crossed a ThresholdArtificial IntelligenceOpenAIMachine LearningReinforcement LearningPost-Training...
- Chilling Discovery! AI Safety Evaluator METR Finds Claude Opus 4.6 Cheats on Over 80% of Long Tasks, Actively Breaks Out of Sandboxes to Steal AnswersAI & Machine LearningAI SafetyTechnologyResearch...
- Running ARC and Sudoku with 10M Parameters? Bengio's Team Bets on Multi-Trajectory ReasoningMachine LearningRecursive ReasoningComputational NeuroscienceInference-Time ScalingGenerative Models...
- On 520, Meet China's New 'Model King' Qwen3.7-Max!AI & Machine LearningAlibaba CloudQwenAI AgentsLarge Language Models...
- Google I/O 2026: Gemini 3.5, a Full Suite of Agents Debut, Pushing Android Off the Table?Artificial IntelligenceGoogleCloud ComputingAgentsGemini...
- RAG Context Stuck at 512 for Too Long: The 32K Context Era for Embedding Models Begins with Granite R2AI & Machine LearningNatural Language ProcessingSoftware EngineeringOpen SourceData Retrieval...
- Burning Too Many Tokens Using Claude Code on Large Codebases? This Open-Source Tool Cuts Tool Calls by 92%Developer ToolsArtificial IntelligenceLarge Language ModelsProductivityOpen Source...
- AI Defeats Humans in a Scientific Research Competition for the First Time! Opus 4.7 Sets a World Record with a 2930-Step SprintArtificial IntelligenceAI ResearchScientific ComputingAutonomous SystemsMachine Learning...
- Anthropic's Mythos AI Strikes Again, Uncovers macOS Vulnerabilities Bypassing Apple's SecurityCybersecurityArtificial IntelligencemacOSVulnerabilitiesApple...
- 35B-Parameter Science Model Rivals Trillion-Parameter Giants: 'Intern' Science Model Intern-S2-Preview Open-SourcedArtificial IntelligenceLarge Language ModelsModel OptimizationOpen SourceScientific Computing...
- Gemini 3.5 Pro Leaked: Coding Performance Rivals GPT-5.5! Google Finally Gets SeriousArtificial IntelligenceTechnology NewsCodingGoogle...
- jina-embeddings-v5-omni Released! A Lightweight Omni-Modal Vector ModelTechnologyArtificial IntelligenceOpen SourceMultimodal AIMachine Learning...
- Kaiming He's Team Debuts First Language Model! 105M Parameters, 45B Training Tokens, Continuous Diffusion Route Outperforms Mainstream Discrete DLMsTechnologyArtificial IntelligenceResearchDiffusion ModelsLanguage Models...
- Tian Yuandong's New Role: Joining Forces with AI Luminaries on a $650 Million Bet on "Self-Evolving AI"Artificial IntelligenceStartupsComputer ScienceMachine LearningTech Funding...