Latest Articles
- DeepSeek R2's Secret Weapon Revealed! The Technology Just Awarded a Top Prize to Liang Wen-feng Allows AI to Read Long Texts 11 Times FasterAI TechnologyLarge Language ModelsDeepSeekNatural Language ProcessingSparse Attention...
- AI Safety and Contemplation: Computational Models for Aligning Mind with AGIContemplative AIAI AlignmentActive InferenceMindfulnessBuddhism...
- Qwen Updates Overnight: Runs on RTX 3090, 3B Parameters Activated Rival GPT-4oLarge Language ModelsQwenGPU ComputingDeep LearningOpen-source AI...
- Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning TracesArtificial IntelligenceDual Process TheoryLarge Language ModelsPlanning and ReasoningTransformer Models...
- Reshaping the Browser! Microsoft Adds AI Agent to Edge, Automating Search, Prediction, and IntegrationBrowserArtificial IntelligenceAutomationCopilotMicrosoft Edge...
- Do Multimodal Large Language Models Truly 'Understand' the World? — Unveiling Core Knowledge Deficits in MLLMsMultimodal AICore KnowledgeLarge Language ModelsCognitive ScienceMachine Learning...
- Hierarchical Reasoning ModelArtificial IntelligenceDeep LearningLarge Language ModelsAlgorithmic ReasoningNeuroscience-Inspired AI...
- Why Can't Language Models Directly Output Answers with Confidence?Language ModelsReinforcement LearningArtificial IntelligenceUncertainty QuantificationModel Calibration...
- DeepSeek-GRPO Importance Weight Design Flaw? Explaining Qwen3's New Reinforcement Learning Algorithm GSPOReinforcement LearningLarge Language ModelsAlgorithm OptimizationMoE ModelsQwen3...
- New Book Recommendation: "Reshuffle: Who Wins When AI Restacks the Knowledge Economy"Artificial IntelligenceKnowledge EconomyBook ReviewFuture of WorkBusiness Strategy...
- Must-Read: In-depth Comparison of Mainstream LLM Architectures, Covering Llama, Qwen, DeepSeek, and Six Other ModelsLLM ArchitecturesMixture of ExpertsDeep Learning ArchitecturesLarge Language ModelsModel ComparisonNormalization LayersAttention Mechanisms...
- Kimi K2's Key Training Technique: QK-Clip!Large Language ModelsAttention MechanismsQK-ClipDeep Learning OptimizersTraining StabilityModel Optimization...
- Crushing DeepSeek V3! Alibaba Open-Sources New Qwen-3, Dominating Benchmarks with a Clear LeadLarge Language ModelsOpen Source AIAI BenchmarksLLM PerformanceAlibaba...
- New Book Recommendation: "God, AI and the End of History: Understanding the Book of Revelation in an Age of Intelligent Machines"TheologyArtificial IntelligenceBook ReviewChristian ApologeticsEschatology...
- New Book Recommendation: "The AI-Centered Enterprise: Reshaping Organizations with Context Aware AI"Enterprise AIGenerative AIBusiness StrategyContext-Aware AIOrganizational Transformation...
- New Book Recommendation: "Navigating Data Science: Unleashing the Creative Potential of Artificial Intelligence" | Exploring the Fusion of Data Science and AIData ScienceArtificial IntelligenceTechnology TrendsBook ReviewGenerative AI...
- Large Models Reveal New Weakness! Old Memories Unforgettable, New Memories Indistinguishable, Accuracy Plummets | ICML'25Large Language ModelsMemory LimitationsCognitive SciencePrompt EngineeringProactive Interference...
- Transformer Killer! Google DeepMind's New MoR Architecture Emerges, A New Generation's King Has ArrivedAI Model ArchitecturesLarge Language ModelsMemory EfficiencyInference OptimizationRecurrent Neural Networks...
- Meta Team's Breakthrough: Large Model "Hallucinations" Plummet to 5%! Is a Single Sentence Question the Key?AILarge Language ModelsMeta AI ResearchAI ReliabilityLLM Hallucination...
- The Darkest Job Season Ever: An Oxford Master's Graduate Unemployed for Six Months and Millions in Debt, All Because AI Took My JobGraduate EmploymentAI ImpactStudent DebtUKJob Market...