SJTU's 54-Page Review Illuminates the Evolutionary Path of Agent Cognitive Externalization

Swapping to the latest base models often fails to yield a qualitative leap in Agent performance. Conversely, equipping the same model with persistent memory, reusable skill documentation, and standardized tool interfaces delivers immediate, tangible results. Anyone involved in Agent engineering is likely familiar with this sensation: what lies outside the model often matters more than the model itself. But is there a unified framework to explain this phenomenon? A 54-page review paper from a team at Shanghai Jiao Tong University provides the answer: Externalization.

Recently, researchers from Shanghai Jiao Tong University, Sun Yat-sen University, Shanghai Institute for Advanced Study, Carnegie Mellon University, and OPPO submitted a comprehensive review to arXiv on April 9, 2026. For the first time, it systematically organizes the four pillars of LLM Agents—Memory, Skills, Protocols, and Harness Engineering—through the unified lens of "Externalization." The core thesis is clear: The actual progress of Agents increasingly depends on external cognitive infrastructure rather than improvements in the model's inherent capabilities.

Illustration of Agent Externalization Concept

Visual representation of the review paper structure

  • Paper Title: Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

  • Affiliations: Shanghai Jiao Tong University, Sun Yat-sen University, Shanghai Institute for Advanced Study, Carnegie Mellon University, OPPO

  • Paper Link: https://arxiv.org/abs/2604.08224 (Submitted April 9, 2026)

  • Authors: The lead author is Zhou Chenyu, a PhD student at Shanghai Jiao Tong University. Corresponding authors include Dr. Wang Jun from OPPO Research Institute, and Professors Liu Weiwen, Lin Jianghao, and Zhang Weinan from Shanghai Jiao Tong University.


Figure 1: Externalization as the Organizing Principle for LLM Agent Design

Figure 1: Historical arc of human cognitive externalization vs LLM Agent externalization arc (Memory, Skills, Protocols to Harness) and literature landscape.

Models Are Strong, But Agents Remain Unreliable: Where Is the Contradiction?

Over the past two years, the parameter scale and reasoning capabilities of large models have continued to climb. However, engineers familiar with Agent deployment share a common experience: upgrading to a stronger base model often yields less significant improvements than enhancing external infrastructure. Persistent memory, reusable skills, standardized tool interfaces, sandbox constraints, and execution logs—these elements "outside the model" are increasingly determining whether an Agent is truly usable.

The paper attributes this phenomenon to three structural mismatches:

  • Continuity Mismatch: Context windows are limited and ephemeral; models cannot stably maintain state across sessions. Every session starts anew, requiring previously accumulated context to be rebuilt from scratch.

  • Consistency Mismatch: Complex multi-step processes are often re-derived rather than executed stably. For the same task, execution paths and quality vary depending on when they are invoked.

  • Coordination Mismatch: Interactions with tools, services, and other Agents rely on temporary agreements that are fragile and non-portable. Once an interface changes, the entire call chain may fail simultaneously.

The paper draws on cognitive scientist Don Norman's theory of "Cognitive Artifacts" to explain this. For instance, a shopping list does not expand human memory capacity but transforms the problem of "recall" into one of "recognition." A map doesn't make navigation stronger per se but makes spatial relationships visible rather than implicit. The power of external artifacts lies in Representational Transformation—they reorganize the form of the problem, allowing the subject to solve it more reliably with existing capabilities.

The same logic is unfolding in LLM Agents. The paper's core argument is that externalization is the unified logic understanding recent architectural evolutions in Agents, not merely a pile of engineering tricks.

From Weights to Harness: Three Shifts in the Carrier of Capability

Figure 2: Evolution of community themes across three capability layers (2022–2026). Focus shifts from parameter knowledge and prompt engineering to Harness-level infrastructure.

  • Weights Layer (2022–2023): Capability was nearly synonymous with model parameters, dominated by scaling laws. This laid the foundation, but knowledge was hard to update selectively, behavior was difficult to audit, and personalization was nearly impossible.

  • Context Layer (2023–2024): Prompt engineering, Chain-of-Thought (CoT), and Retrieval-Augmented Generation (RAG) rose to prominence. Models remained frozen while prompt templates iterated rapidly. The difficult problem of "recall" was partially converted to "recognition," but state remained ephemeral, and cross-step coordination remained fragile.

  • Harness Layer (2024–Present): Reliability now depends on external memory, tool registration, protocols, sandboxes, and orchestration. "Agent engineering is increasingly becoming Harness engineering"—a pattern followed by OpenHands, SWE-agent, Deep Research, and others.

All Roads Lead to Externalization: Memory, Skills, Protocols, and Harness

Looking back at recent technical advances in the Agent field, memory systems, skill systems, protocol standardization, and Harness engineering itself appear to be four independent research lines solving different problems. However, the paper points out that they are essentially doing the same thing: migrating specific layers of cognitive burden from inside the model to external structures. This is not a coincidence but an inevitable convergence for reliable Agent deployment. The intersection of these four routes is Externalization.

Memory externalizes state, turning "recall" into "retrieval" to solve continuity mismatches. Skills externalize professional expertise, turning "improvisation" into "composition and reuse" to solve consistency mismatches. Protocols externalize interaction structures, turning "temporary agreements" into "structured contracts" to solve coordination mismatches. Finally, Harness externalizes the Agent's cognitive environment itself: execution flows, sandboxes, observations, and permissions, which were previously implicit in every model call, are now explicitly extracted to become inspectable, configurable, and governable infrastructure.

Memory: Externalized State

Figure 3: The full process of memory as externalized state—from raw context to four layers of memory content, through memory system architecture (monolithic → hierarchical orchestration → adaptive), finally integrating with Harness.

The paper organizes Agent memory into four layers: Working Context (current task state, open files, partially completed plans), Situational Experience (past run records and failure trajectories), Semantic Knowledge (domain facts, user preferences, general heuristics), and Personalized Memory (specific user habits and constraints).

Memory architectures have evolved with demand: from monolithic systems stuffing all history into prompts, to retrieval-based systems with active state and external storage, to hierarchical architectures orchestrating by semantics or time, and finally to adaptive memory systems that dynamically adjust retrieval strategies based on feedback. The core effect remains the same: the model no longer needs to "recall" from weights but "retrieves" from persistent storage.

Skills: Externalized Professional Expertise

Figure 4: The full process of skills as externalized expertise—from acquisition (manual writing, distillation, discovery, composition) to skill artifacts, via activation pipelines (registration, progressive disclosure, composition), finally binding to runtime.

Skill systems package reusable procedural expertise into explicit artifacts. A complete skill comprises three components: Operational Procedures (task skeletons and decomposition steps), Decision Heuristics (local strategies for branching decisions), and Normative Constraints (compliance, safety, and operational boundaries).

There are four generation paths for skills: Manual Writing (experts hand-crafting instruction files like SKILL.md), Trajectory Distillation (extracting reusable programs from historical run records), Autonomous Discovery (Agents exploring and inducing in the environment, e.g., Voyager), and Compositional Construction (assembling high-level capabilities from existing low-level skills). Skills move from "discovery" to "execution" through stages of registration, progressive disclosure (expanding from summary to full detail on demand), and composition, finally binding to specific tools, APIs, and protocols at runtime.

The core effect: The model no longer needs to "improvise" workflows from scratch each time but "composes" them from pre-validated components.

Protocols: Externalized Interaction Structures

Figure 5: Evolution of protocols in Harness engineering—from isolated model calls to standardized protocols to decentralized Agentic Web. Harness manages three interaction types: with tools, perceiving environments, and collaborating with Agents and humans.

Protocols fix interaction structures into machine-readable contracts, externalizing four types of burdens: calling syntax (parameter formats and types), lifecycle semantics (state transitions and completion conditions), permission and trust boundaries (authorization rules), and discovery metadata (declarations of available capabilities).

The paper outlines three major protocol families:

  • Agent-Tool Protocols (e.g., MCP): Standardize tool discovery and invocation via JSON-RPC, enabling dynamic registration and modular expansion of tools.

  • Agent-Agent Protocols (e.g., A2A): Define structured semantics for task delegation, progress exchange, and capability discovery, supporting interoperability in an open Agent ecosystem.

  • Agent-User Protocols (e.g., AG-UI): Make runtime observable and portable through typed execution events and state flows, allowing user interfaces to track Agent behavior in real-time.

The core effect: Temporary agreements become structured contracts, making cross-system coordination governable rather than fragile.

Harness: The Unified Cognitive Environment

Figure 6: Overall architecture of an Externalized Agent. Harness sits at the center, surrounded by the three externalization dimensions: Memory, Skills, and Protocols. Operational elements like sandboxing, observability, compression, evaluation, and approval loops coordinate in the middle layer.

Harness externalizes the cognitive environment upon which the previous three dimensions rely. Execution flows, sandboxes, observations, and permissions, previously implicit in every model call, are explicitly extracted to become inspectable, configurable, and governable infrastructure. This is both the runtime accommodating memory, skills, and protocols, and the key to transforming the entire system from a "black box" to a "white box." The paper analyzes its composition across six design dimensions:

  1. Agent Loop and Control Flow: The complete cycle of Perception-Retrieval-Planning-Execution-Observation, managing termination conditions, recursion boundaries, and resource consumption.

  2. Sandboxing and Execution Isolation: Filesystem isolation, network restrictions, and cloud sandboxes serve as both security and cognitive boundaries.

  3. Human Supervision and Approval Gating: Pre-execution approval, post-execution review, and escalation triggers, treating autonomy as a configurable parameter.

  4. Observability and Structured Feedback: Structured logs of tool calls, tracing actions back to their antecedents, supporting debugging, auditing, and internal feedback loops.

  5. Configuration, Permissions, and Policy Encoding: Three-level hierarchical constraints (User, Project, Organization) enforced at runtime via declarative rules.

  6. Context Budget Management: Balancing competition for the context window among the three dimensions through history summarization, priority-driven content eviction, and staged skill loading.

These three dimensions form a self-reinforcing loop within the Harness: memory experience distills into skills, and skill execution trajectories沉淀 back into memory; protocols standardize how skills are invoked and write structured results back to persistent state; richer memory leads to better skills, better skills generate richer execution trajectories, and the cycle continues.

A Scenario: Changing Only the "External Environment" Without Swapping the Model

Consider a software engineering Agent tasked with implementing a new feature, running tests, and submitting a PR in a large code repository. The paper uses this example to directly illustrate the significance of externalization.

  • Without Externalization: The model must cram the repository structure, project conventions, workflow state, and tool interactions into a fragile prompt window. A single error requires restarting the entire process. As task complexity increases, the management cost of prompt templates rises super-linearly.

  • With Externalization: Persistent project memory provides cross-session context; reusable skill documents encode project conventions and workflows; protocolized tool interfaces ensure call formats remain correct; and the Harness handles step ordering, output validation, and failure recovery.

The base model can remain completely unchanged; what changes is the representation of the task it faces. This is the core argument of the entire paper: The improvement in Agent reliability comes less from stronger reasoners and more from better-organized cognitive systems. The question for evaluating an Agent system shifts from "How strong is the model?" to "Which burdens have been externalized so the model no longer needs to solve them from scratch every time?"

Future Directions

The paper concludes by pointing out six frontier directions:

  • Expansion of Externalization Boundaries: Planning goals, verification logic, and orchestration strategies themselves are becoming Harness objects, not just content executed by the Harness.

  • From Digital to Embodied: Embodied Agents are undergoing the same externalization pattern. The separation between high-level planners and low-latency execution modules is a mapping of externalization logic in physical systems.

  • Self-Evolving Harness: Using reinforcement learning, program synthesis, or imitation learning to automatically update infrastructure offers broad prospects but simultaneously amplifies governance risks.

  • Safety and Governance: Novel attack surfaces such as memory poisoning, malicious skill injection, and protocol deception warrant specific attention. Mandatory review gating and provenance tracing are essential safeguards for mature systems.

  • Shared Infrastructure and Multi-Agent Ecosystems: When memory, skills, and protocols can be shared across Agents, collective learning and division of labor become possible, though this brings governance challenges like infrastructure drift.

  • Evaluation of Externalization: Existing benchmarks severely lack metrics for infrastructure contributions. New dimensions such as transferability, maintainability, and context efficiency need to be established.

From memory to skills, to protocols, and finally to Harness, the value of this review lies not in listing technical details but in providing a system-level explanatory framework. To summarize in one sentence: Better Agents are not just better reasoners; they are better-organized cognitive systems.

End of article illustration

© THE END

For reprint permissions, please contact our official account.

For submissions or media inquiries: liyazhou@jiqizhixin.com

Related Articles

分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.