New Intelligence Report
Editors: Hao Kun, KingZH
【New Intelligence Lead】An Aussie sheep farmer's casually written three-line bash script has been collectively adopted by OpenAI, Anthropic, and Hermes within 11 days.
Wake up, and Claude Code has updated again!
To enable Claude to work continuously until a task is complete, Claude Code has recently launched a new feature: /goal.
You just set a condition, and Claude will never give up until the task is completed!
Anyone who has used AI coding tools knows how crucial this is!
You give an agent a task. It runs for three rounds, modifies two files, and then suddenly stops to ask, "What should I do next?"
Wait, you haven't finished fixing that bug yet!
Agents are getting smarter and writing code faster, but the ability to "finish a task from start to finish" was something no one had achieved by early 2026.
Then, an Aussie sheep farmer, Geoffrey Huntley, solved it with three lines of bash.
while :; do
cat PROMPT.md | claude-code --continue
done
He named it Ralph Loop, a tribute to Ralph Wiggum from *The Simpsons*, the kid who never quite understands the situation but never gives up.
The logic is brutally simple: an infinite loop, repeatedly feeding the same prompt to the agent. Progress is written in the file system and Git history. When the context is full, it starts a new instance, reads the files, and continues.
Primitive, inelegant, but incredibly effective.
So effective that OpenAI saw it, Nous Research saw it, and Anthropic saw it.
In 11 days, three top-tier AI labs independently wrote those three lines of bash into their official products.
At that moment, everyone understood one thing—
The final push for general AI might not come from a smarter model, but from a model that "gets things done."
In other words, the core battlefield of AI coding is shifting from "generating code" to "closed-loop delivery."
11 Days, Three Paths, Same Destination
On April 30th, OpenAI's Codex was the first to launch the /goal feature.
Greg Brockman simply posted on X, "Codex now has Ralph loop++ built-in."
A week later, Hermes Agent followed suit. Four more days, and Claude Code joined in.
11 days. Three companies. The same command. The same functionality.
But their implementation paths couldn't be more different.
Codex "never forgets," Hermes "never leaves a project unfinished," and Claude Code "never deceives itself."
Codex: Store the Goal as a Database Record
OpenAI was the first to act among the three, and its solution was the most straightforward.
In Codex, /goal is a persistent workflow object, stored in the local app-server's state layer.
Close the terminal, shut your laptop, even restart the system—the goal won't be lost. Next time you open Codex, it automatically resumes.
The model reports progress status through a structured update_goal tool. When the token budget is exhausted, it triggers a "soft landing" instead of a hard stop.
One user ran this feature continuously for 14 hours, paused for 5 hours to sleep, and when they returned, Codex resumed from the breakpoint and completed a device driver project.
Engineered, clean, but restrained.
Hermes Agent: If One Can't Finish It, Assemble a Team
Hermes Agent had the biggest ambition.
Here, /goal is just the tip of the iceberg. The real highlight is the multi-agent kanban system. Hermes upgraded "making AI finish the job" from a single-agent problem to a team collaboration problem.
The kanban board's foundation is a local SQLite database, with persistent storage that survives restarts.
You create a task card on it, and Hermes directly breaks it down into multiple sub-tasks, assigning them to different agent workers. Each worker is an independent OS process with its own identity, model configuration, and working directory.
The kanban board and /goal are two complementary systems. /goal manages a single agent's target lock (the Ralph loop), while the kanban board handles task scheduling among multiple agents. One delves deep vertically, the other spreads out horizontally.
Finally, there is a five-layer defense mechanism against unfinished work.
Layer 1: Heartbeat detection. Each worker periodically reports to the kanban board, proving it is still alive.
Layer 2: Zombie reclamation. Worker didn't respond in time? The system automatically determines it's dead, reclaims its tasks, and reassigns them. There's even specific Darwin zombie detection logic on macOS.
Layer 3: Exit interception. Worker exited without completing its task? The system automatically marks it as blocked, preventing it from taking new jobs, stopping "slacking agents" from repeatedly taking tasks without doing them.
Layer 4: Hallucination interception. This is the most ruthless layer. If the AI says "I'm done," it doesn't count. The system verifies whether the code it actually output has been written to disk. Agent claims it created a file but didn't? Caught, rolled back, and restarted.
Layer 5: Retry budget. Each task has an independent max_retries setting. It retries up to N times, and if exceeded, it escalates to a human. It will absolutely never loop infinitely until a crash.
Claude Code: The Doer and the Reviewer Cannot Be the Same
Anthropic was the last of the three to act, but its solution was the most ingenious.
Essentially, Claude Code's /goal is a session-level Stop Hook.
You set a completion condition (e.g., "all tests in the test/auth directory pass and lint reports no errors"), and Claude starts working.
The key design is in the verification step. After each round of work, the system does not let Claude itself judge "Am I done?"
It sends the conversation log and your completion condition together to an independent small model (defaulting to Haiku), letting this small model act as the referee.
If the small model deems the task unfinished, it needs to return a specific reason (e.g., "test_login.py still has 2 failures"). This reason is then injected into Claude's context for the next round, guiding it to continue.
If the small model considers the task complete, the goal is automatically cleared, and the task ends.
Notably, this referee model does not call any tools, read files, or run commands. It only looks at the content Claude produced in the conversation.
Therefore, your completion condition must be something Claude can demonstrate within the dialogue.
It can be up to 4,000 characters long, so you can write it in great detail.
You can even add constraints to the condition, such as "do not modify other test files" or "stop if not completed within 20 rounds," etc.
The Finals Are Underway: The Workflow Entry Point
Let's zoom out for a moment.
Claude Code is backed by Anthropic. Codex is backed by OpenAI. Hermes Agent interfaces with both companies' models and is also a primary distribution channel for models like DeepSeek V4.
These three paths perfectly cover the three ecosystem entry points in the ASI finals.
And they are all competing for the same thing—workflow.
Whichever agent first gets developers into the habit of "set a goal and walk away" will lock in the workflow entry point.
Because once a habit is formed, the switching cost is exponential.
You won't easily leave an agent infrastructure where kanban scheduling, breakpoint resume, and checkpoint rollback are already running smoothly.
A seemingly small /goal command is actually digging a moat around the entire agent workflow.
References:
https://code.claude.com/docs/en/goal
https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.7
https://github.com/anthropics/claude-code/releases/tag/v2.1.139