OpenAI Open-Sourced the Codex Orchestration Spec: Symphony

Published on April 27, 2026

Original link: https://openai.com/index/open-source-codex-orchestration-symphony/

Repository link: https://github.com/openai/symphony

Six months ago, while building an internal productivity tool, the OpenAI team made a then-controversial decision — no human-written code would be allowed in the repository. Every single line had to be generated by Codex.

To achieve this, they re-engineered their workflow from the ground up: they built an agent-friendly repository, invested heavily in automated testing and guardrails, and treated Codex as a full-fledged team member. That journey was documented in a previous blog post, "Harness Engineering."

The approach proved viable, but then they hit a new bottleneck — context switching.

To solve this, they built a system called Symphony. Symphony is an agent orchestrator that turns a project management board like Linear into the control center for coding agents: every pending task gets its own agent, the agent runs continuously, and humans just step in to review the results.

This article covers how Symphony was built — leading to a 500% increase in merged PRs for some teams — and how you can use it to turn your own issue tracker into a tireless agent orchestrator.

The Ceiling of Interactive Coding Agents

Coding agents are getting remarkably good, but whether you interact via the web or the command line, they remain fundamentally interactive tools.

As agent usage scaled within OpenAI, a new kind of burden emerged: every engineer was juggling several Codex sessions simultaneously — assigning tasks, reviewing outputs, nudging direction, and repeating the loop. In practice, most people could handle three to five sessions before their efficiency began to drop. They'd forget which session was doing what, hop between terminals to steer agents back on track, and debug long-running tasks that stalled midway.

Agents were fast, but human attention was the bottleneck. It was like assembling a team of highly capable junior engineers and then assigning a human engineer to micro-manage each one. That model wasn't going to scale.

A Shift in Perspective

The team realized they were optimizing the wrong thing. Their previous workflow was organized around "sessions" and "merging PRs," but those are just means, not ends. The fundamental unit of software work is the deliverable — issues, tasks, tickets, milestones.

So they asked themselves: What if instead of directly supervising agents, we let them pull tasks directly from the task tracker on their own?

This idea became Symphony — a "spec" documented as a written document, acting as a supervisor that orchestrates the agents' work.

Turning the Issue Tracker into an Agent Orchestrator

Symphony's core philosophy is dead simple: Every pending task should have an agent working to complete it. Instead of monitoring Codex sessions across multiple tabs, the issue tracker becomes the control center.

In this workflow, every Linear issue maps to an isolated agent workspace. Symphony continuously monitors the task board, ensuring that every active task has a running agent until completion. If an agent crashes or gets stuck, Symphony restarts it. If a new task appears, Symphony picks it up and assigns it.

The entire workflow is driven by ticket state, treating Linear like a state machine.

Coding agents working with the team, using Linear as a state machine

In practice, Symphony decouples "work" from "sessions" and "PRs." Some issues might generate multiple PRs across different repositories; others are purely research and analysis, never touching code at all.

Once work is abstracted this way, a single ticket can represent a much larger unit of work.

The team routinely uses Symphony to orchestrate complex feature development and infrastructure migrations. For example, you can simply file a task for an agent to analyze the codebase, Slack history, or Notion docs and produce an implementation plan. Once the plan is approved, the agent then generates a task tree, breaking the work into multiple phases with defined dependencies.

The agent only processes non-blocked tasks, so execution naturally occurs in an optimally parallel way — if a React upgrade is tagged as dependent on a Vite migration, the agent waits for the Vite migration to finish before starting the React upgrade, exactly as you'd expect.

Agents also create their own tasks. During implementation or review, they frequently discover improvements beyond the current task scope — performance issues, refactoring opportunities, better architectural approaches. When this happens, they proactively create a new issue for human evaluation and prioritization. Often, these follow-up tasks are then picked up and completed by agents as well.

This way of working dramatically lowers the psychological cost of starting ambiguous tasks. The agent built something completely off-track? No problem — that's valuable information in itself, and it cost almost nothing. Just file a ticket for an agent to explore a prototype, and feel free to discard it if it doesn't pan out.

Because the orchestrator runs on a devbox and never goes offline, you can submit tasks any time, knowing an agent will definitely handle them. One engineer on the team managed to land three significant changes using only the Linear App on his phone from a remote cabin with patchy cell service.

A New Way of Working Unlocks Exploration

The most visible change after adopting Symphony was in the volume of output. In some internal teams at OpenAI, the number of ultimately merged PRs increased by 500% in the first three weeks. Externally, Linear founder Karri Saarinen also noticed a distinct spike in workspace creation around Symphony's release. But the deeper change was in how the team perceived work.

When engineers no longer had to monitor Codex sessions, the "psychological cost" of every code change was fundamentally transformed. They no longer needed to manually drive the implementation, so the perceived cost of each change plummeted.

This directly influenced behavior: throwing an exploratory task into Symphony became trivial. Test an idea, run a refactor, validate a hypothesis — just keep the results that look valuable.

At the same time, the barrier to initiating work dropped. Product managers and designers can now directly submit feature requests to Symphony. They don't need to know how to code or manage a Codex session. They describe a feature and receive a complete review package in return, including a video demo of the feature running in the actual product.

In large monorepos — like the one OpenAI uses internally — Symphony has another strong suit: it watches CI, automatically rebases when necessary, resolves conflicts, retries flaky checks, and shepherds changes all the way to the main branch without human intervention. When a ticket enters the Merging state, you can generally rely on it landing smoothly.

Before and after Symphony comparison chart

After introducing Symphony, more work is delegated to agents, freeing human effort to focus on the harder, more exploratory tasks.

A New Way of Working Brings New Problems

Operating at this level naturally comes with its own costs. Shifting from directly interacting with an agent to delegating at the ticket level means losing the opportunity to intervene midway and course-correct. Sometimes an agent produces something completely off the mark, but those failures are valuable in themselves — they reveal gaps in the system, pushing the entire mechanism to become more robust.

Faced with this, the team opted not to manually patch results. Instead, they added guardrails and skills so the agent could get it right the next time. Over time, this drove them to continuously add new capabilities to the harness: running end-to-end tests, operating the application via Chrome DevTools, managing QA smoke tests. They also significantly improved documentation, articulating more clearly "what good output looks like."

Naturally, not all tasks suit the Symphony approach. Some problems still require an engineer to work directly in an interactive Codex session — especially fuzzy problems or work demanding strong judgment and deep domain expertise. In reality, this category of work is often what engineers find most interesting and rewarding anyway.

What Symphony covers is the broad swath of routine implementation work, allowing engineers to focus on chewing through one hard problem at a time, rather than thrashing between a dozen small tasks.

The team also discovered that treating agents as dumb nodes in a state machine is a dead end. Models are getting stronger and can tackle problems far bigger than the initial box drawn around them. The early approach of simply having Codex "implement a task" quickly felt too constraining. Codex is perfectly capable of creating multiple PRs, reading review feedback, and addressing revision comments. So they gave it access to the gh CLI, the ability to read CI logs, and other tools, empowering it to do more — like closing stale PRs, and generating reports on completed and abandoned work. These tasks were completely beyond the scope of the original "implement feature" brief.

Ultimately, the team gravitated toward assigning agents goals rather than prescribing rigid state transitions — much like a good manager gives a direct report an objective rather than dictating every single step. The model's value lies in its reasoning ability: give it tools and context, and let it figure things out.

Building Symphony with Symphony

Open up Symphony's code repository, and the first thing you'll notice is interesting: Symphony, on a technical level, is just a SPEC.md file — a definition of the problem and a description of the intended solution. The team didn't build a complex supervisory system; instead, they clearly defined the problem and the desired direction, providing high-level guidance to the agent.

# Symphony Service Specification
Status: Draft v1 (language-agnostic)
Purpose: Define a service that orchestrates coding agents to get project work done.

## 1. Problem Statement
Symphony is a long-running automation service that continuously reads work from an issue tracker
(Linear in this specification version), creates an isolated workspace for each issue, and runs a
coding agent session for that issue inside the workspace.
...

The reference implementation is written in Elixir — because when the cost of code is near zero, you can finally choose a language for its inherent qualities, and Elixir has natural advantages in concurrency. But the core concept is literally a Markdown document. The team actively encourages developers to just toss this spec at their favorite coding agent and have it implement its own version.

The first version of Symphony was nothing more than a Codex session running inside tmux, polling Linear and launching sub-agents for new tasks. It worked, but wasn't reliable. The second version lived inside the main project repository, which was already designed for agent use. The team had previously built the harness that enables agents to do high-quality work; Symphony simply connected it all together.

Once the basic functionality was up, the team began using Symphony to build Symphony itself.

After an internal demo showing it managing tasks and automatically attaching work-in-progress video proof, the response was surprisingly strong: Symphony's project channel began to grow rapidly, with various teams adopting it organically. At OpenAI, internal product-market fit is a prerequisite for an external release, and the internal usage data clearly indicated that Symphony was worth sharing.

So the team abstracted the core idea into a standalone SPEC.md and let Codex implement it. They chose Elixir for the reference implementation and continuously iterated on both the spec and the implementation in parallel. To polish the spec, they had Codex implement it in TypeScript, Go, Rust, Java, and Python, using the results to hunt down ambiguities in the spec and simplify the system design. Every language was made to work.

Throughout this build process, the team stripped away a large amount of accidental complexity, like dependencies on specific repositories or Linear MCP. No longer tied to internal repositories or internal workflows, the core approach became remarkably simple:

For every pending task, ensure an agent is continuously running in its own workspace.

Beyond helping with specific work, the development workflow itself became something agents understood how to follow. This process — picking up an issue, checking out the repo, setting it to "In Progress" so the PM knows someone is actively working on it, adding a PR, moving it to "Review" status, attaching a video, and so on — was previously maintained by engineer instinct and never documented. Now it is written into a WORKFLOW.md file, and Symphony ensures the agents follow the steps. If later you want agents to attach a self-reflection to completed work, you just add it to WORKFLOW.md; Symphony will guide the agent to that step.

The entire process also leverages Codex's App Server mode — a built-in mode specifically designed for headless operation, enabling programmatic interaction with Codex via a well-defined JSON-RPC API, like starting a thread or responding to turn events. This is significantly more convenient and scalable than interacting through the CLI or tmux sessions.

Codex App Server is a perfect fit for this scenario: you get the harness Codex provides while still having plenty of knobs and hooks to tap into. For example, to avoid exposing the Linear access token to sub-agents, the team used dynamic tool invocation, exposing the raw linear_graphql function to agents — allowing them to execute arbitrary requests against Linear without needing MCP or exposing the token to the container.

Next Steps

Symphony is an intentionally ultra-minimal orchestration layer. The goal of open-sourcing it is to demonstrate the potential of pairing Codex App Server with workflow tools like Linear. As a reference implementation, the team doesn't intend to maintain Symphony as a standalone product. Instead, it's offered as a blueprint — much like when many developers tossed that Harness Engineering blog post at their own agents to scaffold their repositories. The hope is that people will similarly toss the Symphony spec and repo at their favorite coding agent and build a version tailored to their own team.

The real power comes from Codex and its App Server. Symphony merely connects Codex and Linear — two tools already in use — and solves the problem of managing work. As coding agents' reasoning and instruction-following capabilities consistently improve, the bottleneck for other companies may also start to shift from "writing code" to "managing agent work." What's exciting is that the barrier to experimenting with these kinds of coding agent systems is now surprisingly low. If you want to try something, just have Codex build it.

Community Response

Since its release, the project has racked up over 15,000 stars on GitHub (as of April 23).

The community is already taking action — someone had an agent build a complete dashboard in their Elixir ERP application in one shot, complete with GenServer and a custom agent that captures production bugs and auto-submits fixes; another person implemented their own version using a Go + Charm CLI TUI stack; and someone else forked Symphony to support Claude Code and GitHub Issues, making it available via Homebrew.

That's exactly the point of releasing an open specification — pointing the way and letting the community run with it.

Symphony: Every Issue Gets Its Own Agent, Humans Just Review the Results