Codex Ran for 22 Hours, Earned a Real $16.88: Altman's Vision of 'AI Workers' Is Here

This story appears to be the first time Codex has touched a 'paycheck.'

Over the weekend, a developer named Chris posted on X:

He gave Codex a simple command to 'go earn $5.' Over the next 22 hours, Codex found a bounty path for an open-source security audit, completed and submitted a PR, and followed up with the maintainer.

A few days later, $16.88 landed in his account.

Chris then did the math: if repeated daily, that's a monthly income of $506.40.

A screenshot showing Chris's tweet and the calculation of potential monthly earnings from the Codex experiment.

The post caused a sensation in the developer community.

Some called it 'the first order for an AI worker,' and Chris himself excitedly declared in the post that it made him 'see Altman's vision of AI earning money for you begin to come true.'

Altman once predicted in a personal blog that 2025 might see the first AI agents 'join the workforce' and materially change company output.

Chris views this experiment as an early validation of that prediction.

How It Earned the $16.88

Based on the process Chris published on X, the timeline likely unfolded like this:

The starting point was an extremely simple command. Chris told Codex: Go find work on GitHub and earn $5.

After receiving this minimal instruction, Codex located a bounty platform. However, how it specifically targeted tasks and whether it used additional tools or manual configuration is currently unverifiable, as no public logs are available.

After taking the job, it executed it: reading code, modifying code, and submitting a pull request (PR). This is Codex's most validated and strongest capability from recent years.

Then came communication: replying to comments back and forth with the maintainer. This is historically where AI agents are most likely to stumble. Codex navigated this round as well, and the PR was ultimately merged.

According to Chris's review, a few days after the PR merge and verification process was completed, he received a $16.88 payment.

Although Chris publicly shared a payment receipt screenshot and some conversation logs, the detailed process still lacks independent third-party verification. Nonetheless, this does not detract from the story's main narrative.

Codex's capabilities are clearly documented by OpenAI, which defines it as 'a cloud-based software engineering agent.'

A screenshot from OpenAI's website introducing Codex and its features.

After a user inputs a task instruction, Codex processes multiple tasks in parallel in the background, each corresponding to a code change or PR. https://openai.com/index/introducing-codex/

It can read and edit files, run testing frameworks, linters, and type checkers, commit code changes, and open GitHub pull requests for review.

In other words, the pipeline of 'write code → run tests → submit PR → keep logs' has been officially established by Codex.

Chris said Codex 'found an open-source security audit bounty path.' OpenAI launched a Codex Security feature specifically for engineering and security teams, enabling per-commit scanning of connected GitHub repositories and validating high-confidence security vulnerabilities in an isolated environment.

The 'security audit bounty path' Chris mentioned aligns perfectly with this product line's described capabilities.

But a crucial precondition here is overlooked by many.

During the agent execution phase, Codex has internet access turned off by default. The Codex cloud documentation clearly states:

By default, Codex blocks internet access during agent execution. Network permissions are still available during the script installation phase to install dependencies; users can manually enable agent internet access per environment as needed.

This is a core constraint.

If it is turned on, what then? OpenAI lists a string of risks: prompt injection from untrusted web content, code or key exfiltration, downloading malicious dependencies, pulling content with restrictive licenses.

So, if the events Chris described truly happened, Codex being able to 'find a bounty path' likely means he actively enabled internet access, or Codex accomplished this action through a combination of GitHub, a browser, MCP, or other tools.

This is a product of 'model + tools + permissions + network,' not a bare capability of the model. This also demonstrates that Codex, under specific conditions, possesses the technical ability for the 'find vulnerabilities → submit PR → follow up on reviews' pipeline.

The Accounting Isn't That Simple

Let's do the math again.

Chris said a total of about 10-15 security audit projects were run this time, consuming 22M tokens. The $16.88 is for the first project that 'got the green light' and paid out. There are still multiple pending audits awaiting confirmation.

The OpenAI API public pricing page shows that the output price for GPT-5.5 is $30 per 1 million tokens, and input is $5 per 1 million tokens. Chris himself cited this token price in a follow-up comment, using it to extrapolate future profit margins.

However, Codex as a product has task quota limits based on subscription plans like ChatGPT Pro, Team, and Enterprise. The actual consumption logic is entirely different from raw API billing.

Chris did not disclose the breakdown of input versus output tokens among the 22M consumed, nor did he clarify whether he used subscription quotas or went directly through the API. He also did not mention task failure rates, retry costs, or manual troubleshooting time.

His real logic isn't about calculating current profitability; he's betting that with model costs potentially dropping 10x every year, this closed loop will become increasingly cheaper to run. He stated in a follow-up thread:

"Why are you all assuming I spent a lot of money? GPT-5.5 output is $30/million tokens now. Next year it will drop to $2, and by then both sides will be making a killing."

Therefore, this $16.88 is currently more of an experimental signal that 'something can be run through' rather than a replicable business model.

GitHub Has Already Paved the Way

This wasn't solely Codex's accomplishment. If we break down all the actions behind this single order—finding work, doing work, communicating, and receiving payment—GitHub has been quietly laying the groundwork for each step.

Agent HQ: GitHub Sets Up Workstations for AI

In February of this year, GitHub integrated Claude and OpenAI Codex into Agent HQ and made it available in public preview to Copilot Pro+ and Enterprise users. This is akin to GitHub setting up a workstation for AI.

A screenshot of the GitHub Agent HQ interface showing AI agent options like Copilot, Claude, and Codex.

The GitHub Agent HQ agent selection interface allows programming tasks to be assigned to Copilot, Claude, Codex, and custom agents.

GitHub's official description is:

"Agents run asynchronously by default. You can track progress in real-time, or review completed sessions afterward, viewing detailed logs to see what the agent did and why."

This means the role you previously assigned to a junior engineer can now also be filled by a coding agent. This is precisely the workflow direction GitHub is endorsing and advancing at the system level.

Four Key Interfaces

Mapping the process Codex went through this time to the capabilities GitHub has already laid out makes it clearer:

First, the interface for finding work. GitHub has ready-made issues, PRs, repository contexts, and an Agents tab. As for bounty tasks, these rely more on third-party platforms like Algora, IssueHunt, or a project's own mechanisms. An agent doesn't need to crawl the entire web; it can go to these few places to find structured 'work.'

Second, the interface for doing work. Repository read/write permissions and the Codespaces sandbox environment allow an agent to clone, modify, and run tests within an isolated environment provided by GitHub itself, without needing to set up its own infrastructure.

Third, the interface for communication. The PR review channel, the @-mention mechanism, and comment threads allow the agent to know precisely who is responding to it and which section of code in the PR is being referenced upon receiving a comment.

Fourth, the interface for receiving payment. Bounty platforms like Algora and IssueHunt have already integrated with the GitHub issue workflow. The Algora platform can automatically settle payments upon PR merge, so the receiving payment step is no longer a case of needing to 'integrate Stripe separately and write a bunch of code.'

None of these four interfaces are new individually, but their significance changes when they are combined within the same workbench and reorganized in an 'agent-friendly' manner.

Not Just Codex

The pathway Codex successfully navigated this time is reusable for any agent that has been connected to Agent HQ.

GitHub's official Octoverse report shows that in 2025, the platform averages 43.2 million merged PRs per month, a 23% year-over-year increase, with AI-related repositories growing by 178% year-over-year.

Agent-driven development workflows are moving from experimentation to scale.

An infographic showing key data from the GitHub Octoverse 2025 report, including 43.2 million merged PRs and a 178% increase in AI repositories.

Key data from the GitHub Octoverse 2025 annual report: 43.2 million PRs are merged on the platform monthly, and the total number of AI-related repositories has reached 4.3 million, a 178% year-over-year increase.

Codex's $16.88 payment this time is more of a milestone: on the road GitHub has already paved, the first car has completed the entire route and generated revenue.

The Last Few Puzzle Pieces

So, what is still missing for 'AI autonomously earning money' to truly become a reality?

Based on public information, the bounty Codex took on this time was not top-tier difficulty. A single $16.88 bounty corresponds to a small-to-medium fix. The whole process took a few days, and there weren't many back-and-forth rounds with the maintainer. Therefore, this is more of a demonstration that a 'pathway is established' rather than a sign that 'the pathway is mature.'

The manual intervention steps might also be more numerous than imagined.

Account and GitHub authorization needed to be configured by a human, internet access had to be manually enabled, and the final code review and merge required human confirmation.

OpenAI officially states that users still need to manually review and verify all agent-generated code.

A screenshot of a Codex-completed PR page, showing a change summary, terminal test logs, and a code diff, awaiting a user's merge decision.

The PR page after a Codex task is completed includes a modification summary, terminal test logs (tests passed), and a code comparison (diff), based on which the user decides whether to merge.

This means Chris was indispensable in this matter: he had to give the starting command, as the agent wouldn't wake up on its own and decide to earn money today; he had to help Codex connect the payment channel; he had to act as a fallback, ready to step in if Codex got stuck.

Therefore, a more accurate characterization of the event is this: it was a 'successful end-to-end run-through under Chris's supervision,' which is still some distance from a truly unattended automatic money-making machine.

But the ratio of human-to-agent collaboration is rapidly tilting in favor of the latter.

OpenAI says that interaction with Codex 'will increasingly resemble asynchronous collaboration with a colleague,' with agents handling more complex tasks over longer periods.

$16.88 won't change anyone's life.

But if this experiment is eventually fully verified, what will the next order be worth?

References:

https://x.com/chatgpt21/status/2053556436475461786

https://openai.com/index/introducing-codex/

https://github.blog/news-insights/company-news/pick-your-agent-use-claude-and-codex-on-agent-hq/

https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/

Codex Ran for 22 Hours, Earned a Real $16.88: Altman's Vision of 'AI Workers' Is Here

Related Articles

分享網址