By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: OpenAI Begins Article Series on Codex CLI Internals
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > OpenAI Begins Article Series on Codex CLI Internals
News

OpenAI Begins Article Series on Codex CLI Internals

News Room
Last updated: 2026/02/03 at 9:44 AM
News Room Published 3 February 2026
Share
OpenAI Begins Article Series on Codex CLI Internals
SHARE

OpenAI recently published the first in a series of articles detailing the design and functionality of their Codex software development agent. The inaugural post highlights the internals of the Codex harness, the core component in the Codex CLI.

Like all AI agents, the harness consists of a loop that takes input from a user and uses an LLM to generate tool calls or responses back to the user. But because of LLM constraints, the loop also has strategies to manage context and reduce prompt cache misses. Some of these strategies were based on lessons learned the hard way: as bugs reported by users. Because the CLI uses the Open Responses API, it is LLM agnostic: it can use any model that is wrapped by this API, including locally-hosted open models. According to OpenAI, their CLI design and lessons can therefore benefit anyone designing an agent based on this API:

[We] highlighted practical considerations and best practices that apply to anyone building an agent loop on top of the Responses API. While the agent loop provides the foundation for Codex, it’s only the beginning. In upcoming posts, we’ll dig into the CLI’s architecture, explore how tool use is implemented, and take a closer look at Codex’s sandboxing model.

The article describes what happens in one round or turn of a user conversation with the agent. The turn begins with assembling an initial prompt for the LLM. This consists of instructions, which is a system message that contains general rules for the agent, such as coding standards; tools, a list of MCP servers that the agent can invoke; and the input, which is a list of text, images, and file inputs, including things like AGENTS.md, local environment information info, and the user’s input message. All of this is packaged into a JSON object to send to the Responses API.

This triggers LLM inference, which produces a stream of output events. Some of these events may indicate that the agent should call one of the tools; in this case the agent invokes the tool with the specified inputs and collects the output. Other events indicate reasoning outputs from the LLM, which are typically steps in a plan. Both tool calls and reasoning are then appended to the initial prompt, which is passed to the LLM again for more iterations of reasoning or tool calling. This is called a turn of the “inner” loop. The conversation turn is finished when the LLM responds to the inner loop with a done event, which includes a response message for the user.

A major challenge in this scheme is LLM inference performance: it is “quadratic in terms of the amount of JSON sent to the Responses API over the course of the conversation.” This is why prompt caching is key: by reusing the output of a previous inference call, inference performace becomes linear instead of quadratic. Changing things like the list of tools will invalidate the cache, and Codex CLI’s initial support for MCP had a bug that “failed to enumerate the tools in a consistent order⁠,” which caused cache misses.

Codex CLI also uses compaction to reduce the amount of text in the LLM context. Once the conversation length exceeds some set number of tokens, the agent will call a special Responses API endpoint that provides a smaller representation of the conversation that replaces the previous input. 

Hacker News users discussing the article praised OpenAI’s decision to open-source Codex CLI, pointing out that Claude Code is closed. One user wrote:

I remember they announced that Codex CLI is opensource…This is a big deal and very useful for anyone wanting to learn how coding agents work, especially coming from a major lab like OpenAI. I’ve also contributed some improvements to their CLI a while ago and have been following their releases and PRs to broaden my knowledge.

The Codex CLI source code, bug tracking, and fix history are available on GitHub.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Taiwan’s Tron Future unveils AI-guided anti-armor rockets | Cyber News Taiwan’s Tron Future unveils AI-guided anti-armor rockets | Cyber News
Next Article How to Track IP Addresses Easily and Accurately ? How to Track IP Addresses Easily and Accurately ?
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Docker Fixes Critical Ask Gordon AI Flaw Allowing Code Execution via Image Metadata
Docker Fixes Critical Ask Gordon AI Flaw Allowing Code Execution via Image Metadata
Computing
Spain looks to ban social media for under-16s, joining others in Europe
News
UK privacy watchdog opens inquiry into X over Grok AI sexual deepfakes
UK privacy watchdog opens inquiry into X over Grok AI sexual deepfakes
News
New iPhone Fold specs, button layout revealed, new leak claims
New iPhone Fold specs, button layout revealed, new leak claims
News

You Might also Like

Spain looks to ban social media for under-16s, joining others in Europe

4 Min Read
UK privacy watchdog opens inquiry into X over Grok AI sexual deepfakes
News

UK privacy watchdog opens inquiry into X over Grok AI sexual deepfakes

5 Min Read
New iPhone Fold specs, button layout revealed, new leak claims
News

New iPhone Fold specs, button layout revealed, new leak claims

2 Min Read
The Cheap And Easy Way You Can Instantly Revitalize Your Old PlayStation 3 – BGR
News

The Cheap And Easy Way You Can Instantly Revitalize Your Old PlayStation 3 – BGR

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?