After Operator (web navigation) and Deep Research (information synthesis), OpenAI announced on Friday the preview of a new agent dedicated to software engineering: Codex (not to be confused with the first version of Codex launched in 2021). This agent, integrated into the ChatGPT interface, is designed to automate certain programming tasks such as code generation, bug detection and correction, writing tests, or even creating pull requests.
Unlike traditional code assistance systems, which merely offer completions or suggestions, this agent operates more autonomously. Tasks are executed in a cloud-based isolated environment, configured with the technical context provided by the user (notably the content of their code repository). This allows the agent to carry out complex operations sequentially or in parallel, while ensuring a certain level of internal verification: it can, for example, execute code, analyze results, adjust its own modifications, and generate output documents like pull requests ready for review.
This functionality relies on a model named codex-1, a variant of the GPT-4 reasoning model (referenced as "o3" by OpenAI in its internal communications). This model has been specifically fine-tuned through reinforcement learning on software development scenarios, with the objective of producing readable code, consistent with the project's style, and adhering to best practices.
Functionality and Availability
Codex is accessible from the ChatGPT sidebar (for Pro, Team, and Enterprise plan users). Two main entries are offered:
“Code” to request the execution of a task (implementation, correction, etc.)
“Ask” to query the agent about an existing file or structure (function, class, dependency, etc.)
The time required for execution depends on the complexity of the task and varies, according to OpenAI, from a few minutes to half an hour. Several companies, including Cisco, Superhuman, Temporal, and Kodiak, are experimenting with the tool in real-world use cases such as legacy code maintenance, automated test generation, or project documentation.
The service is currently limited to paying subscribers, and its extension to "Plus" offer users is announced for a later date.
To better understand
What is the potential regulatory impact of using an isolated cloud-based environment for software engineering, in terms of regulation and compliance?
Using an isolated cloud-based environment raises regulatory concerns, particularly regarding data security and compliance with data protection standards like GDPR. Companies must ensure their cloud hosting practices align with these regulations to avoid legal risks.
How does fine-tuning the codex-1 model with reinforcement learning enhance its performance in software development?
Fine-tuning the codex-1 model with reinforcement learning allows the model to learn from its mistakes and dynamically adjust. This enhances its ability to generate code consistently and in line with modern development practices, while improving its accuracy in understanding and implementing programming tasks.