Home/Blog/Generative Ai/A detailed OpenAI's Codex tutorial
Home/Blog/Generative Ai/A detailed OpenAI's Codex tutorial

A detailed OpenAI's Codex tutorial

16 min read
Sep 10, 2025
content
What is OpenAI Codex?
Setting up Codex inside ChatGPT
Step 1: Locating the Codex tool
Step 2: Starting the setup process
Step 3: Connecting to GitHub
Interacting with Codex: The two modes of operation
The “Ask” mode: Our AI code analyst
The “Code” mode: Our AI action taker
Guiding OpenAI Codex with the AGENTS.md file
What should we include in an AGENTS.md file?
OpenAI Codex in action: A daily planner case study
Understanding the codebase with the “Ask” mode
Fixing a bug with Code mode
Creating the pull request
Beyond the basics: Real-world considerations
The future of delegation: Trust, security, and a new workflow
Final thoughts

Imagine starting your day not by tackling the small bugs in the backlog, but by delegating them. We assign a list of tedious fixes to a capable AI teammate and then dive deep into the architectural work that truly needs our focus. This isn’t a distant future; it’s the new paradigm of AI-powered delegation that OpenAI Codex explores, shifting from a real-time pair programmer to an asynchronous AI software engineering agent.

As developers, our daily work is often a balancing act. Alongside building major features, we face an endless stream of small but essential tasks: fixing bugs, updating documentation, or refactoring a function. Each task requires a context switch, pulling our focus away from deeper problems. With an AI developer tool like Codex, we can package these assignments and hand them off to a capable AI teammate to handle on our behalf, boosting our developer productivity. Instead of suggesting the next line, Codex is designed to take on an entire task, work independently, and return a completed solution for our review.

Informational note: It’s important to clarify that OpenAI has two distinct offerings with similar names:

  • Codex CLI tool

  • Codex cloud-based agent

This blog focuses on the cloud-based Codex agent. This proprietary, closed-source service is integrated directly within the ChatGPT interface and is unavailable as a standalone SDK. It performs its tasks by creating a secure, remote copy of our repository.

This is separate from the Codex CLI, an open-source tool designed to run locally in our terminal, allowing it to operate directly on the files on our machine.

This blog will explore what OpenAI Codex is, how to set it up, and how to use it to solve a real-world bug, from prompt to pull request. Whether you are an individual developer looking to boost your productivity or a team lead evaluating new AI tools, this guide provides a practical roadmap for getting started.

What is OpenAI Codex?#

OpenAI Codex is a cloud-based software engineering agent released in 2025 as part of the ChatGPT ecosystem. It performs development tasks in a secure, sandboxed environmentA temporary, isolated space where code can be run safely without affecting our main repository..

Unlike tools inside our editor, Codex operates on a different level of abstraction. To solve the problem of constant context-switching, it doesn’t just suggest changes; it actively carries them out by handling tasks asynchronously. ​​When we submit a request (like fixing a bug or adding tests), Codex launches a dedicated container to perform the work. It runs terminal commands, analyzes dependencies, and checks its results, ultimately presenting a solution for our review.

At its core, Codex is powered by codex-1, a variant of OpenAI’s o3 model that was specifically fine-tuned on actual software engineering workflows, as detailed in the OpenAI Codex official documentation. This includes reinforcement learning on real-world repositories, helping the agent produce outputs that match standard pull request formats, coding styles, and CI constraints.

Using OpenAI API for Natural Language Processing in Python

Cover
Using OpenAI API for Natural Language Processing in Python

This course introduces OpenAI API and NLP in Python, teaching you how to use the OpenAI API for real-world natural language processing tasks. You’ll begin by exploring the OpenAI API with Python, setting up your account, and learning how to access key endpoints. The completions endpoint will be covered for generating, classifying and transforming text. The course then covers advanced NLP in Python techniques using other OpenAI endpoints like moderations and embeddings for in-depth text analysis. You’ll practice these methods to efficiently analyze and manipulate text data. Finally, you will integrate your OpenAI skills with Flask to build interactive, NLP-powered applications. The course emphasizes real-world projects, helping you use OpenAI’s capabilities to solve problems in content generation, text classification, and more. After completing this course, you can apply NLP in Python using OpenAI API to create scalable solutions.

1hr 30mins
Beginner
22 Playgrounds
30 Illustrations

Codex is being rolled out to users on ChatGPT Team, Enterprise, and Pro plans, with access for Plus and Edu users planned for the future. It represents OpenAI’s dedicated effort to build an agent that reduces developer toil and accelerates the entire software development life cycle.

Informational note: While Codex is built on top of OpenAI’s language models, it differs from general-purpose GPT tools. Codex is fine-tuned specifically for software engineering tasks, with system behaviors and workflows optimized for writing, testing, and submitting production-grade code, not just generating text.

Setting up Codex inside ChatGPT#

Before Codex can begin working on our codebase, we need to complete a brief one-time setup to connect Codex with our development environment. The process is straightforward and guided by the ChatGPT interface. Let’s walk through it together.

Step 1: Locating the Codex tool#

Our first step is to find the Codex agent within ChatGPT. After logging into a supported account (Team, Enterprise, or Pro), we can locate Codex in the main sidebar on the left-hand side of the screen.

Locating the Codex in the ChatGPT interface
Locating the Codex in the ChatGPT interface

Step 2: Starting the setup process#

Once we click “Codex” in the sidebar, ChatGPT opens a new tab dedicated to the Codex workspace. This is where all task execution, environment configuration, and code interactions will happen.

Codex tab in ChatGPT
Codex tab in ChatGPT

We simply click the “Get Started” button to initiate the setup. This begins the process of linking Codex with our GitHub account.

The “Connect to GitHub” page
The “Connect to GitHub” page

Step 3: Connecting to GitHub#

Codex operates directly on repositories and needs permission to access our GitHub account. After clicking the “Connect to GitHub” button, we’re prompted to connect to GitHub.

Connecting to GitHub
Connecting to GitHub

After clicking “Continue to GitHub,” a pop-up window will ask us to install and authorize the GitHub connector. This allows Codex to view and interact with repositories across our account or only the specific ones we select.

Installing and authorizing GitHub
Installing and authorizing GitHub

Once authorized, we’re taken to a screen where we can add our GitHub account under the “GitHub organization” tab. This step ensures Codex can discover and prepare our repositories for task execution.

Creating the starting environment
Creating the starting environment

We can see a toggle for “Agent internet access.” This is a crucial security and functionality setting. By default, it is turned off, which means Codex operates in a completely isolated sandbox without access to the public internet. This powerful security feature prevents the agent from reaching external websites or APIs. However, if our task requires installing or updating dependencies from a package manager (like npm or pip), we must enable this option. For our initial setup, we can leave it turned off, but it is an important setting to remember for tasks that involve managing external packages.

Once we have selected our initial repository, we complete the process by clicking the “Create environment” button. Codex will then prepare the workspace, which may take a moment as it sets up the sandbox in the background. Once it’s finished, we will be brought to the main Codex interface. This is our command center for assigning new tasks and monitoring their progress.

Our setup is complete, and we are ready to put Codex to work.

The Codex interface
The Codex interface

Interacting with Codex: The two modes of operation#

Interacting with Codex is fundamentally different from using a real-time AI assistant. It operates on an asynchronous delegation model, freeing us from waiting on a task and enabling ​​multitasking. After we submit a task, the agent works on it independently in its cloud environment. A typical task can take 1 to 30 minutes, depending on complexity. This workflow encourages us to assign a task, switch our focus to another problem, and then return to review the finished work.

To manage these interactions, Codex offers two primary modes, i.e., Ask and Code. E​​ach is designed for a different type of developer workflow. Understanding the distinction is key to using the agent effectively.

The “Ask” mode: Our AI code analyst#

In the “Ask” mode, Codex acts as an analyst, not an implementer. Codex performs a read-only analysis of our codebase when we use this mode. It can explore the repository to understand its structure, dependencies, and logic, but it will not change the files. As it doesn’t need to prepare an environment for running tests or applying changes, this mode is generally faster.

It’s the perfect choice for tasks needing insight, not implementation. Common use cases for the “Ask” mode include:

  • Codebase onboarding: Asking Codex to “Explain the purpose of the auth module” or “Summarize how the payment processing flow works.”

  • Architecture review: Requesting Codex to “Generate a Mermaid.js diagram of the full request flow for the primary API endpoint.”

  • Refactoring strategy: Brainstorming improvements by asking, “What are some ways we could refactor utils.js to be more modular and testable?”

Note: Setting realistic expectations for the “Ask” mode is important. While incredibly powerful for well-structured projects, its ability to analyze a codebase is proportional to the guidance it receives. Broad prompts may yield generic or incomplete answers in large, complex, poorly documented repositories. For best results, we should follow OpenAI's advice: narrow the focus of our prompts to specific files or modules, and use a detailed AGENTS.md file to provide essential context.

The “Code” mode: Our AI action taker#

As the name implies, the “Code” mode is when we want Codex to actively write or modify our code. This is a more involved process where the agent creates a full-fledged, interactive environment. It can run tests, execute linters, and validate its work using other tools defined in our setup scripts. The final output is not just text but a concrete set of code changes that can be pushed directly into a pull request.

This is our go-to mode for delegating actionable tasks. It’s ideal for situations like:

  • Applying bug fixes: Providing a stack trace and instructing Codex to “Find and fix the bug in <my-package> causing this error.”

  • Writing unit tests: Pointing to a file and asking Codex to “Add comprehensive unit tests for the functions in calculations.js.”

  • Automating refactors: Instructing Codex to “Rename the legacyApi function to newApiV2 across the entire project.”

Now that we understand the ‘what’ and ‘how’ of interacting with Codex, the next logical question is: How do we ensure the code it generates meets our project’s specific standards? Let’s discuss how to tailor Codex’s behavior to our team's unique workflow.

Guiding OpenAI Codex with the AGENTS.md file#

While Codex can operate without any special configuration, its true power is unlocked when we provide context and project-specific rules. We must provide clear instructions and context to get the best results from any engineering teammate. With Codex, we primarily achieve this through a special file called AGENTS.md. It is an optional but highly recommended “contributor’s guide” written for our AI teammate.

An AGENTS.md file is a standard Markdown file in our repository to give Codex clear, explicit instructions on operating within our codebase. It serves as a bridge between our development team’s standards and the AI’s execution, ensuring the code it produces is functional but also consistent, high-quality, and easy to review.

When Codex begins a task, it automatically searches for AGENTS.md files relevant to the code it needs to modify. If it finds multiple files in nested directories, it will prioritize the instructions in the file “closest” to the changed code. This allows for global, repository-wide rules and, more specifically, component-level guidance.

What should we include in an AGENTS.md file?#

A comprehensive AGENTS.md file acts as a complete blueprint for the agent. Based on best practices, here are the key areas to document:

  • Project structure: Outline the key directories and their purposes. This helps Codex understand where to find relevant code and where it should (and should not) make changes.

  • Coding conventions: Define the team’s standards for languages, style, naming, and commenting. For example, specify whether to use TypeScript, how to name React components, or to use a particular CSS methodology like Tailwind CSS.

  • Testing protocols: Provide the exact commands needed to run our test suite. This enables Codex to validate its work before presenting it for review. We can specify commands for running all tests, specific files, or tests with coverage.

  • Pull request (PR) guidelines: Instruct Codex on how to format PR titles and descriptions, how to reference related issues, and whether to include screenshots for UI changes. This ensures its contributions align with our team’s workflow.

  • Programmatic checks: List any linting, type-checking, or build commands that must be run before code can be merged. Codex will attempt to run these checks to ensure its changes are safe and correct.

Here is an example of what a simple AGENTS.md file might look like:

# AI Contributor Guide: E-Commerce Platform
This guide helps our AI agents, including OpenAI Codex, understand how to contribute to our codebase effectively.
## 1. Project Structure Overview
Our project is a standard Next.js application.
- All primary source code is located in `/src`.
- Reusable React components are in `/src/components`.
- API route handlers are located in `/src/pages/api`.
- Do not modify files in the `/public` directory.
## 2. Coding Conventions
- **Language:** Please use TypeScript for all new `.ts` and `.tsx` files.
- **Style:** We use Prettier for code formatting. Please ensure your code follows this style.
- **Comments:** Add JSDoc comments to any new utility functions explaining their purpose, parameters, and return values.
## 3. Testing Requirements
We use Jest for unit testing. Before submitting a pull request, please ensure all changes are covered by tests.
- To run all tests, use the command: `npm test`
- To run tests for a specific file you've changed, use: `npm test -- <path/to/file>`
## 4. Pull Request (PR) Guidelines
When creating a PR, please follow these rules:
- **Title:** Use the format `feat: A brief description` for new features or `fix: A brief description` for bug fixes.
- **Description:** Link to the corresponding ticket number from our issue tracker (e.g., "Closes TICKET-123").
- **UI Changes:** If you make any changes to the UI, please include a screenshot in the PR description.
Example AGENTS.md file

By providing a detailed guide like this, we transform Codex from a generic tool into a specialized teammate who understands our project’s unique requirements. This leads to contributions that are faster to review and easier to merge.

Now that we understand the theory behind these two modes, let’s test Code mode with a practical, end-to-end example.

OpenAI Codex in action: A daily planner case study#

We’ve created a simple “My Daily Planner” application using Python and Flask to see Codex in action. The app allows users to add, complete, and delete tasks. Given below is the code for the application.

Flask==2.3.3
Werkzeug==2.3.7
The complete code for the “My Daily Planner” application

Here we can see the demo of the above application:

First, let’s imagine we are a new developer joining this project. Our initial goal isn’t to fix anything but simply to understand the existing codebase.

Understanding the codebase with the “Ask” mode#

We’ll ask Codex to perform a high-level analysis of our repository. We want it to act as a senior developer, giving us a project tour. Here is the prompt we will use in the Codex interface for the planner application. We will see how Codex breaks down our application.

Prompt: Please analyze this repository and explain its structure and core functionality. I need a summary that covers the main purpose of each file and how they work together.

After a few minutes, Codex returned with a remarkably detailed and accurate breakdown of the entire project.

This summary is an excellent starting point. It confirms the application’s features and highlights the key file, app.py, and the specific routes that handle the core logic. Now that we have a good understanding of the codebase, we can move on to fixing one of its known issues.

Fixing a bug with Code mode#

Our planner application has a logical flaw: it allows users to add duplicate tasks. If “Prepare for exam” is already on the list, a user can add it again, which is impossible. This time, we will switch to the “Code” mode, as we want the agent to actively modify the files and provide a fix. Here’s the prompt we will use and see how Codex handles this direct instruction to find and fix a specific bug.

Prompt: There is a bug in the application that allows users to add duplicate tasks. Please modify the /add route in app.py to prevent this. Before inserting a new task, the code should check if a task with the exact same text already exists in the database. If it does, the code shouldn‘t add a task and display an appropriate message.

After a few minutes, Codex updated its interface to show us the proposed changes. It presents a clean “Diff” view, allowing us to review the exact lines of code it has added or modified to solve the problem.

Codex provides transparency through a dedicated “Logs” tab to solve the problem of “opaque-box” AI, where we can’t see the process. Clicking on this tab reveals the step-by-step terminal commands the agent executed within its sandbox. We can see everything from cloning the repository, installing dependencies, running tests, and encountering errors.

This is an invaluable feature for understanding how the agent reasoned about our request and debugging the process if a task fails or produces an unexpected outcome. It builds trust by making the AI’s work auditable, not just an opaque box.

We will review the proposed change’s code and, once we’re satisfied, have Codex create the pull request. The final verification will happen after we merge the code and run the application locally.

Creating the pull request#

With the code change reviewed, we can now click the “Create PR” button. Codex will automatically create a new branch, commit the changes, and open a pull request in our GitHub repository. Based on our initial prompt, it even populates the title and description to save us the manual effort of writing boilerplate PR descriptions.

Codex is creating the pull request for fixing the duplicate task insertion issue
Codex is creating the pull request for fixing the duplicate task insertion issue

The task we delegated is complete. We have successfully used Codex to identify, understand, and fix a bug in our application without writing a single line of code ourselves. Below is the updated code for the “My Daily Planner” application, which doesn’t allow the insertion of duplicate tasks.

Flask==2.3.3
Werkzeug==2.3.7
The updated code for the “My Daily Planner” application, which fixes the issue of duplicate insertion

Here we can see the demo of the updated application:

Beyond the basics: Real-world considerations#

While our daily planner example shows an ideal workflow, it’s helpful to consider insights from practical use in more complex development environments. To get the most out of Codex, here are a few key points to keep in mind for advanced scenarios:

  • Navigating large monorepos: Codex can operate effectively within large monorepos, but its success depends on clear guidance. Vague prompts will struggle; for best results, our instructions must be precise, pointing to specific files or packages, and be supported by a well-defined AGENTS.md file to help the agent navigate the complexity.

  • Single-repository focus: The current workflow is designed around a single repository at a time. When we start a task, Codex clones one specific repo into its sandbox and works exclusively within that context. It cannot natively see or interact with code in other repositories (like other microservices), as this would require advanced, custom configurations.

  • Best for discrete tasks: The agent excels at well-defined, discrete tasks that can be completed in a single pass. For complex refactoring that might require several rounds of iterative changes on one feature branch, the current workflow, which is optimized for creating a new pull request per task, can sometimes be less fluid than working directly in an IDE.

The future of delegation: Trust, security, and a new workflow#

Our hands-on example showcases a powerful workflow and highlights the core principles needed for successful AI delegation: trust and guidance. This is where features like the AGENTS.md file become paramount, acting as a contract to ensure the AI adheres to our team's standards.

Furthermore, OpenAI’s security-first architecture is fundamental to this trust. By running every task in a fresh, isolated sandbox with network access disabled by default, the system is designed to protect our codebase, ensuring the agent only interacts with the code and tools we explicitly provide.

Final thoughts#

OpenAI Codex represents a significant step toward an asynchronous development model. It may not replace a senior developer for complex, architectural work, but it has already proven to be a powerful ally for clearing backlogs and handling the daily churn of software maintenance.

By delegating these routine tasks, we can reclaim our most valuable resources (time and focus) and apply them to the bigger challenges. If you’re ready to try AI delegation in your next sprint, start by identifying a small, well-defined bug in your backlog. Guide Codex with a clear prompt and an AGENTS.md file, and see how this new way of working can accelerate your development life cycle.

Mastering OpenAI API and ChatGPT for Innovative Applications

Cover
Mastering OpenAI API and ChatGPT for Innovative Applications

OpenAI has revolutionized how developers approach natural language processing, creative content generation, and AI-driven solutions across various industries. You will begin the course with an introduction to OpenAI and its most famous GPT-based chatbot, ChatGPT. After learning the fundamentals of prompt engineering, you’ll progress to advanced techniques for writing prompts, explore practical applications of ChatGPT, and gain hands-on experience with the OpenAI API. The course will guide you through advanced model usage, including fine-tuning and embeddings, and provide insights into troubleshooting and best practices. By the end of this course, you will be equipped with the skills to leverage the OpenAI API for innovative applications, from building AI-driven chatbots to implementing AI solutions in real-world projects. This knowledge will enhance your ability to create cutting-edge AI applications and advance your career in the rapidly evolving field of artificial intelligence.

5hrs
Beginner
37 Playgrounds
5 Quizzes

Written By:
Kamran Lodhi

Free Resources