Codex CLI Model Guide 2026 — o4-mini vs GPT-4.1 vs o3 Compared

Q: What model does Codex CLI use by default?

Codex CLI defaults to codex-mini-latest, which maps to o4-mini. This model offers the best balance of reasoning capability, cost, and speed for most Codex tasks.

Q: Should I use o4-mini or GPT-4.1 with Codex CLI?

It depends on the task. o4-mini is better for complex multi-step tasks that require planning and reasoning (CI/CD automation, multi-file refactoring, debugging). GPT-4.1 is faster and cheaper for simple, well-defined tasks (single function edits, formatting, file processing). For most Codex users, o4-mini is the better default.

Q: How do I change the model in Codex CLI?

Two ways: (1) Set a persistent default in ~/.codex/config.toml with model = "gpt-4.1"; (2) Use the --model flag for a one-time override: codex --model gpt-4.1 or codex exec --model o3 "..."

Available Models Overview

Codex CLI supports all models available on the OpenAI platform. As of 2026, the most commonly used options are:

Model ID	Alias	Positioning	Context Window
`codex-mini-latest`	o4-mini (default)	Strong reasoning, moderate cost	200K tokens
`gpt-4.1`	GPT-4.1	Fastest response, lowest cost per token	1M tokens
`gpt-4.1-mini`	GPT-4.1 Mini	Lowest cost, simple tasks	1M tokens
`o3`	o3	Deepest reasoning, complex problems	200K tokens
`o4-mini-high`	o4-mini High	Higher reasoning budget variant	200K tokens

ℹ

The default model is codex-mini-latest (o4-mini). For most users, this is the right choice — don't change it unless you have a specific performance or cost need.

Performance & Cost Comparison

Model	Code Quality	Reasoning	Speed	Input Cost	Output Cost
o4-mini default	★★★★☆	★★★★★	★★★☆☆	$1.1 / 1M	$4.4 / 1M
GPT-4.1	★★★★☆	★★★☆☆	★★★★★	$2.0 / 1M	$8.0 / 1M
GPT-4.1 Mini	★★★☆☆	★★☆☆☆	★★★★★	$0.4 / 1M	$1.6 / 1M
o3	★★★★★	★★★★★	★★☆☆☆	$10.0 / 1M	$40.0 / 1M
o4-mini-high	★★★★★	★★★★★	★★★☆☆	$1.1 / 1M	$4.4 / 1M

Prices are approximate. See OpenAI's current pricing page for exact rates. Actual costs vary due to reasoning token overhead.

o4-mini (Default): Best All-Around Choice

Recommended for: Day-to-day Codex CLI usage, CI/CD tasks, multi-step refactoring, AGENTS.md-guided workflows.

o4-mini is the default Codex CLI model and the best choice for most users. As a reasoning model, it "thinks" through complex tasks before acting — making it meaningfully better than GPT-4.1 for:

Multi-file refactoring (requires understanding inter-file dependencies)
CI/CD automation tasks (requires planning execution steps)
Debugging complex bugs (requires tracing root causes)
Understanding large codebase structure (requires holistic reasoning)

At $1.1/$4.4 per 1M tokens with excellent SWE-bench performance, o4-mini is the safe default. If you're unsure which model to use, stick with o4-mini.

Using the default model

$ codex            # defaults to codex-mini-latest (o4-mini)
$ codex exec "analyze test failures and suggest fixes"

GPT-4.1: Speed and Economy

Recommended for: Quick code edits, large file processing, simple batch tasks, latency-sensitive scenarios.

GPT-4.1's standout advantages are fastest response + largest context window (1M tokens). It's not a reasoning model — it generates output directly without an internal thinking step — making it faster and more economical for simple, well-defined tasks:

Modifying a single function or method (clear, bounded scope)
Adding comments, adjusting formatting (no reasoning needed)
Processing very large code files (needs the 1M context window)
Quick Q&A and code explanations

Note: GPT-4.1's input cost ($2.0 / 1M) is higher than o4-mini ($1.1 / 1M), but because it doesn't use reasoning tokens, total task cost can be similar or even lower for simple tasks.

Using GPT-4.1

$ codex --model gpt-4.1
$ codex exec --model gpt-4.1 "add JSDoc comments to all exported functions"

GPT-4.1 Mini: Lowest Cost Option

Recommended for: Simple scripts, formatting, log analysis, cost-sensitive automation tasks.

GPT-4.1 Mini is the cheapest available model ($0.4 / 1M input) — great for batch operations that don't require complex reasoning. For tasks that need context understanding or decision-making, switch back to o4-mini.

Using GPT-4.1 Mini

$ codex exec --model gpt-4.1-mini "normalize all files to 2-space indentation"

o3: Maximum Reasoning Power, High Cost

Recommended for: Extremely complex architecture redesigns, cross-module dependency analysis, deep security vulnerability audits.

o3 is one of OpenAI's most powerful reasoning models, but also the most expensive ($10 / 1M input — roughly 9× o4-mini). Standard Codex CLI tasks don't require o3 — reserve it for scenarios where o4-mini demonstrably struggles with extreme complexity.

Cost warning: o3 on a large codebase task can cost $1–5 per run. Always test on small scope first before running o3 in CI/CD or batch workflows.

Model Selection by Task Type

Task Type	Recommended Model	Reason
CI/CD automation (GitHub Actions)	o4-mini	Reasoning + planning, cost-controlled
Multi-file refactoring	o4-mini	Needs to understand cross-file dependencies
Single function edits / small tasks	GPT-4.1	Faster, no reasoning overhead needed
Batch formatting / comments	GPT-4.1 Mini	Cheapest, task is simple and mechanical
Very large files (>500K tokens)	GPT-4.1	1M context window required
Complex bug debugging	o4-mini or o4-mini-high	Reasoning models better at root-cause tracing
Architecture redesign / ultra-complex	o3	Maximum reasoning; only when necessary

How to Configure Your Model

Method 1: config.toml (Persistent Default)

Set a default model in ~/.codex/config.toml — applies to every Codex session:

~/.codex/config.toml

model = "gpt-4.1"
# Options: codex-mini-latest, gpt-4.1, gpt-4.1-mini, o3, o4-mini-high

Method 2: --model Flag (One-Time Override)

Override the model for a single invocation without touching global config:

Command-line model selection

# Interactive mode
$ codex --model gpt-4.1

# Non-interactive mode (codex exec)
$ codex exec --model o4-mini-high "refactor the auth module"

# CI/CD via environment variable
$ CODEX_MODEL=gpt-4.1-mini codex exec "normalize code style"

Method 3: AGENTS.md Suggestion (Project-Level)

Suggest a model in your project's AGENTS.md so Codex uses the right configuration when working on that project:

AGENTS.md example

# Project-level model preference (advisory, can be overridden by --model)
preferred_model: o4-mini-high

## Project context
This is a financial compliance system with very high accuracy requirements.
Prefer models with stronger reasoning capability.

For more on writing effective AGENTS.md files, see: AGENTS.md Guide.

Cost Optimization Tips

Task-Tiered Model Strategy

Assigning different models to different task types is the most direct way to reduce costs. A typical CI/CD pipeline might look like this:

GitHub Actions tiered model example

# Simple tasks: GPT-4.1 Mini (cheapest)
codex exec --model gpt-4.1-mini "check code style and list non-compliant files"

# Medium tasks: o4-mini (default, best value)
codex exec "analyze test failures and suggest root cause fixes"

# Complex tasks: o4-mini-high (deeper reasoning needed)
codex exec --model o4-mini-high "refactor auth module while preserving backward compatibility"

Use --quiet to Reduce Output Tokens

In CI/CD and non-interactive scenarios, --quiet suppresses explanatory prose and reduces output token consumption:

Reducing output overhead

$ codex exec --quiet "fix all lint errors"  # skips explanatory text

Set max_tokens for Short-Output Tasks

For tasks where you only need brief output, cap the maximum output tokens in config.toml:

~/.codex/config.toml

model = "gpt-4.1-mini"
max_tokens = 4096  # 4096 is plenty for most simple tasks

For more cost strategies, see: Pricing & Billing Guide.

🦙 Want to eliminate API costs entirely? Codex CLI supports local models via Ollama (Qwen2.5-Coder, DeepSeek-Coder, etc.) for zero-cost, offline coding. See the Ollama Local Model Setup Guide.

Frequently Asked Questions

What model does Codex CLI use by default?

The default is codex-mini-latest, which maps to o4-mini. It's the best balance of reasoning capability, cost, and speed for most tasks.

Should I use o4-mini or GPT-4.1 with Codex CLI?

Depends on task complexity. o4-mini is better for complex reasoning tasks (refactoring, CI/CD, debugging). GPT-4.1 is faster and cheaper for simple, clear-scope tasks (formatting, single function edits). For most Codex users, o4-mini is the better default.

How do I switch models in Codex CLI?

Two ways: (1) Set a persistent default in ~/.codex/config.toml with model = "gpt-4.1"; (2) Use the --model flag for a one-time override: codex --model gpt-4.1 or codex exec --model o3 "...".

When will GPT-5 be available in Codex CLI?

OpenAI continuously updates available models. Once GPT-5 series models are available via API, you can use them directly with Codex CLI's --model flag. Check the OpenAI API page for the latest model list.

Is the subscription version using the same model as API billing?

The subscription version (Codex in ChatGPT Plus/Pro) uses models managed by OpenAI, typically with stricter usage limits. The API billing version lets you freely switch models and precisely control costs — more flexible but requires managing your own budget.

Codex CLI Model Guide 2026: o4-mini vs GPT-4.1 vs o3 Full Comparison

Available Models Overview

Performance & Cost Comparison

o4-mini (Default): Best All-Around Choice

GPT-4.1: Speed and Economy

GPT-4.1 Mini: Lowest Cost Option

o3: Maximum Reasoning Power, High Cost

Model Selection by Task Type

How to Configure Your Model

Method 1: config.toml (Persistent Default)

Method 2: --model Flag (One-Time Override)

Method 3: AGENTS.md Suggestion (Project-Level)

Cost Optimization Tips

Task-Tiered Model Strategy

Use --quiet to Reduce Output Tokens

Set max_tokens for Short-Output Tasks

Frequently Asked Questions