Available Models Overview
Codex CLI supports all models available on the OpenAI platform. As of 2026, the most commonly used options are:
| Model ID | Alias | Positioning | Context Window |
|---|---|---|---|
codex-mini-latest |
o4-mini (default) | Strong reasoning, moderate cost | 200K tokens |
gpt-4.1 |
GPT-4.1 | Fastest response, lowest cost per token | 1M tokens |
gpt-4.1-mini |
GPT-4.1 Mini | Lowest cost, simple tasks | 1M tokens |
o3 |
o3 | Deepest reasoning, complex problems | 200K tokens |
o4-mini-high |
o4-mini High | Higher reasoning budget variant | 200K tokens |
The default model is codex-mini-latest (o4-mini). For most users, this is the right choice — don't change it unless you have a specific performance or cost need.
Performance & Cost Comparison
| Model | Code Quality | Reasoning | Speed | Input Cost | Output Cost |
|---|---|---|---|---|---|
| o4-mini default | ★★★★☆ | ★★★★★ | ★★★☆☆ | $1.1 / 1M | $4.4 / 1M |
| GPT-4.1 | ★★★★☆ | ★★★☆☆ | ★★★★★ | $2.0 / 1M | $8.0 / 1M |
| GPT-4.1 Mini | ★★★☆☆ | ★★☆☆☆ | ★★★★★ | $0.4 / 1M | $1.6 / 1M |
| o3 | ★★★★★ | ★★★★★ | ★★☆☆☆ | $10.0 / 1M | $40.0 / 1M |
| o4-mini-high | ★★★★★ | ★★★★★ | ★★★☆☆ | $1.1 / 1M | $4.4 / 1M |
Prices are approximate. See OpenAI's current pricing page for exact rates. Actual costs vary due to reasoning token overhead.
o4-mini (Default): Best All-Around Choice
Recommended for: Day-to-day Codex CLI usage, CI/CD tasks, multi-step refactoring, AGENTS.md-guided workflows.
o4-mini is the default Codex CLI model and the best choice for most users. As a reasoning model, it "thinks" through complex tasks before acting — making it meaningfully better than GPT-4.1 for:
- Multi-file refactoring (requires understanding inter-file dependencies)
- CI/CD automation tasks (requires planning execution steps)
- Debugging complex bugs (requires tracing root causes)
- Understanding large codebase structure (requires holistic reasoning)
At $1.1/$4.4 per 1M tokens with excellent SWE-bench performance, o4-mini is the safe default. If you're unsure which model to use, stick with o4-mini.
$ codex # defaults to codex-mini-latest (o4-mini)
$ codex exec "analyze test failures and suggest fixes"
GPT-4.1: Speed and Economy
Recommended for: Quick code edits, large file processing, simple batch tasks, latency-sensitive scenarios.
GPT-4.1's standout advantages are fastest response + largest context window (1M tokens). It's not a reasoning model — it generates output directly without an internal thinking step — making it faster and more economical for simple, well-defined tasks:
- Modifying a single function or method (clear, bounded scope)
- Adding comments, adjusting formatting (no reasoning needed)
- Processing very large code files (needs the 1M context window)
- Quick Q&A and code explanations
Note: GPT-4.1's input cost ($2.0 / 1M) is higher than o4-mini ($1.1 / 1M), but because it doesn't use reasoning tokens, total task cost can be similar or even lower for simple tasks.
$ codex --model gpt-4.1
$ codex exec --model gpt-4.1 "add JSDoc comments to all exported functions"
GPT-4.1 Mini: Lowest Cost Option
Recommended for: Simple scripts, formatting, log analysis, cost-sensitive automation tasks.
GPT-4.1 Mini is the cheapest available model ($0.4 / 1M input) — great for batch operations that don't require complex reasoning. For tasks that need context understanding or decision-making, switch back to o4-mini.
$ codex exec --model gpt-4.1-mini "normalize all files to 2-space indentation"
o3: Maximum Reasoning Power, High Cost
Recommended for: Extremely complex architecture redesigns, cross-module dependency analysis, deep security vulnerability audits.
o3 is one of OpenAI's most powerful reasoning models, but also the most expensive ($10 / 1M input — roughly 9× o4-mini). Standard Codex CLI tasks don't require o3 — reserve it for scenarios where o4-mini demonstrably struggles with extreme complexity.
Cost warning: o3 on a large codebase task can cost $1–5 per run. Always test on small scope first before running o3 in CI/CD or batch workflows.
Model Selection by Task Type
| Task Type | Recommended Model | Reason |
|---|---|---|
| CI/CD automation (GitHub Actions) | o4-mini | Reasoning + planning, cost-controlled |
| Multi-file refactoring | o4-mini | Needs to understand cross-file dependencies |
| Single function edits / small tasks | GPT-4.1 | Faster, no reasoning overhead needed |
| Batch formatting / comments | GPT-4.1 Mini | Cheapest, task is simple and mechanical |
| Very large files (>500K tokens) | GPT-4.1 | 1M context window required |
| Complex bug debugging | o4-mini or o4-mini-high | Reasoning models better at root-cause tracing |
| Architecture redesign / ultra-complex | o3 | Maximum reasoning; only when necessary |
How to Configure Your Model
Method 1: config.toml (Persistent Default)
Set a default model in ~/.codex/config.toml — applies to every Codex session:
model = "gpt-4.1"
# Options: codex-mini-latest, gpt-4.1, gpt-4.1-mini, o3, o4-mini-high
Method 2: --model Flag (One-Time Override)
Override the model for a single invocation without touching global config:
# Interactive mode
$ codex --model gpt-4.1
# Non-interactive mode (codex exec)
$ codex exec --model o4-mini-high "refactor the auth module"
# CI/CD via environment variable
$ CODEX_MODEL=gpt-4.1-mini codex exec "normalize code style"
Method 3: AGENTS.md Suggestion (Project-Level)
Suggest a model in your project's AGENTS.md so Codex uses the right configuration when working on that project:
# Project-level model preference (advisory, can be overridden by --model)
preferred_model: o4-mini-high
## Project context
This is a financial compliance system with very high accuracy requirements.
Prefer models with stronger reasoning capability.
For more on writing effective AGENTS.md files, see: AGENTS.md Guide.
Cost Optimization Tips
Task-Tiered Model Strategy
Assigning different models to different task types is the most direct way to reduce costs. A typical CI/CD pipeline might look like this:
# Simple tasks: GPT-4.1 Mini (cheapest)
codex exec --model gpt-4.1-mini "check code style and list non-compliant files"
# Medium tasks: o4-mini (default, best value)
codex exec "analyze test failures and suggest root cause fixes"
# Complex tasks: o4-mini-high (deeper reasoning needed)
codex exec --model o4-mini-high "refactor auth module while preserving backward compatibility"
Use --quiet to Reduce Output Tokens
In CI/CD and non-interactive scenarios, --quiet suppresses explanatory prose and reduces output token consumption:
$ codex exec --quiet "fix all lint errors" # skips explanatory text
Set max_tokens for Short-Output Tasks
For tasks where you only need brief output, cap the maximum output tokens in config.toml:
model = "gpt-4.1-mini"
max_tokens = 4096 # 4096 is plenty for most simple tasks
For more cost strategies, see: Pricing & Billing Guide.
Frequently Asked Questions
What model does Codex CLI use by default?
The default is codex-mini-latest, which maps to o4-mini. It's the best balance of reasoning capability, cost, and speed for most tasks.
Should I use o4-mini or GPT-4.1 with Codex CLI?
Depends on task complexity. o4-mini is better for complex reasoning tasks (refactoring, CI/CD, debugging). GPT-4.1 is faster and cheaper for simple, clear-scope tasks (formatting, single function edits). For most Codex users, o4-mini is the better default.
How do I switch models in Codex CLI?
Two ways: (1) Set a persistent default in ~/.codex/config.toml with model = "gpt-4.1"; (2) Use the --model flag for a one-time override: codex --model gpt-4.1 or codex exec --model o3 "...".
When will GPT-5 be available in Codex CLI?
OpenAI continuously updates available models. Once GPT-5 series models are available via API, you can use them directly with Codex CLI's --model flag. Check the OpenAI API page for the latest model list.
Is the subscription version using the same model as API billing?
The subscription version (Codex in ChatGPT Plus/Pro) uses models managed by OpenAI, typically with stricter usage limits. The API billing version lets you freely switch models and precisely control costs — more flexible but requires managing your own budget.