The Cheapest OpenAI Model That Still Does the Job
GPT-5 Nano at $0.05/$0.40 per million tokens handles classification and extraction; Mini covers most production text. The decision table, with a worked 94x cost spread.
The cheapest OpenAI model is GPT-5 Nano: $0.05 per million input tokens and $0.40 per million output as of June 2026. The more useful answer is the cheapest model per task type: Nano for classification and extraction, GPT-5 Mini for most production text, GPT-5 and up only where judgment is the product. On an identical job, the spread between the cheapest and the most expensive current model is roughly 94x, so this choice outweighs every prompt-level optimization you will ever ship.
The price ladder
June 2026 API prices per million tokens (live numbers at openai.com/api/pricing):
| Model | Input | Output | Sweet spot |
|---|---|---|---|
| GPT-5 Nano | $0.05 | $0.40 | Classification, extraction, routing, dedup |
| GPT-5 Mini | $0.25 | $2.00 | Summaries, support drafts, structured rewrites |
| o4-mini | $0.55 | $2.20 | Budget reasoning: math checks, code triage |
| GPT-5 | $1.25 | $10.00 | Default for customer-facing generation |
| GPT-5.4 | $2.50 | $15.00 | Harder reasoning, review passes |
| GPT-5.5 | $5.00 | $30.00 | Agent planners, frontier-quality judgment |
GPT-5 also offers cached input at $0.125 per million, which matters when a long system prompt repeats across calls. Per-model workload math for the top tier is worked through in GPT-5.5 API cost: per-token prices and real workload math, and the full pricing mechanics (caching, batch, gotchas) live in OpenAI API pricing explained.
The same job at four prices
Classify 100,000 support tickets, 500 input and 20 output tokens each: 50M input, 2M output tokens total.
| Model | Input cost | Output cost | Total job cost |
|---|---|---|---|
| GPT-5 Nano | $2.50 | $0.80 | $3.30 |
| GPT-5 Mini | $12.50 | $4.00 | $16.50 |
| GPT-5 | $62.50 | $20.00 | $82.50 |
| GPT-5.5 | $250.00 | $60.00 | $310.00 |
Checking one row: 50M input x $0.05/1M = $2.50, plus 2M output x $0.40/1M = $0.80, so the whole job costs $3.30 on Nano. On the same classification job, the spread between GPT-5 Nano and GPT-5.5 is roughly 94x. If a team is running tickets through GPT-5.5 “to be safe”, they are paying $306.70 per hundred thousand tickets for safety nobody measured.
How to find your floor
The method beats the folklore: write an eval set of 100 to 200 real examples with expected outputs before touching model names. Then downgrade until the evals fail, and step back up one tier. Task-type starting points:
- Nano: single-label decisions, JSON extraction from consistent formats, language detection, spam filtering. Wrong for anything open-ended.
- Mini: summaries under a page, templated support replies, title generation, rewrites with clear instructions.
- o4-mini: the budget reasoning slot. It thinks before answering, and those reasoning tokens bill as output, so its effective cost runs above the sticker for hard problems. Good for math validation and code-review triage.
- GPT-5: the production default when output quality is customer-visible.
- GPT-5.4 / GPT-5.5: planning steps in agents, final review passes, work where a wrong answer costs more than the model does.
Cascades beat single-model choices
Production systems rarely need one model; they need a cheap default and an escalation path. Route everything to Nano with a confidence check, escalate the uncertain 15% to GPT-5:
100,000 tickets through Nano = $3.30
15,000 escalations through GPT-5
(7.5M in x $1.25 + 0.3M out x $10)/1M = $12.38
total = $15.68
That is 81% below the $82.50 all-GPT-5 bill, with GPT-5 quality exactly where the easy cases end. The confidence check can be as simple as asking Nano to emit a certainty field and escalating anything below a threshold.
Two levers that change the answer
Batch halves everything offline. The Batch API runs at a 50% discount for results within 24 hours. The Nano cascade above drops toward $8 if the job can wait overnight; the mechanics and fit are in the Batch API: when 50% off is worth the wait.
Flat capacity makes the question moot for bulk work. Model-shaving exists because every token is metered. Work routed through a subscription-backed lane bills against a flat ChatGPT plan instead, so the per-token spread stops mattering for whatever the Codex lane serves; the model surface is what Codex exposes, and capacity comes as plan windows (estimates, not guarantees). When bulk jobs move off the meter, model choice goes back to being a quality decision instead of a budget one. The arithmetic is in the API vs subscription cost comparison.
Price your own workload both ways in the calculator; it takes token counts and shows the per-model meter cost next to the flat-lane setup.
Frequently asked questions
What is the cheapest OpenAI model in 2026?
GPT-5 Nano, at $0.05 per million input tokens and $0.40 per million output tokens as of June 2026. It handles classification, extraction, routing, and other narrow tasks well. GPT-5 Mini at $0.25/$2 is the cheapest model most teams can run for general production text work.
Is GPT-5 Nano good enough for production?
For narrow, well-specified tasks, yes: classification, entity extraction, deduplication, routing, and moderation pre-filters. It is the wrong choice for open-ended writing or multi-step reasoning. The reliable pattern is an eval suite: downgrade until your evals fail, then step back up one tier.
When is GPT-5.5 worth the price?
When the task carries judgment that cheaper models measurably fail: agent planning, hard debugging, high-stakes drafts. At $5/$30 per million tokens it costs roughly 94x GPT-5 Nano on a typical job, so it earns its keep as the escalation tier, not the default.
How much does the OpenAI Batch API save?
Batch runs at a 50% discount on both input and output tokens in exchange for results within 24 hours instead of seconds. For offline jobs like backfills, nightly classification, or bulk summarization, it halves whatever model price you chose.