Controlling AI Costs in GitHub Actions

CI runs AI on every push, with retries and matrix jobs multiplying calls. Capped per-repo keys, request logs per key, and when CI belongs on a key vs a plan.

CI is the easiest place to lose control of AI spend because nobody is watching when it runs. Workflows fire on every push, retry what fails, and fan out across matrix jobs, so one innocent-looking review step becomes hundreds of model calls a day. The control pattern is structural: one capped, scoped key per repo, a request log that attributes every call to its key, and a deliberate choice about which CI traffic belongs on an API key and which belongs on a flat ChatGPT plan.

Why CI burns AI budget differently

An engineer at a desk notices an expensive loop. A workflow does not. Three properties make CI spend its own category:

  • Event-driven volume. Every push, every PR update, every scheduled run triggers the job. Nobody decides to spend; the trigger decides.
  • Retries are policy. Failed jobs re-run, and each re-run repeats every AI call inside it at full price.
  • Matrix multiplication. A job that runs across 4 OS-and-version combinations makes its AI calls 4 times.

A loop bug that costs you a coffee during work hours runs unattended at 3 a.m. in CI. Caps exist for exactly that hour.

The setup

Create a ProxyLLM key just for CI, cap it, and store it as a repository or organization secret. Then export the OpenAI-compatible variables in the workflow:

name: ai-review
on: pull_request

jobs:
  review:
    runs-on: ubuntu-latest
    env:
      OPENAI_BASE_URL: https://api.proxyllm.ai/v1
      OPENAI_API_KEY: ${{ secrets.PROXYLLM_API_KEY }}
    steps:
      - uses: actions/checkout@v4
      # any step that reads OpenAI-compatible env vars
      # now bills through the capped CI key
      - run: node scripts/ai-review.mjs

Most CI AI tools read these variables already, so the change is configuration, not code. Issue one key per repository rather than one for the whole org: the cap then bounds each repo’s automation separately, and the request log gives you spend per repo for free.

What an AI-assisted org pays

Per-repo math at OpenAI’s June 2026 list prices for GPT-5.4 ($2.50 per million input, $15 per million output). A PR review reads about 12,000 tokens of diff and guidelines and writes 800; a failure-triage step runs 5 calls at about 6,000 input and 400 output each:

JobCost eachRuns/dayCost/day
PR review$0.04212$0.50
Failure triage$0.1058$0.84
Per repo$1.34
Per repo/month~$40

One repo is cheap. Twenty active repos are about $806 a month, and that figure scales with commit velocity, which is the one thing you never want to discourage. The flat path: ChatGPT Pro 5x at $100 plus ProxyLLM at $129 is $229 a month, with an estimated $3,500 of API-equivalent capacity behind it. Capacity figures are planning estimates, never guarantees; overflow falls back to a second connected account, then your own API key, and the log shows the lane per call.

CI AI spend scales with commit velocity, so the meter effectively charges your team for shipping faster.

API key or ChatGPT plan: where the line sits

Both lanes are legitimate; they fit different CI shapes.

API-key auth fits shared CI. The official Codex GitHub Action is built for API-key secrets, which is the documented pattern for shared runners; the setup is covered in the Codex GitHub Action guide. It also fits low volumes, where a few dollars a month of metered calls beats any fixed fee.

Personal-plan credentials do not belong in shared runners. OpenAI’s terms tie a ChatGPT account to its owner, and stuffing personal Codex auth into an org-wide secret blurs exactly that line. We wrote the honest version of this question up in can GitHub Actions use your ChatGPT plan.

The plan lane fits CI when the account stays yours. Through a hosted endpoint, your ChatGPT account lives in one isolated container you connected yourself via OpenAI’s device-code flow, and CI holds only a scoped, capped gateway key, never the account credentials. Workflow volume then bills to the flat plan. Programmatic Codex use is intended functionality, and OpenAI has the final call on its services.

The decision rule we give teams: shared or client-owned CI goes on API-key auth; your own org’s volume CI goes on your own plan through a scoped key.

Caps and logs are the actual feature

Spend control in CI is two primitives. A hard cap per key turns the 3 a.m. retry storm into a stopped key and a red job instead of a surprise invoice; set it near 2x the repo’s normal month so it only trips on genuine anomalies. The per-key request log answers “what did repo X spend this week, and on which workflow” as a filter, with each call’s lane and API-equivalent value attached. The wider capping playbook, including OpenAI’s own budget alerts, is in how to cap OpenAI API spending.

The condensed setup lives on the GitHub Actions integration page. If you can count your active repos and daily PRs, the calculator tells you which side of the $229 line your CI sits on.

Frequently asked questions

How do I control OpenAI costs in GitHub Actions?

Give CI its own scoped key with a hard budget cap instead of reusing a production key. Store it as a repository or organization secret, point OPENAI_BASE_URL at a gateway that logs per key, and review spend per repo in the request log. A cap turns a looping workflow into a stopped key rather than an invoice.

Should GitHub Actions use an API key or a ChatGPT plan?

Shared or team CI belongs on API-key auth: the official Codex GitHub Action is built for API keys, and personal-plan credentials do not belong in shared runners. A ChatGPT plan fits CI when the connected account is yours and stays in its own container, with CI holding only a scoped gateway key.

Why did my CI AI bill spike?

The usual culprits are push frequency, retry loops, and matrix builds, each of which multiplies AI calls without anyone deciding to spend more. Per-key request logs find it fast: filter by the CI key, sort by day, and the workflow that changed stands out.

How do I see AI spend per repository?

Issue one key per repository and read the per-key request log. Each call is attributed to its key with the lane that served it and its API-equivalent value, so per-repo cost reporting is a filter instead of a reconstruction.

More on Integrations
Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.