LiteLLM vs OpenRouter: Self-Hosted Router or Hosted Marketplace?
LiteLLM is an open-source router you run yourself, with no markup. OpenRouter is a hosted marketplace charging about 5.5% on credits. How to choose on fees, ops, and models.
LiteLLM and OpenRouter solve the same surface problem, one OpenAI-compatible endpoint for many models, with opposite architectures. LiteLLM is an open-source router you deploy and operate yourself: no markup, your keys, your pager. OpenRouter is a hosted marketplace: one key for hundreds of models, nothing to run, and a fee of about 5.5% when you buy credits as of June 2026. Choose by deciding who should operate the router, then check whether a flat-rate lane belongs underneath either one.
What each one actually is
LiteLLM is an open-source proxy server. You write a config listing your models and provider keys, deploy it with Docker or Kubernetes, and your apps call it as if it were the OpenAI API. It translates requests to OpenAI, Anthropic, Google, Bedrock, and dozens of other providers, and it adds the control plane self-hosters want: virtual keys, per-key budgets, rate limits, and logging callbacks. Every token bills directly to your own provider accounts at list price.
OpenRouter is a hosted API. You create an account, load credits or bring your own provider keys, and call one endpoint that fronts hundreds of models. OpenRouter operates the routing, the provider failover, and the activity log. You operate nothing.
One disclosure before the scoring starts: we build ProxyLLM, a product that overlaps this category, and it appears later in this article where it is relevant. The LiteLLM-versus-OpenRouter call itself does not depend on us.
The fee math
OpenRouter charges roughly 5.5% when you purchase credits, and a smaller fee on traffic that uses your own provider keys (their fee page has current numbers; these are June 2026 figures). LiteLLM charges nothing, but self-hosting is never free. Honest accounting puts a server at $20 to $50 a month and maintenance at a few engineer hours a month: upgrades on a fast-moving repo, key rotation, dashboard checks, the occasional broken deploy.
| Monthly inference spend | OpenRouter fee (~5.5%) | LiteLLM markup | LiteLLM real cost (est.) |
|---|---|---|---|
| $500 | ~$28 | $0 | server + ~2-4 eng hours/mo |
| $2,000 | ~$110 | $0 | same |
| $10,000 | ~$550 | $0 | same |
Price those hours at $100 each and LiteLLM’s true cost lands around $220 to $450 a month regardless of volume. The crossover follows directly: below a few thousand dollars of monthly inference, OpenRouter’s fee is usually cheaper than honest ops accounting. Well above it, self-hosting pays for itself. OpenRouter’s fee appears on an invoice; LiteLLM’s fee appears in your on-call rotation.
Model surface and routing
OpenRouter wins breadth with zero configuration. New models appear on the marketplace without you touching anything, and its routing can fail over across multiple providers hosting the same open-weight model.
LiteLLM supports a comparable range of providers, but only the ones you bring keys for, and you configure the routing yourself: fallback chains, load balancing across deployments, retries. That is more work and more control. If your request path must stay on your own infrastructure for privacy or compliance reasons, LiteLLM is the only one of the two that can promise it.
Who carries the pager
With LiteLLM you own uptime, scaling, the database behind virtual keys, and secrets handling. When an upgrade breaks a provider integration at a bad hour, that is your incident. With OpenRouter you inherit their uptime and their deprecation schedule, and you accept a third party in every request path. Neither position is wrong. They are different jobs, and the broader alternatives list sorts more tools along the same axis.
Where a flat-rate lane fits either
Both tools meter per token, because the providers underneath them do. Routing changes which meter runs, never the fact of the meter. If a large share of your spend is OpenAI-bound, that is the line item worth restructuring, and per-token vs flat-rate pricing is the framework we use.
This is where our product enters. ProxyLLM’s Codex Hosted runs OpenAI’s official Codex CLI on managed servers, signed in with your own ChatGPT account, so OpenAI-bound work bills to the flat plan instead of per-token API pricing: $129 a month, no inference markup. Our planning estimates put Plus at roughly $700 of API-equivalent work a month and Pro tiers at roughly $3,500 to $14,000, estimates rather than guarantees. The honest caveats: the Codex lane serves OpenAI models only and returns complete responses rather than streams.
It composes cleanly with LiteLLM, which can list our endpoint as one more OpenAI-compatible upstream and route bulk OpenAI work through it while everything else stays metered. OpenRouter is a closed marketplace, so there you run the lanes side by side instead. The full pairing logic is in ProxyLLM vs LiteLLM.
Side by side
| Axis | LiteLLM | OpenRouter |
|---|---|---|
| Deployment | Self-hosted, your infra | Hosted, nothing to run |
| Price | Free OSS (enterprise tier sold) | ~5.5% on credits (June 2026) |
| Inference markup | None, your keys at list price | None on tokens; fee on credits |
| Model surface | Providers you bring keys for | Hundreds of models, one key |
| Failover | You configure it | Built into the marketplace |
| Budgets and keys | Virtual keys, per-key budgets | Account-level controls |
| Request path | Stays on your infrastructure | Passes through OpenRouter |
| Maintenance | Yours | Theirs |
Which one should you run?
Run LiteLLM when you have real DevOps capacity, multi-provider traffic, and a reason to keep the request path in-house; at high volume the absent fee compounds in your favor. Use OpenRouter when you want maximum model breadth this afternoon and would rather pay a visible percentage than operate one more service. Both are good tools, which is why both anchor our gateway roundup.
And whichever router wins, check your OpenAI line item before settling. If it is most of the bill, a flat-rate lane under the router may move more money than the router choice itself; the calculator maps your current spend to a plan tier in about thirty seconds.
Frequently asked questions
What is the difference between LiteLLM and OpenRouter?
LiteLLM is an open-source proxy you deploy on your own infrastructure. It routes requests to providers through your own API keys and adds no markup. OpenRouter is a hosted marketplace: one API key reaches hundreds of models, and OpenRouter charges a fee of about 5.5% when you buy credits, as of June 2026.
Does LiteLLM charge a markup on requests?
No. LiteLLM is open-source software, so inference bills at each provider's list price through your own keys. Your real costs are the server it runs on and the engineering hours to deploy, upgrade, and monitor it. LiteLLM also sells an enterprise tier with support and extra features.
What does OpenRouter's fee actually buy?
One key and one bill for hundreds of models, failover across providers, a unified activity log, and zero routers to operate. For teams that want model breadth without running infrastructure, that bundle is frequently worth the roughly 5.5% fee.
Can I use LiteLLM and OpenRouter together?
Yes, and many teams do. LiteLLM can treat OpenRouter as one upstream provider among several, which gives you a self-hosted control plane plus marketplace breadth behind a single internal endpoint. Flat-rate lanes such as ProxyLLM's Codex Hosted can sit behind LiteLLM the same way.