Model integration · Together AI

Together for open model hosting.

Treat Together as one more hosted open-model option behind the same endpoint. Sub-keys, budget caps, and request logs stay consistent across every host you test.

Start free How to connect

$129/month SaaS. Bring your own model keys. No inference markup.

Three steps to connect.

Pass Together-hosted models through

Together serves open and fine-tuned model families behind an OpenAI-compatible API. Use OpenRouter-backed access on your own key today; native Together key storage is a future expansion.

Keep ProxyLLM in front

For model families exposed through OpenRouter, send traffic to https://api.proxyllm.ai/v1 and keep usage analytics, sub-keys, and caps in one place.

Compare open-model hosts

Run the same prompt and model family across hosts, then compare latency, errors, and cost in the request logs before choosing the production lane.

Compare hosted open models.

Call Together-backed models where your configured provider exposes them.

client.ts

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.proxyllm.ai/v1",
  apiKey: "pk_live_...",
});

const r = await client.chat.completions.create({
  model: "together/meta-llama/llama-3.1-70b-instruct",
  messages: [{ role: "user", content: "Rewrite this onboarding email." }],
});

Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.

Get Codex Hosted How it works

$129/month · normal SaaS pricing

Pick model hosts with data.

Latency, cost, and error rates matter as much as the model name. ProxyLLM shows each one per request, with no markup on inference.

Start free All integrations