Together for open model hosting.
Treat Together as one more hosted open-model option behind the same endpoint. Sub-keys, budget caps, and request logs stay consistent across every host you test.
$129/month SaaS. Bring your own model keys. No inference markup.
Three steps to connect.
Pass Together-hosted models through
Together serves open and fine-tuned model families behind an OpenAI-compatible API. Use OpenRouter-backed access on your own key today; native Together key storage is a future expansion.
Keep ProxyLLM in front
For model families exposed through OpenRouter, send traffic to https://api.proxyllm.ai/v1 and keep usage analytics, sub-keys, and caps in one place.
Compare open-model hosts
Run the same prompt and model family across hosts, then compare latency, errors, and cost in the request logs before choosing the production lane.
Compare hosted open models.
Call Together-backed models where your configured provider exposes them.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.proxyllm.ai/v1",
apiKey: "pk_live_...",
});
const r = await client.chat.completions.create({
model: "together/meta-llama/llama-3.1-70b-instruct",
messages: [{ role: "user", content: "Rewrite this onboarding email." }],
}); Run your AI workloads on your ChatGPT subscription.
ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.
Pick model hosts with data.
Latency, cost, and error rates matter as much as the model name. ProxyLLM shows each one per request, with no markup on inference.