Private beta·Built on Cloudflare Workers

Your AI agents
never go down.

Relay is the reliable delivery layer for LLM APIs. One line of code adds automatic retry, provider failover, smart caching and a live dashboard — so your product stays up even when Anthropic and OpenAI don't.

Join waitlist View on GitHub

Drop-in Anthropic SDK replacement
Edge proxy < 30ms overhead
BYOK — your keys, encrypted

agent.tsLive

01import { Relay } from '@relay/sdk';
02
03const llm = new Relay({ apiKey: process.env.RELAY_KEY });
04
05// Auto retry, failover, cache — handled.
06const reply = await llm.messages.create({
07  model: 'claude-sonnet-4-6',
08  messages: [{ role: 'user', content: 'ping' }],
09  relay: { fallback: 'gpt-4o', cache: 'semantic' },
10});

Anthropic down

Failover → OpenAI

Cache hit · 38ms

99.99%

Effective uptime

< 30ms

Edge overhead

47%

Failed calls saved

Providers covered

AnthropicOpenAIMistralGeminiCohereDeepSeekGroqTogetherAnthropicOpenAIMistralGeminiCohereDeepSeekGroqTogether

The reality

Every AI builder ships the same reliability problems from scratch.

Four bugs you have hit this quarter — and will keep hitting until something else handles them for you.

01 · Outage

Provider goes down — your product goes with it.

Anthropic API drops 30 minutes. Your agent stops mid-task. Your user closes the tab. Your churn goes up.

02 · Rate limit

529 overloaded, right when traffic spikes.

Peak hours are exactly when you can't afford 429s. Every retry you don't write yourself is a request that quietly dies.

03 · Boilerplate

You wrote exponential backoff again. And again.

Every project ships its own retry loop. Half of them get the jitter wrong. None of them failover to a backup provider.

04 · Cost

You paid twice for the same answer.

Same system prompt, same user input — different request id. Without caching, every duplicate burns tokens you'll never see again.

The fix

Delete the retry file. Keep shipping.

Replace @anthropic-ai/sdk with @relay/sdk. Same API, different superpowers.

Before

without-relay.ts

01// 47 lines of retry boilerplate, written badly.
02async function callLLM(body) {
03  let attempt = 0;
04  while (attempt < 3) {
05    try {
06      return await anthropic.messages.create(body);
07    } catch (e) {
08      attempt++;
09      await sleep(2 ** attempt * 1000);
10    }
11  }
12  throw new Error('Gave up');
13}
14// Still no failover. No cache. No metrics.

After

with-relay.ts

01// One line. Reliability handled.
02const reply = await llm.messages.create(body);
03
04// What Relay just did for you:
05//   ✓ exponential backoff with jitter
06//   ✓ failover Anthropic → OpenAI on 5xx
07//   ✓ exact + semantic cache lookup
08//   ✓ streaming preserved end-to-end
09//   ✓ live metrics in your dashboard

What you get

Six features. One install.

Automatic provider failover

When Anthropic returns 5xx, Relay routes the next attempt to OpenAI — with model mapping you can override per request.

AnthropicRelayOpenAI

Exponential backoff, done right

Decorrelated jitter, configurable ceilings, surfaced in the dashboard so you see what was retried and why.

Smart caching

Exact-match on day one, semantic with embeddings on Hobby and up. Stop paying twice for the same answer.

Live dashboard for every request

Latency, cost, cache hit rate, retries, failover events — searchable, filterable, and streaming as it happens.

p95 · 412ms

BYOK, encrypted at rest

You keep paying Anthropic directly. We never see plaintext keys — AES-256-GCM, audited access.

Edge-native, globally

Runs on Cloudflare Workers across 300+ cities. Under 30ms of proxy overhead from anywhere in the world.

How it works

From npm i to first reliable call in four steps.

Drop-in the SDK

Swap @anthropic-ai/sdk for @relay/sdk. Same method signatures, same streaming. Your existing code keeps working.

npm i @relay/sdk

Plug in your providers

Paste your Anthropic and OpenAI keys into the dashboard. We encrypt them with AES-256-GCM and never log plaintext.

Anthropic ✓ OpenAI ✓ Mistral ✓

Ship through the edge

Every call hits a Cloudflare Worker in 300+ cities. Cache lookup, retry policy and failover happen before bytes leave the region.

POST /v1/messages → ~28ms

Watch it stay up

Open the dashboard. Filter requests by status, model, latency or cost. The next outage shows up as a green failover chip.

uptime 99.99% · failovers handled 137

Pricing

Free until you ship. Then small.

MonthlyAnnual-20%

Free

Hack on a side project.

$0/ month

1,000 requests / month
Exact-match cache
Anthropic + OpenAI failover
Community support

Start free

Hobby

Indie devs shipping to real users.

$19/ month

50,000 requests / month
Exact + semantic cache
Streaming & SSE proxy
Email support · 48h

Choose Hobby

Popular

Pro

Small teams in production.

$49/ month

250,000 requests / month
Custom retry policies
Multi-key routing
Email support · 24h

Choose Pro

Scale

When downtime costs real money.

$199/ month

1,000,000 requests / month
Custom cache rules
Bring-your-own providers
Priority support · 4h

Talk to us

FAQ

The honest answers, upfront.

Be the first to ship on a reliable LLM stack.

Join the private beta. Early signups get six months of Pro on the house when we open the gates.

487builders waiting$0setup cost<5 minto first call

Your AI agentsnever go down.