Bring your own voice stack. Change it whenever you want.

Bring the transcriber that fits your language. Bring the brain your agent actually needs. Bring the voice that holds the persona. Hyponema orchestrates them — and the conversation, the memory, and the persona stay yours, swappable per session.

ONE PLATFORM, YOUR PROVIDERS

Bring your providers.
Keep the relationship.

Transcriber, model, and voice — all swappable, mid-conversation if you need to. We own the orchestration and the memory. That’s what makes the agent yours, not your provider’s.

Automatic failover
If your main provider has a bad minute, the next one in line takes the turn. Your user never feels it.
01
Your keys, your bill
You bring your provider keys, encrypted with your name on them. We never see them in plaintext.
02
Sensible defaults
Fastest, highest-quality, or cheapest combos out of the box. Tune later, per agent or per session.
03

OpenAI

Anthropic

Gemini

Inworld

Deepgram

Azure

ElevenLabs

Cartesia

VENDOR-NEUTRAL

One registry. Every provider. New ones drop in as config.

MOVE AS FAST AS THE MODELS DO

New model on Tuesday. You ship it Wednesday.

Voice models leapfrog every quarter. Our customers don’t rebuild their agent when a competitor ships something better. They change one line and the next call uses it.

Switch any time
Save the change in the dashboard. The next session picks it up. No redeploy, no downtime.
01
Per-session swap
Different agents in the same workspace can use different stacks; the same agent can pick a different combo per session via voice_stack overrides. No redeploy, no downtime.
02
Plug in your own models
Anything that speaks an OpenAI-compatible chat-completion API plugs in as a custom LLM. Custom LLMs do not participate in the cascade chain — they run as the single primary, no automatic fallback.
03

agent_42 · voice_stack

stt: deepgram → fallback: azure

llm: anthropic → fallback: openai

tts: cartesia → fallback: elevenlabs

✓ saved · v4 · live next session

Architecture

What it does when it matters.

cascade_logsession_4f2a

stage	provider	status	latency
stt	deepgram	502	2400ms
stt	azure	200	380ms

cascadefallback A served the turn

Automatic failover

When the main provider stumbles, the backup catches the turn. Set a main and up to two backups per layer. We record every attempt — which provider, how long it took, and why.

v3v4

live

voice_stack.json

−"tts":"elevenlabs"

+"tts":"cartesia"

v3 → v4next session picks up the new chain

Switch any time

New provider in one save. Edit the stack, save it. The next session uses the new one. No deploy, no downtime, no break in the conversation.

session.region → chain3 active

United Statesus-east
Cartesia
European Unioneu-frankfurt
Azure
Indiaap-mumbai
Sarvam

compliance-awareroutes resolve at session.start

Three layers, three chains

STT, LLM, and TTS each have their own primary → fallback → fallback chain (max 3 retries per layer, independent). TTS fallbacks carry their own voice_id because voice catalogs aren't portable across providers; STT fallbacks may override the language for multilingual recovery.

stt · latency by provider

P50P95

cartesia

140/220ms

deepgram

180/320ms

assemblyai

280/540ms

whisper

480/940ms

TTFT · last 1k attempts

Speed you can see

Every step, every layer, on the record. How long the transcription took, the model took, the voice took. Every call, broken down. Open it in the dashboard or pull it through the SDK.

EVERY CALL ON THE RECORD

When something looks off, you have the trail. Replayable. Exportable.

See the data model

THE NUMBERS

More than one provider isn’t just safer. It’s cheaper, faster, and future-proof.

The customer who gets the new model first doesn’t rebuild their agent. They change one line of voice_stack JSON.

Providers shipped.

Deepgram, AssemblyAI, Whisper for STT. Anthropic, OpenAI, Gemini for LLM. Cartesia, ElevenLabs, OpenAI TTS for voice. Custom OpenAI-compatible LLMs plug in as a single primary.

Vendor lock-in.

Persona, memory, and orchestration stay with you. Providers are interchangeable; the relationship doesn’t break when you swap one.

3×

Failover depth, per layer.

Each of STT, LLM, and TTS has its own independent primary → fallback → fallback chain. Every failover lands in error_event with provider, model, error_kind, fallback_overhead_ms.

How memory pairs with voice

The memory engine and the voice stack don’t depend on each other. Swap providers freely. The relationship survives.

See the memory engine

How your keys stay safe

AES-256-GCM envelope encryption. Per-credential DEK, KMS-managed KEK. Decrypted in-memory only for the seconds a call lasts.

Read the security model

Voice agents built for years, not minutes.

Bring your own keys. Join the waitlist for early access.

Or read the docs →

Bring your own voice stack. Change it whenever you want.

Bring your providers.Keep the relationship.

Automatic failover

Your keys, your bill

Sensible defaults

One registry. Every provider. New ones drop in as config.

New model on Tuesday. You ship it Wednesday.

Switch any time

Per-session swap

Plug in your own models

What it does when it matters.

Automatic failover

Switch any time

Three layers, three chains

Speed you can see

When something looks off, you have the trail. Replayable. Exportable.

More than one provider isn’t just safer. It’s cheaper, faster, and future-proof.

How memory pairs with voice

How your keys stay safe

Voice agents built for years, not minutes.

Bring your providers.
Keep the relationship.