Hyponema

Bring your own voice stack. Change it whenever you want.

Bring the transcriber that fits your language. Bring the brain your agent actually needs. Bring the voice that holds the persona. Hyponema orchestrates them — and the conversation, the memory, and the persona stay yours, swappable per session.

No spam. Early access invites only.

ONE PLATFORM, YOUR PROVIDERS

Bring your providers.
Keep the relationship.

Transcriber, model, and voice — all swappable, mid-conversation if you need to. We own the orchestration and the memory. That’s what makes the agent yours, not your provider’s.

  1. Automatic failover

    If your main provider has a bad minute, the next one in line takes the turn. Your user never feels it.

    01
  2. Your keys, your bill

    You bring your provider keys, encrypted with your name on them. We never see them in plaintext.

    02
  3. Sensible defaults

    Fastest, highest-quality, or cheapest combos out of the box. Tune later, per agent or per session.

    03
OpenAI
Anthropic
Gemini
Inworld
Deepgram
Azure
ElevenLabs
Cartesia

VENDOR-NEUTRAL

One registry. Every provider. New ones drop in as config.

MOVE AS FAST AS THE MODELS DO

New model on Tuesday. You ship it Wednesday.

Voice models leapfrog every quarter. Our customers don’t rebuild their agent when a competitor ships something better. They change one line and the next call uses it.

  1. Switch any time

    Save the change in the dashboard. The next session picks it up. No redeploy, no downtime.

    01
  2. Per-session swap

    Different agents in the same workspace can use different stacks; the same agent can pick a different combo per session via voice_stack overrides. No redeploy, no downtime.

    02
  3. Plug in your own models

    Anything that speaks an OpenAI-compatible chat-completion API plugs in as a custom LLM. Custom LLMs do not participate in the cascade chain — they run as the single primary, no automatic fallback.

    03
agent_42 · voice_stack
stt: deepgram → fallback: azure
llm: anthropic → fallback: openai
tts: cartesia → fallback: elevenlabs
✓ saved · v4 · live next session

Architecture

What it does when it matters.

cascade_logsession_4f2a
stageproviderstatuslatency
sttdeepgram5022400ms
sttazure200380ms
cascadefallback A served the turn

Automatic failover

When the main provider stumbles, the backup catches the turn. Set a main and up to two backups per layer. We record every attempt — which provider, how long it took, and why.

v3v4
live
voice_stack.json
"tts":"elevenlabs"
+"tts":"cartesia"
v3 → v4next session picks up the new chain

Switch any time

New provider in one save. Edit the stack, save it. The next session uses the new one. No deploy, no downtime, no break in the conversation.

session.region → chain3 active
  • United Statesus-east
    Cartesia
  • European Unioneu-frankfurt
    Azure
  • Indiaap-mumbai
    Sarvam
compliance-awareroutes resolve at session.start

Three layers, three chains

STT, LLM, and TTS each have their own primary → fallback → fallback chain (max 3 retries per layer, independent). TTS fallbacks carry their own voice_id because voice catalogs aren't portable across providers; STT fallbacks may override the language for multilingual recovery.

stt · latency by provider
P50P95
cartesia
140/220ms
deepgram
180/320ms
assemblyai
280/540ms
whisper
480/940ms
TTFT · last 1k attempts

Speed you can see

Every step, every layer, on the record. How long the transcription took, the model took, the voice took. Every call, broken down. Open it in the dashboard or pull it through the SDK.

EVERY CALL ON THE RECORD

When something looks off, you have the trail. Replayable. Exportable.

THE NUMBERS

More than one provider isn’t just safer. It’s cheaper, faster, and future-proof.

The customer who gets the new model first doesn’t rebuild their agent. They change one line of voice_stack JSON.

9
Providers shipped.

Deepgram, AssemblyAI, Whisper for STT. Anthropic, OpenAI, Gemini for LLM. Cartesia, ElevenLabs, OpenAI TTS for voice. Custom OpenAI-compatible LLMs plug in as a single primary.

0
Vendor lock-in.

Persona, memory, and orchestration stay with you. Providers are interchangeable; the relationship doesn’t break when you swap one.

Failover depth, per layer.

Each of STT, LLM, and TTS has its own independent primary → fallback → fallback chain. Every failover lands in error_event with provider, model, error_kind, fallback_overhead_ms.

How memory pairs with voice

The memory engine and the voice stack don’t depend on each other. Swap providers freely. The relationship survives.

How your keys stay safe

AES-256-GCM envelope encryption. Per-credential DEK, KMS-managed KEK. Decrypted in-memory only for the seconds a call lasts.

Voice agents built for years, not minutes.

Bring your own keys. Join the waitlist for early access.

No spam. Early access invites only.

Or read the docs →