Hume EVI
also known as Empathic Voice Interface, Hume Speech-to-Speech
Type: full-code · Vendor: Hume AI · Language: API · License: proprietary · Status: active · Status in practice: emerging
Hosted speech-to-speech voice API from Hume AI that pairs an emotionally aware response model with a configurable supplemental LLM, measuring vocal prosody and adapting tone in real time.
Description. EVI is Hume AI's real-time speech-to-speech interface. It streams measurements of the tune, rhythm and timbre of the user's voice, reacts with matching prosody, and remains interruptible at all times. EVI is configured through a Config object (system prompt, voice, supplemental LLM, tools) and can be supplemented with partner LLMs from Anthropic, OpenAI, Google or Fireworks. Tool use and built-in tools are first-class but parallel function calls are not yet supported. Chat Groups link sessions so a conversation can resume across disconnects.
Agent loop shape. Hosted speech-to-speech loop over a WebSocket. Caller audio streams in; EVI emits prosody measurements, decides the response (optionally via a supplemental LLM), streams generated speech back, and is interruptible by design. Tool calls go out to the developer's backend except for Hume's built-in tools which it invokes itself. A Chat Group id can be passed to resume across reconnects.
Primary use cases
- emotionally aware voice agents across consumer and support apps
- multilingual speech-to-speech experiences on EVI 4 / 4-mini
- function-calling voice agents that hit external APIs
- long-running conversations resumed via Chat Groups
Key concepts
- EVI speech-to-speech (docs) — Streaming voice loop that measures vocal prosody and matches tone.
- Configuration (docs) — EVI Config defines system behavior, voice and supplemental LLM.
- Supplemental LLM (docs) — Optional partner LLMs from Anthropic / OpenAI / Google / Fireworks plug into the loop.
- Tool use → tool-use (docs) — Developer-defined functions and Hume built-in tools; no parallel calls.
- Multilingual EVI 4-mini → multilingual-voice-agent (docs) — Languages: English, Japanese, Korean, Spanish, French, Portuguese, Italian, German, Russian, Hindi, Arabic.
- Chat Groups → agent-resumption (docs) — Link related chats so a conversation persists across disconnects.
Patterns this full-code implements —
- ★★Agent Resumption
Chat Groups bundle individual chats so a conversation can resume across reconnects; a chat_group_id connects each chat to its group.
- ★Multilingual Voice Agent Stack
EVI 4-mini ships an explicit multilingual roster spanning eleven languages.
- ★★Tool Use
Developer-defined function calls plus Hume built-in tools; explicit caveat that parallel calls are not supported.
- ★★Stop / Cancel
EVI is documented as always interruptible, stopping rapidly when the user interjects, then resuming with the right context.
- ★★Multi-Model Routing
EVI can be supplemented with configurable partner LLMs from multiple vendors; the language-model configuration page explicitly lists Claude, GPT, and Gemini as external options.
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.