Vapi
Type: low-code · Vendor: Vapi · Language: Web product / API · License: proprietary · Status: active · Status in practice: mature
Hosted voice AI platform that orchestrates a transcriber, model and voice provider into a phone-callable assistant, with squads for multi-assistant handoff, function-calling tools and multilingual voice agents.
Description. Vapi is positioned as an orchestration layer over three pluggable modules — transcriber (STT), model (LLM) and voice (TTS) — plus a suite of real-time orchestration models for endpointing, interruption handling, backchanneling and emotion detection. Developers configure an Assistant, attach Tools (function calling, custom and code tools, transfer call), and optionally compose Squads where specialised assistants hand off to one another mid-call. Multilingual mode adds automatic language detection.
Agent loop shape. Hosted real-time pipeline. Caller audio flows into Vapi's transcriber; the LLM produces a response interleaved with tool calls; a chosen TTS speaks the result back; a suite of real-time models layered on top of STT/LLM/TTS handles endpointing, interruption, backchanneling and emotion. Squads compose multiple specialised assistants and route the live call between them via assistantDestinations.
Primary use cases
- outbound and inbound phone agents
- customer support and lead qualification at scale
- appointment scheduling and reception
- multi-assistant workflows via squads
Key concepts
- Assistant (docs) — Configurable agent with transcriber, model, voice and tools.
- Orchestration layer (docs) — Vapi sits over transcriber + model + voice and adds real-time models.
- Squads → conversation-handoff (docs) — Multiple specialised assistants that hand off mid-call.
- Tools → tool-use (docs) — Function calling, custom tools, code tools and call transfer.
- Multilingual → multilingual-voice-agent (docs) — Multilingual assistants with automatic language detection.
- Voice pipeline configuration (docs) — Endpointing, interruption handling, backchanneling, filler injection.
Patterns this low-code implements —
- ★★Conversation Handoff to Human
Squads let calls hand off between specialised assistants via assistantDestinations.
- ★Multilingual Voice Agent Stack
Explicit multilingual mode with automatic language detection across the orchestrated STT/LLM/TTS stack; multiple transcriber providers expose multi/multilingual modes.
- ★★Tool Use
Four explicit tool types: Default (built-in), Custom (webhooks), Code (TypeScript on Vapi infra), Integration (Make/GHL).
- ★★Stop / Cancel
Voice pipeline configuration covers timing and interruption handling; orchestration models include interruption handling and backchanneling.
- ★★Multi-Model Routing
Assistant config selects transcriber + LLM + voice independently; the three modules can be swapped with any provider (OpenAI, Groq, Deepgram, ElevenLabs, PlayHT, etc).
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.