LiveKit Agents
Type: full-code · Vendor: LiveKit · Language: Python, Node.js · License: Apache-2.0 · Status: active · Status in practice: mature · First released: 2023-10-19
Open-source realtime agent framework that lets a Python or Node.js process join a LiveKit room as a full participant, with an STT-LLM-TTS pipeline, turn detection, tool calling and worker-based job dispatch.
Description. LiveKit Agents is a framework for building realtime voice, video and multimodal agents that operate as participants in LiveKit rooms. The agent code acts as a stateful bridge between AI models and users: it streams audio through an STT-LLM-TTS pipeline, runs a custom turn-detection model for lifelike conversation flow, handles interruptions, and exposes a function_tool decorator so any LLM can call tools (including forwarding calls to the frontend). Agents run as workers; the agent server boots a 'job' subprocess that joins each room. Plugins cover STT, LLM, TTS and Realtime APIs.
Agent loop shape. Worker-dispatched, room-scoped event loop. A LiveKit worker registers with the agent server; when a room is created the server dispatches a job, the job subprocess joins the room, instantiates an AgentSession with STT, LLM, TTS and turn-detection plugins, and runs the streaming pipeline. Tools defined with @function_tool are exposed to the LLM, including forwardable frontend tool calls.
Primary use cases
- realtime voice agents over WebRTC and telephony
- multi-modal assistants that hear, see and speak
- multi-agent handoff over LiveKit rooms
- outbound calling, transcription and realtime translation
Key concepts
- Agent (docs) — A realtime participant that runs on the server and bridges users with AI models.
- STT-LLM-TTS pipeline (docs) — Streaming audio pipeline with reliable turn detection and interruption handling.
- Turn detection → stop-cancel (docs) — Custom transformer model that detects end-of-turn to reduce barge-ins.
- @function_tool → tool-use (docs) — Decorator that exposes Python functions as LLM-callable tools, forwardable to frontend.
- Worker / job (docs) — The agent server boots a job subprocess per room with load balancing.
- Plugins (docs) — Mix-and-match STT, LLM, TTS and Realtime API integrations.
Patterns this full-code implements —
- ★★Stop / Cancel
Pipeline ships reliable turn detection and interruption handling; custom transformer model detects end-of-turn.
- ★★Tool Use
@function_tool decorator exposes tools to any LLM; tool calls can be forwarded to the frontend.
- ★★Event-Driven Agent
Agent is a realtime participant whose code reacts to room events and audio frames as they arrive.
- ★★Conversation Handoff to Human
A handoff transfers session control from one agent to another; returning a different agent from a tool call triggers automatic handoff and adds an AgentHandoff item to the chat context.
- ★★Session Isolation
Each room gets its own job subprocess; the agent server boots a job subprocess per room with load balancing.
- ★★Model Context Protocol
First-class MCP support via MCPToolset passed to an agent's tools parameter; MCPServerHTTP for remote and MCPServerStdio for local subprocess servers.
- ★Multilingual Voice Agent Stack
Plugins expose multilingual modes (e.g. Deepgram nova-3 language='multi'); realtime translation is a documented use case.
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.