Framework · Voice & Conversational

LiveKit Agents

Open-source realtime agent framework that lets a Python or Node.js process join a LiveKit room as a full participant, with an STT-LLM-TTS pipeline, turn detection, tool calling and worker-based job dispatch.

Description

LiveKit Agents is a framework for building realtime voice, video and multimodal agents that operate as participants in LiveKit rooms. The agent code acts as a stateful bridge between AI models and users: it streams audio through an STT-LLM-TTS pipeline, runs a custom turn-detection model for lifelike conversation flow, handles interruptions, and exposes a function_tool decorator so any LLM can call tools (including forwarding calls to the frontend). Agents run as workers; the agent server boots a 'job' subprocess that joins each room. Plugins cover STT, LLM, TTS and Realtime APIs.

Solution

Worker-dispatched, room-scoped event loop. A LiveKit worker registers with the agent server; when a room is created the server dispatches a job, the job subprocess joins the room, instantiates an AgentSession with STT, LLM, TTS and turn-detection plugins, and runs the streaming pipeline. Tools defined with @function_tool are exposed to the LLM, including forwardable frontend tool calls.

Primary use cases

  • realtime voice agents over WebRTC and telephony
  • multi-modal assistants that hear, see and speak
  • multi-agent handoff over LiveKit rooms
  • outbound calling, transcription and realtime translation

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.