LiveKit Agents | LiveKit Docs

Introduction

The Agents framework allows you to add a Python or Node.js program to any LiveKit room as a full realtime participant. The SDK includes a complete set of tools and abstractions that make it easy to feed realtime media and data through an AI pipeline that works with any provider, and to publish realtime results back to the room.

If you want to get your hands on the code right away, follow this quickstart guide. It takes just a few minutes to build your first voice agent.

Voice AI quickstart

Build a simple voice assistant with Python in less than 10 minutes.

GitHub repository

Python source code and examples for the LiveKit Agents SDK.

SDK reference

Python reference docs for the LiveKit Agents SDK.

Use cases

Some applications for agents include:

Multimodal assistant: Talk, text, or screen share with an AI assistant.
Telehealth: Bring AI into realtime telemedicine consultations, with or without humans in the loop.
Call center: Deploy AI to the front lines of customer service with inbound and outbound call support.
Realtime translation: Translate conversations in realtime.
NPCs: Add lifelike NPCs backed by language models instead of static scripts.
Robotics: Put your robot's brain in the cloud, giving it access to the most powerful models.

The following recipes demonstrate some of these use cases:

Medical Office Triage

Agent that triages patients based on symptoms and medical history.

Restaurant Agent

A restaurant front-of-house agent that can take orders, add items to a shared cart, and checkout.

Company Directory

Build a AI company directory agent. The agent can respond to DTMF tones and voice prompts, then redirect callers.

Pipeline Translator

Implement translation in the processing pipeline.

Framework overview

Your agent code operates as a stateful, realtime bridge between powerful AI models and your users. While AI models typically run in data centers with reliable connectivity, users often connect from mobile networks with varying quality.

WebRTC ensures smooth communication between agents and users, even over unstable connections. LiveKit WebRTC is used between the frontend and the agent, while the agent communicates with your backend using HTTP and WebSockets. This setup provides the benefits of WebRTC without its typical complexity.

The agents SDK includes components for handling the core challenges of realtime voice AI, such as streaming audio through an STT-LLM-TTS pipeline, reliable turn detection, handling interruptions, and LLM orchestration. It supports plugins for most major AI providers, with more continually added. The framework is fully open source and supported by an active community.

Other framework features include:

Voice, video, and text: Build agents that can process realtime input and produce output in any modality.
Tool use: Define tools that are compatible with any LLM, and even forward tool calls to your frontend.
Multi-agent handoff: Break down complex workflows into simpler tasks.
Extensive integrations: Integrate with nearly every AI provider there is for LLMs, STT, TTS, and more.
State-of-the-art turn detection: Use the custom turn detection model for lifelike conversation flow.
Made for developers: Build your agents in code, not configuration.
Production ready: Includes built-in worker orchestration, load balancing, and Kubernetes compatibility.
Open source: The framework and entire LiveKit ecosystem are open source under the Apache 2.0 license.

How agents connect to LiveKit

Diagram showing a high-level view of how agents work.

When your agent code starts, it first registers with a LiveKit server (either self hosted or LiveKit Cloud) to run as a "worker" process. The worker waits until it receives a dispatch request. To fulfill this request, the worker boots a "job" subprocess which joins the room. By default, your workers are dispatched to each new room created in your LiveKit project. To learn more about workers, see the Worker lifecycle guide.

After your agent and user join a room, the agent and your frontend app can communicate using LiveKit WebRTC. This enables reliable and fast realtime communication in any network conditions. LiveKit also includes full support for telephony, so the user can join the call from a phone instead of a frontend app.

To learn more about how LiveKit works overall, see the Intro to LiveKit guide.

Getting started

Follow these guides to learn more and get started with LiveKit Agents.

Voice AI quickstart

Build a simple voice assistant with Python in less than 10 minutes.

Recipes

A comprehensive collection of examples, guides, and recipes for LiveKit Agents.

Intro to LiveKit

An overview of the LiveKit ecosystem.

Web and mobile frontends

Put your agent in your pocket with a custom web or mobile app.

Telephony integration

Your agent can place and receive calls with LiveKit's SIP integration.

Building voice agents

Comprehensive documentation to build advanced voice AI apps with LiveKit.

Worker lifecycle

Learn how to manage your agents with workers and jobs.

Deploying to production

Guide to deploying your voice agent in a production environment.

Integration guides

Explore the full list of AI providers available for LiveKit Agents.