Introduction
The Agents framework allows you to add a Python or Node.js program to any LiveKit room as a full realtime participant. The SDK includes a complete set of tools and abstractions that make it easy to feed realtime media and data through an AI pipeline that works with any provider, and to publish realtime results back to the room.
If you want to get your hands on the code right away, follow this quickstart guide. It takes just a few minutes to build your first voice agent.
Voice AI quickstart
Build a simple voice assistant with Python in less than 10 minutes.
GitHub repository
Python source code and examples for the LiveKit Agents SDK.
SDK reference
Python reference docs for the LiveKit Agents SDK.
Use cases
Some applications for agents include:
- Multimodal assistant: Talk, text, or screen share with an AI assistant.
- Telehealth: Bring AI into realtime telemedicine consultations, with or without humans in the loop.
- Call center: Deploy AI to the front lines of customer service with inbound and outbound call support.
- Realtime translation: Translate conversations in realtime.
- NPCs: Add lifelike NPCs backed by language models instead of static scripts.
- Robotics: Put your robot's brain in the cloud, giving it access to the most powerful models.
The following recipes demonstrate some of these use cases:
Medical Office Triage
Restaurant Agent
Company Directory
Pipeline Translator
Framework overview
Your agent code operates as a stateful, realtime bridge between powerful AI models and your users. While AI models typically run in data centers with reliable connectivity, users often connect from mobile networks with varying quality.
WebRTC ensures smooth communication between agents and users, even over unstable connections. LiveKit WebRTC is used between the frontend and the agent, while the agent communicates with your backend using HTTP and WebSockets. This setup provides the benefits of WebRTC without its typical complexity.
The agents SDK includes components for handling the core challenges of realtime voice AI, such as streaming audio through an STT-LLM-TTS pipeline, reliable turn detection, handling interruptions, and LLM orchestration. It supports plugins for most major AI providers, with more continually added. The framework is fully open source and supported by an active community.
Other framework features include:
- Voice, video, and text: Build agents that can process realtime input and produce output in any modality.
- Tool use: Define tools that are compatible with any LLM, and even forward tool calls to your frontend.
- Multi-agent handoff: Break down complex workflows into simpler tasks.
- Extensive integrations: Integrate with nearly every AI provider there is for LLMs, STT, TTS, and more.
- State-of-the-art turn detection: Use the custom turn detection model for lifelike conversation flow.
- Made for developers: Build your agents in code, not configuration.
- Production ready: Includes built-in worker orchestration, load balancing, and Kubernetes compatibility.
- Open source: The framework and entire LiveKit ecosystem are open source under the Apache 2.0 license.
How agents connect to LiveKit
When your agent code starts, it first registers with a LiveKit server (either self hosted or LiveKit Cloud) to run as a "worker" process. The worker waits until it receives a dispatch request. To fulfill this request, the worker boots a "job" subprocess which joins the room. By default, your workers are dispatched to each new room created in your LiveKit project. To learn more about workers, see the Worker lifecycle guide.
After your agent and user join a room, the agent and your frontend app can communicate using LiveKit WebRTC. This enables reliable and fast realtime communication in any network conditions. LiveKit also includes full support for telephony, so the user can join the call from a phone instead of a frontend app.
To learn more about how LiveKit works overall, see the Intro to LiveKit guide.
Getting started
Follow these guides to learn more and get started with LiveKit Agents.
Voice AI quickstart
Build a simple voice assistant with Python in less than 10 minutes.
Recipes
A comprehensive collection of examples, guides, and recipes for LiveKit Agents.
Intro to LiveKit
An overview of the LiveKit ecosystem.
Web and mobile frontends
Put your agent in your pocket with a custom web or mobile app.
Telephony integration
Your agent can place and receive calls with LiveKit's SIP integration.
Building voice agents
Comprehensive documentation to build advanced voice AI apps with LiveKit.
Worker lifecycle
Learn how to manage your agents with workers and jobs.
Deploying to production
Guide to deploying your voice agent in a production environment.
Integration guides
Explore the full list of AI providers available for LiveKit Agents.