Overview
This guide walks you through the setup of your very first voice assistant using LiveKit Agents for Python. In less than 10 minutes, you'll have a voice assistant that you can speak to in your terminal, browser, telephone, or native app.
Requirements
The following sections describe the minimum requirements to get started with LiveKit Agents.
Python
LiveKit Agents requires Python 3.9 or later.
The Node.js beta is still in development and has not yet reached v1.0. See the v0.x documentation for Node.js reference and join the LiveKit Community Slack to be the first to know when the next release is available.
LiveKit server
You need a LiveKit server instance to transport realtime media between user and agent. The easiest way to get started is with a free LiveKit Cloud account. Create a project and use the API keys in the following steps. You may also self-host LiveKit if you prefer.
AI providers
LiveKit Agents integrates with most AI model providers and supports both high-performance STT-LLM-TTS voice pipelines, as well as lifelike multimodal models.
The rest of this guide assumes you use one of the following two starter packs, which provide the best combination of value, features, and ease of setup.
Your agent strings together three specialized providers into a high-performance voice pipeline. You need accounts and API keys for each.
Loading diagram…
Component | Provider | Plugin | Required Key |
---|---|---|---|
STT | Deepgram | livekit-plugins-deepgram | DEEPGRAM_API_KEY |
LLM | OpenAI | livekit-plugins-openai | OPENAI_API_KEY |
TTS | Cartesia | livekit-plugins-cartesia | CARTESIA_API_KEY |
Setup
Use the instructions in the following sections to set up your new project.
Packages
In addition to LiveKit Agents and plugins for the pipeline type you've chosen, install plugins for noise cancellation, VAD, and turn detection to make your voice AI app best-in-class.
This example integrates LiveKit Cloud enhanced noise cancellation. If you're not using LiveKit Cloud, omit the plugin and the noise_cancellation
parameter from the following code.
LiveKit Agents v1.0 is currently available as a release candidate. The following commands install the latest pre-release packages.
pip install \"livekit-agents[openai,silero,deepgram,cartesia,turn-detector]~=1.0rc" \"livekit-plugins-noise-cancellation~=0.2" \"python-dotenv"
Environment variables
Create a file named .env
and add your LiveKit credentials along with the necessary API keys for your AI providers.
DEEPGRAM_API_KEY=<Your Deepgram API Key>OPENAI_API_KEY=<Your OpenAI API Key>CARTESIA_API_KEY=<Your Cartesia API Key>LIVEKIT_API_KEY=<your API Key>LIVEKIT_API_SECRET=<your API Secret>LIVEKIT_URL=<your LiveKit server URL>
Agent code
Create a file named main.py
containing the following code for your first voice agent.
from dotenv import load_dotenvfrom livekit import agentsfrom livekit.agents.voice import AgentSession, Agent, room_iofrom livekit.plugins import (openai,cartesia,deepgram,noise_cancellation,silero,turn_detector,)load_dotenv()class Assistant(Agent):def __init__(self) -> None:super().__init__(instructions="You are a helpful voice AI assistant.")async def entrypoint(ctx: agents.JobContext):await ctx.connect()session = AgentSession(stt=deepgram.STT(),llm=openai.LLM(model="gpt-4o"),tts=cartesia.TTS(),vad=silero.VAD.load(),turn_detection=turn_detector.EOUModel(),)await session.start(room=ctx.room,agent=Assistant(),room_input_options=room_io.RoomInputOptions(noise_cancellation=noise_cancellation.BVC(),),)# Instruct the agent to speak firstawait session.generate_reply()if __name__ == "__main__":agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
Download model files
To use the silero
and turn-detector
plugins, you first need to download the model files. It's recommended to do this separately before the first run:
python main.py download-files
Speak to your agent
Start your agent in console
mode to run inside your terminal:
python main.py console
Your agent speaks to you in the terminal, and you can speak to it as well.
Connect to playground
Start your agent in dev
mode to connect it to LiveKit and make it available from anywhere on the internet:
python main.py dev
Use the Agents playground to speak with your agent and explore its full range of multimodal capabilities.
Congratulations, your agent is up and running. Continue to use the playground or the console
mode as you build and test your agent.
In the console
mode, the agent runs locally and is only available within your terminal.
Run your agent in dev
(development / debug) or start
(production) mode to connect to LiveKit and join rooms.
Next steps
Follow these guides bring your voice AI app to life in the real world.
Web and mobile frontends
Put your agent in your pocket with a custom web or mobile app.
Telephony integration
Your agent can place and receive calls with LiveKit's SIP integration.
Building voice agents
Comprehensive documentation to build advanced voice AI apps with LiveKit.
Deploying to production
Guide to deploying your voice agent in a production environment.
Plugin reference
Explore the full list of AI providers available for LiveKit Agents.
Recipes
Get inspired by LiveKit's collection of recipes and example apps.