Skip to main content

Voice AI quickstart

Build and deploy a simple voice assistant in less than 10 minutes.

Overview

This guide walks you through the setup of your very first voice assistant using LiveKit Agents for Python. In less than 10 minutes, you'll have a voice assistant that you can speak to in your terminal, browser, telephone, or native app.

Requirements

The following sections describe the minimum requirements to get started with LiveKit Agents.

  • LiveKit Agents requires Python >= 3.9.
  • This guide uses the uv package manager.

LiveKit Cloud

This guide assumes you have signed up for a free LiveKit Cloud account. LiveKit Cloud offers realtime media transport and agent deployment. Create a free project and use the API keys in the following steps to get started.

While this guide assumes LiveKit Cloud, the instructions can be adapted for self-hosting the open-source LiveKit server instead. You will need your own custom deployment environment in production, and should remove the enhanced noise cancellation plugin from the agent code.

LiveKit CLI

Use the LiveKit CLI to manage LiveKit API keys and deploy your agent to LiveKit Cloud.

  1. Install the LiveKit CLI:

    Install the LiveKit CLI with Homebrew:

    brew install livekit-cli
  2. Link your LiveKit Cloud project to the CLI:

    lk cloud auth

    This opens a browser window to authenticate and link your project to the CLI.

AI providers

LiveKit Agents integrates with most AI model providers and supports both high-performance STT-LLM-TTS voice pipelines, as well as lifelike multimodal models.

The rest of this guide assumes you use one of the following two starter packs, which provide the best combination of value, features, and ease of setup.

Your agent strings together three specialized providers into a high-performance voice pipeline. You need accounts and API keys for each.

Diagram showing STT-LLM-TTS pipeline.
ComponentProviderRequired KeyAlternatives
STTDeepgramDEEPGRAM_API_KEYSTT integrations
LLMOpenAIOPENAI_API_KEYLLM integrations
TTSCartesiaCARTESIA_API_KEYTTS integrations

Setup

Use the instructions in the following sections to set up your new project.

Project initialization

Create a new project for the voice agent.

Run the following commands to use uv to create a new project ready to use for your new voice agent.

uv init livekit-voice-agent --bare
cd livekit-voice-agent

Install packages

Install the following packages to build a complete voice AI agent with your STT-LLM-TTS pipeline, noise cancellation, and turn detection:

uv add \
"livekit-agents[deepgram,openai,cartesia,silero,turn-detector]~=1.2" \
"livekit-plugins-noise-cancellation~=0.2" \
"python-dotenv"

Environment variables

Run the following command to load your LiveKit Cloud API keys into a .env.local file:

lk app env -w

Now open this file and add keys for your selected AI provider. The file should look like this:

DEEPGRAM_API_KEY=<Your Deepgram API Key>
OPENAI_API_KEY=<Your OpenAI API Key>
CARTESIA_API_KEY=<Your Cartesia API Key>
LIVEKIT_API_KEY=<your API Key>
LIVEKIT_API_SECRET=<your API Secret>
LIVEKIT_URL=<your LiveKit server URL>

Agent code

Create a file with your agent code.

from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import (
openai,
cartesia,
deepgram,
noise_cancellation,
silero,
)
from livekit.plugins.turn_detector.multilingual import MultilingualModel
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(instructions="You are a helpful voice AI assistant.")
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
stt=deepgram.STT(model="nova-3", language="multi"),
llm=openai.LLM(model="gpt-4o-mini"),
tts=cartesia.TTS(model="sonic-2", voice="f786b574-daa5-4673-aa0c-cbe3e8534c02"),
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_input_options=RoomInputOptions(
# For telephony applications, use `BVCTelephony` instead for best results
noise_cancellation=noise_cancellation.BVC(),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Download model files

To use the turn-detector, silero, and noise-cancellation plugins, you first need to download the model files:

uv run agent.py download-files

Speak to your agent

Python only

If you're using Node.js, you can skip this setup and continue to Connect to playground.

Start your agent in console mode to run inside your terminal:

uv run agent.py console

Your agent speaks to you in the terminal, and you can speak to it as well.

Screenshot of the CLI console mode.

Connect to playground

Start your agent in dev mode to connect it to LiveKit and make it available from anywhere on the internet:

uv run agent.py dev

Use the Agents playground to speak with your agent and explore its full range of multimodal capabilities.

Agent CLI modes

In the dev and start modes, your agent connects to LiveKit Cloud and joins rooms:

  • dev mode: Run your agent in development mode for testing and debugging.

  • start mode: Run your agent in production mode.

For Python agents, run the following command to start your agent in production mode:

uv run agent.py start

Python agents can also use console mode, which runs locally and is only available within your terminal.

Deploy to LiveKit Cloud

From the root of your project, run the following command with the LiveKit CLI. Ensure you have linked your LiveKit Cloud project and added the build and start scripts.

lk agent create

The CLI creates Dockerfile, .dockerignore, and livekit.toml files in your current directory, then registers your agent with your LiveKit Cloud project and deploys it.

After the deployment completes, you can access your agent in the playground, or continue to use the console mode as you build and test your agent locally.

Next steps

Follow these guides bring your voice AI app to life in the real world.