Voice AI quickstart | LiveKit Docs

Overview

This guide walks you through the setup of your very first voice assistant using LiveKit Agents for Python. In less than 10 minutes, you'll have a voice assistant that you can speak to in your terminal, browser, telephone, or native app.

Requirements

The following sections describe the minimum requirements to get started with LiveKit Agents.

Python

LiveKit Agents requires Python 3.9 or later.

Looking for Node.js?

The Node.js beta is still in development and has not yet reached v1.0. See the v0.x documentation for Node.js reference and join the LiveKit Community Slack to be the first to know when the next release is available.

LiveKit server

You need a LiveKit server instance to transport realtime media between user and agent. The easiest way to get started is with a free LiveKit Cloud account. Create a project and use the API keys in the following steps. You may also self-host LiveKit if you prefer.

AI providers

LiveKit Agents integrates with most AI model providers and supports both high-performance STT-LLM-TTS voice pipelines, as well as lifelike multimodal models.

The rest of this guide assumes you use one of the following two starter packs, which provide the best combination of value, features, and ease of setup.

Your agent strings together three specialized providers into a high-performance voice pipeline. You need accounts and API keys for each.

Loading diagram…

Component	Provider	Plugin	Required Key
STT	Deepgram	livekit-plugins-deepgram	`DEEPGRAM_API_KEY`
LLM	OpenAI	livekit-plugins-openai	`OPENAI_API_KEY`
TTS	Cartesia	livekit-plugins-cartesia	`CARTESIA_API_KEY`

Setup

Use the instructions in the following sections to set up your new project.

Packages

In addition to LiveKit Agents and plugins for the pipeline type you've chosen, install plugins for noise cancellation, VAD, and turn detection to make your voice AI app best-in-class.

Noise cancellation

This example integrates LiveKit Cloud enhanced noise cancellation. If you're not using LiveKit Cloud, omit the plugin and the noise_cancellation parameter from the following code.

Release candidate

LiveKit Agents v1.0 is currently available as a release candidate. The following commands install the latest pre-release packages.

pip install \
  "livekit-agents[openai,silero,deepgram,cartesia,turn-detector]~=1.0rc" \
  "livekit-plugins-noise-cancellation~=0.2" \
  "python-dotenv"

Environment variables

Create a file named .env and add your LiveKit credentials along with the necessary API keys for your AI providers.

DEEPGRAM_API_KEY=<Your Deepgram API Key>
OPENAI_API_KEY=<Your OpenAI API Key>
CARTESIA_API_KEY=<Your Cartesia API Key>
LIVEKIT_API_KEY=<your API Key>
LIVEKIT_API_SECRET=<your API Secret>
LIVEKIT_URL=<your LiveKit server URL>

Agent code

Create a file named main.py containing the following code for your first voice agent.

from dotenv import load_dotenv

from livekit import agents
from livekit.agents.voice import AgentSession, Agent, room_io
from livekit.plugins import (
    openai,
    cartesia,
    deepgram,
    noise_cancellation,
    silero,
    turn_detector,
)

load_dotenv()


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")


async def entrypoint(ctx: agents.JobContext):
    await ctx.connect()

    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4o"),
        tts=cartesia.TTS(),
        vad=silero.VAD.load(),
        turn_detection=turn_detector.EOUModel(),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_input_options=room_io.RoomInputOptions(
            noise_cancellation=noise_cancellation.BVC(),
        ),
    )

    # Instruct the agent to speak first
    await session.generate_reply()


if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Download model files

To use the silero and turn-detector plugins, you first need to download the model files. It's recommended to do this separately before the first run:

python main.py download-files

Speak to your agent

Start your agent in console mode to run inside your terminal:

python main.py console

Your agent speaks to you in the terminal, and you can speak to it as well.

Connect to playground

Start your agent in dev mode to connect it to LiveKit and make it available from anywhere on the internet:

python main.py dev

Use the Agents playground to speak with your agent and explore its full range of multimodal capabilities.

Congratulations, your agent is up and running. Continue to use the playground or the console mode as you build and test your agent.

Agent CLI modes

In the console mode, the agent runs locally and is only available within your terminal.

Run your agent in dev (development / debug) or start (production) mode to connect to LiveKit and join rooms.

Next steps

Follow these guides bring your voice AI app to life in the real world.

Web and mobile frontends

Put your agent in your pocket with a custom web or mobile app.

Telephony integration

Your agent can place and receive calls with LiveKit's SIP integration.

Building voice agents

Comprehensive documentation to build advanced voice AI apps with LiveKit.

Deploying to production

Guide to deploying your voice agent in a production environment.

Plugin reference

Explore the full list of AI providers available for LiveKit Agents.

Recipes

Get inspired by LiveKit's collection of recipes and example apps.