AI Voice Assistant Quickstart

Build an AI-powered voice assistant that engages in realtime conversations using LiveKit, Python, and NextJS.

This quickstart tutorial walks you through the steps to build a conversational AI application using Python and NextJS. It uses LiveKit's Agents Framework and React Components Library to create an AI-powered voice assistant that can engage in realtime conversations with users. By the end, you will have a basic voice assistant application that you can run and interact with.

Note

If you're interested in using the OpenAI Realtime API, see the Speech-to-speech quickstart.

Voice Assistant

Prerequisites

Note

By default, the example agent uses Deepgram for STT and OpenAI for TTS and LLM. However, you aren't required to use these providers.

Steps

The following steps take you through the process of creating a voice assistant using the LiveKit CLI and some minimal templates.

Setup a LiveKit account and install the CLI

  1. Create an account or sign in to your LiveKit Cloud account.

  2. Install the LiveKit CLI and authenticate using lk cloud auth.

Bootstrap an agent from template

  1. Clone the starter template for a simple Python voice agent:

    lk app create --template voice-pipeline-agent-python
  2. Enter your OpenAI API Key and Deepgram API Key when prompted. If you aren't using Deepgram and OpenAI, see Customizing plugins.

  3. Install dependencies and start your agent:

    cd <agent_dir>
    python3 -m venv venv
    source venv/bin/activate
    python3 -m pip install -r requirements.txt
    python3 agent.py dev

    You can edit the agent.py file to customize the system prompt and other aspects of your agent.

Bootstrap a frontend from template

  1. Clone the Voice Assistant Frontend Next.js app starter template using the CLI:

    lk app create --template voice-assistant-frontend
  2. Install dependencies and start your frontend application:

    cd <frontend_dir>
    pnpm install
    pnpm dev

Launch your app and talk to your agent

  1. Visit your locally-running application (by default, http://localhost:3000).
  2. Select Connect and start a conversation with your agent.

Customizing plugins

You can change the VAD, STT, TTS, and LLM plugins your agent uses by editing the agents.py file. By default, the sandbox voice assistant is configured to use Silero for VAD, Deepgram for STT, and OpenAI for TTS and LLM using the gpt-4o-mini model:

assistant = VoiceAssistant(
vad=silero.VAD.load(),
stt=deepgram.STT(),
llm=openai.LLM(model="gpt-4o-mini"),
tts=openai.TTS(),
chat_ctx=initial_ctx,
)

You can modify your agent to use different providers. For example, to use Cartesia for TTS, use the following steps:

  1. Edit file agent.py and update the imported plugins list to include cartesia:

    from livekit.plugins import cartesia, deepgram, openai, silero
  2. Update the tts plugin for your assistant in file agent.py:

    tts=cartesia.TTS(),
  3. Update the .env.local file to include your Cartesia API key by adding a CARTESIA_API_KEY environment variable:

    CARTESIA_API_KEY="<cartesia_api_key>"
  4. Install the plugin locally:

    pip install livekit-plugins-cartesia
  5. Start your agent:

    python3 agent.py dev

Next steps