AI Voice Assistant Quickstart

Build an AI-powered voice assistant that engages in realtime conversations using LiveKit, Python, and NextJS.

This quickstart tutorial walks you through the steps to build a conversational AI application using Python and NextJS. It uses LiveKit's Agents SDK and React Components Library to create an AI-powered voice assistant that can engage in realtime conversations with users. By the end, you will have a basic voice assistant application that you can run and interact with.

Voice Assistant

Prerequisites

note:

By default, the example agent uses Deepgram for STT and OpenAI for TTS and LLM. However, you aren't required to use these providers.

Steps

1. Create a LiveKit Sandbox app

A sandbox allows you to quickly create and deploy an agent locally, and test the agent using a frontend web client. To create a sandbox, follow these steps:

  1. Sign in to LiveKit Sandbox.

  2. Select Create app for the Voice assistant template.

  3. Follow the Finish setting up your sandbox app instructions provided after you create your sandbox. After you run the lk app create command, enter your OpenAI and Deepgram API keys at the prompts. If you aren't using Deepgram or OpenAI, you can skip the prompts.

  4. Follow the instructions in the command output to create and start your agent:

    • Create a virtual environment and install requirements:

      cd <your_sandbox_id>
      python3 -m venv venv
      source venv/bin/activate
      pip install -r requirements.txt
    • If you aren't using Deepgram and OpenAI, see Customzing plugins, otherwise start your agent:

      python3 agent.py dev

2. Launch your sandbox and talk to your agent

  1. Sign in to LiveKit Sandbox
  2. In the Your Sandbox apps section, select Launch for <your_sandbox_id> sandbox.
  3. Select Connect and start a conversation with your agent.

Customizing plugins

You can change the VAD, STT, TTS, and LLM plugins your agent uses by editing the agents.py file. By default, the sandbox voice assistant is configured to use Silero for VAD, Deepgram for STT, and OpenAI for TTS and LLM using the gpt-4o-mini model:

assistant = VoiceAssistant(
vad=silero.VAD.load(),
stt=deepgram.STT(),
llm=openai.LLM(model="gpt-4o-mini"),
tts=openai.TTS(),
chat_ctx=initial_ctx,
)

You can modify your agent to use different providers. For example, to use Cartesia for TTS, use the following steps:

  1. Edit file agent.py and update the imported plugins list to include cartesia:

    from livekit.plugins import cartesia, deepgram, openai, silero
  2. Update the tts plugin for your assistant in file agent.py:

    tts=cartesia.TTS(),
  3. Update the .env.local file to include your Cartesia API key by adding a CARTESIA_API_KEY environment variable:

    CARTESIA_API_KEY="<cartesia_api_key>"
  4. Install the plugin locally:

    pip install livekit-plugins-cartesia
  5. Start your agent:

    python3 agent.py dev
  6. Launch your sandbox and talk to your agent.

Next steps

  • For a list of additional plugins you can use, see Available LiveKit plugins.

  • You can customize your frontend client by using the frontend sandbox as a base project. To clone the frontend sandbox locally, run the following command:

    lk app create --template voice-assistant-frontend