This quickstart tutorial walks you through the steps to build a conversational AI application using Python and NextJS. It uses LiveKit's Agents Framework and React Components Library to create an AI-powered voice assistant that can engage in realtime conversations with users. By the end, you will have a basic voice assistant application that you can run and interact with.
If you're interested in using the OpenAI Realtime API, see the Speech-to-speech quickstart.
Prerequisites
By default, the example agent uses Deepgram for STT and OpenAI for TTS and LLM. However, you aren't required to use these providers.
Steps
The following steps take you through the process of creating a voice assistant using the LiveKit CLI and some minimal templates.
Setup a LiveKit account and install the CLI
Create an account or sign in to your LiveKit Cloud account.
Install the LiveKit CLI and authenticate using
lk cloud auth
.
Bootstrap an agent from template
Clone the starter template for a simple Python voice agent:
lk app create --template voice-pipeline-agent-pythonEnter your OpenAI API Key and Deepgram API Key when prompted. If you aren't using Deepgram and OpenAI, see Customizing plugins.
Install dependencies and start your agent:
cd <agent_dir>python3 -m venv venvsource venv/bin/activatepython3 -m pip install -r requirements.txtpython3 agent.py devYou can edit the
agent.py
file to customize the system prompt and other aspects of your agent.
Bootstrap a frontend from template
Clone the Voice Assistant Frontend Next.js app starter template using the CLI:
lk app create --template voice-assistant-frontendInstall dependencies and start your frontend application:
cd <frontend_dir>pnpm installpnpm dev
Launch your app and talk to your agent
- Visit your locally-running application (by default, http://localhost:3000).
- Select Connect and start a conversation with your agent.
Customizing plugins
You can change the VAD, STT, TTS, and LLM plugins your agent uses by editing the agents.py
file. By default, the sandbox voice assistant is configured to use Silero for VAD, Deepgram for STT, and OpenAI for TTS and LLM using the gpt-4o-mini
model:
assistant = VoiceAssistant(vad=silero.VAD.load(),stt=deepgram.STT(),llm=openai.LLM(model="gpt-4o-mini"),tts=openai.TTS(),chat_ctx=initial_ctx,)
You can modify your agent to use different providers. For example, to use Cartesia for TTS, use the following steps:
Edit file
agent.py
and update the imported plugins list to includecartesia
:from livekit.plugins import cartesia, deepgram, openai, sileroUpdate the
tts
plugin for your assistant in fileagent.py
:tts=cartesia.TTS(),Update the
.env.local
file to include your Cartesia API key by adding aCARTESIA_API_KEY
environment variable:CARTESIA_API_KEY="<cartesia_api_key>"Install the plugin locally:
pip install livekit-plugins-cartesiaStart your agent:
python3 agent.py dev
Next steps
- For a list of additional plugins you can use, see Available LiveKit plugins.
- Let your friends and colleagues talk to your agent by connecting it to a LiveKit Sandbox.
- Create an agent that accepts incoming calls using SIP.
- Create an agent that makes outbound calls using SIP.