OpenAI Realtime API Quickstart

The MultimodalAgent class in the LiveKit Agents Framework uses the OpenAI Realtime API for speech-to-speech interactions between AI voice assistants and end users. It is implemented in both our Python and Node Agents Framework libraries.

Note

If you're not using the OpenAI Realtime API, see the Voice agent with STT, LLM, TTS quickstart.

Prerequisites

Steps

The following steps take you through the process of creating a LiveKit account and using the LiveKit CLI to create an agent from some minimal templates. At the end of the quickstart, you'll have an agent and a frontend you can use to talk to your agent.

Setup a LiveKit account and install the CLI

Create an account or sign in to your LiveKit Cloud account.
(Optional) Install the LiveKit CLI and authenticate using lk cloud auth.

Note

LiveKit's CLI utility lk is a convenient way to setup and configure new applications, but if you'd rather do it manually, you can clone the multimodal-agent-python (or node) and voice-assistant-frontend repositories and follow the manual setup instructions in each.

Bootstrap an agent from template

Clone a starter template for your preferred language using the CLI:
```
lk app create --template multimodal-agent-python
```
Enter your OpenAI API Key when prompted.
Install dependencies and start your agent:
```
cd <agent_dir>

python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt
python3 agent.py dev
```
You can edit the agent.py file to customize the system prompt and other aspects of your agent.

Bootstrap a frontend from template

Clone the Voice Assistant Frontend Next.js app starter template using the CLI:
```
lk app create --template voice-assistant-frontend
```
Install dependencies and start your frontend application:
```
cd <frontend_dir>

pnpm install
pnpm dev
```

Launch your app and talk to your agent

Visit your locally-running application (by default, http://localhost:3000).
Select Connect and start a conversation with your agent.

Next steps

Learn more in the OpenAI Realtime API integration guide.
Let your friends and colleagues talk to your agent by connecting it to a LiveKit Sandbox.
Create an agent that accepts incoming calls using SIP.
Create an agent that makes outbound calls using SIP.