OpenAI and LiveKit | LiveKit Docs

Chat with OpenAI's latest models in a LiveKit demo inspired by OpenAI.fm

OpenAI ecosystem support

OpenAI provides some of the most powerful AI models and services today, which integrate into LiveKit Agents in the following ways:

Realtime API: The original production-grade speech-to-speech model. Build lifelike voice assistants with just one model.
GPT 4o, o1-mini, and more: Smart and creative models for voice AI.
STT models: From industry-standard whisper-1 to leading-edge gpt-4o-transcribe.
TTS models: Use OpenAI's latest gpt-4o-mini-tts to generate lifelike speech in a voice pipeline.

LiveKit Agents supports OpenAI models through the OpenAI developer platform as well as Azure OpenAI Service. See the Azure AI integration guide for more information on Azure OpenAI.

Getting started

Use the following guide to speak to your own OpenAI-powered voice AI agent in less than 10 minutes.

Voice AI quickstart

Build your first voice AI app with the OpenAI Realtime API or GPT-4o.

Realtime playground

Experiment with the OpenAI Realtime API and personalities like the Snarky Teenager or Opera Singer.

LiveKit Agents overview

LiveKit Agents is an open source framework for building realtime AI apps in Python and Node.js. It supports complex voice AI workflows with multiple agents and discrete processing steps, and includes built-in load balancing.

LiveKit provides SIP support for telephony integration and full-featured frontend SDKs in multiple languages. It uses WebRTC transport for end-user devices, enabling high-quality, low-latency realtime experiences. To learn more, see LiveKit Agents.

Realtime API

LiveKit Agents serves as a bridge between your frontend — connected over WebRTC — and the OpenAI Realtime API — connected over WebSockets. LiveKit automatically converts Realtime API audio response buffers to WebRTC audio streams synchronized with text, and handles business logic like interruption handling automatically. You can add your own logic within your agent, and use LiveKit features for realtime state and data to coordinate with your frontend.

Additional benefits of LiveKit Agents for the Realtime API include:

Noise cancellation: One line of code to remove background noise and speakers from your input audio.
Telephony: Inbound and outbound calling using SIP trunks.
Interruption handling: Automatically handles context truncation on interruption.
Transcription sync: Realtime API text output is synced to audio playback automatically.

Loading diagram…

Realtime API quickstart

Use the Voice AI quickstart with the Realtime API to get up and running in less than 10 minutes.

Web and mobile frontends

Put your agent in your pocket with a custom web or mobile app.

Telephony integration

Your agent can place and receive calls with LiveKit's SIP integration.

Building voice agents

Comprehensive documentation to build advanced voice AI apps with LiveKit.

Recipes

Get inspired by LiveKit's collection of recipes and example apps.

OpenAI plugin documentation

The following links provide more information on each available OpenAI component in LiveKit Agents.

Realtime API

LiveKit Agents docs for the OpenAI Realtime API.

OpenAI Models

LiveKit Agents docs for gpt-4o, o1-mini, and other OpenAI LLMs.

OpenAI STT

LiveKit Agents docs for whisper-1, gpt-4o-transcribe, and other OpenAI STT models.

OpenAI TTS

LiveKit Agents docs for tts-1, gpt-4o-mini-tts, and other OpenAI TTS models.