OpenAI and LiveKit | LiveKit Docs

Chat with OpenAI's latest models in a LiveKit demo inspired by OpenAI.fm

OpenAI ecosystem support

OpenAI provides some of the most powerful AI models and services today, which integrate into LiveKit Agents in the following ways:

Realtime API: The original production-grade speech-to-speech model. Build lifelike voice assistants with just one model.
GPT 4o, o1-mini, and more: Smart and creative models for voice AI.
STT models: From industry-standard whisper-1 to leading-edge gpt-4o-transcribe.
TTS models: Use OpenAI's latest gpt-4o-mini-tts to generate lifelike speech in a voice pipeline.

LiveKit Agents supports OpenAI models through the OpenAI developer platform as well as Azure OpenAI Service.

Getting started

Use the following guide to speak to your own OpenAI-powered voice AI agent in less than 10 minutes.

Voice AI quickstart

Build your first voice AI app with the OpenAI Realtime API or GPT-4o.

Realtime playground

Experiment with the OpenAI Realtime API and personalities like the Snarky Teenager or Opera Singer.

LiveKit Agents overview

LiveKit Agents is an open-source framework for building realtime AI apps using WebRTC transport to end-user devices and WebSockets or HTTPS for backend services.

Agent workflows: Build complex voice AI apps with discrete stages and handoffs.
Telephony: Inbound and outbound calling using SIP trunks.
Frontend SDKs: Full-featured SDKs and UI components for JavaScript, Swift, Kotlin, Flutter, React Native, and Unity.
Python and Node.js: Build voice AI apps in Python or Node.js.
Dispatch and load balancing: Built-in support for request distribution and load balancing.
LiveKit Cloud: Fully-managed LiveKit server with global scale and low latency (you can also self-host).

What is WebRTC?

WebRTC provides significant advantages over other options for building realtime applications such as websockets.

Optimized for media: Purpose-built for audio and video with advanced codecs and compression algorithms.
Network resilient: Performs reliably even in challenging network conditions due to UDP, adaptive bitrate, and more.
Broad compatibility: Natively supported in all modern browsers.

LiveKit handles all of the complexity of running production-grade WebRTC infrastructure while extending support to mobile apps, backends, and telephony.

Realtime API

LiveKit Agents serves as a bridge between your frontend — connected over WebRTC — and the OpenAI Realtime API — connected over WebSockets. LiveKit automatically converts Realtime API audio response buffers to WebRTC audio streams synchronized with text, and handles business logic like interruption handling automatically. You can add your own logic within your agent, and use LiveKit features for realtime state and data to coordinate with your frontend.

Additional benefits of LiveKit Agents for the Realtime API include:

Noise cancellation: One line of code to remove background noise and speakers from your input audio.
Telephony: Inbound and outbound calling using SIP trunks.
Interruption handling: Automatically handles context truncation on interruption.
Transcription sync: Realtime API text output is synced to audio playback automatically.

Loading diagram…

Realtime API quickstart

Use the Voice AI quickstart with the Realtime API to get up and running in less than 10 minutes.

Web and mobile frontends

Put your agent in your pocket with a custom web or mobile app.

Telephony integration

Your agent can place and receive calls with LiveKit's SIP integration.

Building voice agents

Comprehensive documentation to build advanced voice AI apps with LiveKit.

Recipes

Get inspired by LiveKit's collection of recipes and example apps.

OpenAI plugin documentation

The following links provide more information on each available OpenAI component in LiveKit Agents.

Realtime API

LiveKit Agents docs for the OpenAI Realtime API.

OpenAI Models

LiveKit Agents docs for gpt-4o, o1-mini, and other OpenAI LLMs.

OpenAI STT

LiveKit Agents docs for whisper-1, gpt-4o-transcribe, and other OpenAI STT models.

OpenAI TTS

LiveKit Agents docs for tts-1, gpt-4o-mini-tts, and other OpenAI TTS models.