OpenAI and LiveKit

Build world-class realtime AI apps with OpenAI and LiveKit Agents.

Try LiveKit.fm

Chat with OpenAI's latest models in a LiveKit demo inspired by OpenAI.fm

Try LiveKit.fm

OpenAI ecosystem support

OpenAI provides some of the most powerful AI models and services today, which integrate into LiveKit Agents in the following ways:

  • Realtime API: The original production-grade speech-to-speech model. Build lifelike voice assistants with just one model.
  • GPT 4o, o1-mini, and more: Smart and creative models for voice AI.
  • STT models: From industry-standard whisper-1 to leading-edge gpt-4o-transcribe.
  • TTS models: Use OpenAI's latest gpt-4o-mini-tts to generate lifelike speech in a voice pipeline.

LiveKit Agents supports OpenAI models through the OpenAI developer platform as well as Azure OpenAI Service.

Getting started

Use the following guide to speak to your own OpenAI-powered voice AI agent in less than 10 minutes.

LiveKit Agents overview

LiveKit Agents is an open-source framework for building realtime AI apps using WebRTC transport to end-user devices and WebSockets or HTTPS for backend services.

  • Agent workflows: Build complex voice AI apps with discrete stages and handoffs.
  • Telephony: Inbound and outbound calling using SIP trunks.
  • Frontend SDKs: Full-featured SDKs and UI components for JavaScript, Swift, Kotlin, Flutter, React Native, and Unity.
  • Python and Node.js: Build voice AI apps in Python or Node.js.
  • Dispatch and load balancing: Built-in support for request distribution and load balancing.
  • LiveKit Cloud: Fully-managed LiveKit server with global scale and low latency (you can also self-host).
What is WebRTC?

WebRTC provides significant advantages over other options for building realtime applications such as websockets.

  • Optimized for media: Purpose-built for audio and video with advanced codecs and compression algorithms.
  • Network resilient: Performs reliably even in challenging network conditions due to UDP, adaptive bitrate, and more.
  • Broad compatibility: Natively supported in all modern browsers.

LiveKit handles all of the complexity of running production-grade WebRTC infrastructure while extending support to mobile apps, backends, and telephony.

Realtime API

LiveKit Agents serves as a bridge between your frontend — connected over WebRTC — and the OpenAI Realtime API — connected over WebSockets. LiveKit automatically converts Realtime API audio response buffers to WebRTC audio streams synchronized with text, and handles business logic like interruption handling automatically. You can add your own logic within your agent, and use LiveKit features for realtime state and data to coordinate with your frontend.

Additional benefits of LiveKit Agents for the Realtime API include:

  • Noise cancellation: One line of code to remove background noise and speakers from your input audio.
  • Telephony: Inbound and outbound calling using SIP trunks.
  • Interruption handling: Automatically handles context truncation on interruption.
  • Transcription sync: Realtime API text output is synced to audio playback automatically.

Loading diagram…

Realtime API quickstart

Use the Voice AI quickstart with the Realtime API to get up and running in less than 10 minutes.

OpenAI plugin documentation

The following links provide more information on each available OpenAI component in LiveKit Agents.