OpenAI STT integration guide

Overview

OpenAI provides STT support via the latest gpt-4o-transcribe model as well as whisper-1. You can use the open source OpenAI plugin for LiveKit agents to build voice AI applications with fast, accurate transcription.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[openai]~=1.0rc"

Authentication

The OpenAI plugin requires an OpenAI API key.

Set OPENAI_API_KEY in your .env file.

Usage

Use OpenAI STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import openai

session = AgentSession(
  stt = openai.STT(
    model="gpt-4o-transcribe",
  ),
  # ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelWhisperModels | stringOptionalDefault: gpt-4o-transcribe

Model to use for transcription. See OpenAI's documentation for a list of supported models.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format. See OpenAI's documentation for a list of supported languages.

Additional resources

The following resources provide more information about using OpenAI with LiveKit Agents.

Python package

The livekit-plugins-openai package on PyPI.

Plugin reference

Reference for the OpenAI STT plugin.

GitHub repo

View the source or contribute to the LiveKit OpenAI STT plugin.

OpenAI docs

OpenAI STT docs.

Voice AI quickstart

Get started with LiveKit Agents and OpenAI STT.

OpenAI ecosystem guide

Overview of the entire OpenAI and LiveKit Agents integration.