OpenAI STT integration guide

How to use the OpenAI STT plugin for LiveKit Agents.

Overview

OpenAI provides STT support via the latest gpt-4o-transcribe model as well as whisper-1. You can use the open source OpenAI plugin for LiveKit agents to build voice AI applications with fast, accurate transcription.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[openai]~=1.0rc"

Authentication

The OpenAI plugin requires an OpenAI API key.

Set OPENAI_API_KEY in your .env file.

Usage

Use OpenAI STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import openai
session = AgentSession(
stt = openai.STT(
model="gpt-4o-transcribe",
),
# ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelWhisperModels | stringOptionalDefault: gpt-4o-transcribe

Model to use for transcription. See OpenAI's documentation for a list of supported models.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format. See OpenAI's documentation for a list of supported languages.

Additional resources

The following resources provide more information about using OpenAI with LiveKit Agents.