Overview
OpenAI provides STT support via the latest gpt-4o-transcribe
model as well as whisper-1
. You can use the open source OpenAI plugin for LiveKit agents to build voice AI applications with fast, accurate transcription.
Quick reference
This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.
Installation
Install the plugin from PyPI:
pip install "livekit-agents[openai]~=1.2"
pnpm add @livekit/agents-plugin-openai@1.x
Authentication
The OpenAI plugin requires an OpenAI API key.
Set OPENAI_API_KEY
in your .env
file.
Usage
Use OpenAI STT in an AgentSession
or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.
from livekit.plugins import openaisession = AgentSession(stt = openai.STT(model="gpt-4o-transcribe",),# ... llm, tts, etc.)
import * as openai from '@livekit/agents-plugin-openai';const session = new voice.AgentSession({stt: new openai.STT(model: "gpt-4o-transcribe"),// ... llm, tts, etc.});
Parameters
This section describes some of the available parameters. See the plugin reference links in the Additional resources section for a complete list of all available parameters.
Model to use for transcription. See OpenAI's documentation for a list of supported models.
Language of input audio in ISO-639-1 format. See OpenAI's documentation for a list of supported languages.
Additional resources
The following resources provide more information about using OpenAI with LiveKit Agents.