Skip to main content

LiveKit Inference

Access the best AI models for voice agents, included in LiveKit Cloud.

Overview

Overview showing LiveKit Inference serving a STT-LLM-TTS pipeline for a voice agent.

LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for LLM, STT, and TTS for supported models and configuration options.

To learn more about LiveKit Inference, see the blog post Introducing LiveKit Inference: A unified model interface for voice AI.

For LiveKit Inference models, use the inference module classes in your AgentSession:

from livekit.agents import AgentSession, inference
session = AgentSession(
stt=inference.STT(
model="deepgram/flux-general",
language="en"
),
llm=inference.LLM(
model="openai/gpt-4.1-mini",
),
tts=inference.TTS(
model="cartesia/sonic-3",
voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
),
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
stt: new inference.STT({
model: "deepgram/flux-general",
language: "en"
}),
llm: new inference.LLM({
model: "openai/gpt-4.1-mini",
}),
tts: new inference.TTS({
model: "cartesia/sonic-3",
voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
}),
});

String descriptors

As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.

from livekit.agents import AgentSession
session = AgentSession(
stt="deepgram/nova-3:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
stt: "deepgram/nova-3:en",
llm: "openai/gpt-4.1-mini",
tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
});

For detailed parameter references and model-specific options, see the individual model guides for LLM, STT, and TTS.

Models

The following tables list all models currently available through LiveKit Inference.

Pricing

See the latest pricing for all LiveKit Inference models.

Large language models (LLM)

Model familyModel nameModel ID
GPT-4oopenai/gpt-4o
GPT-4o miniopenai/gpt-4o-mini
GPT-4.1openai/gpt-4.1
GPT-4.1 miniopenai/gpt-4.1-mini
GPT-4.1 nanoopenai/gpt-4.1-nano
GPT-5openai/gpt-5
GPT-5 miniopenai/gpt-5-mini
GPT-5 nanoopenai/gpt-5-nano
GPT-5.1openai/gpt-5.1
GPT-5.1 Chat Latestopenai/gpt-5.1-chat-latest
GPT-5.2openai/gpt-5.2
GPT-5.2 Chat Latestopenai/gpt-5.2-chat-latest
GPT OSS 120Bopenai/gpt-oss-120b
Gemini 3 Progoogle/gemini-3-pro
Gemini 3 Flashgoogle/gemini-3-flash
Gemini 2.5 Progoogle/gemini-2.5-pro
Gemini 2.5 Flashgoogle/gemini-2.5-flash
Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite
KimiKimi
Kimi K2 Instructmoonshotai/kimi-k2-instruct
DeepSeek V3deepseek-ai/deepseek-v3
DeepSeek V3.2deepseek-ai/deepseek-v3.2

Speech-to-text (STT)

ProviderModel nameLanguages
Universal-StreamingEnglish only
Universal-Streaming-Multilingual6 languages
Ink Whisper100 languages
FluxEnglish only
Nova-3Multilingual, 9 languages
Nova-3 MedicalEnglish only
Nova-2Multilingual, 33 languages
Nova-2 MedicalEnglish only
Nova-2 Conversational AIEnglish only
Nova-2 PhonecallEnglish only
Scribe V2 Realtime41 languages

Text-to-speech (TTS)

ProviderModel IDLanguages
cartesia/sonic-3
endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
cartesia/sonic-2
enfrdeesptzhjahiitkonlplrusvtr
cartesia/sonic-turbo
enfrdeesptzhjahiitkonlplrusvtr
cartesia/sonic
enfrdeesptzhjahiitkonlplrusvtr
deepgram/aura-2
enes
elevenlabs/eleven_flash_v2
en
elevenlabs/eleven_flash_v2_5
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
elevenlabs/eleven_turbo_v2
en
elevenlabs/eleven_turbo_v2_5
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
elevenlabs/eleven_multilingual_v2
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukru
inworld/inworld-tts-1.5-max
enesfrkonlzhdeitjaplptruhi
inworld/inworld-tts-1.5-mini
enesfrkonlzhdeitjaplptruhi
inworld/inworld-tts-1-max
enesfrkonlzhdeitjaplptru
inworld/inworld-tts-1
enesfrkonlzhdeitjaplptru
rime/arcana
enesfrdearhehijapt
rime/mistv2
enesfrde

Billing

LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the LiveKit Inference pricing page.