Skip to main content

LiveKit Inference

Access the best AI models for voice agents, included in LiveKit Cloud.

Overview

Overview showing LiveKit Inference serving a STT-LLM-TTS pipeline for a voice agent.

LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for LLM, STT, and TTS for supported models and configuration options.

To learn more about LiveKit Inference, see the blog post Introducing LiveKit Inference: A unified model interface for voice AI.

For LiveKit Inference models, use the inference module classes in your AgentSession:

from livekit.agents import AgentSession, inference
session = AgentSession(
stt=inference.STT(
model="deepgram/flux-general",
language="en"
),
llm=inference.LLM(
model="openai/gpt-4.1-mini",
),
tts=inference.TTS(
model="cartesia/sonic-3",
voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
),
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
stt: new inference.STT({
model: "deepgram/flux-general",
language: "en"
}),
llm: new inference.LLM({
model: "openai/gpt-4.1-mini",
}),
tts: new inference.TTS({
model: "cartesia/sonic-3",
voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
}),
});

String descriptors

As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.

from livekit.agents import AgentSession
session = AgentSession(
stt="deepgram/nova-3:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
stt: "deepgram/nova-3:en",
llm: "openai/gpt-4.1-mini",
tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
});

For detailed parameter references and model-specific options, see the individual model guides for LLM, STT, and TTS.

Models

The following tables list all models currently available through LiveKit Inference.

Pricing

See the latest pricing for all LiveKit Inference models.

Large language models (LLM)

Model familyModel nameModel ID
DeepSeek-V3
deepseek-ai/deepseek-v3
DeepSeek-V3.2
Retired
deepseek-ai/deepseek-v3.2
Gemini 2.5 Flash
google/gemini-2.5-flash
Gemini 2.5 Flash-Lite
google/gemini-2.5-flash-lite
Gemini 2.5 Pro
google/gemini-2.5-pro
Gemini 3 Flash
google/gemini-3-flash-preview
Gemini 3.1 Flash Lite
google/gemini-3.1-flash-lite-preview
Gemini 3.1 Pro
google/gemini-3.1-pro-preview
Gemini 2.0 Flash
Retired
google/gemini-2.0-flash
Gemini 2.0 Flash-Lite
Retired
google/gemini-2.0-flash-lite
Gemini 3 Pro
Retired
google/gemini-3-pro-preview
Kimi K2.5
moonshotai/kimi-k2.5
Kimi K2 Instruct
Retired
moonshotai/kimi-k2-instruct
GPT-4.1
openai/gpt-4.1
GPT-4.1 mini
openai/gpt-4.1-mini
GPT-4.1 nano
openai/gpt-4.1-nano
GPT-4o
openai/gpt-4o
GPT-4o mini
openai/gpt-4o-mini
GPT-5
openai/gpt-5
GPT-5 mini
openai/gpt-5-mini
GPT-5 nano
openai/gpt-5-nano
GPT-5.1
openai/gpt-5.1
GPT-5.1 Chat
openai/gpt-5.1-chat-latest
GPT-5.2
openai/gpt-5.2
GPT-5.2 Chat
openai/gpt-5.2-chat-latest
GPT-5.3 Chat
openai/gpt-5.3-chat-latest
GPT-5.4
openai/gpt-5.4
GPT OSS 120B
openai/gpt-oss-120b
QwenQwen
Qwen3 235B-A22B Instruct
qwen/qwen3-235b-a22b-instruct

Speech-to-text (STT)

ProviderModel nameModel IDLanguages
Universal-3 Pro Streaming
assemblyai/u3-rt-pro
6 languages
Universal-Streaming
assemblyai/universal-streaming
English only
Universal-Streaming-Multilingual
assemblyai/universal-streaming-multilingual
6 languages
Ink Whisper
cartesia/ink-whisper
100 languages
Flux
deepgram/flux-general-en
English only
Nova-2
deepgram/nova-2
Multilingual, 33 languages
Nova-2 Conversational AI
deepgram/nova-2-conversationalai
English only
Nova-2 Medical
deepgram/nova-2-medical
English only
Nova-2 Phone Call
deepgram/nova-2-phonecall
English only
Nova-3 (Monolingual)
deepgram/nova-3
44 languages
Nova-3 Medical
deepgram/nova-3-medical
English only
Nova-3 (Multilingual)
deepgram/nova-3-multi
Multilingual, 0 languages
Scribe v2 Realtime
elevenlabs/scribe_v2_realtime
190 languages

Text-to-speech (TTS)

ProviderModel nameModel IDLanguages
Sonic
cartesia/sonic
enfrdeesptzhjahiitkonlplrusvtr
Sonic 2
cartesia/sonic-2
enfrdeesptzhjahiitkonlplrusvtr
Sonic 3
cartesia/sonic-3
endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
Sonic 3 (2025-10-27)
cartesia/sonic-3-2025-10-27
endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
Sonic 3 (2026-01-12)
cartesia/sonic-3-2026-01-12
endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
Sonic Turbo
cartesia/sonic-turbo
enfrdeesptzhjahiitkonlplrusvtr
Aura-2
deepgram/aura-2
enen-USen-PHen-GBen-AUeses-COes-MXes-ESes-419es-ARnlnl-NLfrfr-FRdede-DEitit-ITjaja-JP
Aura-1
Retired
deepgram/aura
enen-USen-IEen-GB
Eleven Flash v2
elevenlabs/eleven_flash_v2
en
Eleven Flash v2.5
elevenlabs/eleven_flash_v2_5
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
Eleven Multilingual v2
elevenlabs/eleven_multilingual_v2
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukru
Eleven Turbo v2
elevenlabs/eleven_turbo_v2
en
Eleven Turbo v2.5
elevenlabs/eleven_turbo_v2_5
enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
Inworld TTS 1
inworld/inworld-tts-1
enesfrkonlzhdeitjaplptruhihear
Inworld TTS 1 Max
inworld/inworld-tts-1-max
enesfrkonlzhdeitjaplptruhihear
Inworld TTS 1.5 Max
inworld/inworld-tts-1.5-max
enzhjakoruitesptfrdeplnlhihear
Inworld TTS 1.5 Mini
inworld/inworld-tts-1.5-mini
enzhjakoruitesptfrdeplnlhihear
Arcana
rime/arcana
enesfrdehihejaptar
Mist
rime/mist
en
Mist v2
rime/mistv2
enesfrde
Text to Speech
xai/tts-1
autoenar-EGar-SAar-AEbnzhfrdehiiditjakopt-BRpt-PTrues-MXes-EStrvi

Billing

LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the LiveKit Inference pricing page.