Overview
LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for LLM, STT, and TTS for supported models and configuration options.
To learn more about LiveKit Inference, see the blog post Introducing LiveKit Inference: A unified model interface for voice AI.
For LiveKit Inference models, use the inference module classes in your AgentSession:
from livekit.agents import AgentSession, inferencesession = AgentSession(stt=inference.STT(model="deepgram/flux-general",language="en"),llm=inference.LLM(model="openai/gpt-4.1-mini",),tts=inference.TTS(model="cartesia/sonic-3",voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",),)
import { AgentSession, inference } from '@livekit/agents';session = new AgentSession({stt: new inference.STT({model: "deepgram/flux-general",language: "en"}),llm: new inference.LLM({model: "openai/gpt-4.1-mini",}),tts: new inference.TTS({model: "cartesia/sonic-3",voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",}),});
String descriptors
As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.
from livekit.agents import AgentSessionsession = AgentSession(stt="deepgram/nova-3:en",llm="openai/gpt-4.1-mini",tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",)
import { AgentSession } from '@livekit/agents';session = new AgentSession({stt: "deepgram/nova-3:en",llm: "openai/gpt-4.1-mini",tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",});
For detailed parameter references and model-specific options, see the individual model guides for LLM, STT, and TTS.
Models
The following tables list all models currently available through LiveKit Inference.
Pricing
See the latest pricing for all LiveKit Inference models.
Large language models (LLM)
| Model family | Model name | Model ID |
|---|---|---|
DeepSeek-V3 | deepseek-ai/deepseek-v3 | |
DeepSeek-V3.2 Retired | deepseek-ai/deepseek-v3.2 | |
Gemini 2.5 Flash | google/gemini-2.5-flash | |
Gemini 2.5 Flash-Lite | google/gemini-2.5-flash-lite | |
Gemini 2.5 Pro | google/gemini-2.5-pro | |
Gemini 3 Flash | google/gemini-3-flash-preview | |
Gemini 3.1 Flash Lite | google/gemini-3.1-flash-lite-preview | |
Gemini 3.1 Pro | google/gemini-3.1-pro-preview | |
Gemini 2.0 Flash Retired | google/gemini-2.0-flash | |
Gemini 2.0 Flash-Lite Retired | google/gemini-2.0-flash-lite | |
Gemini 3 Pro Retired | google/gemini-3-pro-preview | |
Kimi K2.5 | moonshotai/kimi-k2.5 | |
Kimi K2 Instruct Retired | moonshotai/kimi-k2-instruct | |
GPT-4.1 | openai/gpt-4.1 | |
GPT-4.1 mini | openai/gpt-4.1-mini | |
GPT-4.1 nano | openai/gpt-4.1-nano | |
GPT-4o | openai/gpt-4o | |
GPT-4o mini | openai/gpt-4o-mini | |
GPT-5 | openai/gpt-5 | |
GPT-5 mini | openai/gpt-5-mini | |
GPT-5 nano | openai/gpt-5-nano | |
GPT-5.1 | openai/gpt-5.1 | |
GPT-5.1 Chat | openai/gpt-5.1-chat-latest | |
GPT-5.2 | openai/gpt-5.2 | |
GPT-5.2 Chat | openai/gpt-5.2-chat-latest | |
GPT-5.3 Chat | openai/gpt-5.3-chat-latest | |
GPT-5.4 | openai/gpt-5.4 | |
GPT OSS 120B | openai/gpt-oss-120b | |
Qwen3 235B-A22B Instruct | qwen/qwen3-235b-a22b-instruct |
Speech-to-text (STT)
| Provider | Model name | Model ID | Languages |
|---|---|---|---|
Universal-3 Pro Streaming | assemblyai/u3-rt-pro | 6 languages | |
Universal-Streaming | assemblyai/universal-streaming | English only | |
Universal-Streaming-Multilingual | assemblyai/universal-streaming-multilingual | 6 languages | |
Ink Whisper | cartesia/ink-whisper | 100 languages | |
Flux | deepgram/flux-general-en | English only | |
Nova-2 | deepgram/nova-2 | Multilingual, 33 languages | |
Nova-2 Conversational AI | deepgram/nova-2-conversationalai | English only | |
Nova-2 Medical | deepgram/nova-2-medical | English only | |
Nova-2 Phone Call | deepgram/nova-2-phonecall | English only | |
Nova-3 (Monolingual) | deepgram/nova-3 | 44 languages | |
Nova-3 Medical | deepgram/nova-3-medical | English only | |
Nova-3 (Multilingual) | deepgram/nova-3-multi | Multilingual, 0 languages | |
Scribe v2 Realtime | elevenlabs/scribe_v2_realtime | 190 languages |
Text-to-speech (TTS)
| Provider | Model name | Model ID | Languages |
|---|---|---|---|
Sonic | cartesia/sonic | enfrdeesptzhjahiitkonlplrusvtr | |
Sonic 2 | cartesia/sonic-2 | enfrdeesptzhjahiitkonlplrusvtr | |
Sonic 3 | cartesia/sonic-3 | endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa | |
Sonic 3 (2025-10-27) | cartesia/sonic-3-2025-10-27 | endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa | |
Sonic 3 (2026-01-12) | cartesia/sonic-3-2026-01-12 | endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa | |
Sonic Turbo | cartesia/sonic-turbo | enfrdeesptzhjahiitkonlplrusvtr | |
Aura-2 | deepgram/aura-2 | enen-USen-PHen-GBen-AUeses-COes-MXes-ESes-419es-ARnlnl-NLfrfr-FRdede-DEitit-ITjaja-JP | |
Aura-1 Retired | deepgram/aura | enen-USen-IEen-GB | |
Eleven Flash v2 | elevenlabs/eleven_flash_v2 | en | |
Eleven Flash v2.5 | elevenlabs/eleven_flash_v2_5 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi | |
Eleven Multilingual v2 | elevenlabs/eleven_multilingual_v2 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukru | |
Eleven Turbo v2 | elevenlabs/eleven_turbo_v2 | en | |
Eleven Turbo v2.5 | elevenlabs/eleven_turbo_v2_5 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi | |
Inworld TTS 1 | inworld/inworld-tts-1 | enesfrkonlzhdeitjaplptruhihear | |
Inworld TTS 1 Max | inworld/inworld-tts-1-max | enesfrkonlzhdeitjaplptruhihear | |
Inworld TTS 1.5 Max | inworld/inworld-tts-1.5-max | enzhjakoruitesptfrdeplnlhihear | |
Inworld TTS 1.5 Mini | inworld/inworld-tts-1.5-mini | enzhjakoruitesptfrdeplnlhihear | |
Arcana | rime/arcana | enesfrdehihejaptar | |
Mist | rime/mist | en | |
Mist v2 | rime/mistv2 | enesfrde | |
Text to Speech | xai/tts-1 | autoenar-EGar-SAar-AEbnzhfrdehiiditjakopt-BRpt-PTrues-MXes-EStrvi |
Billing
LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the LiveKit Inference pricing page.