Overview
LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for LLM, STT, and TTS for supported models and configuration options.
To learn more about LiveKit Inference, see the blog post Introducing LiveKit Inference: A unified model interface for voice AI.
For LiveKit Inference models, use the inference module classes in your AgentSession:
from livekit.agents import AgentSession, inferencesession = AgentSession(stt=inference.STT(model="deepgram/flux-general",language="en"),llm=inference.LLM(model="openai/gpt-4.1-mini",),tts=inference.TTS(model="cartesia/sonic-3",voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",),)
import { AgentSession, inference } from '@livekit/agents';session = new AgentSession({stt: new inference.STT({model: "deepgram/flux-general",language: "en"}),llm: new inference.LLM({model: "openai/gpt-4.1-mini",}),tts: new inference.TTS({model: "cartesia/sonic-3",voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",}),});
String descriptors
As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.
from livekit.agents import AgentSessionsession = AgentSession(stt="deepgram/nova-3:en",llm="openai/gpt-4.1-mini",tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",)
import { AgentSession } from '@livekit/agents';session = new AgentSession({stt: "deepgram/nova-3:en",llm: "openai/gpt-4.1-mini",tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",});
For detailed parameter references and model-specific options, see the individual model guides for LLM, STT, and TTS.
Models
The following tables list all models currently available through LiveKit Inference.
Pricing
See the latest pricing for all LiveKit Inference models.
Large language models (LLM)
| Model family | Model name | Model ID |
|---|---|---|
| GPT-4o | openai/gpt-4o | |
| GPT-4o mini | openai/gpt-4o-mini | |
| GPT-4.1 | openai/gpt-4.1 | |
| GPT-4.1 mini | openai/gpt-4.1-mini | |
| GPT-4.1 nano | openai/gpt-4.1-nano | |
| GPT-5 | openai/gpt-5 | |
| GPT-5 mini | openai/gpt-5-mini | |
| GPT-5 nano | openai/gpt-5-nano | |
| GPT-5.1 | openai/gpt-5.1 | |
| GPT-5.1 Chat Latest | openai/gpt-5.1-chat-latest | |
| GPT-5.2 | openai/gpt-5.2 | |
| GPT-5.2 Chat Latest | openai/gpt-5.2-chat-latest | |
| GPT OSS 120B | openai/gpt-oss-120b | |
| Gemini 3 Pro | google/gemini-3-pro | |
| Gemini 3 Flash | google/gemini-3-flash | |
| Gemini 2.5 Pro | google/gemini-2.5-pro | |
| Gemini 2.5 Flash | google/gemini-2.5-flash | |
| Gemini 2.5 Flash Lite | google/gemini-2.5-flash-lite | |
| Kimi K2 Instruct | moonshotai/kimi-k2-instruct | |
| DeepSeek V3 | deepseek-ai/deepseek-v3 | |
| DeepSeek V3.2 | deepseek-ai/deepseek-v3.2 |
Speech-to-text (STT)
| Provider | Model name | Languages |
|---|---|---|
| Universal-Streaming | English only | |
| Universal-Streaming-Multilingual | 6 languages | |
| Ink Whisper | 100 languages | |
| Flux | English only | |
| Nova-3 | Multilingual, 9 languages | |
| Nova-3 Medical | English only | |
| Nova-2 | Multilingual, 33 languages | |
| Nova-2 Medical | English only | |
| Nova-2 Conversational AI | English only | |
| Nova-2 Phonecall | English only | |
| Scribe V2 Realtime | 41 languages |
Text-to-speech (TTS)
| Provider | Model ID | Languages |
|---|---|---|
| cartesia/sonic-3 | endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa | |
| cartesia/sonic-2 | enfrdeesptzhjahiitkonlplrusvtr | |
| cartesia/sonic-turbo | enfrdeesptzhjahiitkonlplrusvtr | |
| cartesia/sonic | enfrdeesptzhjahiitkonlplrusvtr | |
| deepgram/aura-2 | enes | |
| elevenlabs/eleven_flash_v2 | en | |
| elevenlabs/eleven_flash_v2_5 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi | |
| elevenlabs/eleven_turbo_v2 | en | |
| elevenlabs/eleven_turbo_v2_5 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi | |
| elevenlabs/eleven_multilingual_v2 | enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukru | |
| inworld/inworld-tts-1.5-max | enesfrkonlzhdeitjaplptruhi | |
| inworld/inworld-tts-1.5-mini | enesfrkonlzhdeitjaplptruhi | |
| inworld/inworld-tts-1-max | enesfrkonlzhdeitjaplptru | |
| inworld/inworld-tts-1 | enesfrkonlzhdeitjaplptru | |
| rime/arcana | enesfrdearhehijapt | |
| rime/mistv2 | enesfrde |
Billing
LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the LiveKit Inference pricing page.