LiveKit Inference | LiveKit Documentation

Overview

LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for LLM, STT, and TTS for supported models and configuration options.

To learn more about LiveKit Inference, see the blog post Introducing LiveKit Inference: A unified model interface for voice AI.

For LiveKit Inference models, use the inference module classes in your AgentSession:

from livekit.agents import AgentSession, inference

session = AgentSession(
    stt=inference.STT(
        model="deepgram/flux-general",
        language="en"
    ),
    llm=inference.LLM(
        model="openai/gpt-5.3-chat-latest",
    ),
    tts=inference.TTS(
        model="cartesia/sonic-3",
        voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
    ),
)

import { AgentSession, inference } from '@livekit/agents';

session = new AgentSession({
    stt: new inference.STT({
        model: "deepgram/flux-general",
        language: "en"
    }),
    llm: new inference.LLM({
        model: "openai/gpt-5.3-chat-latest",
    }),
    tts: new inference.TTS({
        model: "cartesia/sonic-3",
        voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
    }),
});

String descriptors

As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.

from livekit.agents import AgentSession

session = AgentSession(
    stt="deepgram/nova-3:en",
    llm="openai/gpt-5.3-chat-latest",
    tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
)

import { AgentSession } from '@livekit/agents';

session = new AgentSession({
    stt: "deepgram/nova-3:en",
    llm: "openai/gpt-5.3-chat-latest",
    tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
});

For detailed parameter references and model-specific options, see the individual model guides for LLM, STT, and TTS.

Models

The following tables list all models currently available through LiveKit Inference.

Pricing

See the latest pricing for all LiveKit Inference models.

Large language models (LLM)

Model family	Model name	Model ID
DeepSeek	DeepSeek-V3.1	deepseek-ai/deepseek-v3.1
	DeepSeek-V3 Deprecated	deepseek-ai/deepseek-v3
	DeepSeek-V3.2 Retired	deepseek-ai/deepseek-v3.2
Gemini	Gemini 2.5 Flash	google/gemini-2.5-flash
	Gemini 2.5 Flash-Lite	google/gemini-2.5-flash-lite
	Gemini 2.5 Pro	google/gemini-2.5-pro
	Gemini 3 Flash	google/gemini-3-flash-preview
	Gemini 3.1 Flash Lite	google/gemini-3.1-flash-lite-preview
	Gemini 3.1 Pro	google/gemini-3.1-pro-preview
	Gemini 2.0 Flash Retired	google/gemini-2.0-flash
	Gemini 2.0 Flash-Lite Retired	google/gemini-2.0-flash-lite
	Gemini 3 Pro Retired	google/gemini-3-pro-preview
Kimi	Kimi K2.5	moonshotai/kimi-k2.5
Kimi	Kimi K2 Instruct Retired	moonshotai/kimi-k2-instruct
OpenAI	GPT-4.1	openai/gpt-4.1
	GPT-4.1 mini	openai/gpt-4.1-mini
	GPT-4.1 nano	openai/gpt-4.1-nano
	GPT-4o	openai/gpt-4o
	GPT-4o mini	openai/gpt-4o-mini
	GPT-5	openai/gpt-5
	GPT-5 mini	openai/gpt-5-mini
	GPT-5 nano	openai/gpt-5-nano
	GPT-5.1	openai/gpt-5.1
	GPT-5.1 Chat	openai/gpt-5.1-chat-latest
	GPT-5.2	openai/gpt-5.2
	GPT-5.2 Chat	openai/gpt-5.2-chat-latest
	GPT-5.3 Chat	openai/gpt-5.3-chat-latest
	GPT-5.4	openai/gpt-5.4
	GPT-5.4 mini	openai/gpt-5.4-mini
	GPT-5.4 nano	openai/gpt-5.4-nano
	GPT OSS 120B	openai/gpt-oss-120b
xAI	Grok 4.1 Fast	xai/grok-4-1-fast-non-reasoning
	Grok 4.1 Fast Reasoning	xai/grok-4-1-fast-reasoning
	Grok 4.20	xai/grok-4.20-0309-non-reasoning
	Grok 4.20 Reasoning	xai/grok-4.20-0309-reasoning
	Grok 4.20 Multi-Agent	xai/grok-4.20-multi-agent-0309

Retired models

Retired models are no longer accessible. If you're using a retired model, switch to a currently available model.

Speech-to-text (STT)

Provider	Model name	Model ID	Languages
AssemblyAI	Universal-3 Pro Streaming	assemblyai/u3-rt-pro	6 languages
	Universal-Streaming	assemblyai/universal-streaming	English only
	Universal-Streaming-Multilingual	assemblyai/universal-streaming-multilingual	6 languages
Cartesia	Ink Whisper	cartesia/ink-whisper	100 languages
Deepgram	Flux	deepgram/flux-general-en	English only
	Flux (Multilingual)	deepgram/flux-general-multi	10 languages
	Nova-2	deepgram/nova-2	Multilingual, 33 languages
	Nova-2 Conversational AI	deepgram/nova-2-conversationalai	English only
	Nova-2 Medical	deepgram/nova-2-medical	English only
	Nova-2 Phone Call	deepgram/nova-2-phonecall	English only
	Nova-3 (Monolingual)	deepgram/nova-3	45 languages
	Nova-3 Medical	deepgram/nova-3-medical	English only
	Nova-3 (Multilingual)	deepgram/nova-3-multi	Multilingual, 0 languages
ElevenLabs	Scribe v2 Realtime	elevenlabs/scribe_v2_realtime	190 languages
xAI	Speech to Text	xai/stt-1	25 languages

Text-to-speech (TTS)

Provider	Model name	Model ID	Languages
Cartesia	Sonic 2	cartesia/sonic-2	enfrdeesptzhjako
	Sonic 3	cartesia/sonic-3	endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
	Sonic 3 (2025-10-27)	cartesia/sonic-3-2025-10-27	endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
	Sonic 3 (2026-01-12)	cartesia/sonic-3-2026-01-12	endeesfrjaptzhhikoitnlplrusvtrtlbgroarcselfihrmsskdataukhunovibnthhekaidteguknmlmrpa
	Sonic Turbo	cartesia/sonic-turbo	enfrdeesptzhjahiko
	Sonic Deprecated	cartesia/sonic	enfrdeesptzhjahiitkonlplrusvtr
Deepgram	Aura-2	deepgram/aura-2	enen-USen-PHen-GBen-AUeses-COes-MXes-ESes-419es-ARnlnl-NLfrfr-FRdede-DEitit-ITjaja-JP
Deepgram	Aura-1 Retired	deepgram/aura	enen-USen-IEen-GB
ElevenLabs	Eleven Flash v2	elevenlabs/eleven_flash_v2	en
	Eleven Flash v2.5	elevenlabs/eleven_flash_v2_5	enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
	Eleven Multilingual v2	elevenlabs/eleven_multilingual_v2	enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukru
	Eleven Turbo v2	elevenlabs/eleven_turbo_v2	en
	Eleven Turbo v2.5	elevenlabs/eleven_turbo_v2_5	enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
	Eleven v3	elevenlabs/eleven_v3	enjazhdehifrkoptitesidnltrfilplsvbgroarcselfihrmsskdataukruhunovi
Inworld	TTS 1	inworld/inworld-tts-1	enesfrkonlzhdeitjaplptruhihear
	TTS 1 Max	inworld/inworld-tts-1-max	enesfrkonlzhdeitjaplptruhihear
	TTS 1.5 Max	inworld/inworld-tts-1.5-max	enzhjakoruitesptfrdeplnlhihear
	TTS 1.5 Mini	inworld/inworld-tts-1.5-mini	enzhjakoruitesptfrdeplnlhihear
Rime	Arcana	rime/arcana	enesfrdehihejaptar
	Mist	rime/mist	en
	Mist v2	rime/mistv2	enesfrde
	Mist v3	rime/mistv3	enesfrdehi
xAI	Text to Speech	xai/tts-1	autoenar-EGar-SAar-AEbnzhfrdehiiditjakopt-BRpt-PTrues-MXes-EStrvi

Retired models

Retired models are no longer accessible. If you're using a retired model, switch to a currently available model.

Billing

LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the LiveKit Inference pricing page.

Quotas and limits

Guide to quotas and limits for LiveKit Cloud plans.

Billing

Guide to LiveKit Cloud invoices and billing cycles.