LiveKit docs › Models › LiveKit Inference

---

# LiveKit Inference

> Access the best AI models for voice agents, included in LiveKit Cloud.

## Overview

![Overview showing LiveKit Inference serving a STT-LLM-TTS pipeline for a voice agent.](/images/agents/inference.svg)

LiveKit Inference provides access to many of the best models and providers for voice agents, including models from OpenAI, Google, AssemblyAI, Deepgram, Cartesia, ElevenLabs, and more. LiveKit Inference is included in LiveKit Cloud, and does not require any additional plugins. See the guides for [LLM](https://docs.livekit.io/agents/models/llm.md), [STT](https://docs.livekit.io/agents/models/stt.md), and [TTS](https://docs.livekit.io/agents/models/tts.md) for supported models and configuration options.

To learn more about LiveKit Inference, see the blog post [Introducing LiveKit Inference: A unified model interface for voice AI](https://blog.livekit.io/introducing-livekit-inference/).

For LiveKit Inference models, use the `inference` module classes in your `AgentSession`:

**Python**:

```python
from livekit.agents import AgentSession, inference

session = AgentSession(
    stt=inference.STT(
        model="deepgram/flux-general",
        language="en"
    ),
    llm=inference.LLM(
        model="openai/gpt-5.3-chat-latest",
    ),
    tts=inference.TTS(
        model="cartesia/sonic-3",
        voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
    ),
)

```

---

**Node.js**:

```typescript
import { AgentSession, inference } from '@livekit/agents';

session = new AgentSession({
    stt: new inference.STT({
        model: "deepgram/flux-general",
        language: "en"
    }),
    llm: new inference.LLM({
        model: "openai/gpt-5.3-chat-latest",
    }),
    tts: new inference.TTS({
        model: "cartesia/sonic-3",
        voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
    }),
});

```

## String descriptors

As a shortcut, you can pass a model descriptor string directly instead of using the inference classes. This is a convenient way to get started quickly.

**Python**:

```python
from livekit.agents import AgentSession

session = AgentSession(
    stt="deepgram/nova-3:en",
    llm="openai/gpt-5.3-chat-latest",
    tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
)

```

---

**Node.js**:

```typescript
import { AgentSession } from '@livekit/agents';

session = new AgentSession({
    stt: "deepgram/nova-3:en",
    llm: "openai/gpt-5.3-chat-latest",
    tts: "cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
});

```

For detailed parameter references and model-specific options, see the individual model guides for [LLM](https://docs.livekit.io/agents/models/llm.md), [STT](https://docs.livekit.io/agents/models/stt.md), and [TTS](https://docs.livekit.io/agents/models/tts.md).

## Models

The following tables list all models currently available through LiveKit Inference.

- **[Pricing](https://livekit.com/pricing/inference)**: See the latest pricing for all LiveKit Inference models.

### Large language models (LLM)

| Model family | Model name | Provided by |
| ------------- | ---------- | ----------- |
| DeepSeek | DeepSeek-V3 | Baseten |
|   | DeepSeek-V3.1 | Baseten |
|   | DeepSeek-V3.2 | DeepSeek |
| Gemini | Gemini 2.0 Flash | Google |
|   | Gemini 2.0 Flash-Lite | Google |
|   | Gemini 2.5 Flash | Google |
|   | Gemini 2.5 Flash-Lite | Google |
|   | Gemini 2.5 Pro | Google |
|   | Gemini 3 Flash | Google |
|   | Gemini 3 Pro | Google |
|   | Gemini 3.1 Flash Lite | Google |
|   | Gemini 3.5 Flash | Google |
|   | Gemini 3.1 Pro | Google |
| Kimi | Kimi K2 Instruct | Baseten |
|   | Kimi K2.5 | Baseten |
| OpenAI | GPT-4.1 | Azure, OpenAI |
|   | GPT-4.1 mini | Azure, OpenAI |
|   | GPT-4.1 nano | Azure, OpenAI |
|   | GPT-4o | Azure, OpenAI |
|   | GPT-4o mini | Azure, OpenAI |
|   | GPT-5 | Azure, OpenAI |
|   | GPT-5 mini | Azure, OpenAI |
|   | GPT-5 nano | Azure, OpenAI |
|   | GPT-5.1 | Azure, OpenAI |
|   | GPT-5.1 Chat | Azure, OpenAI |
|   | GPT-5.2 | Azure, OpenAI |
|   | GPT-5.2 Chat | Azure, OpenAI |
|   | GPT-5.3 Chat | Azure, OpenAI |
|   | GPT-5.4 | Azure, OpenAI |
|   | GPT-5.4 mini | OpenAI |
|   | GPT-5.4 nano | OpenAI |
|   | GPT-5.5 | Azure, OpenAI |
|   | ChatGPT Latest | OpenAI |
|   | GPT OSS 120B | Baseten, Cerebras, Groq |
| Azure | ChatGPT Latest | Azure |
| xAI | Grok 4.1 Fast | xAI |
|   | Grok 4.1 Fast Reasoning | xAI |
|   | Grok 4.20 | xAI |
|   | Grok 4.20 Reasoning | xAI |
|   | Grok 4.20 Multi-Agent | xAI |

### Speech-to-text (STT)

| Provider | Model name | Languages |
| -------- | -------- | --------- |
| [Speechmatics](https://docs.livekit.io/agents/models/stt/speechmatics.md) | Speechmatics Enhanced | 61 languages |
|   | Speechmatics Standard | 61 languages |
| [AssemblyAI](https://docs.livekit.io/agents/models/stt/assemblyai.md) | Universal-3 Pro Streaming | 6 languages |
|   | Universal-Streaming | English only |
|   | Universal-Streaming-Multilingual | Multilingual, 6 languages |
| [Cartesia](https://docs.livekit.io/agents/models/stt/cartesia.md) | Ink Whisper | 100 languages |
|   | Ink 2 | English only |
|   | Ink 2 Latest | English only |
|   | Ink 2 (2026-04-15) | English only |
| [Deepgram](https://docs.livekit.io/agents/models/stt/deepgram.md) | Flux | English only |
|   | Flux (Multilingual) | Multilingual, 10 languages |
|   | Nova-2 | Multilingual, 33 languages |
|   | Nova-2 Conversational AI | English only |
|   | Nova-2 Medical | English only |
|   | Nova-2 Phone Call | English only |
|   | Nova-3 | Multilingual, 45 languages |
|   | Nova-3 Medical | English only |
| [ElevenLabs](https://docs.livekit.io/agents/models/stt/elevenlabs.md) | Scribe v2 Realtime | 190 languages |
| [xAI](https://docs.livekit.io/agents/models/stt/xai.md) | Speech to Text | 25 languages |

### Text-to-speech (TTS)

| Provider | Model ID | Languages |
| -------- | -------- | --------- |
| [Cartesia](https://docs.livekit.io/agents/models/tts/cartesia.md) | `cartesia/sonic` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `hi`, `it`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr` |
|   | `cartesia/sonic-2` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `ko` |
|   | `cartesia/sonic-3` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-latest` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-latest` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3.5` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3.5-2026-05-04` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-2025-10-27` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-2026-01-12` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-turbo` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `hi`, `ko` |
| [Deepgram](https://docs.livekit.io/agents/models/tts/deepgram.md) | `deepgram/aura` | `en`, `en-US`, `en-IE`, `en-GB` |
|   | `deepgram/aura-2` | `en`, `en-US`, `en-PH`, `en-GB`, `en-AU`, `es`, `es-CO`, `es-MX`, `es-ES`, `es-419`, `es-AR`, `nl`, `nl-NL`, `fr`, `fr-FR`, `de`, `de-DE`, `it`, `it-IT`, `ja`, `ja-JP` |
| [ElevenLabs](https://docs.livekit.io/agents/models/tts/elevenlabs.md) | `elevenlabs/eleven_flash_v2` | `en` |
|   | `elevenlabs/eleven_flash_v2_5` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
|   | `elevenlabs/eleven_multilingual_v2` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru` |
|   | `elevenlabs/eleven_turbo_v2` | `en` |
|   | `elevenlabs/eleven_turbo_v2_5` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
|   | `elevenlabs/eleven_v3` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
| [Inworld](https://docs.livekit.io/agents/models/tts/inworld.md) | `inworld/inworld-tts-2` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1` | `en`, `es`, `fr`, `ko`, `nl`, `zh`, `de`, `it`, `ja`, `pl`, `pt`, `ru`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1-max` | `en`, `es`, `fr`, `ko`, `nl`, `zh`, `de`, `it`, `ja`, `pl`, `pt`, `ru`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1.5-max` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1.5-mini` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
| [Rime](https://docs.livekit.io/agents/models/tts/rime.md) | `rime/arcana` | `en`, `es`, `fr`, `de`, `hi`, `he`, `ja`, `pt`, `ar` |
|   | `rime/coda` | `en`, `es`, `fr`, `de`, `pt`, `ja` |
|   | `rime/mist` | `en` |
|   | `rime/mistv2` | `en`, `es`, `fr`, `de` |
|   | `rime/mistv3` | `en`, `es`, `fr`, `de`, `hi` |
| [xAI](https://docs.livekit.io/agents/models/tts/xai.md) | `xai/tts-1` | `auto`, `en`, `ar-EG`, `ar-SA`, `ar-AE`, `bn`, `zh`, `fr`, `de`, `hi`, `id`, `it`, `ja`, `ko`, `pt-BR`, `pt-PT`, `ru`, `es-MX`, `es-ES`, `tr`, `vi` |

## Billing

LiveKit Inference billing is based on usage. Discounted rates are available on the Scale plan. Custom rates are available on the Enterprise plan. Refer to the following articles for more information on quotas, limits, and billing for LiveKit Inference. The latest pricing is always available on the [LiveKit Inference pricing page](https://livekit.com/pricing/inference).

- **[Quotas and limits](https://docs.livekit.io/deploy/admin/quotas-and-limits.md)**: Guide to quotas and limits for LiveKit Cloud plans.

- **[Billing](https://docs.livekit.io/deploy/admin/billing.md)**: Guide to LiveKit Cloud invoices and billing cycles.

---

This document was rendered at 2026-06-07T11:33:15.365Z.
For the latest version of this document, see [https://docs.livekit.io/agents/models/inference.md](https://docs.livekit.io/agents/models/inference.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).