LiveKit docs › Models › TTS › Overview

---

# Text-to-speech (TTS) models overview

> Voices and plugins to add realtime speech to your voice agents.

## Overview

Voice agent speech is produced by a TTS model, configured with a voice profile that specifies tone, accent, and other qualitative characteristics of the speech. The TTS model runs on output from an [LLM](https://docs.livekit.io/agents/models/llm.md) model to speak the agent response to the user.

You can choose a voice model served through LiveKit Inference, included with LiveKit Cloud. With LiveKit Inference, your agent runs on LiveKit's infrastructure to minimize latency. No separate provider API key is required, and usage and rate limits are managed through LiveKit Cloud. Use the plugin instead if you prefer to manage billing and rate limits yourself, or need access to a provider not currently available through LiveKit Inference.

## LiveKit Inference

LiveKit Inference provides a curated set of TTS models with managed billing and rate limits. Each model comes with a limited selection of [Suggested voices](#voices), plus a wider selection through each provider's documentation.

### Models

The following models are available in [LiveKit Inference](https://docs.livekit.io/agents/models.md#inference). Refer to the guide for each model for more details on additional configuration options.

| Provider | Model ID | Languages |
| -------- | -------- | --------- |
| [Cartesia](https://docs.livekit.io/agents/models/tts/cartesia.md) | `cartesia/sonic` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `hi`, `it`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr` |
|   | `cartesia/sonic-2` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `ko` |
|   | `cartesia/sonic-3` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-latest` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-latest` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3.5` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3.5-2026-05-04` | `en`, `de`, `es`, `ja`, `pt`, `zh`, `hi`, `ko`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-2025-10-27` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-3-2026-01-12` | `en`, `de`, `es`, `fr`, `ja`, `pt`, `zh`, `hi`, `ko`, `it`, `nl`, `pl`, `ru`, `sv`, `tr`, `tl`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `hu`, `no`, `vi`, `bn`, `th`, `he`, `ka`, `id`, `te`, `gu`, `kn`, `ml`, `mr`, `pa` |
|   | `cartesia/sonic-turbo` | `en`, `fr`, `de`, `es`, `pt`, `zh`, `ja`, `hi`, `ko` |
| [Deepgram](https://docs.livekit.io/agents/models/tts/deepgram.md) | `deepgram/aura` | `en`, `en-US`, `en-IE`, `en-GB` |
|   | `deepgram/aura-2` | `en`, `en-US`, `en-PH`, `en-GB`, `en-AU`, `es`, `es-CO`, `es-MX`, `es-ES`, `es-419`, `es-AR`, `nl`, `nl-NL`, `fr`, `fr-FR`, `de`, `de-DE`, `it`, `it-IT`, `ja`, `ja-JP` |
| [ElevenLabs](https://docs.livekit.io/agents/models/tts/elevenlabs.md) | `elevenlabs/eleven_flash_v2` | `en` |
|   | `elevenlabs/eleven_flash_v2_5` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
|   | `elevenlabs/eleven_multilingual_v2` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru` |
|   | `elevenlabs/eleven_turbo_v2` | `en` |
|   | `elevenlabs/eleven_turbo_v2_5` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
|   | `elevenlabs/eleven_v3` | `en`, `ja`, `zh`, `de`, `hi`, `fr`, `ko`, `pt`, `it`, `es`, `id`, `nl`, `tr`, `fil`, `pl`, `sv`, `bg`, `ro`, `ar`, `cs`, `el`, `fi`, `hr`, `ms`, `sk`, `da`, `ta`, `uk`, `ru`, `hu`, `no`, `vi` |
| [Inworld](https://docs.livekit.io/agents/models/tts/inworld.md) | `inworld/inworld-tts-2` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1` | `en`, `es`, `fr`, `ko`, `nl`, `zh`, `de`, `it`, `ja`, `pl`, `pt`, `ru`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1-max` | `en`, `es`, `fr`, `ko`, `nl`, `zh`, `de`, `it`, `ja`, `pl`, `pt`, `ru`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1.5-max` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
|   | `inworld/inworld-tts-1.5-mini` | `en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar` |
| [Rime](https://docs.livekit.io/agents/models/tts/rime.md) | `rime/arcana` | `en`, `es`, `fr`, `de`, `hi`, `he`, `ja`, `pt`, `ar` |
|   | `rime/coda` | `en`, `es`, `fr`, `de`, `pt`, `ja` |
|   | `rime/mist` | `en` |
|   | `rime/mistv2` | `en`, `es`, `fr`, `de` |
|   | `rime/mistv3` | `en`, `es`, `fr`, `de`, `hi` |
| [xAI](https://docs.livekit.io/agents/models/tts/xai.md) | `xai/tts-1` | `auto`, `en`, `ar-EG`, `ar-SA`, `ar-AE`, `bn`, `zh`, `fr`, `de`, `hi`, `id`, `it`, `ja`, `ko`, `pt-BR`, `pt-PT`, `ru`, `es-MX`, `es-ES`, `tr`, `vi` |

### Custom voices

You can create and use custom voices with LiveKit Inference. Upload or record a sample, and LiveKit clones it to all [supported TTS providers](https://docs.livekit.io/agents/models/tts/custom-voices.md#providers) on your plan. You can then use the clone in your agent sessions with any of those providers.

- **[Custom voices](https://docs.livekit.io/agents/models/tts/custom-voices.md)**: Create voice clones from short audio samples.

### Suggested voices

The following voices are good choices for overall quality and performance. Each provider has a much larger selection of voices to choose from, which you can find in their documentation. In addition to the voices below, you can choose to use other TTS provider voices through LiveKit Inference.

Click the copy icon to copy the voice ID to use in your agent session.

| Provider | Name | Description | Language | ID |
| -------- | ---- | ----------- | -------- | -------- |
| Cartesia | Blake | Energetic American adult male | `en-US` | `cartesia/sonic-3:a167e0f3-df7e-4d52-a9c3-f949145efdab` |
| Cartesia | Daniela | Calm and trusting Mexican female | `es-MX` | `cartesia/sonic-3:5c5ad5e7-1020-476b-8b91-fdcbe9cc313c` |
| Cartesia | Jacqueline | Confident, young American adult female | `en-US` | `cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc` |
| Cartesia | Robyn | Neutral, mature Australian female | `en-AU` | `cartesia/sonic-3:f31cc6a7-c1e8-4764-980c-60a361443dd1` |
| Deepgram | Apollo | Comfortable, casual male | `en-US` | `deepgram/aura-2:apollo` |
| Deepgram | Athena | Smooth, professional female | `en-US` | `deepgram/aura-2:athena` |
| Deepgram | Odysseus | Calm, professional male | `en-US` | `deepgram/aura-2:odysseus` |
| Deepgram | Theia | Expressive, polite female | `en-AU` | `deepgram/aura-2:theia` |
| ElevenLabs | Alice | Clear and engaging, friendly British woman | `en-GB` | `elevenlabs/eleven_turbo_v2_5:Xb7hH8MSUJpSbSDYk0k2` |
| ElevenLabs | Chris | Natural and real American male | `en-US` | `elevenlabs/eleven_turbo_v2_5:iP95p4xoKVk53GoZ742B` |
| ElevenLabs | Eric | A smooth tenor Mexican male | `es-MX` | `elevenlabs/eleven_turbo_v2_5:cjVigY5qzO86Huf0OWal` |
| ElevenLabs | Jessica | Young and popular, playful American female | `en-US` | `elevenlabs/eleven_turbo_v2_5:cgSgspJ2msm6clMCkdW9` |
| Rime | Astra | Chipper, upbeat American female | `en-US` | `rime/arcana:astra` |
| Rime | Celeste | Chill Gen-Z American female | `en-US` | `rime/arcana:celeste` |
| Rime | Luna | Chill but excitable American female | `en-US` | `rime/arcana:luna` |
| Rime | Ursa | Young, emo American male | `en-US` | `rime/arcana:ursa` |
| Inworld | Ashley | Warm, natural American female | `en-US` | `inworld/inworld-tts-1:Ashley` |
| Inworld | Diego | Soothing, gentle Mexican male | `es-MX` | `inworld/inworld-tts-1:Diego ` |
| Inworld | Edward | Fast-talking, emphatic American male | `en-US` | `inworld/inworld-tts-1:Edward` |
| Inworld | Olivia | Upbeat, friendly British female | `en-GB` | `inworld/inworld-tts-1:Olivia` |
| xAI | Ara | Warm, friendly | `en-US` | `xai/tts-1:ara` |
| xAI | Eve | Energetic, upbeat | `en-US` | `xai/tts-1:eve` |
| xAI | Leo | Authoritative, strong | `en-US` | `xai/tts-1:leo` |
| xAI | Rex | Confident, clear | `en-US` | `xai/tts-1:rex` |

## Plugins

The LiveKit Agents framework includes open source [plugins](https://docs.livekit.io/agents/models.md#plugins) for a wide range of TTS providers. Use a plugin when you need provider-specific features not available through Inference, want to manage billing directly, or need a provider not currently in Inference. Plugins require your own API key and account.

| Provider | Python | Node.js |
| -------- | ------ | ------- |
| [Amazon Polly](https://docs.livekit.io/agents/models/tts/plugins/aws.md) | ✓ | — |
| [AsyncAI](https://docs.livekit.io/agents/models/tts/plugins/asyncai.md) | ✓ | — |
| [Azure AI Speech](https://docs.livekit.io/agents/models/tts/plugins/azure.md) | ✓ | — |
| [Azure OpenAI](https://docs.livekit.io/agents/models/tts/plugins/azure-openai.md) | ✓ | — |
| [Baseten](https://docs.livekit.io/agents/models/tts/plugins/baseten.md) | ✓ | — |
| [Camb.ai](https://docs.livekit.io/agents/models/tts/plugins/cambai.md) | ✓ | — |
| [Cartesia](https://docs.livekit.io/agents/models/tts/plugins/cartesia.md) | ✓ | ✓ |
| [Deepgram](https://docs.livekit.io/agents/models/tts/plugins/deepgram.md) | ✓ | ✓ |
| [ElevenLabs](https://docs.livekit.io/agents/models/tts/plugins/elevenlabs.md) | ✓ | ✓ |
| [Gemini](https://docs.livekit.io/agents/models/tts/plugins/gemini.md) | ✓ | — |
| [Google Cloud](https://docs.livekit.io/agents/models/tts/plugins/google.md) | ✓ | — |
| [Gradium](https://docs.livekit.io/agents/models/tts/plugins/gradium.md) | ✓ | — |
| [Groq](https://docs.livekit.io/agents/models/tts/plugins/groq.md) | ✓ | — |
| [Hume](https://docs.livekit.io/agents/models/tts/plugins/hume.md) | ✓ | — |
| [Inworld](https://docs.livekit.io/agents/models/tts/plugins/inworld.md) | ✓ | ✓ |
| [LMNT](https://docs.livekit.io/agents/models/tts/plugins/lmnt.md) | ✓ | — |
| [MiniMax](https://docs.livekit.io/agents/models/tts/plugins/minimax.md) | ✓ | — |
| [Mistral AI](https://docs.livekit.io/agents/models/tts/plugins/mistralai.md) | ✓ | ✓ |
| [Murf AI](https://docs.livekit.io/agents/models/tts/plugins/murf.md) | ✓ | — |
| [Neuphonic](https://docs.livekit.io/agents/models/tts/plugins/neuphonic.md) | ✓ | ✓ |
| [Nvidia](https://docs.livekit.io/agents/models/tts/plugins/nvidia.md) | ✓ | — |
| [OpenAI](https://docs.livekit.io/agents/models/tts/plugins/openai.md) | ✓ | ✓ |
| [Resemble AI](https://docs.livekit.io/agents/models/tts/plugins/resemble.md) | ✓ | ✓ |
| [Rime](https://docs.livekit.io/agents/models/tts/plugins/rime.md) | ✓ | ✓ |
| [Sarvam](https://docs.livekit.io/agents/models/tts/plugins/sarvam.md) | ✓ | ✓ |
| [Simplismart](https://docs.livekit.io/agents/models/tts/plugins/simplismart.md) | ✓ | — |
| [SLNG](https://docs.livekit.io/agents/models/tts/plugins/slng.md) | ✓ | — |
| [Smallest AI](https://docs.livekit.io/agents/models/tts/plugins/smallestai.md) | ✓ | — |
| [Soniox](https://docs.livekit.io/agents/models/tts/plugins/soniox.md) | ✓ | — |
| [Speechify](https://docs.livekit.io/agents/models/tts/plugins/speechify.md) | ✓ | — |
| [Speechmatics](https://docs.livekit.io/agents/models/tts/plugins/speechmatics.md) | ✓ | — |
| [Spitch](https://docs.livekit.io/agents/models/tts/plugins/spitch.md) | ✓ | — |
| [xAI](https://docs.livekit.io/agents/models/tts/plugins/xai.md) | ✓ | — |

Have another provider in mind? LiveKit is open source and welcomes [new plugin contributions](https://docs.livekit.io/agents/models.md#contribute).

## TTS Usage

To set up TTS in an `AgentSession`, provide a descriptor with both the desired model and voice. LiveKit Inference manages the connection to the model automatically. Consult the [Suggested voices](#voices) list for suggested voices, or view the model reference for more voices.

**Python**:

```python
from livekit.agents import AgentSession, inference

session = AgentSession(
    tts=inference.TTS(
        model="cartesia/sonic-3",
        voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        language="en",
    ),
    # ... llm, stt, etc.
)

```

---

**Node.js**:

```typescript
import { AgentSession, inference } from '@livekit/agents';

const session = new AgentSession({
    tts: new inference.TTS({
        model: "cartesia/sonic-3",
        voice: "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        language: "en",
    }),
    // ... llm, stt, etc.
})

```

### Additional parameters

More configuration options, such as custom pronunciation, are available for each model. To set additional parameters, use the `TTS` class from the `inference` module. Consult each model reference for examples and available parameters.

### Language codes

All TTS plugins and LiveKit Inference use the `LanguageCode` type for the `language` parameter. `LanguageCode` accepts any common language format and normalizes it automatically to [BCP-47](https://www.rfc-editor.org/info/bcp47). You don't need to look up the specific format each provider expects — pass any of the following and the framework handles the conversion:

- ISO 639-1: `"en"`, `"es"`, `"fr"`
- BCP-47 with region: `"en-US"`, `"zh-Hans-CN"`
- ISO 639-3: `"eng"`, `"spa"`
- Language names: `"english"`, `"spanish"`
- Underscored variants: `"en_us"` (normalized to `"en-US"`)

For example, all of the following are equivalent:

**Python**:

```python
from livekit.agents import LanguageCode

LanguageCode("english")  # → "en"
LanguageCode("eng")      # → "en"
LanguageCode("en")       # → "en"
LanguageCode("en-US")    # → "en-US"
LanguageCode("en_us")    # → "en-US"

```

`LanguageCode` is a `str` subclass, so you can use it anywhere a string is expected. It also provides properties for extracting parts of the code:

- `.language`: Base ISO 639-1 code (e.g., `"en"` from `"en-US"`).
- `.region`: Region subtag, if present (e.g., `"US"` from `"en-US"`).
- `.iso`: ISO 639-1 tag with region (e.g., `"zh-CN"` from `"cmn-Hans-CN"`).

---

**Node.js**:

```typescript
import { normalizeLanguage } from '@livekit/agents';

normalizeLanguage("english")  // → "en"
normalizeLanguage("eng")      // → "en"
normalizeLanguage("en")       // → "en"
normalizeLanguage("en-US")    // → "en-US"
normalizeLanguage("en_us")    // → "en-US"

```

In Node.js, `LanguageCode` is a branded `string` type. Use `normalizeLanguage()` to convert a plain string to a `LanguageCode`, and the standalone helper functions to extract parts of the code:

- `getBaseLanguage(lang)`: Base ISO 639-1 code (e.g., `"en"` from `"en-US"`).
- `getLanguageRegion(lang)`: Region subtag, if present (e.g., `"US"` from `"en-US"`).
- `getIsoLanguage(lang)`: ISO 639-1 tag with region (e.g., `"zh-CN"` from `"cmn-Hans-CN"`).

### Custom TTS

To create an entirely custom TTS, implement the [TTS node](https://docs.livekit.io/agents/build/nodes.md#tts_node) in your agent.

### Standalone TTS

You can use a `TTS` instance as a standalone component by creating a stream. Use `push_text` to add text to the stream, and then consume a stream of `SynthesizedAudio` to publish as [realtime audio](https://docs.livekit.io/transport/media.md) to another participant.

Here is an example of a standalone TTS app:

** Filename: `agent.py`**

```python
import asyncio
from livekit import agents, rtc
from livekit.agents import AgentServer
from livekit.agents.tts import SynthesizedAudio
from livekit.plugins import cartesia
from typing import AsyncIterable


server = AgentServer()

@server.rtc_session(agent_name="my-agent")
async def my_agent(ctx: agents.JobContext):
    text_stream: AsyncIterable[str] = ... # you need to provide a stream of text
    audio_source = rtc.AudioSource(44100, 1)

    track = rtc.LocalAudioTrack.create_audio_track("agent-audio", audio_source)
    await ctx.room.local_participant.publish_track(track)

    tts = cartesia.TTS(model="sonic-english")
    tts_stream = tts.stream()

    # create a task to consume and publish audio frames
    asyncio.create_task(send_audio(tts_stream))

    # push text into the stream, TTS stream will emit audio frames along with events
    # indicating sentence (or segment) boundaries.
    async for text in text_stream:
        tts_stream.push_text(text)
    tts_stream.end_input()

    async def send_audio(audio_stream: AsyncIterable[SynthesizedAudio]):
        async for a in audio_stream:
            await audio_source.capture_frame(a.audio.frame)

if __name__ == "__main__":
    agents.cli.run_app(server)

```

## Additional resources

The following resources cover related topics that may be useful for your application.

- **[Agent speech docs](https://docs.livekit.io/agents/build/audio.md)**: Explore the speech capabilities and features of LiveKit Agents.

- **[Pipeline nodes](https://docs.livekit.io/agents/build/nodes.md)**: Learn how to customize the behavior of your agent by overriding nodes in the voice pipeline.

- **[Inference pricing](https://livekit.com/pricing/inference#tts)**: The latest pricing information for TTS models in LiveKit Inference.

---

This document was rendered at 2026-06-07T11:33:40.386Z.
For the latest version of this document, see [https://docs.livekit.io/agents/models/tts.md](https://docs.livekit.io/agents/models/tts.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).