ElevenLabs TTS integration guide

How to use the ElevenLabs TTS plugin for LiveKit Agents.

Overview

ElevenLabs provides an AI text-to-speech (TTS) service with thousands of human-like voices across a number of different languages. With LiveKit's ElevenLabs integration and the Agents framework, you can build voice AI applications that sound realistic.

Quick reference

This section provides a quick reference for the ElevenLabs TTS plugin. For more information, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[elevenlabs]~=1.0rc"

Authentication

The ElevenLabs plugin requires an ElevenLabs API key.

Set ELEVEN_API_KEY in your .env file.

Usage

Use ElevenLabs TTS within an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import elevenlabs
session = AgentSession(
tts=elevenlabs.TTS(
model="eleven_turbo_v2_5",
voice=elevenlabs.Voice(
id="EXAVITQu4vr4xnSDxMaL",
name="Bella",
category="premade",
settings=VoiceSettings(
stability=0.71,
speed=1.0,
similarity_boost=0.5,
style=0.0,
use_speaker_boost=True,
),
),
language="en",
streaming_latency=3,
enable_ssml_parsing=False,
chunk_length_schedule=[80, 120, 200, 260],
)
# ... llm, stt, etc.
)

Parameters

This section describes some of the parameters you can set when you create an ElevenLabs TTS. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: eleven_turbo_v2_5

ID of the model to use for generation. To learn more, see the ElevenLabs documentation.

voiceVoiceOptionalDefault: DEFAULT_VOICE

Voice configuration. To learn more, see the ElevenLabs documentation.

  • idstringRequired

    ID of the voice to use for generation. To learn more, see the ElevenLabs documentation.

  • namestringRequired
  • categorystringRequired
  • settingsVoiceSettingsOptional

    Voice settings to override default settings for a given voice. To learn more, see the voice_settings.

    • stabilityfloatRequired
    • similarity_boostfloatRequired
    • stylefloatOptionalDefault: 0
    • use_speaker_boostboolOptionalDefault: true
languagestringOptionalDefault: en

Language of output audio in ISO-639-1 format. To learn more, see the ElevenLabs documentation.

streaming_latencyintOptionalDefault: 3

Latency in seconds for streaming.

enable_ssml_parsingboolOptionalDefault: false

Enable Speech Synthesis Markup Language (SSML) parsing for input text. Set to true to customize pronunciation using SSML.

chunk_length_schedulelist[int]OptionalDefault: [80, 120, 200, 260]

Schedule for chunk lengths. Valid values range from 50 to 500.

Customizing pronunciation

ElevenLabs supports customizing pronunciation for specific words or phrases using SSML phoneme tags. This is useful to ensure correct pronunciation of certain words, even when missing from the voice's lexicon. To learn more, see Pronunciation.

Additional resources

The following resources provide more information about using ElevenLabs with LiveKit Agents.

Voice AI quickstart

Get started with LiveKit Agents and ElevenLabs TTS.