Cartesia TTS integration guide

How to use the Cartesia TTS plugin for LiveKit Agents.

Try the playground

Chat with a voice assistant built with LiveKit and Cartesia TTS

Try the playground

Overview

Cartesia provides customizable speech synthesis across a number of different languages and produces natural-sounding speech with low latency. You can use the Cartesia TTS plugin for LiveKit Agents to build voice AI applications that sound realistic.

Quick reference

This section includes a brief overview of the Cartesia TTS plugin. For more information, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[cartesia]~=1.0rc"

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Usage

Use Cartesia TTS within an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import cartesia
session = AgentSession(
tts=cartesia.TTS(
model="sonic-english",
voice="c2ac25f9-ecc4-4f56-9095-651354df60c0",
speed=0.8,
emotion=["curiosity:highest", "positivity:high"],
)
# ... llm, stt, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: sonic

ID of the model to use for generation. See supported models.

voicestring | list[float]OptionalDefault: c2ac25f9-ecc4-4f56-9095-651354df60c0

ID of the voice to use for generation, or an embedding array. See official documentation.

speedstring | floatOptionalDefault: 1.0

Speed of generated speech. Either a float in range [-1.0, 1.0], or one of "fastest", "fast", "normal", "slow", "slowest". See speed options.

emotionlist[string]OptionalDefault: neutral

Emotion of generated speech. See emotion options.

languagestringOptionalDefault: en

Language of input text in ISO-639-1 format. For a list of languages support by model, see supported models.

Customizing pronunciation

Cartesia TTS allows you to customize pronunciation using Speech Synthesis Markup Language (SSML). To learn more, see Specify Custom Pronunciations.

Additional resources

The following resources provide more information about using Cartesia with LiveKit Agents.

Voice AI quickstart

Get started with LiveKit Agents and Cartesia TTS.

Was this page helpful?