Chat with a voice assistant built with LiveKit and Cartesia TTS
Overview
Cartesia provides customizable speech synthesis (TTS) across a number of different languages and produces natural-sounding speech with low latency. With LiveKit's Cartesia integration and the Agents framework, you can build AI voice applications that sound realistic. For a demonstration of what you can build, try out the LiveKit voice assistant with Cartesia.
If you're looking to build an AI voice assistant with Cartesia, check out our Voice Agent Quickstart guide and use the Cartesia TTS module as demonstrated below.
Quick reference
Environment variables
CARTESIA_API_KEY=<your-cartesia-api-key>
TTS
LiveKit's Cartesia integration provides a text-to-speech (TTS) interface. This can be used in a VoicePipelineAgent
or as a standalone speech generator. For a complete reference of all available parameters, see the plugin reference.
Usage
from livekit.plugins.cartesia import ttscartesia_tts = tts.TTS(model="sonic-english",voice="c2ac25f9-ecc4-4f56-9095-651354df60c0",speed=0.8,emotion=["curiosity:highest", "positivity:high"])
Parameters
ID of the model to use for generation. See supported models.
ID of the voice to use for generation, or an embedding array. See official documentation.
Speed of generated speech. Either a float in range [-1.0, 1.0], or one of "fastest"
, "fast"
, "normal"
, "slow"
, "slowest"
. See speed options.
Emotion of generated speech. See emotion options.
Language of input text in ISO-639-1 format. For a list of languages support by model, see supported models.