Skip to main content

Cartesia STT integration guide

How to use the Cartesia STT plugin for LiveKit Agents.

Available in
Python

Overview

Cartesia provides advanced speech recognition technology with their Ink-Whisper model, optimized for real-time transcription in conversational settings. With LiveKit's Cartesia integration and the Agents framework, you can build AI agents that provide high-accuracy transcriptions with ultra-low latency.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[cartesia]~=1.2"

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Usage

Use Cartesia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import cartesia
session = AgentSession(
stt = cartesia.STT(
model="ink-whisper"
),
# ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: ink-whisper

Selected model to use for STT. See Cartesia STT models for supported values.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format. See Cartesia STT models for supported values.

Additional resources

The following resources provide more information about using Cartesia with LiveKit Agents.