Skip to main content

Cartesia STT integration guide

How to use the Cartesia STT plugin for LiveKit Agents.

Overview

Cartesia provides advanced speech recognition technology with their Ink-Whisper model, optimized for real-time transcription in conversational settings. With LiveKit's Cartesia integration and the Agents framework, you can build AI agents that provide high-accuracy transcriptions with ultra-low latency.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[cartesia]~=1.0"

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Usage

Use Cartesia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import cartesia
session = AgentSession(
stt = cartesia.STT(
model="ink-whisper"
),
# ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: ink-whisper

Selected model to use for STT. See Cartesia STT models for supported values.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format. See Cartesia STT models for supported values.

Additional resources

The following resources provide more information about using Cartesia with LiveKit Agents.