Skip to main content

Cartesia STT

How to use Cartesia STT with LiveKit Agents.

Use in Agent Builder

Create a new agent in your browser using this model

Overview

Cartesia speech-to-text is available in LiveKit Agents through LiveKit Inference and the Cartesia plugin. Pricing for LiveKit Inference is available on the pricing page.

LiveKit Inference

Use LiveKit Inference to access Cartesia STT without a separate Cartesia API key.

Model nameModel IDLanguages
Ink Whisper
cartesia/ink-whisper
enzhdeesrukofrjapttrplcanlarsvitidhifiviheukelmscsrodahutanothurhrbgltlamimlcysktefalvbnsrazslknetmkbreuishynemnbskksqswglmrpasikmsnyosoafockabetgsdguamyilouzfohtpstknnmtsalbmybotlmgastthawlnhabajwsuyue

Usage

To use Cartesia, use the STT class from the inference module:

from livekit.agents import AgentSession, inference
session = AgentSession(
stt=inference.STT(
model="cartesia/ink-whisper",
language="en"
),
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
stt: new inference.STT({
model: "cartesia/ink-whisper",
language: "en"
}),
// ... tts, stt, vad, turnHandling, etc.
});

Parameters

modelstringRequired

The model to use for the STT.

languageLanguageCodeOptional

Language code for the transcription. If not set, the provider default applies.

extra_kwargsdictOptional

Additional parameters to pass to the Cartesia STT API, including min_volume, and max_silence_duration_secs. See the provider's documentation for more information.

In Node.js this parameter is called modelOptions.

String descriptors

As a shortcut, you can also pass a model descriptor string directly to the stt argument in your AgentSession:

from livekit.agents import AgentSession
session = AgentSession(
stt="cartesia/ink-whisper:en",
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
stt: "cartesia/ink-whisper:en",
// ... tts, stt, vad, turnHandling, etc.
});

Plugin

Use the Cartesia plugin to connect directly to Cartesia's STT API with your own API key.

Available in
Python

Installation

Install the plugin from PyPI:

uv add "livekit-agents[cartesia]~=1.4"

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Usage

Use Cartesia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import cartesia
session = AgentSession(
stt = cartesia.STT(
model="ink-whisper"
),
# ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: ink-whisper

Selected model to use for STT. See Cartesia STT models for supported values.

languageLanguageCodeOptionalDefault: en

Language code for the input audio. For supported languages, see Cartesia STT models.

Additional resources

The following resources provide more information about using Cartesia with LiveKit Agents.