Skip to main content

Cartesia STT

How to use Cartesia STT with LiveKit Agents.

Use in Agent Builder

Create a new agent in your browser using this model

Overview

Cartesia speech-to-text is available in LiveKit Agents through LiveKit Inference and the Cartesia plugin. Pricing for LiveKit Inference is available on the pricing page.

Model nameModel IDLanguages
Ink Whisper
cartesia/ink-whisper
enzhdeesrukofrjapttrplcanlarsvitidvihehiukelmscsrodahutanothurhrbgltlamimlcysktefafilvbnsrazslknetmkbreuishynemnbskksqswglmrpasikmsnyosoafockabetgsdguamyilouzfohtpstknnmtsalbmybotlmgastthawlnhabajwsuyue

LiveKit Inference

Use LiveKit Inference to access Cartesia STT without a separate Cartesia API key.

Usage

To use Cartesia, use the STT class from the inference module:

from livekit.agents import AgentSession, inference
session = AgentSession(
stt=inference.STT(
model="cartesia/ink-whisper",
language="en"
),
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
stt: new inference.STT({
model: "cartesia/ink-whisper",
language: "en"
}),
// ... tts, stt, vad, turn_detection, etc.
});

Parameters

modelstringRequired

The model to use for the STT.

languagestringOptional

Language code for the transcription. If not set, the provider default applies.

extra_kwargsdictOptional

Additional parameters to pass to the Cartesia STT API, including min_volume, and max_silence_duration_secs. See the provider's documentation for more information.

In Node.js this parameter is called modelOptions.

String descriptors

As a shortcut, you can also pass a model descriptor string directly to the stt argument in your AgentSession:

from livekit.agents import AgentSession
session = AgentSession(
stt="cartesia/ink-whisper:en",
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
stt: "cartesia/ink-whisper:en",
// ... tts, stt, vad, turn_detection, etc.
});

Plugin

Use the Cartesia plugin to connect directly to Cartesia's STT API with your own API key.

Available in
Python

Installation

Install the plugin from PyPI:

uv add "livekit-agents[cartesia]~=1.4"

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Usage

Use Cartesia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import cartesia
session = AgentSession(
stt = cartesia.STT(
model="ink-whisper"
),
# ... llm, tts, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: ink-whisper

Selected model to use for STT. See Cartesia STT models for supported values.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format. See Cartesia STT models for supported values.

Additional resources

The following resources provide more information about using Cartesia with LiveKit Agents.