Skip to main content

Cartesia STT

Reference for Cartesia STT in LiveKit Inference.

Overview

LiveKit Inference offers transcription powered by Cartesia. Pricing information is available on the pricing page.

Model nameModel IDLanguages
Ink Whisper
cartesia/ink-whisper
enzhdeesrukofrjapttrplcanlarsvitidviheukelmscsrodahutanothurhrbgltlamimlcysktefalvbnsrazslknetmkbreuishynemnbskksqswglmrpasikmsnyosoafockabetgsdguamyilouzfohtpstknnmtsalbmybotlmgastthawlnhabajwsuyue

Usage

To use Cartesia, pass a descriptor with the model and language to the stt argument in your AgentSession:

from livekit.agents import AgentSession
session = AgentSession(
stt="cartesia/ink-whisper:en",
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
stt: "cartesia/ink-whisper:en",
// ... tts, stt, vad, turn_detection, etc.
});

Parameters

To customize additional parameters, use the STT class from the inference module:

from livekit.agents import AgentSession, inference
session = AgentSession(
stt=inference.STT(
model="cartesia/ink-whisper",
language="en"
),
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
stt: new inference.STT({
model: "cartesia/ink-whisper",
language: "en"
}),
// ... tts, stt, vad, turn_detection, etc.
});
modelstringRequired

The model to use for the STT.

languagestringOptional

Language code for the transcription. If not set, the provider default applies.

extra_kwargsdictOptional

Additional parameters to pass to the Cartesia STT API, including min_volume, and max_silence_duration_secs. See the provider's documentation for more information.

Additional resources

The following links provide more information about Cartesia in LiveKit Inference.