Create a new agent in your browser using this model
Overview
Cartesia speech-to-text is available in LiveKit Agents through LiveKit Inference and the Cartesia plugin. Pricing for LiveKit Inference is available on the pricing page.
| Model name | Model ID | Languages |
|---|---|---|
| Ink Whisper | cartesia/ink-whisper | enzhdeesrukofrjapttrplcanlarsvitidvihehiukelmscsrodahutanothurhrbgltlamimlcysktefafilvbnsrazslknetmkbreuishynemnbskksqswglmrpasikmsnyosoafockabetgsdguamyilouzfohtpstknnmtsalbmybotlmgastthawlnhabajwsuyue |
LiveKit Inference
Use LiveKit Inference to access Cartesia STT without a separate Cartesia API key.
Usage
To use Cartesia, use the STT class from the inference module:
from livekit.agents import AgentSession, inferencesession = AgentSession(stt=inference.STT(model="cartesia/ink-whisper",language="en"),# ... tts, stt, vad, turn_detection, etc.)
import { AgentSession, inference } from '@livekit/agents';session = new AgentSession({stt: new inference.STT({model: "cartesia/ink-whisper",language: "en"}),// ... tts, stt, vad, turn_detection, etc.});
Parameters
stringRequiredThe model to use for the STT.
stringOptionalLanguage code for the transcription. If not set, the provider default applies.
dictOptionalAdditional parameters to pass to the Cartesia STT API, including min_volume, and max_silence_duration_secs. See the provider's documentation for more information.
In Node.js this parameter is called modelOptions.
String descriptors
As a shortcut, you can also pass a model descriptor string directly to the stt argument in your AgentSession:
from livekit.agents import AgentSessionsession = AgentSession(stt="cartesia/ink-whisper:en",# ... tts, stt, vad, turn_detection, etc.)
import { AgentSession } from '@livekit/agents';session = new AgentSession({stt: "cartesia/ink-whisper:en",// ... tts, stt, vad, turn_detection, etc.});
Plugin
Use the Cartesia plugin to connect directly to Cartesia's STT API with your own API key.
Installation
Install the plugin from PyPI:
uv add "livekit-agents[cartesia]~=1.4"
Authentication
The Cartesia plugin requires a Cartesia API key.
Set CARTESIA_API_KEY in your .env file.
Usage
Use Cartesia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.
from livekit.plugins import cartesiasession = AgentSession(stt = cartesia.STT(model="ink-whisper"),# ... llm, tts, etc.)
Parameters
This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.
stringOptionalDefault: ink-whisperSelected model to use for STT. See Cartesia STT models for supported values.
stringOptionalDefault: enLanguage of input audio in ISO-639-1 format. See Cartesia STT models for supported values.
Additional resources
The following resources provide more information about using Cartesia with LiveKit Agents.
Python package
The livekit-plugins-cartesia package on PyPI.
Plugin reference
Reference for the Cartesia STT plugin.
GitHub repo
View the source or contribute to the LiveKit Cartesia STT plugin.
Cartesia docs
Cartesia STT docs.
Voice AI quickstart
Get started with LiveKit Agents and Cartesia STT.
Cartesia TTS
Guide to the Cartesia TTS plugin with LiveKit Agents.