Overview
Cartesia provides advanced speech recognition technology with their Ink-Whisper model, optimized for real-time transcription in conversational settings. With LiveKit's Cartesia integration and the Agents framework, you can build AI agents that provide high-accuracy transcriptions with ultra-low latency.
Quick reference
This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.
Installation
Install the plugin from PyPI:
pip install "livekit-agents[cartesia]~=1.0"
Authentication
The Cartesia plugin requires a Cartesia API key.
Set CARTESIA_API_KEY
in your .env
file.
Usage
Use Cartesia STT in an AgentSession
or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.
from livekit.plugins import cartesiasession = AgentSession(stt = cartesia.STT(model="ink-whisper"),# ... llm, tts, etc.)
Parameters
This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.
Selected model to use for STT. See Cartesia STT models for supported values.
Language of input audio in ISO-639-1 format. See Cartesia STT models for supported values.
Additional resources
The following resources provide more information about using Cartesia with LiveKit Agents.
Python package
The livekit-plugins-cartesia
package on PyPI.
Plugin reference
Reference for the Cartesia STT plugin.
GitHub repo
View the source or contribute to the LiveKit Cartesia STT plugin.
Cartesia docs
Cartesia STT docs.
Voice AI quickstart
Get started with LiveKit Agents and Cartesia STT.
Cartesia TTS
Guide to the Cartesia TTS integration with LiveKit Agents.