Skip to main content

Rime TTS

How to use Rime TTS with LiveKit Agents.

Use in Agent Builder

Create a new agent in your browser using this model

Overview

Rime text-to-speech is available in LiveKit Agents through LiveKit Inference and the Rime plugin. Pricing for LiveKit Inference is available on the pricing page.

Model IDLanguages
rime/arcana
enesfrdearhehijapt
rime/mistv2
enesfrde

LiveKit Inference

Use LiveKit Inference to access Rime TTS without a separate Rime API key.

Usage

The simplest way to use Rime TTS is to pass it to the tts argument in your AgentSession, including the model and voice to use:

from livekit.agents import AgentSession
session = AgentSession(
tts="rime/arcana:celeste",
# ... llm, stt, vad, turn_detection, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
tts: "rime/arcana:celeste",
// ... tts, stt, vad, turn_detection, etc.
});

Parameters

To customize additional parameters, use the TTS class from the inference module:

from livekit.agents import AgentSession, inference
session = AgentSession(
tts=inference.TTS(
model="rime/arcana",
voice="celeste",
language="en"
),
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
tts: new inference.TTS({
model: "rime/arcana",
voice: "celeste",
language: "en"
}),
// ... tts, stt, vad, turn_detection, etc.
});
modelstringRequired

The model ID from the models list.

voicestringRequired

See voices for guidance on selecting a voice.

languagestringOptional

Two-letter language code for the input text. Note that the Rime API uses three-letter abbreviations (e.g. eng for English), but LiveKit Inference uses two-letter codes instead for consistency with other providers.

extra_kwargsdictOptional

Additional parameters to pass to the Rime TTS API. See the provider's documentation for more information.

In Node.js this parameter is called modelOptions.

Voices

LiveKit Inference supports all of the voices available in the Rime API. You can view the default voices and explore the wider set in the API in the Rime voices documentation, and use the voice by copying its name into your LiveKit agent session.

The following is a small sample of the Rime voices available in LiveKit Inference.

Astra

Chipper, upbeat American female

🇺🇸
Celeste

Chill Gen-Z American female

🇺🇸
Luna

Chill but excitable American female

🇺🇸
Ursa

Young, emo American male

🇺🇸

Customizing pronunciation

Rime TTS supports customizing pronunciation. To learn more, see Custom Pronunciation guide.

Plugin

Use the Rime plugin to connect directly to Rime's TTS API with your own API key.

Available in
Python
|
Node.js

Installation

Install the plugin:

uv add "livekit-agents[rime]~=1.4"
pnpm add @livekit/agents-plugin-rime@1.x

Authentication

The Rime plugin requires a Rime API key.

Set RIME_API_KEY in your .env file.

Usage

Use Rime TTS within an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import rime
session = AgentSession(
tts=rime.TTS(
model="arcana",
speaker="celeste",
speed_alpha=0.9,
),
# ... llm, stt, etc.
)
import * as rime from '@livekit/agents-plugin-rime';
const session = new voice.AgentSession({
tts: new rime.TTS({
modelId: "arcana",
speaker: "celeste",
speedAlpha: 0.9,
}),
// ... llm, tts, etc.
});

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstringOptionalDefault: arcana

ID of the model to use. To learn more, see Models.

speakerstringOptionalDefault: celeste

ID of the voice to use for speech generation. To learn more, see Voices.

audio_formatTTSEncodingOptionalDefault: pcm

Audio format to use. Valid values are: pcm and mp3.

sample_rateintegerOptionalDefault: 16000

Sample rate of the generated audio. Set this rate to best match your application needs. To learn more, see Recommendations for reducing response time.

speed_alphafloatOptionalDefault: 1.0

Adjusts the speed of speech. Lower than 1.0 results in faster speech; higher than 1.0 results in slower speech.

reduce_latencybooleanOptionalDefault: false

When set to true, turns off text normalization to reduce the amount of time spent preparing input text for TTS inference. This might result in the mispronunciation of digits and abbreviations. To learn more, see Recommendations for reducing response time.

phonemize_between_bracketsbooleanOptionalDefault: false

When set to true, allows the use of custom pronunciation strings in text. To learn more, see Custom pronunciation.

api_keystringOptionalEnv: RIME_API_KEY

Rime API Key. Required if the environment variable isn't set.

Additional resources

The following resources provide more information about using Rime with LiveKit Agents.