Skip to main content

Sarvam TTS plugin guide

How to use the Sarvam TTS plugin for LiveKit Agents.

Available in
Python

Overview

This plugin allows you to use Sarvam as a TTS provider for your voice agents.

Installation

Install the plugin from PyPI:

uv add "livekit-agents[sarvam]~=1.4"

Authentication

The Sarvam plugin requires a Sarvam API key.

Set SARVAM_API_KEY in your .env file.

Usage

Use Sarvam TTS within an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import sarvam
session = AgentSession(
tts=sarvam.TTS(
target_language_code="hi-IN",
model="bulbul:v3-beta",
speaker="shubh",
pace=1.0,
)
# ... llm, stt, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

target_language_codestringRequired

BCP-47 language code for supported Indian languages. For example: hi-IN for Hindi, en-IN for Indian English. See documentation for a complete list of supported languages.

modelstringOptionalDefault: bulbul:v2

The Sarvam TTS model to use. Valid values are:

  • bulbul:v2
  • bulbul:v3-beta
speakerstringOptionalDefault: varies by model

Voice to use for synthesis. Default depends on the selected model:

  • anushka for bulbul:v2
  • shubh for bulbul:v3-beta.

Speakers are validated for model compatibility.

pitchfloatOptionalDefault: 0.0

Voice pitch adjustment. Valid range: -20.0 to 20.0. Included in synthesis payload for bulbul:v2.

pacefloatOptionalDefault: 1.0

Speech rate multiplier. Valid range: 0.5 to 2.0.

loudnessfloatOptionalDefault: 1.0

Volume multiplier. Valid range: 0.5 to 2.0. Included in synthesis payload for bulbul:v2.

enable_preprocessingbooleanOptionalDefault: false

Controls whether normalization of English words and numeric entities (for example, numbers and dates) is performed. Set to true for better handling of mixed-language text.

Only supported for bulbul:v2 model. This value is ignored for bulbul:v3-beta.

speech_sample_rateintOptionalDefault: 22050

Output sample rate in Hz. Supported values: 8000, 16000, 22050, 24000.

Additional resources

The following resources provide more information about using Sarvam with LiveKit Agents.