Options specific to bulbul:v2

interface TTSV2Options {
    apiKey?: string;
    baseURL?: string;
    enablePreprocessing?: boolean;
    loudness?: number;
    model?: "bulbul:v2";
    pace?: number;
    pitch?: number;
    sampleRate?: number;
    sentenceTokenizer?: tokenize.SentenceTokenizer;
    speaker?: string;
    streaming?: boolean;
    targetLanguageCode?: string;
}

Hierarchy

  • TTSBaseOptions
    • TTSV2Options

Properties

apiKey?: string

Sarvam API key. Defaults to $SARVAM_API_KEY

baseURL?: string

Base URL for the Sarvam API

enablePreprocessing?: boolean

Enable text preprocessing (v2 only)

loudness?: number

Loudness, 0.3 to 3.0 (v2 only)

model?: "bulbul:v2"
pace?: number

Speech pace. v2: 0.3–3.0, v3: 0.5–2.0 (default 1.0)

pitch?: number

Pitch adjustment, -0.75 to 0.75 (v2 only)

sampleRate?: number

Output sample rate in Hz (default 24000)

sentenceTokenizer?: tokenize.SentenceTokenizer

Sentence tokenizer for streaming (default: basic sentence tokenizer)

speaker?: string

Speaker voice (v2 voices). Default: 'anushka'

streaming?: boolean

Whether to use native WebSocket streaming for stream(). Set to false to prefer non-streaming REST synthesis (used by Agent via TTS StreamAdapter). Default: true.

targetLanguageCode?: string

Target language code (BCP-47)