Interface STTOptions

interface STTOptions {
    agentContext?: string;
    agentContextCarryover?: boolean;
    apiKey?: string;
    baseUrl: string;
    bufferSizeMs: number;
    domain?: string;
    encoding: STTEncoding;
    endOfTurnConfidenceThreshold?: number;
    formatTurns?: boolean;
    inactivityTimeout?: number;
    keytermsPrompt?: string[];
    languageDetection?: boolean;
    maxSpeakers?: number;
    maxTurnSilence?: number;
    minTurnSilence?: number;
    mode?: "min_latency" | "balanced" | "max_accuracy";
    previousContextNTurns?: number;
    prompt?: string;
    sampleRate: number;
    speakerLabels?: boolean;
    speechModel: STTModels;
    vadThreshold?: number;
    voiceFocus?: VoiceFocus;
    voiceFocusThreshold?: number;
}

Properties

`Optional` agentContext

agentContext?: string

Only supported with the Universal-3 Pro model family.

`Optional` agentContextCarryover

agentContextCarryover?: boolean

When the model supports it, let an AgentSession push each assistant reply into agentContext so it is carried into the model's conversation context. Defaults to false; set true to enable. Prior user turns are carried automatically by the model regardless of this flag. Ignored on models without context support.

`Optional` apiKey

apiKey?: string

baseUrl

baseUrl: string

bufferSizeMs

bufferSizeMs: number

How large each chunk of audio is before being sent to AssemblyAI, in milliseconds. Corresponds to Python's buffer_size_seconds (seconds there, ms here per this repo's time-unit convention).

`Optional` domain

domain?: string

encoding

encoding: STTEncoding

`Optional` endOfTurnConfidenceThreshold

endOfTurnConfidenceThreshold?: number

`Optional` formatTurns

formatTurns?: boolean

`Optional` inactivityTimeout

inactivityTimeout?: number

Session inactivity timeout in seconds. AssemblyAI accepts integer values from 5 to 3600; when unset, no inactivity timeout is applied.

`Optional` keytermsPrompt

keytermsPrompt?: string[]

`Optional` languageDetection

languageDetection?: boolean

`Optional` maxSpeakers

maxSpeakers?: number

`Optional` maxTurnSilence

maxTurnSilence?: number

Maximum silence (ms) before end-of-turn is forced regardless of confidence.

`Optional` minTurnSilence

minTurnSilence?: number

Minimum silence (ms) before a confident end-of-turn is finalized.

`Optional` mode

mode?: "min_latency" | "balanced" | "max_accuracy"

Accuracy/latency preset for the Universal-3 Pro model family: min_latency, balanced, or max_accuracy. Explicit turn-silence values still take precedence over mode defaults.

`Optional` previousContextNTurns

previousContextNTurns?: number

Only supported with the Universal-3 Pro model family. Set at connection time only.

`Optional` prompt

prompt?: string

Only supported with the Universal-3 Pro model family.

sampleRate

sampleRate: number

`Optional` speakerLabels

speakerLabels?: boolean

Enable speaker diarization. Note: AssemblyAI will return per-word speaker labels, but the JS framework's stt.SpeechData type does not yet expose a speakerId field (unlike the Python framework), so the labels are not currently surfaced on emitted events. Setting this to true still has effect server-side. Once the base SpeechData interface gains speaker support, #processStreamEvent should forward data.words[].speaker too.

speechModel

speechModel: STTModels

`Optional` vadThreshold

vadThreshold?: number

`Optional` voiceFocus

voiceFocus?: VoiceFocus

Isolate the primary voice and suppress background noise. Connect-time only.

`Optional` voiceFocusThreshold

voiceFocusThreshold?: number

Background audio suppression aggressiveness, from 0.0 to 1.0. Connect-time only.

Interface STTOptions

Index

Properties

Properties

`Optional` agentContext

`Optional` agentContextCarryover

`Optional` apiKey

baseUrl

bufferSizeMs

`Optional` domain

encoding

`Optional` endOfTurnConfidenceThreshold

`Optional` formatTurns

`Optional` inactivityTimeout

`Optional` keytermsPrompt

`Optional` languageDetection

`Optional` maxSpeakers

`Optional` maxTurnSilence

`Optional` minTurnSilence

`Optional` mode

`Optional` previousContextNTurns

`Optional` prompt

sampleRate

`Optional` speakerLabels

speechModel

`Optional` vadThreshold

`Optional` voiceFocus

`Optional` voiceFocusThreshold

Settings

Member Visibility

Theme

On This Page

Interface STTOptions

Index

Properties

Properties

Optional agentContext

Optional agentContextCarryover

Optional apiKey

baseUrl

bufferSizeMs

Optional domain

encoding

Optional endOfTurnConfidenceThreshold

Optional formatTurns

Optional inactivityTimeout

Optional keytermsPrompt

Optional languageDetection

Optional maxSpeakers

Optional maxTurnSilence

Optional minTurnSilence

Optional mode

Optional previousContextNTurns

Optional prompt

sampleRate

Optional speakerLabels

speechModel

Optional vadThreshold

Optional voiceFocus

Optional voiceFocusThreshold

Settings

Member Visibility

Theme

On This Page

`Optional` agentContext

`Optional` agentContextCarryover

`Optional` apiKey

`Optional` domain

`Optional` endOfTurnConfidenceThreshold

`Optional` formatTurns

`Optional` inactivityTimeout

`Optional` keytermsPrompt

`Optional` languageDetection

`Optional` maxSpeakers

`Optional` maxTurnSilence

`Optional` minTurnSilence

`Optional` mode

`Optional` previousContextNTurns

`Optional` prompt

`Optional` speakerLabels

`Optional` vadThreshold

`Optional` voiceFocus

`Optional` voiceFocusThreshold