SpeechData contains metadata about this SpeechEvent.

interface SpeechData {
    confidence: number;
    endTime: number;
    language: LanguageCode;
    metadata?: Record<string, unknown>;
    sourceLanguages?: LanguageCode[];
    speakerId?: null | string;
    startTime: number;
    text: string;
    words?: TimedString[];
}

Properties

confidence: number

Confidence score of the transcription (0-1).

endTime: number

End time of the speech segment in seconds.

language: LanguageCode

Language code of the speech.

metadata?: Record<string, unknown>

Optional plugin-specific metadata (e.g. voice profile, provider diagnostics).

Plugins may populate this with provider-specific data that doesn't map to standard fields.

sourceLanguages?: LanguageCode[]

The source languages spoken by the user.

Populated by STT services that support translation, where language holds the target language and sourceLanguages holds the original spoken language(s), or by multi-language detection services where language holds the dominant language and sourceLanguages holds all detected languages sorted by prevalence.

May contain multiple entries when a single utterance spans multiple source languages.

speakerId?: null | string

Speaker identifier when the provider supports diarization.

startTime: number

Start time of the speech segment in seconds.

text: string

Transcribed text.

words?: TimedString[]

Word-level timing information.