Soniox STT plugin guide | LiveKit Documentation

Available inPython

Overview

This plugin allows you to use Soniox as an STT provider for your voice agents.

Installation

Install the plugin from PyPI:

uv add "livekit-agents[soniox]~=1.5"

Authentication

The Soniox plugin requires an API key from the Soniox console .

Set SONIOX_API_KEY in your .env file.

Usage

Use Soniox STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

Set STT options for Soniox using the params argument:

from livekit.plugins import soniox

session = AgentSession(
   stt=soniox.STT(
      params=soniox.STTOptions(
         model="stt-rt-v4",
         language_hints=["en"]
      )
   ),
# ... llm, tts, etc.
)

Speaker diarization

You can enable speaker diarization so the STT assigns a speaker identifier to each word or segment. When enabled, each token includes a speaker field and the STT reports capabilities.diarization=True.

The following example enables speaker diarization:

from livekit.plugins import soniox

session = AgentSession(
   stt=soniox.STT(
      params=soniox.STTOptions(
         model="stt-rt-v4",
         language_hints=["en"],
         enable_speaker_diarization=True,
      )
   ),
# ... llm, tts, etc.
)

You can use MultiSpeakerAdapter to detect the primary speaker and format the transcripts by speaker. To learn more, see Speaker diarization and primary speaker detection.

Realtime translation

To use realtime translation , pass a TranslationConfig to STTOptions. Soniox supports two translation modes: one-way and two-way.

One-way translation

To translate from any detected language into a single target language, set type to "one_way" and specify the target_language. For example, to translate any spoken language into English:

from livekit.plugins import soniox

session = AgentSession(
   stt=soniox.STT(
      params=soniox.STTOptions(
         model="stt-rt-v4",
         translation=soniox.TranslationConfig(
            type="one_way",
            target_language="en",
         ),
      )
   ),
   # ... llm, tts, etc.
)

Two-way translation

To translate back and forth between two languages, set type to "two_way" and specify language_a and language_b. For example, to translate between English and Spanish:

from livekit.plugins import soniox

session = AgentSession(
   stt=soniox.STT(
      params=soniox.STTOptions(
         model="stt-rt-v4",
         translation=soniox.TranslationConfig(
            type="two_way",
            language_a="en",
            language_b="es",
         ),
      )
   ),
   # ... llm, tts, etc.
)

When translation is active, the first SpeechData in the alternatives list of each SpeechEvent (alternatives[0]) contains the translated text in the text field. The original spoken language and transcription are available in the source_languages and source_texts fields.

Parameters

The soniox.STT constructor takes an STTOptions object as the params argument. This section describes some of the available options. See the STTOptions reference for a complete list.

modelstringDefault: stt-rt-v4

The Soniox STT model to use. See documentation for a complete list of supported models.

contextstringDefault: None

Free-form text that provides additional context or vocabulary to bias transcription towards domain-specific terms.

enable_language_identificationbooleanDefault: true

When true, Soniox attempts to identify the language of the input audio.

enable_speaker_diarizationbooleanDefault: false

Set to True to enable speaker diarization.

translationTranslationConfigDefault: None

Enable realtime translation. See realtime translation for details and examples.

Additional resources

The following resources provide more information about using Soniox with LiveKit Agents.

Python package

The livekit-plugins-soniox package on PyPI.

Plugin reference

Reference for the Soniox STT plugin.

GitHub repo

View the source or contribute to the LiveKit Soniox STT plugin.

Soniox docs

Soniox's full docs site.

Voice AI quickstart

Get started with LiveKit Agents and Soniox.