Overview
This plugin allows you to use Soniox as an STT provider for your voice agents.
Installation
Install the plugin from PyPI:
uv add "livekit-agents[soniox]~=1.4"
Authentication
The Soniox plugin requires an API key from the Soniox console.
Set SONIOX_API_KEY in your .env file.
Usage
Use Soniox STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.
Set STT options for Soniox using the params argument:
from livekit.plugins import sonioxsession = AgentSession(stt=soniox.STT(params=soniox.STTOptions(model="stt-rt-v3",language_hints=["en"])),# ... llm, tts, etc.)
Parameters
The soniox.STT constructor takes an STTOptions object as the params argument. This section describes some of the available options. See the STTOptions reference for a complete list.
stringOptionalDefault: stt-rt-v3The Soniox STT model to use. See documentation for a complete list of supported models.
stringOptionalDefault: NoneFree-form text that provides additional context or vocabulary to bias transcription towards domain-specific terms.
booleanOptionalDefault: trueWhen true, Soniox attempts to identify the language of the input audio.
booleanOptionalDefault: falseSet to True to enable speaker diarization.
Speaker diarization
You can enable speaker diarization so the STT assigns a speaker identifier to each word or segment. When enabled, each token includes a speaker field and the STT reports capabilities.diarization=True.
The following example enables speaker diarization:
from livekit.plugins import sonioxsession = AgentSession(stt=soniox.STT(params=soniox.STTOptions(model="stt-rt-v3",language_hints=["en"],enable_speaker_diarization=True,)),# ... llm, tts, etc.)
You can use MultiSpeakerAdapter to detect the primary speaker and format the transcripts by speaker. To learn more, see Speaker diarization and primary speaker detection.
Additional resources
The following resources provide more information about using Soniox with LiveKit Agents.