Overview
Speechmatics provides enterprise-grade speech-to-text APIs. Their advanced speech models deliver highly accurate transcriptions across diverse languages, dialects, and accents. You can use the LiveKit Speechmatics plugin with the Agents framework to build voice AI agents that provide reliable, real-time transcriptions.
Quick reference
This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.
Installation
Install the plugin from PyPI:
pip install "livekit-agents[speechmatics]~=1.0"
Authentication
The Speechmatics plugin requires an API key.
Set SPEECHMATICS_API_KEY
in your .env
file.
Usage
Use Speechmatics STT in an AgentSession
or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.
from livekit.plugins import speechmaticssession = AgentSession(stt = speechmatics.STT(transcription_config=speechmatics.types.TranscriptionConfig(operating_point="enhanced",enable_partials=True,language="en",output_locale="en-US",diarization="speaker",enable_entities=True,additional_vocab=[{"content": "financial crisis"},{"content": "gnocchi","sounds_like": ["nyohki","nokey","nochi"]},{"content": "CEO","sounds_like": ["C.E.O."]}],max_delay=0.7,max_delay_mode="flexible"),audio_settings=speechmatics.types.AudioSettings(encoding="pcm_s16le",sample_rate=16000,),),# ... llm, tts, etc.)
Parameters
This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.
Operating point to use for the transcription per required accuracy & complexity. To learn more, see Accuracy Reference.
Partial transcripts allow you to receive preliminary transcriptions and update as more context is available until the higher-accuracy final transcript is returned. Partials are returned faster but without any post-processing such as formatting.
ISO 639-1 language code. All languages are global and can understand different dialects/accents. To see the list of all supported languages, see Supported Languages.
Usage
Create a Speechmatics STT that can be used in a VoiceAgent
or as a standalone transcription service. For example, you can use this STT in the VoiceAgent
quickstart.
Parameters
This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.
Operating point to use for the transcription per required accuracy & complexity. To learn more, see Accuracy Reference.
Partial transcripts allow you to receive preliminary transcriptions and update as more context is available until the higher-accuracy final transcript is returned. Partials are returned faster but without any post-processing such as formatting.
ISO 639-1 language code. All languages are global and can understand different dialects/accents. To see the list of all supported languages, see Supported Languages.
RFC-5646 language code for transcription output. For supported locales, see Output Locale.
Setting this to speaker
enables accurate labeling of different speakers detected with the attributed transcribed output e.g. S1, S2. For more information, visit Speaker Diarization.
Add custom words for each transcription job. To learn more, see Custom Dictionary.
Allows the written form of various entities such as phone numbers, emails, currency, etc to be output in the transcript. To learn more about the supported entities, see Entities.
The delay in seconds between the end of a spoken word and returning the final transcript results.
If set to flexible
, the final transcript is delayed until proper numeral formatting is complete. To learn more, see Numeral Formatting.
Additional resources
The following resources provide more information about using Speechmatics with LiveKit Agents.
Python package
The livekit-plugins-speechmatics
package on PyPI.
Plugin reference
Reference for the Speechmatics STT plugin.
GitHub repo
View the source or contribute to the LiveKit Speechmatics STT plugin.
Speechmatics docs
Speechmatics STT docs.
Voice AI quickstart
Get started with LiveKit Agents and Speechmatics STT.