Gladia integration guide

How to use the Gladia STT plugin for LiveKit Agents.

Overview

Gladia provides accurate speech recognition optimized for enterprise use cases. You can use the open source Gladia integration for LiveKit Agents to build voice AI with fast, accurate transcription and optional translation features.

Quick reference

This section provides a brief overview of the Gladia STT plugin. For more information, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[gladia]~=1.0"

Authentication

The Gladia plugin requires a Gladia API key.

Set GLADIA_API_KEY in your .env file.

Initialization

Use Gladia STT in an AgentSession or as a standalone transcription service. For example, you can use this STT in the Voice AI quickstart.

from livekit.plugins import gladia
session = AgentSession(
stt = gladia.STT(),
# ... llm, tts, etc.
)

Realtime translation

To use realtime translation, set translation_enabled to True and specify the expected audio languages in languages and the desired target language in translation_target_languages.

For example, to transcribe and translate a mixed English and French audio stream into English, set the following options:

gladia.STT(
translation_enabled=True,
languages=["en", "fr"],
translation_target_languages=["en"]
)

Note that if you specify more than one target language, the plugin emits a separate transcription event for each. When used in an AgentSession, this adds each transcription to the conversation history, in order, which might confuse the LLM.

Updating options

Use the update_options method to configure the STT on the fly:

gladia_stt = gladia.STT()
gladia_stt.update_options(
languages=["ja", "en"],
translation_enabled=True,
translation_target_languages=["fr"]
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

languageslist[string]OptionalDefault: []

List of languages to use for transcription. If empty, Gladia will auto-detect the language.

code_switchingboolOptionalDefault: false

Enable switching between languages during recognition.

energy_filterboolOptionalDefault: true

Enable voice activity detection with energy filtering.

translation_enabledboolOptionalDefault: false

Enable real-time translation.

translation_target_languageslist[string]OptionalDefault: []

List of target languages for translation.

Additional resources

The following resources provide more information about using Gladia with LiveKit Agents.