Overview
LiveKit Inference offers transcription powered by AssemblyAI. Pricing information is available on the pricing page.
| Model name | Model ID | Languages |
|---|---|---|
| Universal-Streaming | assemblyai/universal-streaming | enen-US |
| Universal-Streaming-Multilingual | assemblyai/universal-streaming-multilingual | enen-USen-GBen-AUen-CAen-INen-NZeses-ESes-MXes-ARes-COes-CLes-PEes-VEes-ECes-GTes-CUes-BOes-DOes-HNes-PYes-SVes-NIes-CRes-PAes-UYes-PRfrfr-FRfr-CAfr-BEfr-CHdede-DEde-ATde-CHitit-ITit-CHptpt-BRpt-PT |
Usage
To use AssemblyAI, pass a descriptor with the model and language to the stt argument in your AgentSession:
from livekit.agents import AgentSessionsession = AgentSession(stt="assemblyai/universal-streaming:en",# ... tts, stt, vad, turn_detection, etc.)
import { AgentSession } from '@livekit/agents';session = new AgentSession({stt: "assemblyai/universal-streaming:en",// ... tts, stt, vad, turn_detection, etc.});
Parameters
To customize additional parameters, use the STT class from the inference module:
from livekit.agents import AgentSession, inferencesession = AgentSession(stt=inference.STT(model="assemblyai/universal-streaming",language="en"),# ... tts, stt, vad, turn_detection, etc.)
import { AgentSession, inference } from '@livekit/agents';session = new AgentSession({stt: new inference.STT({model: "assemblyai/universal-streaming",language: "en"}),// ... tts, stt, vad, turn_detection, etc.});
stringRequiredThe model to use for the STT.
stringOptionalLanguage code for the transcription. If not set, the provider default applies.
dictOptionalAdditional parameters to pass to the AssemblyAI Universal Streaming API, including format_turns, end_of_turn_confidence_threshold, min_end_of_turn_silence_when_confident, max_turn_silence, and keyterms_prompt. See the provider's documentation for more information.
In Node.js this parameter is called modelOptions.
Turn detection
AssemblyAI includes a custom phrase endpointing model that uses both audio and linguistic information to detect turn boundaries. To use this model for turn detection, set turn_detection="stt" in the AgentSession constructor. You should also provide a VAD plugin for responsive interruption handling.
session = AgentSession(turn_detection="stt",stt=inference.STT(model="assemblyai/universal-streaming",language="en"),vad=silero.VAD.load(), # Recommended for responsive interruption handling# ... llm, tts, etc.)
Additional resources
The following links provide more information about AssemblyAI in LiveKit Inference.