Deepgram integration guide

Overview

Deepgram provides advanced speech recognition technology and AI-driven audio processing solutions. Customizable speech models allow you to fine tune transcription performance for your specific use case. With LiveKit's Deepgram integration and the Agents framework, you can build AI agents that provide high-accuracy transcriptions.

Note

If you're looking to build an AI voice assistant with Deepgram, check out our Voice Agent Quickstart guide and use the Deepgram STT and/or TTS module as demonstrated below.

Quick reference

Environment variables

DEEPGRAM_API_KEY=<your-deepgram-api-key>

STT

LiveKit's Deepgram integration provides a speech-to-text (STT) interface that can be used as the first stage in a VoicePipelineAgent or as a standalone transcription service. For a complete reference of all available parameters, see the plugin reference for Python or Node.

Usage

from livekit.plugins.deepgram import stt

deepgram_stt = deepgram.stt.STT(
    model="nova-2-general",
    interim_results=True,
    smart_format=True,
    punctuate=True,
    filler_words=True,
    profanity_filter=False,
    keywords=[("LiveKit", 1.5)],
    language="en-US",
)

Parameters

modelstringOptionalDefault: nova-2-general

ID of the model to use for inference. To learn more, see supported models.

interim_resultsboolOptionalDefault: true

Enable preliminary results before the final transcription is available.

smart_formatboolOptionalDefault: true

Enable smart formatting to improve the readability of transcriptions.

punctuateboolOptionalDefault: true

Enable punctuation in transcriptions.

filler_wordsboolOptionalDefault: true

Enable filler words to improve turn detection.

profanity_filterboolOptionalDefault: false

Replace recognized profanity with asterisks in transcriptions.

keywordslist[tuple[string, float]]OptionalDefault: []

A list of keywords and intensifiers to boost or suppress in transcriptions. Positive values boost; negative values suppress.

languagestringOptionalDefault: en

Language of input audio in ISO-639-1 format.

TTS

LiveKit's Deepgram integration also provides a text-to-speech (TTS) interface. This can be used in a VoicePipelineAgent or as a standalone speech generator. For a complete reference of all available parameters, see the plugin reference.

Usage

from livekit.plugins.deepgram import tts

deepgram_tts = tts.TTS(
  model="aura-asteria-en",
)

Parameters

modelstringOptionalDefault: aura-asteria-en

ID of the model to use for generation. To learn more, see supported models.