Google Cloud TTS integration guide

How to use the Google Cloud TTS plugin for LiveKit Agents.

Overview

Google Cloud TTS provides a wide voice selection and generates speech with humanlike intonation. With LiveKit's Google Cloud TTS integration and the Agents framework, you can build voice AI applications that sound realistic.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the plugin from PyPI:

pip install "livekit-agents[google]~=1.0rc"

Authentication

Google Cloud credentials must be provided by one of the following methods:

  • Passed in the credentials_info dictionary.
  • Saved in the credentials_file JSON file (GOOGLE_APPLICATION_CREDENTIALS environment variable).
  • Application Default Credentials. To learn more, see How Application Default Credentials works

Usage

Use a Google Cloud TTS in an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import google
session = AgentSession(
tts = google.TTS(
gender="female",
voice_name="en-US-Standard-H",
),
# ... llm, stt, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

languageSpeechLanguages | stringOptionalDefault: en-US

Specify output language. For a full list of languages, see Supported voices and languages.

genderGender | stringOptionalDefault: neutral

Voice gender. Valid values are male, female, and neutral.

voice_namestringOptional

Name of the voice to use for speech. For a full list of voices, see Supported voices and languages.

credentials_infoarrayOptional

Key-value pairs of authentication credential information.

credentials_filestringOptional

Name of the JSON file that contains authentication credentials for Google Cloud.

Customizing speech

Google Cloud TTS supports Speech Synthesis Markup Language (SSML) to customize pronunciation and speech. To learn more, see the SSML reference.

Additional resources

The following resources provide more information about using Google Cloud with LiveKit Agents.