Gemini TTS plugin guide | LiveKit Documentation

Available in

Beta

Python

Node.js

Overview

This plugin allows you to use Gemini TTS as a TTS provider for your voice agents.

Installation

Install the plugin from PyPI:

uv add "livekit-agents[google]~=1.5"

pnpm add @livekit/agents-plugin-google@1.x

Authentication

Credentials must be provided by one of the following methods:

For Vertex AI, you must set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the service account key file. For more information about mounting files as secrets when deploying to LiveKit Cloud, see File-mounted secrets.
To use Gemini API: Set the api_key argument or the GOOGLE_API_KEY environment variable.

Usage

Use a Gemini TTS in an AgentSession or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.

from livekit.plugins import google

session = AgentSession(
  tts = google.beta.GeminiTTS(
   model="gemini-2.5-flash-preview-tts",
   voice_name="Zephyr",
   instructions="Speak in a friendly and engaging tone.",
  ),
    # ... llm, stt, etc.
  )

import * as google from '@livekit/agents-plugin-google';

const session = new voice.AgentSession({
    tts: new google.beta.TTS(
        model: "gemini-2.5-flash-preview-tts",
        voiceName: "Zephyr",
        instructions: "Speak in a friendly and engaging tone.",
    ),
    // ... llm, stt, etc.
});

Parameters

This section describes some of the available parameters. See the plugin reference links in the Additional resources section for a complete list of all available parameters.

modelstringDefault: gemini-2.5-flash-preview-tts

The model to use for speech generation. For a list of models, see Supported models .

voice_namestringDefault: Kore

Voice name. For supported voices, see Voice options .

voice_namestring

Name of the voice to use for speech. For a full list of voices, see Supported voices and languages .

instructionsstring

Prompt to control the style, tone, accent, and pace. To learn more, see Controlling speech style with prompts .

customPronunciationsCustomPronunciations

Available inNode.js

Pronunciation instructions for the Gemini model. Pass an object with a pronunciations array, where each entry specifies a phrase, pronunciation, and optional phoneticEncoding. These instructions are formatted as prompt text sent to the Gemini model alongside the speech generation request.

const tts = new google.beta.TTS({
  model: 'gemini-2.5-flash-preview-tts',
  voice: 'Aoede',
  customPronunciations: {
    pronunciations: [
      {
        phrase: 'LiveKit',
        pronunciation: 'Live Kit',
      },
      {
        phrase: 'gRPC',
        pronunciation: 'gee arr pee see',
        phoneticEncoding: 'IPA', // optional
      },
    ],
  },
});

Additional resources

The following resources provide more information about using Gemini TTS with LiveKit Agents.

Python plugin

Reference GitHub PyPI

Node.js plugin

Reference GitHub NPM

Gemini TTS docs

Gemini Developer API docs for TTS.

Voice AI quickstart

Get started with LiveKit Agents and Gemini TTS.

Google ecosystem guide

Overview of the entire Google AI and LiveKit Agents integration.