Skip to main content

Google Gemini LLM

How to use Google Gemini models with LiveKit Agents.

Use in Agent Builder

Create a new agent in your browser using this model

Overview

Google Gemini models are available in LiveKit Agents through LiveKit Inference and the Gemini plugin. With LiveKit Inference, your agent runs on LiveKit's infrastructure to minimize latency. No separate provider API key is required, and usage and rate limits are managed through LiveKit Cloud. Use the plugin instead if you want to manage your own billing and rate limits. Pricing for LiveKit Inference is available on the pricing page.

LiveKit Inference

Use LiveKit Inference to access Gemini models without a separate Google API key.

Model nameModel IDProviders
Gemini 2.5 Flash
google/gemini-2.5-flash
google
Gemini 2.5 Flash-Lite
google/gemini-2.5-flash-lite
google
Gemini 2.5 Pro
google/gemini-2.5-pro
google
Gemini 3 Flash
google/gemini-3-flash-preview
google
Gemini 3.1 Flash Lite
google/gemini-3.1-flash-lite-preview
google
Gemini 3.1 Pro
google/gemini-3.1-pro-preview
google
Gemini 2.0 Flash
Retired
google/gemini-2.0-flash
google
Gemini 2.0 Flash-Lite
Retired
google/gemini-2.0-flash-lite
google
Gemini 3 Pro
Retired
google/gemini-3-pro-preview
google
Retired models
Retired models are no longer accessible. If you're using a retired model, switch to a currently available model.

Usage

To use Gemini, use the LLM class from the inference module. You can use this LLM in the Voice AI quickstart:

from livekit.agents import AgentSession, inference
session = AgentSession(
llm=inference.LLM(
model="google/gemini-2.5-flash-lite",
extra_kwargs={
"max_completion_tokens": 1000
}
),
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession, inference } from '@livekit/agents';
const session = new AgentSession({
llm: new inference.LLM({
model: "google/gemini-2.5-flash-lite",
modelOptions: {
max_completion_tokens: 1000
}
}),
// ... tts, stt, vad, turnHandling, etc.
});

Parameters

The following are parameters for configuring Gemini models with LiveKit Inference. For model behavior parameters like temperature and max_completion_tokens, see model parameters.

model
Required
string

The model ID from the models list.

providerstring

Set a specific provider to use for the LLM. Refer to the models list for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.

extra_kwargsdict

Additional parameters to pass to the Gemini Chat Completions API, such as max_tokens or temperature. See model parameters for supported fields.

In Node.js this parameter is called modelOptions.

Model parameters

Pass the following parameters inside extra_kwargs (Python) or modelOptions (Node.js). For more details about each parameter in the list, see Inference parameters.

ParameterTypeDefaultNotes
temperaturefloat1Controls the randomness of the model's output. Valid range: 0-2.
top_pfloat1Alternative to temperature. Valid range: 0-1.
frequency_penaltyfloat0Reduces the model's likelihood to repeat tokens that have already appeared. Valid range: -2.0-2.0.
presence_penaltyfloat0Increases the model's likelihood to introduce new topics. Valid range: -2.0-2.0.
max_completion_tokensintMaximum number of tokens to generate. Also accepted as max_tokens.
seedintEnables deterministic sampling. The system makes a best effort to return the same result for identical requests.
stopstr | list[str]Sequences that stop generation.
tool_choiceToolChoice | Literal['auto', 'required', 'none']"auto"Controls how the model uses tools.
reasoning_effort"low" | "medium" | "high"Controls thinking depth. Maps to thinking token budgets of 1024, 8192, and 24576 respectively. Only supported by thinking-capable models.

String descriptors

As a shortcut, you can also pass a model ID directly to the llm argument in your AgentSession:

from livekit.agents import AgentSession
session = AgentSession(
llm="google/gemini-2.5-flash-lite",
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession } from '@livekit/agents';
const session = new AgentSession({
llm: "google/gemini-2.5-flash-lite",
// ... tts, stt, vad, turnHandling, etc.
});

Plugin

LiveKit's plugin support for Google lets you connect directly to Google's Gemini API with your own API key.

Available in
Python
|
Node.js

Installation

Install the plugin from PyPI:

uv add "livekit-agents[google]~=1.4"
pnpm add @livekit/agents-plugin-google@1.x

Authentication

The Google plugin requires authentication based on your chosen service:

  • For Vertex AI, you must set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the service account key file. For more information about mounting files as secrets when deploying to LiveKit Cloud, see File-mounted secrets.
  • For Google Gemini API, set the GOOGLE_API_KEY environment variable.

Usage

Use Gemini within an AgentSession or as a standalone LLM service. For example, you can use this LLM in the Voice AI quickstart.

from livekit.plugins import google
session = AgentSession(
llm=google.LLM(
model="gemini-3-flash-preview",
),
# ... tts, stt, vad, turn_handling, etc.
)
import { voice } from '@livekit/agents';
import * as google from '@livekit/agents-plugin-google';
const session = new voice.AgentSession({
llm: new google.LLM({
model: "gemini-3-flash-preview",
}),
// ... tts, stt, vad, turnHandling, etc.
});

Parameters

This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference.

modelChatModels | strDefault: gemini-3-flash-preview

ID of the model to use. For a full list, see Gemini models.

api_keystrEnv: GOOGLE_API_KEY

API key for Google Gemini API.

vertexaiboolDefault: false

True to use Vertex AI; false to use Google AI.

projectstrEnv: GOOGLE_CLOUD_PROJECT

Google Cloud project to use (only if using Vertex AI). Required if using Vertex AI and the environment variable isn't set.

locationstrEnv: GOOGLE_CLOUD_LOCATION

Google Cloud location to use (only if using Vertex AI). Required if using Vertex AI and the environment variable isn't set.

Provider tools

ONLY Available in
Python
Tip

The experimental _gemini_tools parameter used with Google LLMs has been removed in favor of these provider tools.

Google Gemini supports the following provider tools that enable the model to use built-in capabilities executed on the model server. These tools can be used alongside function tools defined in your agent's codebase.

ToolDescriptionParameters
GoogleSearchSearch Google for up-to-date information.exclude_domains, blocking_confidence, time_range_filter
GoogleMapsSearch for places and businesses using Google Maps.auth_config, enable_widget
URLContextProvide context from URLs.None
FileSearchSearch file stores.file_search_store_names (required), top_k, metadata_filter
ToolCodeExecutionExecute code snippets.None
Current limitations

Currently only the Gemini Live API supports using provider tools along with function tools.

When using text models, only provider tools or function tools can be used. See issue #53 for more details.

from livekit.plugins import google
agent = MyAgent(
llm=google.LLM(model="gemini-2.5-flash"),
tools=[google.tools.GoogleSearch()], # replace with any supported provider tool
)

Additional resources

The following resources provide more information about using Google Gemini with LiveKit Agents.