Skip to main content

Google Gemini LLM

Reference for the Google Gemini models served via LiveKit Inference.

Overview

LiveKit Inference offers Gemini models through Google Vertex AI. Pricing is available on the pricing page.

Model nameModel IDProviders
Gemini 2.5 Progoogle/gemini-2.5-pro
google
Gemini 2.5 Flashgoogle/gemini-2.5-flash
google
Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite
google
Gemini 2.0 Flashgoogle/gemini-2.0-flash
google
Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite
google

Usage

To use Gemini, pass the model id to the llm argument in your AgentSession. LiveKit Inference manages the connection to the model automatically.

from livekit.agents import AgentSession
session = AgentSession(
llm="google/gemini-2.5-flash-lite",
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession } from '@livekit/agents';
session = new AgentSession({
llm: "google/gemini-2.5-flash-lite",
// ... tts, stt, vad, turn_detection, etc.
});

Parameters

To customize additional parameters, use the LLM class from the inference module.

from livekit.agents import AgentSession, inference
session = AgentSession(
llm=inference.LLM(
model="google/gemini-2.5-flash-lite",
extra_kwargs={
"max_completion_tokens": 1000
}
),
# ... tts, stt, vad, turn_detection, etc.
)
import { AgentSession, inference } from '@livekit/agents';
session = new AgentSession({
llm: new inference.LLM({
model: "google/gemini-2.5-flash-lite",
extraKwargs: {
max_completion_tokens: 1000
}
}),
// ... tts, stt, vad, turn_detection, etc.
});
modelstringRequired

The model ID from the models list.

providerstringOptional

Set a specific provider to use for the LLM. Refer to the models list for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.

extra_kwargsdictOptional

Additional parameters to pass to the Gemini Chat Completions API, such as max_completion_tokens.

Additional resources

The following links provide more information about Gemini in LiveKit Inference.