Play with the Gemini Live API in this LiveKit-powered playground
Overview
This guide walks you through building a voice AI assistant with Google Gemini and LiveKit Agents. In less than 10 minutes, you have a voice assistant that you can speak to in your terminal, browser, or on the phone.
LiveKit Agents overview
LiveKit Agents is an open source framework for building realtime AI apps in Python and Node.js. It supports complex voice AI workflows with multiple agents and discrete processing steps, and includes built-in load balancing.
LiveKit provides SIP support for telephony integration and full-featured frontend SDKs in multiple languages. It uses WebRTC transport for end-user devices, enabling high-quality, low-latency realtime experiences. To learn more, see LiveKit Agents.
Google AI ecosystem support
Google AI provides some of the most powerful AI models and services today, which integrate into LiveKit Agents in the following ways:
- Gemini Live API: A speech-to-speech realtime model with live video input.
- Gemini: A family of general purpose high-performance LLMs.
- Gemini TTS: A speech synthesis model that generates customizable speech from text.
- Google Cloud STT and TTS: Affordable, production-grade models for transcription and speech synthesis.
LiveKit Agents supports Google AI through the Gemini API and Vertex AI.
Requirements
The following sections describe the minimum requirements to get started:
- LiveKit Agents requires Python >= 3.10.
- This guide uses the uv package manager.
LiveKit Cloud
This guide assumes you have signed up for a free LiveKit Cloud account. LiveKit Cloud includes agent deployment, model inference, and realtime media transport. Create a free project and use the API keys in the following steps to get started.
While this guide assumes LiveKit Cloud, the instructions can be adapted for self-hosting the open-source LiveKit server instead. For self-hosting in production, set up a custom deployment environment.
LiveKit Docs MCP server
If you're using an AI coding assistant, you should install the LiveKit Docs MCP server to get the most out of it. This ensures your agent has access to the latest documentation and examples.
LiveKit CLI
Use the LiveKit CLI to manage LiveKit API keys and deploy your agent to LiveKit Cloud.
Install the LiveKit CLI:
Install the LiveKit CLI with Homebrew:
brew install livekit-clicurl -sSL https://get.livekit.io/cli | bashTipYou can also download the latest precompiled binaries here.
winget install LiveKit.LiveKitCLITipYou can also download the latest precompiled binaries here.
This repo uses Git LFS for embedded video resources. Please ensure git-lfs is installed on your machine before proceeding.
git clone github.com/livekit/livekit-climake installLink your LiveKit Cloud project to the CLI:
lk cloud authThis opens a browser window to authenticate and link your project to the CLI.
AI models
Voice agents require one or more AI models to provide understanding, intelligence, and speech. LiveKit Agents supports both high-performance STT-LLM-TTS voice pipelines constructed from multiple specialized models, as well as realtime models with direct speech-to-speech capabilities.
The rest of this guide presents two options for getting started with Gemini:
Use the Gemini Live API for an expressive and lifelike voice experience with a single realtime model. This is the simplest way to get started with Gemini.
| Model | Required Key |
|---|---|
| Gemini Live API | GOOGLE_API_KEY |
String together three specialized Google services into a high-performance voice pipeline.
| Component | Model |
|---|---|
| STT | Google Cloud STT (Chirp) |
| LLM | Gemini 2.0 Flash |
| TTS | Gemini TTS |
Setup
Use the instructions in the following sections to set up your new project.
Project initialization
Create a new project for the voice agent.
Run the following commands to use uv to create a new project ready to use for your new voice agent:
uv init livekit-gemini-agent --barecd livekit-gemini-agent
Install packages
Install the following packages to build a voice AI agent with the Gemini Live API, noise cancellation, and turn detection:
uv add \"livekit-agents[silero,google]~=1.2" \"livekit-plugins-noise-cancellation~=0.2" \"python-dotenv"
Install the following packages to build a complete voice AI agent with Gemini, noise cancellation, and turn detection:
uv add \"livekit-agents[silero,turn-detector,google]~=1.2" \"livekit-plugins-noise-cancellation~=0.2" \"python-dotenv"
Environment variables
Run the following command to load your LiveKit Cloud API keys into a .env.local file:
lk app env -w
Add your Google API key from the Google AI Studio:
LIVEKIT_API_KEY=<your API Key>LIVEKIT_API_SECRET=<your API Secret>LIVEKIT_URL=<your LiveKit server URL>GOOGLE_API_KEY=<Your Google API Key>
Add your Google API key from the Google AI Studio. For Google Cloud STT, you also need to set up Google Cloud credentials:
LIVEKIT_API_KEY=<your API Key>LIVEKIT_API_SECRET=<your API Secret>LIVEKIT_URL=<your LiveKit server URL>GOOGLE_API_KEY=<Your Google API Key>GOOGLE_APPLICATION_CREDENTIALS=<Path to your Google Cloud service account JSON file>
Google Cloud STT requires a Google Cloud project with the Speech-to-Text API enabled. Create a service account key and download the JSON file. To learn more, see Google Cloud authentication.
Agent code
Create a file with your agent code.
from dotenv import load_dotenvfrom livekit import agents, rtcfrom livekit.agents import AgentServer, AgentSession, Agent, room_iofrom livekit.plugins import (google,noise_cancellation,silero,)load_dotenv(".env.local")class Assistant(Agent):def __init__(self) -> None:super().__init__(instructions="You are a helpful voice AI assistant powered by Gemini.")server = AgentServer()@server.rtc_session()async def my_agent(ctx: agents.JobContext):session = AgentSession(llm=google.realtime.RealtimeModel(voice="Puck",),vad=silero.VAD.load(),)await session.start(room=ctx.room,agent=Assistant(),room_options=room_io.RoomOptions(audio_input=room_io.AudioInputOptions(noise_cancellation=lambda params: noise_cancellation.BVCTelephony() if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP else noise_cancellation.BVC(),),),)await session.generate_reply(instructions="Greet the user and offer your assistance.")if __name__ == "__main__":agents.cli.run_app(server)
from dotenv import load_dotenvfrom livekit import agents, rtcfrom livekit.agents import AgentServer, AgentSession, Agent, room_iofrom livekit.plugins import google, noise_cancellation, silerofrom livekit.plugins.turn_detector.multilingual import MultilingualModelload_dotenv(".env.local")class Assistant(Agent):def __init__(self) -> None:super().__init__(instructions="""You are a helpful voice AI assistant powered by Google.You eagerly assist users with their questions by providing information from your extensive knowledge.Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.You are curious, friendly, and have a sense of humor.""",)server = AgentServer()@server.rtc_session()async def my_agent(ctx: agents.JobContext):session = AgentSession(stt=google.STT(model="chirp",),llm=google.LLM(model="gemini-2.5-flash",),tts=google.TTS(gender="female",voice_name="en-US-Standard-H",),vad=silero.VAD.load(),turn_detection=MultilingualModel(),)await session.start(room=ctx.room,agent=Assistant(),room_options=room_io.RoomOptions(audio_input=room_io.AudioInputOptions(noise_cancellation=lambda params: noise_cancellation.BVCTelephony() if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP else noise_cancellation.BVC(),),),)await session.generate_reply(instructions="Greet the user and offer your assistance.",)if __name__ == "__main__":agents.cli.run_app(server)
Download model files
If you're using the turn-detector plugin, you first need to download the model files:
uv run agent.py download-files
Speak to your agent
Start your agent in console mode to run inside your terminal:
uv run agent.py console
Your agent speaks to you in the terminal, and you can speak to it as well.

Connect to playground
Start your agent in dev mode to connect it to LiveKit and make it available from anywhere on the internet:
uv run agent.py dev
Use the Agents playground to speak with your agent and explore its full range of multimodal capabilities.
Deploy to LiveKit Cloud
From the root of your project, run the following command with the LiveKit CLI. Ensure you have linked your LiveKit Cloud project.
lk agent create
The CLI creates Dockerfile, .dockerignore, and livekit.toml files in your current directory, then registers your agent with your LiveKit Cloud project and deploys it.
After the deployment completes, you can access your agent in the playground, or continue to use the console mode as you build and test your agent locally.
Additional resources
The following links provide more information on each available Google component in LiveKit Agents.
Gemini Vision Assistant
Build a vision-aware voice assistant with Gemini Live.
Gemini LLM
LiveKit Agents plugin for Google Gemini models.
Gemini TTS
LiveKit Agents plugin for Gemini TTS.
Gemini Live API
LiveKit Agents plugin for the Gemini Live API.
Google Cloud STT
LiveKit Agents plugin for Google Cloud STT.
Google Cloud TTS
LiveKit Agents plugin for Google Cloud TTS.