Play with the Gemini Multimodal Live API in this LiveKit-powered playground

Google AI ecosystem support
Google AI provides some of the most powerful AI models and services today, which integrate into LiveKit Agents in the following ways:
- Gemini: A family of general purpose high-performance LLMs.
- Google Cloud STT and TTS: Affordable, production-grade models for transcription and speech synthesis.
- Gemini Multimodal Live API: A speech-to-speech realtime model with live video input.
LiveKit Agents supports Google AI through the Gemini API and Vertex AI.
Getting started
Use the Voice AI quickstart to build a voice AI app with Gemini. Select an STT-LLM-TTS pipeline model type and add the following components to build on Gemini.
Voice AI quickstart
Build your first voice AI app with Google Gemini.
Install the Google plugin:
pip install "livekit-agents[google]~=1.0rc"
Add your Google API key to your .env.
file:
GOOGLE_API_KEY=<your-google-api-key>
Use the Google LLM component to initialize your AgentSession
:
from livekit.plugins import google# ...# in your entrypoint functionsession = AgentSession(llm=google.LLM(model="gemini-2.0-flash",),# ... stt, tts,vad, turn_detection, etc.)
LiveKit Agents overview
LiveKit Agents is an open-source framework for building realtime AI apps using WebRTC transport to end-user devices and WebSockets or HTTPS for backend services.
- Agent workflows: Build complex voice AI apps with discrete stages and handoffs.
- Telephony: Inbound and outbound calling using SIP trunks.
- Frontend SDKs: Full-featured SDKs and UI components for JavaScript, Swift, Kotlin, Flutter, React Native, and Unity.
- Python and Node.js: Build voice AI apps in Python or Node.js.
- Dispatch and load balancing: Built-in support for request distribution and load balancing.
- LiveKit Cloud: Fully-managed LiveKit server with global scale and low latency (you can also self-host).
WebRTC provides significant advantages over other options for building realtime applications such as websockets.
- Optimized for media: Purpose-built for audio and video with advanced codecs and compression algorithms.
- Network resilient: Performs reliably even in challenging network conditions due to UDP, adaptive bitrate, and more.
- Broad compatibility: Natively supported in all modern browsers.
LiveKit handles all of the complexity of running production-grade WebRTC infrastructure while extending support to mobile apps, backends, and telephony.
Google plugin documentation
The following links provide more information on each available Google component in LiveKit Agents.