Overview
The Grok Voice Agent API enables low-latency, two-way voice interactions using Grok models. LiveKit's xAI plugin includes a RealtimeModel class that allows you to create agents with natural, human-like voice conversations.
Grok Voice Agent API is compatible with OpenAI's Realtime API.
Quick reference
This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.
Installation
Install the xAI plugin:
uv add "livekit-agents[xai]"
Authentication
The xAI plugin requires an xAI API key.
Set XAI_API_KEY in your .env file.
Usage
Use the Grok Voice Agent API within an AgentSession. For example, you can use it in the Voice AI quickstart.
from livekit.agents import AgentSessionfrom livekit.plugins import xaisession = AgentSession(llm=xai.realtime.RealtimeModel(),)
Parameters
This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference links in the Additional resources section.
Voice to use for speech generation. For a list of available voices, see Available voices.
xAI API key.
Configuration for turn detection. Server VAD is enabled by default with the following settings: threshold=0.5, prefix_padding_ms=300, silence_duration_ms=200.
Tools
xAI supports provider tools that enable the model to use built-in capabilities executed on the model server. These tools can be used alongside function tools defined in your agent's codebase.
Available tools include:
XSearch: Perform keyword search, semantic search, user search, and thread fetch on XWebSearch: Search the web and browse pagesFileSearch: Search uploaded knowledge bases (collections) on xAI
For example, the following code shows an agent that retrieves top trending topics and passes them to a function tool for summarization.
from livekit.agents import Agent, AgentSession, RunContextfrom livekit.plugins import xaiclass MyAgent(Agent):def __init__(self):super().__init__(instructions="you are an AI assistant that have the capability of searching X",llm=xai.realtime.RealtimeModel(),tools=[xai.realtime.XSearch()],)@function_toolasync def summarize_trending_topics(self, context: RunContext, topics: list[str]) -> str:"""Summarizes the trending topics, which are provided by other tools.Args:topics: The trending topics on X"""if len(topics) > 3:topics = topics[:3]return f"The top three topics are: {topics}"
Turn detection
The Grok Voice Agent API includes built-in VAD-based turn detection, enabled by default with optimized settings:
from livekit.agents import AgentSessionfrom livekit.plugins import xaifrom openai.types.beta.realtime.session import TurnDetectionsession = AgentSession(llm=xai.RealtimeModel(turn_detection=TurnDetection(type="server_vad",threshold=0.5,prefix_padding_ms=300,silence_duration_ms=200,create_response=True,interrupt_response=True,)),)
threshold— higher values require louder audio to activate, better for noisy environments.prefix_padding_ms— amount of audio to include before detected speech.silence_duration_ms— duration of silence to detect speech stop (shorter = faster turn detection).
Additional resources
The following resources provide more information about using xAI with LiveKit Agents.