Skip to main content

xAI Grok Voice Agent API plugin

How to use xAI's Grok Voice Agent API with LiveKit Agents.

Available in
Python

Overview

The Grok Voice Agent API enables low-latency, two-way voice interactions using Grok models. LiveKit's xAI plugin includes a RealtimeModel class that allows you to create agents with natural, human-like voice conversations.

Grok Voice Agent API is compatible with OpenAI's Realtime API.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the xAI plugin:

uv add "livekit-agents[xai]"

Authentication

The xAI plugin requires an xAI API key.

Set XAI_API_KEY in your .env file.

Usage

Use the Grok Voice Agent API within an AgentSession. For example, you can use it in the Voice AI quickstart.

from livekit.agents import AgentSession
from livekit.plugins import xai
session = AgentSession(
llm=xai.realtime.RealtimeModel(),
)

Parameters

This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference links in the Additional resources section.

voicestrOptionalDefault: 'ara'

Voice to use for speech generation. For a list of available voices, see Available voices.

api_keystrRequiredEnv: XAI_API_KEY

xAI API key.

turn_detectionTurnDetection | NoneOptional

Configuration for turn detection. Server VAD is enabled by default with the following settings: threshold=0.5, prefix_padding_ms=300, silence_duration_ms=200.

Tools

xAI supports provider tools that enable the model to use built-in capabilities executed on the model server. These tools can be used alongside function tools defined in your agent's codebase.

Available tools include:

  • XSearch: Perform keyword search, semantic search, user search, and thread fetch on X
  • WebSearch: Search the web and browse pages
  • FileSearch: Search uploaded knowledge bases (collections) on xAI

For example, the following code shows an agent that retrieves top trending topics and passes them to a function tool for summarization.

from livekit.agents import Agent, AgentSession, RunContext
from livekit.plugins import xai
class MyAgent(Agent):
def __init__(self):
super().__init__(
instructions="you are an AI assistant that have the capability of searching X",
llm=xai.realtime.RealtimeModel(),
tools=[xai.realtime.XSearch()],
)
@function_tool
async def summarize_trending_topics(self, context: RunContext, topics: list[str]) -> str:
"""Summarizes the trending topics, which are provided by other tools.
Args:
topics: The trending topics on X
"""
if len(topics) > 3:
topics = topics[:3]
return f"The top three topics are: {topics}"

Turn detection

The Grok Voice Agent API includes built-in VAD-based turn detection, enabled by default with optimized settings:

from livekit.agents import AgentSession
from livekit.plugins import xai
from openai.types.beta.realtime.session import TurnDetection
session = AgentSession(
llm=xai.RealtimeModel(
turn_detection=TurnDetection(
type="server_vad",
threshold=0.5,
prefix_padding_ms=300,
silence_duration_ms=200,
create_response=True,
interrupt_response=True,
)
),
)
  • threshold — higher values require louder audio to activate, better for noisy environments.
  • prefix_padding_ms — amount of audio to include before detected speech.
  • silence_duration_ms — duration of silence to detect speech stop (shorter = faster turn detection).

Additional resources

The following resources provide more information about using xAI with LiveKit Agents.