LiveKit docs › Partner spotlight › xAI › xAI LLM

---

# xAI LLM

> How to use xAI's Grok models with LiveKit Agents.

- **[Use in Agent Builder](https://cloud.livekit.io/projects/p_/agents/builder/new?llm=xai%2Fgrok-4-1-fast-non-reasoning)**: Create a new agent in your browser using xai/grok-4-1-fast-non-reasoning

## Overview

xAI's Grok models are available in LiveKit Agents through [LiveKit Inference](https://docs.livekit.io/agents/models/inference.md) and the [xAI plugin](#plugin). With LiveKit Inference, your agent runs on LiveKit's infrastructure to minimize latency. No separate provider API key is required, and usage and rate limits are managed through LiveKit Cloud. Use the plugin instead if you want to manage your own billing and rate limits. Pricing for LiveKit Inference is available on the [pricing page](https://livekit.com/pricing/inference#llm).

## LiveKit Inference

Use [LiveKit Inference](https://docs.livekit.io/agents/models/inference.md) to access Grok models without a separate xAI API key.

| Model name | Model ID | Providers |
| ---------- | -------- | -------- |
| Grok 4.1 Fast | `xai/grok-4-1-fast-non-reasoning` | `xai` |
| Grok 4.1 Fast Reasoning | `xai/grok-4-1-fast-reasoning` | `xai` |
| Grok 4.20 | `xai/grok-4.20-0309-non-reasoning` | `xai` |
| Grok 4.20 Reasoning | `xai/grok-4.20-0309-reasoning` | `xai` |
| Grok 4.20 Multi-Agent | `xai/grok-4.20-multi-agent-0309` | `xai` |

### Usage

To use Grok, use the `LLM` class from the `inference` module. You can use this LLM in the [Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md):

**Python**:

```python
from livekit.agents import AgentSession, inference

session = AgentSession(
    llm=inference.LLM(
        model="xai/grok-4-1-fast-non-reasoning",
        extra_kwargs={
            "max_completion_tokens": 1000
        }
    ),
    # ... tts, stt, vad, turn_handling, etc.
)

```

---

**Node.js**:

```typescript
import { AgentSession, inference } from '@livekit/agents';

const session = new AgentSession({
    llm: new inference.LLM({
        model: "xai/grok-4-1-fast-non-reasoning",
        modelOptions: {
            max_completion_tokens: 1000
        }
    }),
    // ... tts, stt, vad, turnHandling, etc.
});

```

### Parameters

The following are parameters for configuring Grok models with LiveKit Inference. For model behavior parameters like `temperature` and `max_completion_tokens`, see [model parameters](#model-parameters).

- **`model`** _(string)_: The model ID from the [models list](#inference).

- **`provider`** _(string)_ (optional): Set a specific provider to use for the LLM. Refer to the [models list](#inference) for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.

- **`extra_kwargs`** _(dict)_ (optional): Additional parameters to pass to the xAI Chat Completions API, such as `max_completion_tokens` or `temperature`. See [model parameters](#model-parameters) for supported fields.

In Node.js this parameter is called `modelOptions`.

#### Model parameters

Pass the following parameters inside `extra_kwargs` (Python) or `modelOptions` (Node.js). For more details about each parameter in the list, see [Inference parameters](https://docs.livekit.io/reference/agents/inference-llm-parameters.md).

| Parameter | Type | Default | Notes |
| temperature | `float` | `1` | Controls the randomness of the model's output. Valid range: `0`-`2`. |
| top_p | `float` | `1` | Alternative to `temperature`. Model considers the results of the tokens with `top_p` probability mass. Valid range: `0`-`1`. |
| max_completion_tokens | `int` |  | The maximum number of tokens that can be generated in the chat completion. |
| frequency_penalty | `float` | `0` | Positive values decrease the model's likelihood to repeat the same line verbatim. Valid range: `-2.0`-`2.0`. Not supported by reasoning models. |
| presence_penalty | `float` | `0` | Positive values increase the model's likelihood to talk about new topics. Valid range: `-2.0`-`2.0`. Not supported by `grok-3` or reasoning models. |
| stop | `str | list[str]` |  | Up to 4 string sequences (for example, `["\n"]`) that cause the API to stop generating further tokens. |
| logprobs | `bool` |  | If true, returns the log probabilities of each output token. |
| top_logprobs | `int` |  | Number of most likely tokens to return at each token position with associated log probability.

Requires `logprobs: true`. |
| seed | `int` |  | If specified, xAI will make a best effort to sample deterministically for repeated requests with the same seed and parameters. |
| parallel_tool_calls | `bool` | `true` | Whether the model is allowed to call multiple tools simultaneously. |
| tool_choice | `ToolChoice | Literal['auto', 'required', 'none']` | `"auto"` | Controls how the model uses tools. |

### String descriptors

As a shortcut, you can also pass a [model ID](#inference) string directly to the `llm` argument in your `AgentSession`:

**Python**:

```python
from livekit.agents import AgentSession

session = AgentSession(
    llm="xai/grok-4-1-fast-non-reasoning",
    # ... tts, stt, vad, turn_handling, etc.
)

```

---

**Node.js**:

```typescript
import { AgentSession } from '@livekit/agents';

const session = new AgentSession({
    llm: "xai/grok-4-1-fast-non-reasoning",
    // ... tts, stt, vad, turnHandling, etc.
});

```

## Plugin

LiveKit's plugin support for xAI lets you connect directly to xAI's API with your own API key. The Python plugin uses the **Responses API**, which supports xAI provider tools (`WebSearch`, `FileSearch`, `XSearch`).

The Node.js plugin uses xAI's Chat Completions endpoint via the OpenAI plugin. It does not support the Responses API or provider tools.

Available in:
- [x] Node.js
- [x] Python

### Installation

Install the xAI plugin to add xAI support:

**Python**:

```shell
uv add "livekit-agents[xai]~=1.5"

```

---

**Node.js**:

```shell
pnpm add @livekit/agents-plugin-openai@1.x

```

### Authentication

Set the following environment variable in your `.env` file:

```shell
XAI_API_KEY=<your-xai-api-key>

```

### Usage

Use xAI within an `AgentSession` or as a standalone LLM service. For example, you can use this LLM in the [Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md)

**Python**:

```python
from livekit.plugins import xai

# Use Responses API (recommended)
session = AgentSession(
    llm=xai.responses.LLM(
        model="grok-4-1-fast-non-reasoning",
    ),
    # ... tts, stt, vad, turn_handling, etc.
)

```

---

**Node.js**:

```typescript
import * as openai from '@livekit/agents-plugin-openai';

const session = new voice.AgentSession({
    llm: openai.LLM.withXAI({
        model: "grok-3",
    }),
    // ... tts, stt, vad, turnHandling, etc.
});

```

### Parameters

This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference links in the [Additional resources](#additional-resources) section.

- **`model`** _(str)_ (optional) - Default: `grok-4-1-fast-non-reasoning`: Grok model to use. To learn more, see the [xAI Grok models](https://docs.x.ai/docs/models) page.

- **`temperature`** _(float)_ (optional) - Default: `1.0`: Sampling temperature that controls the randomness of the model's output. Higher values make the output more random, while lower values make it more focused and deterministic. Range of valid values can vary by model.

Valid values are between `0` and `2`. To learn more, see the optional parameters for [Responses](https://docs.x.ai/docs/api-reference#create-new-response)

- **`parallel_tool_calls`** _(bool)_ (optional): Controls whether the model can make multiple tool calls in parallel. When enabled, the model can make multiple tool calls simultaneously, which can improve performance for complex tasks.

- **`tool_choice`** _(ToolChoice | Literal['auto', 'required', 'none'])_ (optional) - Default: `auto`: Controls how the model uses tools. String options are as follows:

- `'auto'`: Let the model decide.
- `'required'`: Force tool usage.
- `'none'`: Disable tool usage.

### Provider tools

Available in:
- [ ] Node.js
- [x] Python

xAI supports the following [provider tools](https://docs.livekit.io/agents/logic/tools.md#provider-tools) that enable the model to use built-in capabilities executed on the model server. These tools can be used alongside function tools defined in your agent's codebase. Provider tools work with both the Responses API and the [Grok Voice Agent API](https://docs.livekit.io/agents/models/realtime/plugins/xai.md).

| Tool | Description | Parameters |
| `XSearch` | Search X (Twitter) posts. | `allowed_x_handles` |
| `WebSearch` | Search the web and browse pages. | None |
| `FileSearch` | Search uploaded document [collections](https://docs.x.ai/docs/key-information/collections). | `vector_store_ids` (required), `max_num_results` |

```python
from livekit.plugins import xai

agent = MyAgent(
    llm=xai.responses.LLM(),
    tools=[xai.XSearch(), xai.WebSearch()],  # replace with any supported provider tool
)

```

## Additional resources

The following links provide more information about the xAI Grok LLM integration.

- **[xAI docs](https://docs.x.ai/docs/overview)**: xAI Grok documentation.

- **[Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md)**: Get started with LiveKit Agents and xAI Grok.

- **[Grok Voice Agent API](https://docs.livekit.io/agents/models/realtime/plugins/xai.md)**: Use Grok Voice Agent API for low-latency voice interactions.

---

This document was rendered at 2026-06-07T11:36:47.827Z.
For the latest version of this document, see [https://docs.livekit.io/agents/models/llm/xai.md](https://docs.livekit.io/agents/models/llm/xai.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).