LiveKit docs › Models › LLM › Kimi

---

# Kimi LLM

> How to use Kimi models with LiveKit Agents.

- **[Use in Agent Builder](https://cloud.livekit.io/projects/p_/agents/builder/new?llm=moonshotai%2Fkimi-k2.5)**: Create a new agent in your browser using moonshotai/kimi-k2.5

## Overview

Kimi models are available in LiveKit Agents through [LiveKit Inference](https://docs.livekit.io/agents/models/inference.md). Pricing is available on the [pricing page](https://livekit.com/pricing/inference#llm).

## LiveKit Inference

Use [LiveKit Inference](https://docs.livekit.io/agents/models/inference.md) to access Kimi models without a separate Moonshot API key.

| Model name | Model ID | Providers |
| ---------- | -------- | -------- |
| Kimi K2 Instruct | `moonshotai/kimi-k2-instruct` | `baseten` |
| Kimi K2.5 | `moonshotai/kimi-k2.5` | `baseten` |

## Usage

To use Kimi, use the `LLM` class from the `inference` module. You can use this LLM in the [Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md):

**Python**:

```python
from livekit.agents import AgentSession, inference

session = AgentSession(
    llm=inference.LLM(
        model="moonshotai/kimi-k2.5", 
        provider="baseten",
        extra_kwargs={
            "max_completion_tokens": 1000
        }
    ),
    # ... tts, stt, vad, turn_handling, etc.
)

```

---

**Node.js**:

```typescript
import { AgentSession, inference } from '@livekit/agents';

const session = new AgentSession({
    llm: new inference.LLM({ 
        model: "moonshotai/kimi-k2.5", 
        provider: "baseten",
        modelOptions: { 
            max_completion_tokens: 1000 
        }
    }),
    // ... tts, stt, vad, turnHandling, etc.
});

```

### Parameters

The following are parameters for configuring Kimi models with LiveKit Inference. For model behavior parameters like `temperature` and `max_completion_tokens`, see [model parameters](#model-parameters).

- **`model`** _(string)_: The model ID from the [models list](#inference).

- **`provider`** _(string)_ (optional): Set a specific provider to use for the LLM. Refer to the [models list](#inference) for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.

- **`extra_kwargs`** _(dict)_ (optional): Additional parameters to pass to the Kimi Chat Completions API, such as `max_tokens` or `temperature`. See [model parameters](#model-parameters) for supported fields.

In Node.js this parameter is called `modelOptions`.

#### Model parameters

Pass the following parameters inside `extra_kwargs` (Python) or `modelOptions` (Node.js). For more details about each parameter in the list, see [Inference parameters](https://docs.livekit.io/reference/agents/inference-llm-parameters.md).

| Parameter | Type | Default | Notes |
| temperature | `float` | Varies by model | Valid range: `0`–`1`. Fixed at `0.6` for `kimi-k2` and cannot be modified for `kimi-k2.5`. |
| top_p | `float` | Varies by model | Valid range: `0`–`1`. Fixed at `0.95` for `kimi-k2.5` and cannot be modified. |
| max_completion_tokens | `int` | `1024` | Maximum number of tokens to generate. Must not exceed the remaining token budget after inputs. |
| frequency_penalty | `float` | `0` | Reduces the model's likelihood to repeat the same line verbatim. Valid range: `-2.0`–`2.0`. Fixed for `kimi-k2.5`. |
| presence_penalty | `float` | `0` | Increases the model's likelihood to talk about new topics. Valid range: `-2.0`–`2.0`. Fixed for `kimi-k2.5`. |
| stop | `str | list[str]` |  | Sequences that stop generation. Up to 5 sequences, each up to 32 bytes. |
| n | `int` | `1` | Number of completions to generate. Valid range: `1`–`5`. Fixed at `1` for `kimi-k2.5`. |
| prompt_cache_key | `str` |  | Key for caching responses for similar requests. |
| safety_identifier | `str` |  | Hashed user identifier for policy violation monitoring. |

### String descriptors

As a shortcut, you can also pass a [model ID](#inference) directly to the `llm` argument in your `AgentSession`:

**Python**:

```python
from livekit.agents import AgentSession

session = AgentSession(
    llm="moonshotai/kimi-k2.5",
    # ... tts, stt, vad, turn_handling, etc.
)

```

---

**Node.js**:

```typescript
import { AgentSession } from '@livekit/agents';

const session = new AgentSession({
    llm: "moonshotai/kimi-k2.5",
    // ... tts, stt, vad, turnHandling, etc.
});

```

## Additional resources

The following links provide more information about Kimi in LiveKit Inference.

- **[Baseten Plugin](https://docs.livekit.io/agents/models/llm/baseten.md)**: Plugin to use your own Baseten account instead of LiveKit Inference.

- **[Baseten docs](https://docs.baseten.co/development/model-apis/overview)**: Baseten's official Model API documentation.

---

This document was rendered at 2026-06-07T11:35:26.228Z.
For the latest version of this document, see [https://docs.livekit.io/agents/models/llm/kimi.md](https://docs.livekit.io/agents/models/llm/kimi.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).