Skip to main content

Kimi LLM

How to use Kimi models with LiveKit Agents.

Use in Agent Builder

Create a new agent in your browser using this model

Overview

Kimi models are available in LiveKit Agents through LiveKit Inference. Pricing is available on the pricing page .

LiveKit Inference

Use LiveKit Inference to access Kimi models without a separate Moonshot API key.

Model nameModel IDProviders
Kimi K2.5
moonshotai/kimi-k2.5
baseten
Kimi K2 Instruct
Retired
moonshotai/kimi-k2-instruct
baseten
Retired models
Retired models are no longer accessible. If you're using a retired model, switch to a currently available model.

Usage

To use Kimi, use the LLM class from the inference module. You can use this LLM in the Voice AI quickstart:

from livekit.agents import AgentSession, inference
session = AgentSession(
llm=inference.LLM(
model="moonshotai/kimi-k2.5",
provider="baseten",
extra_kwargs={
"max_completion_tokens": 1000
}
),
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession, inference } from '@livekit/agents';
const session = new AgentSession({
llm: new inference.LLM({
model: "moonshotai/kimi-k2.5",
provider: "baseten",
modelOptions: {
max_completion_tokens: 1000
}
}),
// ... tts, stt, vad, turnHandling, etc.
});

Parameters

The following are parameters for configuring Kimi models with LiveKit Inference. For model behavior parameters like temperature and max_completion_tokens, see model parameters.

model
Required
string

The model ID from the models list.

providerstring

Set a specific provider to use for the LLM. Refer to the models list for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.

extra_kwargsdict

Additional parameters to pass to the Kimi Chat Completions API, such as max_tokens or temperature. See model parameters for supported fields.

In Node.js this parameter is called modelOptions.

Model parameters

Pass the following parameters inside extra_kwargs (Python) or modelOptions (Node.js). For more details about each parameter in the list, see Inference parameters.

ParameterTypeDefaultNotes
temperaturefloatVaries by modelValid range: 01. Fixed at 0.6 for kimi-k2 and cannot be modified for kimi-k2.5.
top_pfloatVaries by modelValid range: 01. Fixed at 0.95 for kimi-k2.5 and cannot be modified.
max_completion_tokensint1024Maximum number of tokens to generate. Must not exceed the remaining token budget after inputs.
frequency_penaltyfloat0Reduces the model's likelihood to repeat the same line verbatim. Valid range: -2.02.0. Fixed for kimi-k2.5.
presence_penaltyfloat0Increases the model's likelihood to talk about new topics. Valid range: -2.02.0. Fixed for kimi-k2.5.
stopstr | list[str]Sequences that stop generation. Up to 5 sequences, each up to 32 bytes.
nint1Number of completions to generate. Valid range: 15. Fixed at 1 for kimi-k2.5.
prompt_cache_keystrKey for caching responses for similar requests.
safety_identifierstrHashed user identifier for policy violation monitoring.

String descriptors

As a shortcut, you can also pass a model ID directly to the llm argument in your AgentSession:

from livekit.agents import AgentSession
session = AgentSession(
llm="moonshotai/kimi-k2.5",
# ... tts, stt, vad, turn_handling, etc.
)
import { AgentSession } from '@livekit/agents';
const session = new AgentSession({
llm: "moonshotai/kimi-k2.5",
// ... tts, stt, vad, turnHandling, etc.
});

Additional resources

The following links provide more information about Kimi in LiveKit Inference.