Create a new agent in your browser using this model
Overview
Kimi models are available in LiveKit Agents through LiveKit Inference. Pricing is available on the pricing page .
LiveKit Inference
Use LiveKit Inference to access Kimi models without a separate Moonshot API key.
| Model name | Model ID | Providers |
|---|---|---|
Kimi K2.5 | moonshotai/kimi-k2.5 | baseten |
Kimi K2 Instruct Retired | moonshotai/kimi-k2-instruct | baseten |
Usage
To use Kimi, use the LLM class from the inference module. You can use this LLM in the Voice AI quickstart:
from livekit.agents import AgentSession, inferencesession = AgentSession(llm=inference.LLM(model="moonshotai/kimi-k2.5",provider="baseten",extra_kwargs={"max_completion_tokens": 1000}),# ... tts, stt, vad, turn_handling, etc.)
import { AgentSession, inference } from '@livekit/agents';const session = new AgentSession({llm: new inference.LLM({model: "moonshotai/kimi-k2.5",provider: "baseten",modelOptions: {max_completion_tokens: 1000}}),// ... tts, stt, vad, turnHandling, etc.});
Parameters
The following are parameters for configuring Kimi models with LiveKit Inference. For model behavior parameters like temperature and max_completion_tokens, see model parameters.
modelstringThe model ID from the models list.
providerstringSet a specific provider to use for the LLM. Refer to the models list for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.
extra_kwargsdictAdditional parameters to pass to the Kimi Chat Completions API, such as max_tokens or temperature. See model parameters for supported fields.
In Node.js this parameter is called modelOptions.
Model parameters
Pass the following parameters inside extra_kwargs (Python) or modelOptions (Node.js). For more details about each parameter in the list, see Inference parameters.
| Parameter | Type | Default | Notes |
|---|---|---|---|
temperature | float | Varies by model | Valid range: 0–1. Fixed at 0.6 for kimi-k2 and cannot be modified for kimi-k2.5. |
top_p | float | Varies by model | Valid range: 0–1. Fixed at 0.95 for kimi-k2.5 and cannot be modified. |
max_completion_tokens | int | 1024 | Maximum number of tokens to generate. Must not exceed the remaining token budget after inputs. |
frequency_penalty | float | 0 | Reduces the model's likelihood to repeat the same line verbatim. Valid range: -2.0–2.0. Fixed for kimi-k2.5. |
presence_penalty | float | 0 | Increases the model's likelihood to talk about new topics. Valid range: -2.0–2.0. Fixed for kimi-k2.5. |
stop | str | list[str] | Sequences that stop generation. Up to 5 sequences, each up to 32 bytes. | |
n | int | 1 | Number of completions to generate. Valid range: 1–5. Fixed at 1 for kimi-k2.5. |
prompt_cache_key | str | Key for caching responses for similar requests. | |
safety_identifier | str | Hashed user identifier for policy violation monitoring. |
String descriptors
As a shortcut, you can also pass a model ID directly to the llm argument in your AgentSession:
from livekit.agents import AgentSessionsession = AgentSession(llm="moonshotai/kimi-k2.5",# ... tts, stt, vad, turn_handling, etc.)
import { AgentSession } from '@livekit/agents';const session = new AgentSession({llm: "moonshotai/kimi-k2.5",// ... tts, stt, vad, turnHandling, etc.});
Additional resources
The following links provide more information about Kimi in LiveKit Inference.