Create a new agent in your browser using this model
Overview
Kimi models are available in LiveKit Agents through LiveKit Inference. Pricing is available on the pricing page.
LiveKit Inference
Use LiveKit Inference to access Kimi models without a separate Moonshot API key.
| Model name | Model ID | Providers |
|---|---|---|
Kimi K2.5 | moonshotai/kimi-k2.5 | baseten |
Kimi K2 Instruct Retired | moonshotai/kimi-k2-instruct | baseten |
Usage
To use Kimi, use the LLM class from the inference module. You can use this LLM in the Voice AI quickstart:
from livekit.agents import AgentSession, inferencesession = AgentSession(llm=inference.LLM(model="moonshotai/kimi-k2.5",provider="baseten",extra_kwargs={"max_completion_tokens": 1000}),# ... tts, stt, vad, turn_handling, etc.)
import { AgentSession, inference } from '@livekit/agents';const session = new AgentSession({llm: new inference.LLM({model: "moonshotai/kimi-k2.5",provider: "baseten",modelOptions: {max_completion_tokens: 1000}}),// ... tts, stt, vad, turnHandling, etc.});
Parameters
The following are parameters for configuring Kimi models with LiveKit Inference. For model behavior parameters like temperature and max_completion_tokens, see model parameters.
modelstringThe model ID from the models list.
providerstringSet a specific provider to use for the LLM. Refer to the models list for available providers. If not set, LiveKit Inference uses the best available provider, and bills accordingly.
extra_kwargsdictAdditional parameters to pass to the Kimi Chat Completions API, such as max_tokens or temperature. See model parameters for supported fields.
In Node.js this parameter is called modelOptions.
Model parameters
Pass the following parameters inside extra_kwargs (Python) or modelOptions (Node.js). For more details about each parameter in the list, see Inference parameters.
| Parameter | Type | Default | Notes |
|---|---|---|---|
temperature | float | Varies by model | Valid range: 0–1. Fixed at 0.6 for kimi-k2 and cannot be modified for kimi-k2.5. |
top_p | float | Varies by model | Valid range: 0–1. Fixed at 0.95 for kimi-k2.5 and cannot be modified. |
max_completion_tokens | int | 1024 | Maximum number of tokens to generate. Must not exceed the remaining token budget after inputs. |
frequency_penalty | float | 0 | Reduces the model's likelihood to repeat the same line verbatim. Valid range: -2.0–2.0. Fixed for kimi-k2.5. |
presence_penalty | float | 0 | Increases the model's likelihood to talk about new topics. Valid range: -2.0–2.0. Fixed for kimi-k2.5. |
stop | str | list[str] | Sequences that stop generation. Up to 5 sequences, each up to 32 bytes. | |
n | int | 1 | Number of completions to generate. Valid range: 1–5. Fixed at 1 for kimi-k2.5. |
prompt_cache_key | str | Key for caching responses for similar requests. | |
safety_identifier | str | Hashed user identifier for policy violation monitoring. |
String descriptors
As a shortcut, you can also pass a model ID directly to the llm argument in your AgentSession:
from livekit.agents import AgentSessionsession = AgentSession(llm="moonshotai/kimi-k2.5",# ... tts, stt, vad, turn_handling, etc.)
import { AgentSession } from '@livekit/agents';const session = new AgentSession({llm: "moonshotai/kimi-k2.5",// ... tts, stt, vad, turnHandling, etc.});
Additional resources
The following links provide more information about Kimi in LiveKit Inference.