Skip to main content

Cerebras LLM integration guide

How to use the Cerebras inference with LiveKit Agents.

Available in
Python
|
Node.js

Overview

Cerebras provides access to Llama 3.1 and 3.3 models through their inference API. These models are multilingual and text-only, making them suitable for a variety of agent applications.

Usage

Install the OpenAI plugin to add Cerebras support:

pip install "livekit-agents[openai]~=1.2"

Set the following environment variable in your .env file:

CEREBRAS_API_KEY=<your-cerebras-api-key>

Create a Cerebras LLM using the with_cerebras method:

from livekit.plugins import openai
session = AgentSession(
llm=openai.LLM.with_cerebras(
model="llama3.1-8b",
temperature=0.7
),
# ... tts, stt, vad, turn_detection, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstr | CerebrasChatModelsOptionalDefault: llama3.1-8b

Model to use for inference. To learn more, see supported models.

temperaturefloatOptionalDefault: 1.0

A measure of randomness in output. A lower value results in more predictable output, while a higher value results in more creative output.

Valid values are between 0 and 1.5. To learn more, see the Cerebras documentation.

parallel_tool_callsboolOptional

Set to true to parallelize tool calls.

tool_choiceToolChoice | Literal['auto', 'required', 'none']OptionalDefault: auto

Specifies whether to use tools during response generation.

The following links provide more information about the Cerebras LLM integration.

Voice AI quickstart

Get started with LiveKit Agents and Cerebras.