Cerebras LLM integration guide

How to use the Cerebras inference with LiveKit Agents.

Overview

Cerebras provides access to Llama 3.1 and 3.3 models through their inference API. These models are multilingual and text-only, making them suitable for a variety of agent applications.

Usage

Install the OpenAI plugin to add Cerebras support:

pip install "livekit-agents[openai]~=1.0"

Set the following environment variable in your .env file:

CEREBRAS_API_KEY=<your-cerebras-api-key>

Create a Cerebras LLM using the with_cerebras method:

from livekit.plugins import openai
session = AgentSession(
llm=openai.LLM.with_cerebras(
model="llama3.1-8b",
temperature=0.7
),
# ... tts, stt, vad, turn_detection, etc.
)

Parameters

This section describes some of the available parameters. See the plugin reference for a complete list of all available parameters.

modelstr | CerebrasChatModelsOptionalDefault: llama3.1-8b

Model to use for inference. To learn more, see supported models.

temperaturefloatOptionalDefault: 1.0

A measure of randomness in output. A lower value results in more predictable output, while a higher value results in more creative output.

Valid values are between 0 and 1.5. To learn more, see the Cerebras documentation.

parallel_tool_callsboolOptional

Set to true to parallelize tool calls.

tool_choiceToolChoice | Literal['auto', 'required', 'none']OptionalDefault: auto

Specifies whether to use tools during response generation.

The following links provide more information about the Cerebras LLM integration.

Voice AI quickstart

Get started with LiveKit Agents and Cerebras.