Skip to main content

Cerebras and LiveKit

Build voice AI on the world's fastest inference.

Try Cerebras AI

Experience Cerebras's fast inference in a LiveKit-powered voice AI playground

Try Cerebras AI

Cerebras ecosystem support

Cerebras provides high-throughput, low-latency AI inference for open models like Llama and DeepSeek. Cerebras is an OpenAI-compatible LLM provider and LiveKit Agents provides full support for Cerebras inference via the OpenAI plugin.

Getting started

Use the Voice AI quickstart to build a voice AI app with Cerebras. Select an STT-LLM-TTS pipeline model type and add the following components to build on Cerebras.

Voice AI quickstart

Build your first voice AI app with Cerebras.

Install the OpenAI plugin:

pip install "livekit-agents[openai]~=1.0"

Add your Cerebras API key to your .env file:

CEREBRAS_API_KEY=<your-cerebras-api-key>

Use the Cerebras LLM to initialize your AgentSession:

from livekit.plugins import openai
# ...
# in your entrypoint function
session = AgentSession(
llm=openai.LLM.with_cerebras(
model="llama-3.3-70b",
),
)

For a full list of supported models, including DeepSeek, see the Cerebras docs.

LiveKit Agents overview

LiveKit Agents is an open source framework for building realtime AI apps in Python and Node.js. It supports complex voice AI workflows with multiple agents and discrete processing steps, and includes built-in load balancing.

LiveKit provides SIP support for telephony integration and full-featured frontend SDKs in multiple languages. It uses WebRTC transport for end-user devices, enabling high-quality, low-latency realtime experiences. To learn more, see LiveKit Agents.

Further reading

More information about integrating Llama is available in the following article:

Cerebras integration guide

LiveKit docs on Cerebras integration.