Experience Cerebras's fast inference in a LiveKit-powered voice AI playground

Cerebras ecosystem support
Cerebras provides high-throughput, low-latency AI inference for open models like Llama and DeepSeek. Cerebras is an OpenAI-compatible LLM provider and LiveKit Agents provides full support for Cerebras inference via the OpenAI plugin.
Getting started
Use the Voice AI quickstart to build a voice AI app with Cerebras. Select an STT-LLM-TTS pipeline model type and add the following components to build on Cerebras.
Voice AI quickstart
Build your first voice AI app with Cerebras.
Install the OpenAI plugin:
pip install "livekit-agents[openai]~=1.0"
Add your Cerebras API key to your .env
file:
CEREBRAS_API_KEY=<your-cerebras-api-key>
Use the Cerebras LLM to initialize your AgentSession
:
from livekit.plugins import openai# ...# in your entrypoint functionsession = AgentSession(llm=openai.LLM.with_cerebras(model="llama-3.3-70b",),)
For a full list of supported models, including DeepSeek, see the Cerebras docs.
LiveKit Agents overview
LiveKit Agents is an open source framework for building realtime AI apps in Python and Node.js. It supports complex voice AI workflows with multiple agents and discrete processing steps, and includes built-in load balancing.
LiveKit provides SIP support for telephony integration and full-featured frontend SDKs in multiple languages. It uses WebRTC transport for end-user devices, enabling high-quality, low-latency realtime experiences. To learn more, see LiveKit Agents.
Further reading
More information about integrating Llama is available in the following article:
Cerebras integration guide
LiveKit docs on Cerebras integration.