Skip to main content

Large language model (LLM) integrations

Guides for adding LLM integrations to your agents.

Overview

Large language models (LLMs) are a type of AI model that can generate text output from text input. In voice AI apps, they fit between speech-to-text (STT) and text-to-speech (TTS) and are responsible for tool calls and generating the agent's text response.

Available providers

The agents framework includes plugins for the following LLM providers out-of-the-box. Choose a provider from the list below for a step-by-step guide. You can also implement the LLM node to provide custom behavior or an alternative provider. All providers support high-performance, low-latency streaming and tool calls. Support for other features is noted in the following table.

ProviderNotesVisionStructured OutputCustom ModelsAvailable in
Wide range of models from Llama, DeepSeek, Mistral, and more.Python
Claude family of models.Python
Python
Python, Node.js
Models from Llama, DeepSeek, and more.Python
Use a LangGraph workflow for your agent LLM.Python
Mistral family of models (for use with La Plateforme).Python
Python, Node.js
Python, Node.js
Models from Llama and DeepSeek.Python, Node.js
Python, Node.js
Wide range of models from Llama, DeepSeek, Mistral, and more.Python
Stateful API with memory features.Python
Self-hosted models from Llama, DeepSeek, and more.Python
Python, Node.js
Models from Llama, DeepSeek, OpenAI, and Mistral, and more.Python, Node.js
Models from Llama, DeepSeek, Mistral, and more.Python, Node.js
Grok family of models.Python, Node.js

Have another provider in mind? LiveKit is open source and welcomes new plugin contributions.

Realtime models

Realtime models like the OpenAI Realtime API, Gemini Live, and Amazon Nova Sonic are capable of consuming and producing speech directly. LiveKit Agents supports them as an alternative to using an LLM plugin, without the need for STT and TTS. To learn more, see Realtime models.

How to use

The following sections describe high-level usage only.

For more detailed information about installing and using plugins, see the plugins overview.

Usage in AgentSession

Construct an AgentSession or Agent with an LLM instance created by your desired plugin:

from livekit.agents import AgentSession
from livekit.plugins import openai
session = AgentSession(
llm=openai.LLM(model="gpt-4o-mini")
)

Standalone usage

You can also use an LLM instance in a standalone fashion with its simple streaming interface. It expects a ChatContext object, which contains the conversation history. The return value is a stream of ChatChunks. This interface is the same across all LLM providers, regardless of their underlying API design:

from livekit.agents import ChatContext
from livekit.plugins import openai
llm = openai.LLM(model="gpt-4o-mini")
chat_ctx = ChatContext()
chat_ctx.add_message(role="user", content="Hello, this is a test message!")
async with llm.chat(chat_ctx=chat_ctx) as stream:
async for chunk in stream:
print("Received chunk:", chunk.delta)

Tool usage

All LLM providers support tools (sometimes called "functions"). LiveKit Agents has full support for them within an AgentSession. For more information, see Tool definition and use.

Vision usage

All LLM providers support vision within most of their models. LiveKit agents supports vision input from URL or from realtime video frames. Consult your model provider for details on compatible image types, external URL support, and other constraints. For more information, see Vision.

Further reading