Skip to main content

Azure OpenAI LLM plugin guide

How to use the Azure OpenAI LLM plugin for LiveKit Agents.

Available in
Python
|
Node.js

Overview

This plugin allows you to use Azure OpenAI as an LLM provider for your voice agents.

LiveKit Inference

Azure OpenAI is also available in LiveKit Inference, with billing and integration handled automatically. See the docs for more information.

Quick reference

This section includes basic usage and reference material for each API mode. For links to more detailed documentation, see Additional resources.

The Azure OpenAI plugin supports two API modes:

  • The Responses API is available in Python only and is the newer endpoint with support for provider tools such as web search and file search. It is the recommended option for new Python projects when you want the latest capabilities and tooling.
  • The Chat Completions API is available in both Python and Node.js. Use it with Node.js or when you need the established endpoint or compatibility with existing code.

To learn more about OpenAI's different modes, see API modes.

Responses API

Only Available in
Python

The Responses API is the recommended option for Python projects.

Installation

Install the Azure plugin to add Azure OpenAI support:

uv add "livekit-agents[azure]~=1.4"

Authentication

Use an Azure OpenAI API key or a Microsoft Entra ID token. Set the following in your .env file:

  • AZURE_OPENAI_API_KEY or AZURE_OPENAI_ENTRA_TOKEN
  • AZURE_OPENAI_ENDPOINT
  • OPENAI_API_VERSION

Usage

Use the Azure plugin and azure.responses.LLM() within an AgentSession or as a standalone LLM. For example, you can use this LLM in the Voice AI quickstart.

from livekit.plugins import azure
session = AgentSession(
llm=azure.responses.LLM(
model="gpt-4o",
azure_deployment="<model-deployment>",
azure_endpoint="https://<endpoint>.openai.azure.com/", # or AZURE_OPENAI_ENDPOINT
api_version="2024-10-01-preview", # or OPENAI_API_VERSION
),
# ... tts, stt, vad, turn_detection, etc.
)

Chat Completions API

Available in
Python
|
Node.js

Use the Chat Completions API mode with Node.js or when you prefer the Chat Completions format.

Installation

Install the OpenAI plugin to add OpenAI support:

uv add "livekit-agents[openai]~=1.4"
pnpm add @livekit/agents-plugin-openai@1.x

Authentication

Use an Azure OpenAI API key or a Microsoft Entra ID token. Set the following in your .env file:

  • AZURE_OPENAI_API_KEY or AZURE_OPENAI_ENTRA_TOKEN
  • AZURE_OPENAI_ENDPOINT
  • OPENAI_API_VERSION

Usage

Use the Azure OpenAI plugin within an AgentSession or as a standalone LLM. For example, you can use this LLM in the Voice AI quickstart.

from livekit.plugins import openai
session = AgentSession(
llm=openai.LLM.with_azure(
azure_deployment="<model-deployment>",
azure_endpoint="https://<endpoint>.openai.azure.com/", # or AZURE_OPENAI_ENDPOINT
api_key="<api-key>", # or AZURE_OPENAI_API_KEY
api_version="2024-10-01-preview", # or OPENAI_API_VERSION
),
# ... tts, stt, vad, turn_detection, etc.
)
import * as openai from '@livekit/agents-plugin-openai';
const session = new voice.AgentSession({
llm: openai.LLM.withAzure({
azureDeployment: "<model-deployment>",
azureEndpoint: "https://<endpoint>.openai.azure.com/", // or AZURE_OPENAI_ENDPOINT
apiKey: "<api-key>", // or AZURE_OPENAI_API_KEY
apiVersion: "2024-10-01-preview", // or OPENAI_API_VERSION
}),
// ... tts, stt, vad, turn_detection, etc.
});

Parameters

The following parameters are commonly used. For a complete list, see the plugin reference links in the Additional resources section.

azure_deploymentstringRequired

Name of your model deployment.

entra_tokenstringOptional

Microsoft Entra ID authentication token. Required if not using API key authentication. To learn more, see Azure's Authentication documentation.

temperaturefloatOptionalDefault: 0.1

Controls the randomness of the model's output. Higher values, for example 0.8, make the output more random, while lower values, for example 0.2, make it more focused and deterministic.

Valid values are between 0 and 2.

parallel_tool_callsboolOptional

Controls whether the model can make multiple tool calls in parallel. When enabled, the model can make multiple tool calls simultaneously, which can improve performance for complex tasks.

tool_choiceToolChoice | Literal['auto', 'required', 'none']OptionalDefault: auto

Controls how the model uses tools. Set to 'auto' to let the model decide, 'required' to force tool usage, or 'none' to disable tool usage.

Additional resources

The following links provide more information about the Azure OpenAI LLM plugin.