Unified agent interface
Agents 1.0 introduces AgentSession
, a single, unified agent orchestrator that serves as the foundation for all types of agents built using the framework. With this change, the VoicePipelineAgent
and MultimodalAgent
classes have been deprecated and 0.x agents will need to be updated to use AgentSession
in order to be compatible with 1.0 and later.
AgentSession
contains a superset of the functionality of VoicePipelineAgent
and MultimodalAgent
, allowing you to switch between pipelined and speech-to-speech models without changing your core application logic.
from livekit.agents import JobContext, llmfrom livekit.agents.pipeline import VoicePipelineAgentfrom livekit.plugins import (cartesia,deepgram,google,silero,)async def entrypoint(ctx: JobContext):initial_ctx = llm.ChatContext().append(role="system",text="You are a helpful voice AI assistant.",)agent = VoicePipelineAgent(vad=silero.VAD.load(),stt=deepgram.STT(),llm=google.LLM(),tts=cartesia.TTS(),)await agent.start(room, participant)await agent.say("Hey, how can I help you today?", allow_interruptions=True)
Customizing pipeline behavior
We’ve introduced more flexibility for developers to customize the behavior of agents built on 1.0 through the new concept of pipeline nodes, which enable custom processing within the pipeline steps while also delegating to the default implementation of each node as needed.
Pipeline nodes replaces the before_llm_cb
and before_tts_cb
callbacks.
before_llm_cb -> llm_node
before_llm_cb
has been replaced by llm_node
. This node can be used to modify the chat context before sending it to LLM, or integrate with custom LLM providers without having to create a plugin. As long as it returns AsyncIterable[llm.ChatChunk], the LLM node will forward the chunks to the next node in the pipeline.
async def add_rag_context(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):rag_context: str = retrieve(chat_ctx)chat_ctx.append(text=rag_context, role="system")agent = VoicePipelineAgent(...before_llm_cb=add_rag_context,)
before_tts_cb -> tts_node
before_tts_cb
has been replaced by tts_node
. This node gives greater flexibility in customizing the TTS pipeline. It's possible to modify the text before synthesis, as well as the audio buffers after synthesis.
def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):# The TTS is incorrectly pronouncing "LiveKit", so we'll replace it with MFA-style IPA# spelling for Cartesiareturn tokenize.utils.replace_words(text=text, replacements={"livekit": r"<<l|aj|v|cʰ|ɪ|t|>>"})agent = VoicePipelineAgent(...before_tts_cb=_before_tts_cb,)
Tool definition and use
Agents 1.0 streamlines the way in which tools are defined for use within your agents, making it easier to add and maintain agent tools. When migrating from 0.x to 1.0, developers will need to make the following changes to existing use of functional calling within their agents in order to be compatible with versions 1.0 and later.
- The
@llm.ai_callable
decorator for function definition has been replaced with the new@function_tool
decorator. - If you define your functions within an
Agent
and use the@function_tool
decorator, these tools are automatically accessible to the LLM. In this scenario, you no longer required to define your functions in allm.FunctionContext
class and pass them into the agent constructor. - Argument types are now inferred from the function signature and docstring. Annotated types are no longer supported.
- Functions take in a
RunContext
object, which provides access to the current agent state.
from livekit.agents import llmfrom livekit.agents.pipeline import VoicePipelineAgentfrom livekit.agents.multimodal import MultimodalAgentclass AssistantFnc(llm.FunctionContext):@llm.ai_callable()async def get_weather(self,...)...fnc_ctx = AssistantFnc()pipeline_agent = VoicePipelineAgent(...fnc_ctx=fnc_ctx,)multimodal_agent = MultimodalAgent(...fnc_ctx=fnc_ctx,)
Chat context
ChatContext has been overhauled in 1.0 to provide a more powerful and flexible API for managing chat history. It now accounts for differences between LLM providers—such as stateless and stateful APIs—while exposing a unified interface.
Chat history can now include three types of items:
ChatMessage
: a message associated with a role (e.g., user, assistant). Each message includes a list ofcontent
items, which can contain text, images, or audio.FunctionCall
: a function call initiated by the LLM.FunctionCallOutput
: the result returned from a function call.
Updating chat context
In 0.x, updating the chat context required modifying chat_ctx.messages directly. This approach was error-prone and difficult to time correctly, especially with realtime APIs.
In v1.0, there are two supported ways to update the chat context:
- Agent handoff – transferring control to a new agent, which will have its own chat context.
- Explicit update - calling
agent.update_chat_ctx()
to modify the context directly.
Transcriptions
Agents 1.0 brings some new changes to how transcriptions are handled:
- Transcriptions now use text streams with topic
lk.transcription
. - The old transcription protocol is deprecated and will be removed in v1.1.
- for now both protocols are used for backwards compatibility.
- Upcoming versions SDKs/components standardize on text streams for transcriptions.
Accepting text input
Agents 1.0 introduces improved support for text input. Previously, text had to be manually intercepted and injected into the agent via ChatManager
.
In this version, agents automatically receive text input from a text stream on the lk.chat
topic.
The ChatManager
has been removed in Python SDK v1.0.
State change events
User state
user_started_speaking
and user_stopped_speaking
events are no longer emitted. They've been combined into a single user_state_changed
event.
@agent.on("user_started_speaking")def on_user_started_speaking():print("User started speaking")
Agent state
@agent.on("agent_started_speaking")def on_agent_started_speaking():# Log transcribed message from userprint("Agent started speaking")
Other events
Agent events were overhauled in version 1.0. For details, see the events page.
Removed features
OpenAI Assistants API support has been removed in 1.0.
The beta integration with the Assistants API in the OpenAI LLM plugin has been deprecated. Its stateful model made it difficult to manage state consistently between the API and agent.