Agents 0.x migration guide

Migrate your Python-based agents from version 0.x to 1.0.

Unified agent interface

Agents 1.0 introduces AgentSession, a single, unified agent orchestrator that serves as the foundation for all types of agents built using the framework. With this change, the VoicePipelineAgent and MultimodalAgent classes have been deprecated and 0.x agents will need to be updated to use AgentSession in order to be compatible with 1.0 and later.

AgentSession contains a superset of the functionality of VoicePipelineAgent and MultimodalAgent, allowing you to switch between pipelined and speech-to-speech models without changing your core application logic.

from livekit.agents import JobContext, llm
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import (
cartesia,
deepgram,
google,
silero,
)
async def entrypoint(ctx: JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text="You are a helpful voice AI assistant.",
)
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(),
llm=google.LLM(),
tts=cartesia.TTS(),
)
await agent.start(room, participant)
await agent.say("Hey, how can I help you today?", allow_interruptions=True)

Customizing pipeline behavior

We’ve introduced more flexibility for developers to customize the behavior of agents built on 1.0 through the new concept of pipeline nodes, which enable custom processing within the pipeline steps while also delegating to the default implementation of each node as needed.

Pipeline nodes replaces the before_llm_cb and before_tts_cb callbacks.

before_llm_cb -> llm_node

before_llm_cb has been replaced by llm_node. This node can be used to modify the chat context before sending it to LLM, or integrate with custom LLM providers without having to create a plugin. As long as it returns AsyncIterable[llm.ChatChunk], the LLM node will forward the chunks to the next node in the pipeline.

async def add_rag_context(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
rag_context: str = retrieve(chat_ctx)
chat_ctx.append(text=rag_context, role="system")
agent = VoicePipelineAgent(
...
before_llm_cb=add_rag_context,
)

before_tts_cb -> tts_node

before_tts_cb has been replaced by tts_node. This node gives greater flexibility in customizing the TTS pipeline. It's possible to modify the text before synthesis, as well as the audio buffers after synthesis.

def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):
# The TTS is incorrectly pronouncing "LiveKit", so we'll replace it with MFA-style IPA
# spelling for Cartesia
return tokenize.utils.replace_words(
text=text, replacements={"livekit": r"<<l|aj|v|cʰ|ɪ|t|>>"}
)
agent = VoicePipelineAgent(
...
before_tts_cb=_before_tts_cb,
)

Tool definition and use

Agents 1.0 streamlines the way in which tools are defined for use within your agents, making it easier to add and maintain agent tools. When migrating from 0.x to 1.0, developers will need to make the following changes to existing use of functional calling within their agents in order to be compatible with versions 1.0 and later.

  • The @llm.ai_callable decorator for function definition has been replaced with the new @function_tool decorator.
  • If you define your functions within an Agent and use the @function_tool decorator, these tools are automatically accessible to the LLM. In this scenario, you no longer required to define your functions in a llm.FunctionContext class and pass them into the agent constructor.
  • Argument types are now inferred from the function signature and docstring. Annotated types are no longer supported.
  • Functions take in a RunContext object, which provides access to the current agent state.
from livekit.agents import llm
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.agents.multimodal import MultimodalAgent
class AssistantFnc(llm.FunctionContext):
@llm.ai_callable()
async def get_weather(
self,
...
)
...
fnc_ctx = AssistantFnc()
pipeline_agent = VoicePipelineAgent(
...
fnc_ctx=fnc_ctx,
)
multimodal_agent = MultimodalAgent(
...
fnc_ctx=fnc_ctx,
)

Chat context

ChatContext has been overhauled in 1.0 to provide a more powerful and flexible API for managing chat history. It now accounts for differences between LLM providers—such as stateless and stateful APIs—while exposing a unified interface.

Chat history can now include three types of items:

  • ChatMessage: a message associated with a role (e.g., user, assistant). Each message includes a list of content items, which can contain text, images, or audio.
  • FunctionCall: a function call initiated by the LLM.
  • FunctionCallOutput: the result returned from a function call.

Updating chat context

In 0.x, updating the chat context required modifying chat_ctx.messages directly. This approach was error-prone and difficult to time correctly, especially with realtime APIs.

In v1.0, there are two supported ways to update the chat context:

  • Agent handofftransferring control to a new agent, which will have its own chat context.
  • Explicit update - calling agent.update_chat_ctx() to modify the context directly.

Transcriptions

Agents 1.0 brings some new changes to how transcriptions are handled:

  • Transcriptions now use text streams with topic lk.transcription.
  • The old transcription protocol is deprecated and will be removed in v1.1.
  • for now both protocols are used for backwards compatibility.
  • Upcoming versions SDKs/components standardize on text streams for transcriptions.

Accepting text input

Agents 1.0 introduces improved support for text input. Previously, text had to be manually intercepted and injected into the agent via ChatManager.

In this version, agents automatically receive text input from a text stream on the lk.chat topic.

The ChatManager has been removed in Python SDK v1.0.

State change events

User state

user_started_speaking and user_stopped_speaking events are no longer emitted. They've been combined into a single user_state_changed event.

@agent.on("user_started_speaking")
def on_user_started_speaking():
print("User started speaking")

Agent state

@agent.on("agent_started_speaking")
def on_agent_started_speaking():
# Log transcribed message from user
print("Agent started speaking")

Other events

Agent events were overhauled in version 1.0. For details, see the events page.

Removed features

  • OpenAI Assistants API support has been removed in 1.0.

    The beta integration with the Assistants API in the OpenAI LLM plugin has been deprecated. Its stateful model made it difficult to manage state consistently between the API and agent.