Overview
This guide provides an overview of the changes between Agents v0.x and Agents 1.0 for Python, released in April 2025. Agents running on v0.x continue to work in LiveKit Cloud, but this version of the framework is no longer receiving updates or support. Migrate your agents to 1.x to continue receiving the latest features and bug fixes.
Unified agent interface
Agents 1.0 introduces AgentSession, a single, unified agent orchestrator that serves as the foundation for all types of agents built using the framework. With this change, the VoicePipelineAgent and MultimodalAgent classes have been deprecated and 0.x agents will need to be updated to use AgentSession in order to be compatible with 1.0 and later.
AgentSession contains a superset of the functionality of VoicePipelineAgent and MultimodalAgent, allowing you to switch between pipelined and speech-to-speech models without changing your core application logic.
The following code highlights the differences between Agents v0.x and Agents 1.0. For a full working example, see the Voice AI quickstart.
from livekit.agents import JobContext, llmfrom livekit.agents.pipeline import VoicePipelineAgentfrom livekit.plugins import (cartesia,deepgram,google,silero,)async def entrypoint(ctx: JobContext):initial_ctx = llm.ChatContext().append(role="system",text="You are a helpful voice AI assistant.",)agent = VoicePipelineAgent(vad=silero.VAD.load(),stt=deepgram.STT(),llm=google.LLM(),tts=cartesia.TTS(),)await agent.start(room, participant)await agent.say("Hey, how can I help you today?", allow_interruptions=True)
from livekit.agents import (AgentServer,AgentSession,Agent,llm,room_io,TurnHandlingOptions,)from livekit.plugins import (elevenlabs,deepgram,google,openai,silero,noise_cancellation,)from livekit.plugins.turn_detector.multilingual import MultilingualModelclass Assistant(Agent):def __init__(self) -> None:super().__init__(instructions="You are a helpful voice AI assistant.")server = AgentServer()@server.rtc_session(agent_name="my-agent")async def my_agent(ctx: agents.JobContext):session = AgentSession(stt=deepgram.STT(),llm=google.LLM(),tts=elevenlabs.TTS(),vad=silero.VAD.load(),turn_handling=TurnHandlingOptions(turn_detection=MultilingualModel(),),)# if using realtime api, use the following#session = AgentSession(# llm=openai.realtime.RealtimeModel(voice="echo"),#)await session.start(room=ctx.room,agent=Assistant(),room_options=room_io.RoomOptions(audio_input=room_io.AudioInputOptions(noise_cancellation=noise_cancellation.BVC(),),),)# Instruct the agent to speak firstawait session.generate_reply(instructions="say hello to the user")
Customizing pipeline behavior
We’ve introduced more flexibility for developers to customize the behavior of agents built on 1.0 through the new concept of pipeline nodes, which enable custom processing within the pipeline steps while also delegating to the default implementation of each node as needed.
Pipeline nodes replace the before_llm_cb and before_tts_cb callbacks.
before_llm_cb -> llm_node
before_llm_cb has been replaced by llm_node. This node can be used to modify the chat context before sending it to LLM, or integrate with custom LLM providers without having to create a plugin. As long as it returns AsyncIterable[llm.ChatChunk], the LLM node will forward the chunks to the next node in the pipeline.
async def add_rag_context(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):rag_context: str = retrieve(chat_ctx)chat_ctx.append(text=rag_context, role="system")agent = VoicePipelineAgent(...before_llm_cb=add_rag_context,)
class MyAgent(Agent):# override method from superclass to customize behaviorasync def llm_node(self,chat_ctx: llm.ChatContext,tools: list[llm.FunctionTool],model_settings: ModelSettings,) -> AsyncIterable[llm.ChatChunk]::rag_context: str = retrieve(chat_ctx)chat_ctx.add_message(content=rag_context, role="system")# update the context for persistence# await self.update_chat_ctx(chat_ctx)return Agent.default.llm_node(self, chat_ctx, tools, model_settings)
before_tts_cb -> tts_node
before_tts_cb has been replaced by tts_node. This node gives greater flexibility in customizing the TTS pipeline. It's possible to modify the text before synthesis, as well as the audio buffers after synthesis.
def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):# The TTS is incorrectly pronouncing "LiveKit", so we'll replace it with MFA-style IPA# spelling for Cartesiareturn tokenize.utils.replace_words(text=text, replacements={"livekit": r"<<l|aj|v|cʰ|ɪ|t|>>"})agent = VoicePipelineAgent(...before_tts_cb=_before_tts_cb,)
class MyAgent(Agent):async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):# use default implementation, but pre-process the textreturn Agent.default.tts_node(self, tokenize.utils.replace_words(text), model_settings)
Tool definition and use
Agents 1.0 streamlines the way in which tools are defined for use within your agents, making it easier to add and maintain agent tools. When migrating from 0.x to 1.0, developers will need to make the following changes to existing use of function calling within their agents in order to be compatible with versions 1.0 and later.
- The
@llm.ai_callabledecorator for function definition has been replaced with the new@function_tooldecorator. - If you define your functions within an
Agentand use the@function_tooldecorator, these tools are automatically accessible to the LLM. In this scenario, you are no longer required to define your functions in allm.FunctionContextclass and pass them into the agent constructor. - Argument types are now inferred from the function signature and docstring. Annotated types are no longer supported.
- Functions take in a
RunContextobject, which provides access to the current agent state.
from livekit.agents import llmfrom livekit.agents.pipeline import VoicePipelineAgentfrom livekit.agents.multimodal import MultimodalAgentclass AssistantFnc(llm.FunctionContext):@llm.ai_callable()async def get_weather(self,...)...fnc_ctx = AssistantFnc()pipeline_agent = VoicePipelineAgent(...fnc_ctx=fnc_ctx,)multimodal_agent = MultimodalAgent(...fnc_ctx=fnc_ctx,)
from livekit.agents.llm import function_toolfrom livekit.agents.voice import Agentfrom livekit.agents.events import RunContextclass MyAgent(Agent):@function_tool()async def get_weather(self,context: RunContext,location: str,) -> dict[str, Any]:"""Look up weather information for a given location.Args:location: The location to look up weather information for."""return {"weather": "sunny", "temperature_f": 70}
Chat context
ChatContext has been overhauled in 1.0 to provide a more powerful and flexible API for managing chat history. It now accounts for differences between LLM providers — such as stateless and stateful APIs — while exposing a unified interface.
Chat history can now include three types of items:
ChatMessage: a message associated with a role (e.g., user, assistant). Each message includes a list ofcontentitems, which can contain text, images, or audio.FunctionCall: a function call initiated by the LLM.FunctionCallOutput: the result returned from a function call.
Updating chat context
In 0.x, updating the chat context required modifying chat_ctx.messages directly. This approach was error-prone and difficult to time correctly, especially with realtime APIs.
In v1.x, there are two supported ways to update the chat context:
- Agent handoff: transferring control to a new agent, which will have its own chat context.
- Explicit update: calling
agent.update_chat_ctx()to modify the context directly.
Transcriptions
Agents 1.0 changes how transcriptions are delivered to frontends. The old and new mechanisms are completely separate systems. Writing to one does not trigger the other.
Legacy path (deprecated)
In v0.x, agents called publish_transcription() on LocalParticipant to send transcription segments to the room. The LiveKit server relayed these as TranscriptionReceived events on the client side. Frontend code listened for these events directly on the room or participant object.
Both publish_transcription() and the TranscriptionReceived event are deprecated and will be removed in a future version. The React hook useTrackTranscriptions from @livekit/components-react, which consumes these events, is also deprecated.
New path (text streams)
In Agents 1.0, transcriptions use text streams with topic lk.transcription. The agent framework publishes transcriptions through stream_text() instead of publish_transcription(), and frontends receive them with registerTextStreamHandler('lk.transcription', ...).
For React, the useTranscriptions hook replaces useTrackTranscriptions.
Backwards compatibility
The current agent framework publishes to both the legacy and text stream paths simultaneously, so existing frontends that listen for TranscriptionReceived continue to work during the migration period. However, if you write directly to the lk.transcription text stream (outside of the agent framework), those messages are only delivered through text streams and do not trigger TranscriptionReceived.
For full details on receiving transcriptions in your frontend, see Text and transcriptions: Frontend rendering.
Accepting text input
Agents 1.0 introduces improved support for text input. Previously, text had to be manually intercepted and injected into the agent via ChatManager.
In this version, agents automatically receive text input from a text stream on the lk.chat topic.
The ChatManager has been removed in Python SDK v1.0.
State change events
User state
user_started_speaking and user_stopped_speaking events are no longer emitted. They've been combined into a single user_state_changed event.
@agent.on("user_started_speaking")def on_user_started_speaking():print("User started speaking")
@session.on("user_state_changed")def on_user_state_changed(ev: UserStateChangedEvent):# userState could be "speaking", "listening", or "away"print(f"state change from {ev.old_state} to {ev.new_state}")
Agent state
@agent.on("agent_started_speaking")def on_agent_started_speaking():# Log transcribed message from userprint("Agent started speaking")
@session.on("agent_state_changed")def on_agent_state_changed(ev: AgentStateChangedEvent):# AgentState could be "initializing", "idle", "listening", "thinking", "speaking"# new_state is set as a participant attribute `lk.agent.state` to notify frontendsprint(f"state change from {ev.old_state} to {ev.new_state}")
Other events
Agent events were overhauled in version 1.0. For details, see the events page.
Removed features
OpenAI Assistants API support has been removed in 1.0.
The beta integration with the Assistants API in the OpenAI LLM plugin has been deprecated. Its stateful model made it difficult to manage state consistently between the API and agent.