Overview
This guide covers the full testing API for LiveKit Agents, including test setup, result navigation, assertions, mocking, and multi-turn conversation testing. The examples use pytest for Python and Vitest for Node.js, but are adaptable to other testing frameworks.
When restructuring your project to add tests, ensure you update your Dockerfile too if you move your agent entrypoint file. The default template assumes src/agent.py for Python projects. See Builds and Dockerfiles for details.
Installation
You must install both the pytest and pytest-asyncio packages to write tests for your agent.
uv add pytest pytest-asyncio
You must install vitest to write tests for your agent.
pnpm add -D vitest
Always call initializeLogger({ pretty: false, level: 'warn' }) at the top of your test files to suppress verbose CLI output.
Test setup
Each test typically follows the same pattern:
@pytest.mark.asyncio # Or your async testing framework of choiceasync def test_your_agent() -> None:async with (# You must create an LLM instance for the `judge` methodinference.LLM(model="openai/gpt-5.3-chat-latest") as llm,# Create a session for the life of this test.# LLM is not required - it will use the agent's LLM if you don't provide one hereAgentSession(llm=llm) as session,):# Start the agent in the sessionawait session.start(Assistant())# Run a single conversation turn based on the given user inputresult = await session.run(user_input="Hello")# ...your assertions go here...
import { inference, initializeLogger, voice } from '@livekit/agents';import { describe, it, beforeAll, afterAll } from 'vitest';// Import your agent classimport { Agent } from './agent';// Initialize logger to suppress CLI outputinitializeLogger({ pretty: false, level: 'warn' });const { AgentSession } = voice;describe('YourAgent', () => {let session: voice.AgentSession;let llm: inference.LLM;beforeAll(async () => {// You must create an LLM instance for the `judge` methodllm = new inference.LLM({ model: 'openai/gpt-5.3-chat-latest' });// Create a session for the life of this test.// LLM is not required - it will use the agent's LLM if you don't provide one heresession = new AgentSession({ llm });// Start the agent in the sessionawait session.start({ agent: new Agent() });});afterAll(async () => {await session?.close();});it('should test your agent', async () => {// Run a single conversation turn based on the given user inputconst result = await session.run({ userInput: 'Hello' }).wait();// ...your assertions go here...});});
Result structure
The run method executes a single conversation turn and returns a RunResult, which contains each of the events that occurred during the turn, in order, and offers a fluent assertion API.
A simple turn with no tool calls produces a single event:
Loading diagram…
However, a more complex turn may contain tool calls, tool outputs, handoffs, and one or more messages.
Loading diagram…
To validate these multi-part turns, you can use any of the following approaches.
Sequential navigation
- Step through events one at a time with
next_event(). - Validate each event with
is_*assertions likeis_message(). - Call
no_more_events()at the end to assert no unexpected events remain.
For example, to validate that the agent responds with a friendly greeting, you can use the following code:
result.expect.next_event().is_message(role="assistant")
result.expect.nextEvent().isMessage({ role: 'assistant' });
Skipping events
You can also skip events without validation:
skip_next(n): Skip one or more events. Defaults to 1.skip_next_event_if(type, ...): Skip the next event only if it matches the given type and optional filters (for example,rolefor messages,namefor function calls). Returns the matching Assert, orNoneif the next event doesn't match.next_event(type=...): Advance to the next event of the given type, skipping everything else. Raises an assertion error if no match is found.
Example:
result.expect.skip_next() # skips one eventresult.expect.skip_next(2) # skips two eventsresult.expect.skip_next_event_if(type="message", role="assistant") # Skips the next event if it's an assistant messageresult.expect.skip_next_event_if(type="function_call", name="lookup_weather") # Skips the next event if it's a call to lookup_weatherresult.expect.next_event(type="function_call") # Advances to the next function call, skipping non-function-call events. Raises an assertion error if not found.
result.expect.skipNext(); // skips one eventresult.expect.skipNext(2); // skips two eventsresult.expect.skipNextEventIf({ type: 'message', role: 'assistant' }); // Skips the next event if it's an assistant messageresult.expect.nextEvent({ type: 'message', role: 'assistant' }); // Advances to the next assistant message, skipping anything else. If no matching event is found, an assertion error is raised.
Passing a type to next_event() returns a type-specific Assert (for example, FunctionCallAssert) that doesn't have is_* methods. Don't chain .is_function_call() after next_event(type="function_call").
To assert additional properties like function name, either omit type and chain the is_* method, or check the event directly:
# Option 1: chain is_function_call on a generic EventAssertresult.expect.next_event().is_function_call(name="lookup_weather")# Option 2: advance to any function call, then check the namefnc = result.expect.next_event(type="function_call")assert fnc.event().item.name == "lookup_weather"
Indexed access
Access a specific event by index without advancing the cursor. You can use negative indices to access events from the end of the list. For example, -1 for the last event.
result.expect[0].is_message(role="assistant")
result.expect.at(0).isMessage({ role: 'assistant' });
Search
Search for events regardless of position with contains_* methods like contains_message(). You can also search within a range using slices ([:] in Python, .range() in Node.js).
result.expect.contains_message(role="assistant")result.expect[0:2].contains_message(role="assistant")
result.expect.containsMessage({ role: 'assistant' });result.expect.range(0, 2).containsMessage({ role: 'assistant' });
Assertions
The test framework includes assertion helpers to validate messages, tool calls, and agent handoffs within each result. Use exact assertions like is_message() to check a specific event, or search assertions like contains_message() to find a match anywhere in a range of events.
Message assertions
Use is_message() and contains_message() to test individual messages. Both accept an optional role argument.
result.expect.next_event().is_message(role="assistant")result.expect[0:2].contains_message(role="assistant")
result.expect.nextEvent().isMessage({ role: 'assistant' });result.expect.range(0, 2).containsMessage({ role: 'assistant' });
Access additional properties with the event() method:
event().item.content- Message contentevent().item.role- Message role
LLM-based judgment
Use judge() to evaluate whether a message matches a given intent. Pass an LLM instance and an intent string describing the expected content. The LLM judges the message against the intent without surrounding conversation context.
result = await session.run(user_input="Hello")await (result.expect.next_event().is_message(role="assistant").judge(llm, intent="Offers a friendly introduction and offer of assistance."))
const result = await session.run({ userInput: 'Hello' }).wait();await result.expect.nextEvent().isMessage({ role: 'assistant' }).judge(llm, {intent: 'Offers a friendly introduction and offer of assistance.',});
The llm argument can be any LLM instance and does not need to be the same one used in the agent itself.
Tool call assertions
Test three aspects of tool use:
- Function calls: The agent calls the correct tool with the correct arguments.
- Function call outputs: The tool returns the expected output.
- Agent response: The agent responds appropriately based on the tool output.
The following example tests all three:
result = await session.run(user_input="What's the weather in Tokyo?")# Test that the agent's first conversation item is a function callfnc_call = result.expect.next_event().is_function_call(name="lookup_weather", arguments={"location": "Tokyo"})# Test that the tool returned the expected output to the agentresult.expect.next_event().is_function_call_output(output="sunny with a temperature of 70 degrees.")# Test that the agent's response is appropriate based on the tool outputawait (result.expect.next_event().is_message(role="assistant").judge(llm,intent="Informs the user that the weather in Tokyo is sunny with a temperature of 70 degrees.",))# Verify the agent's turn is complete, with no additional messages or function callsresult.expect.no_more_events()
const result = await session.run({ userInput: "What's the weather in Tokyo?" }).wait();// Test that the agent's first conversation item is a function callresult.expect.nextEvent().isFunctionCall({ name: 'getWeather', args: { location: 'Tokyo' } });// Test that the tool returned the expected output to the agentresult.expect.nextEvent().isFunctionCallOutput();// Test that the agent's response is appropriate based on the tool outputawait result.expect.nextEvent().isMessage({ role: 'assistant' }).judge(llm, {intent: 'Informs the user that the weather in Tokyo is sunny with a temperature of 70 degrees.',});// Verify the agent's turn is complete, with no additional messages or function callsresult.expect.noMoreEvents();
Access individual properties with the event() method:
is_function_call().event().item.name- Function nameis_function_call().event().item.arguments- Function argumentsis_function_call_output().event().item.output- Raw function outputis_function_call_output().event().item.is_error- Whether the output is an erroris_function_call_output().event().item.call_id- The function call ID
Agent handoff assertions
Use is_agent_handoff() and contains_agent_handoff() to test that the agent performs a handoff to a new agent.
# The next event must be an agent handoff to the specified agentresult.expect.next_event().is_agent_handoff(new_agent_type=MyAgent)# A handoff must occur somewhere in the turnresult.expect.contains_agent_handoff(new_agent_type=MyAgent)
// The next event must be an agent handoff to the specified agentresult.expect.nextEvent().isAgentHandoff({ newAgentType: MyAgent });// A handoff must occur somewhere in the turnresult.expect.containsAgentHandoff({ newAgentType: MyAgent });
Mocking tools
In many cases, you should mock your tools for testing. This is useful to easily test edge cases, such as errors or other unexpected behavior, or when the tool has a dependency on an external service that you don't need to test against.
mock_tools requires LiveKit Agents 1.2.6 or later.
Use the mock_tools helper in a with block to mock one or more tools for a specific Agent. To mock a tool that raises an error:
from livekit.agents import mock_tools# Mock a tool errorwith mock_tools(Assistant,{"lookup_weather": lambda: RuntimeError("Weather service is unavailable")},):result = await session.run(user_input="What's the weather in Tokyo?")await result.expect.next_event(type="message").judge(llm, intent="Should inform the user that an error occurred while looking up the weather.")
For more complex mocks, pass a function instead of a lambda:
def _mock_weather_tool(location: str) -> str:if location == "Tokyo":return "sunny with a temperature of 70 degrees."else:return "UNSUPPORTED_LOCATION"# Mock a specific tool responsewith mock_tools(Assistant, {"lookup_weather": _mock_weather_tool}):result = await session.run(user_input="What's the weather in Tokyo?")await result.expect.next_event(type="message").judge(llm,intent="Should indicate the weather in Tokyo is sunny with a temperature of 70 degrees.",)result = await session.run(user_input="What's the weather in Paris?")await result.expect.next_event(type="message").judge(llm,intent="Should indicate that weather lookups in Paris are not supported.",)
Testing multiple turns
You can test multiple turns of a conversation by executing the run method multiple times. The conversation history builds automatically across turns.
# First turnresult1 = await session.run(user_input="Hello")await result1.expect.next_event().is_message(role="assistant").judge(llm, intent="Friendly greeting")# Second turn builds on conversation historyresult2 = await session.run(user_input="What's the weather like in Tokyo?")result2.expect.next_event().is_function_call(name="lookup_weather")result2.expect.next_event().is_function_call_output()await result2.expect.next_event().is_message(role="assistant").judge(llm, intent="Provides weather information")
// First turnconst result1 = await session.run({ userInput: 'Hello' }).wait();await result1.expect.nextEvent().isMessage({ role: 'assistant' }).judge(llm, {intent: 'Friendly greeting',});// Second turn builds on conversation historyconst result2 = await session.run({ userInput: "What's the weather like in Tokyo?" }).wait();result2.expect.nextEvent().isFunctionCall({ name: 'getWeather' });result2.expect.nextEvent().isFunctionCallOutput();await result2.expect.nextEvent().isMessage({ role: 'assistant' }).judge(llm, {intent: 'Provides weather information',});
Loading conversation history
To load conversation history manually, use the ChatContext class just as in your agent code:
from livekit.agents import ChatContextagent = Assistant()await session.start(agent)# update_chat_ctx is on the Agent instance, not the session.# In tests where you don't hold a reference, use session.current_agent.chat_ctx = ChatContext()chat_ctx.add_message(role="user", content="My name is Alice")chat_ctx.add_message(role="assistant", content="Nice to meet you, Alice!")await agent.update_chat_ctx(chat_ctx)# Test that the agent remembers the contextresult = await session.run(user_input="What's my name?")await result.expect.next_event().is_message(role="assistant").judge(llm, intent="Should remember and mention the user's name is Alice")
import { llm } from '@livekit/agents';const { ChatContext } = llm;const agent = new Assistant();await session.start({ agent });// updateChatCtx is on the Agent instance, not the session.// In tests where you don't hold a reference, use session.currentAgent.const chatCtx = new ChatContext();chatCtx.addMessage({ role: 'user', content: 'My name is Alice' });chatCtx.addMessage({ role: 'assistant', content: 'Nice to meet you, Alice!' });await agent.updateChatCtx(chatCtx);// Test that the agent remembers the contextconst result = await session.run({ userInput: "What's my name?" }).wait();await result.expect.nextEvent().isMessage({ role: 'assistant' }).judge(llm, {intent: "Should remember and mention the user's name is Alice",});