Skip to main content

Tool definition and use

Let your agents call external tools and more.

Overview

LiveKit Agents has full support for LLM tool use. This feature allows you to create a custom library of tools to extend your agent's context, create interactive experiences, and overcome LLM limitations. Within a tool, you can:

Tool definition

The LLM has access to any tools you add to your agent class.

Add tools to your agent class with the @function_tool decorator.

from livekit.agents import function_tool, Agent, RunContext
class MyAgent(Agent):
@function_tool()
async def lookup_weather(
self,
context: RunContext,
location: str,
) -> dict[str, Any]:
"""Look up weather information for a given location.
Args:
location: The location to look up weather information for.
"""
return {"weather": "sunny", "temperature_f": 70}
Best practices

A good tool definition is key to reliable tool use from your LLM. Be specific about what the tool does, when it should or should not be used, what the arguments are for, and what type of return value to expect.

Name and description

By default, the tool name is the name of the function, and the description is its docstring. Override this behavior with the name and description arguments to the @function_tool decorator.

Arguments

The tool arguments are copied automatically by name from the function arguments. Type hints for arguments are included, if present.

Place additional information about the tool arguments, if needed, in the tool description.

Return value

The tool return value is automatically converted to a string before being sent to the LLM. The LLM generates a new reply or additional tool calls based on the return value. Return None or nothing at all to complete the tool silently without requiring a reply from the LLM.

You can use the return value to initiate a handoff to a different Agent within a workflow. Optionally, you can return a tool result to the LLM as well. The tool call and subsequent LLM reply are completed prior to the handoff.

In Python, return a tuple that includes both the Agent instance and the result. If there is no tool result, you can return the new Agent instance by itself.

In Node.js, return an instance of llm.handoff, which specifies the new Agent instance and the tool's return value, if any.

When a handoff occurs, prompt the LLM to inform the user:

@function_tool()
async def my_tool(context: RunContext):
return SomeAgent(), "Transferring the user to SomeAgent"

Structured output

Some LLMs can return structured JSON payloads that define behavior like TTS style separately from the spoken text.

In this example, the LLM streams a JSON object that has both TTS style directives and a spoken response. The TTS style is applied once per message and the spoken response is stripped out for downstream processing. The example contains two code blocks: the format of the JSON and the parsing logic, and an implementation example in an agent workflow.

Tip

This example uses a cast for the LLM and TTS instances. It's specifically built to work with OpenAI (or OpenAI-compatible) APIs. Read more in the OpenAI Structured Outputs docs.

See the following example for the full implementation:

Structured Output

Handle structured output from the LLM by overriding the `llm_node` and `tts_node`.

Core components: Definition and parsing

This code block has two components: the ResponseEmotion schema definition and the process_structured_output parsing function.

  • ResponseEmotion: Defines the structure of the JSON object, with both the TTS style directives (voice_instructions) and the spoken response.

  • process_structured_output: Incrementally parses the JSON object, optionally applies a callback for TTS style directives, and only streams the spoken response.

class ResponseEmotion(TypedDict):
voice_instructions: Annotated[
str,
Field(..., description="Concise TTS directive for tone, emotion, intonation, and speed"),
]
response: str
async def process_structured_output(
text: AsyncIterable[str],
callback: Optional[Callable[[ResponseEmotion], None]] = None,
) -> AsyncIterable[str]:
last_response = ""
acc_text = ""
async for chunk in text:
acc_text += chunk
try:
resp: ResponseEmotion = from_json(acc_text, allow_partial="trailing-strings")
except ValueError:
continue
if callback:
callback(resp)
if not resp.get("response"):
continue
new_delta = resp["response"][len(last_response) :]
if new_delta:
yield new_delta
last_response = resp["response"]

Agent method implementation

This agent implementation example overrides default behavior with custom logic using the LLM and TTS nodes: llm_node and tts_node.

  • llm_node: Casts the LLM instance to the OpenAI type, streams the output using the ResponseEmotion schema, and parses it into structured JSON.

  • tts_node: Processes the streamed JSON with a callback that applies the TTS style directives (voice_instructions), then streams the audio from the response.

async def llm_node(
self, chat_ctx: ChatContext, tools: list[FunctionTool], model_settings: ModelSettings
):
# not all LLMs support structured output, so we need to cast to the specific LLM type
llm = cast(openai.LLM, self.llm)
tool_choice = model_settings.tool_choice if model_settings else NOT_GIVEN
async with llm.chat(
chat_ctx=chat_ctx,
tools=tools,
tool_choice=tool_choice,
response_format=ResponseEmotion,
) as stream:
async for chunk in stream:
yield chunk
async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):
instruction_updated = False
def output_processed(resp: ResponseEmotion):
nonlocal instruction_updated
if resp.get("voice_instructions") and resp.get("response") and not instruction_updated:
# when the response isn't empty, we can assume voice_instructions is complete.
# (if the LLM sent the fields in the right order)
instruction_updated = True
logger.info(
f"Applying TTS instructions before generating response audio: "
f'"{resp["voice_instructions"]}"'
)
tts = cast(openai.TTS, self.tts)
tts.update_options(instructions=resp["voice_instructions"])
# process_structured_output strips the TTS instructions and only synthesizes the verbal part
# of the LLM output
return Agent.default.tts_node(
self, process_structured_output(text, callback=output_processed), model_settings
)

RunContext

Tools include support for a special context argument. This contains access to the current session, function_call, speech_handle, and userdata. Consult the documentation on speech and state within workflows for more information about how to use these features.

Adding tools dynamically

You can exercise more control over the tools available by setting the tools argument directly.

To share a tool between multiple agents, define it outside of their class and then provide it to each. The RunContext is especially useful for this purpose to access the current session, agent, and state.

Tools set in the tools value are available alongside any registered within the class using the @function_tool decorator.

from livekit.agents import function_tool, Agent, RunContext
@function_tool()
async def lookup_user(
context: RunContext,
user_id: str,
) -> dict:
"""Look up a user's information by ID."""
return {"name": "John Doe", "email": "john.doe@example.com"}
class AgentA(Agent):
def __init__(self):
super().__init__(
tools=[lookup_user],
# ...
)
class AgentB(Agent):
def __init__(self):
super().__init__(
tools=[lookup_user],
# ...
)

Use agent.update_tools() to update available tools after creating an agent. This replaces all tools, including those registered automatically within the agent class. To reference existing tools before replacement, access the agent.tools property:

# add a tool
await agent.update_tools(agent.tools + [tool_a])
# remove a tool
await agent.update_tools(agent.tools - [tool_a])
# replace all tools
await agent.update_tools([tool_a, tool_b])

Creating tools programmatically

To create a tool on the fly, use function_tool as a function rather than as a decorator. You must supply a name, description, and callable function. This is useful to compose specific tools based on the same underlying code or load them from external sources such as a database or Model Context Protocol (MCP) server.

In the following example, the app has a single function to set any user profile field but gives the agent one tool per field for improved reliability:

from livekit.agents import function_tool, RunContext
class Assistant(Agent):
def _set_profile_field_func_for(self, field: str):
async def set_value(context: RunContext, value: str):
# custom logic to set input
return f"field {field} was set to {value}"
return set_value
def __init__(self):
super().__init__(
tools=[
function_tool(self._set_profile_field_func_for("phone"),
name="set_phone_number",
description="Call this function when user has provided their phone number."),
function_tool(self._set_profile_field_func_for("email"),
name="set_email",
description="Call this function when user has provided their email."),
# ... other tools ...
],
# instructions, etc ...
)

Creating tools from raw schema

For advanced use cases, you can create tools directly from a raw function calling schema. This is useful when integrating with existing function definitions, loading tools from external sources, or working with schemas that don't map cleanly to Python function signatures.

Use the raw_schema parameter in the @function_tool decorator to provide the full function schema:

from livekit.agents import function_tool, RunContext
raw_schema = {
"type": "function",
"name": "get_weather",
"description": "Get weather for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. New York"
}
},
"required": [
"location"
],
"additionalProperties": False
}
}
@function_tool(raw_schema=raw_schema)
async def get_weather(raw_arguments: dict[str, object], context: RunContext):
location = raw_arguments["location"]
# Your implementation here
return f"The weather of {location} is ..."

When using raw schemas, function parameters are passed to your handler as a dictionary named raw_arguments. You can extract values from this dictionary using the parameter names defined in your schema.

You can also create tools programmatically using function_tool as a function with a raw schemas:

from livekit.agents import function_tool
def create_database_tool(table_name: str, operation: str):
schema = {
"type": "function",
"name": f"{operation}_{table_name}",
"description": f"Perform {operation} operation on {table_name} table",
"parameters": {
"type": "object",
"properties": {
"record_id": {
"type": "string",
"description": f"ID of the record to {operation}"
}
},
"required": ["record_id"]
}
}
async def handler(raw_arguments: dict[str, object], context: RunContext):
record_id = raw_arguments["record_id"]
# Perform database operation
return f"Performed {operation} on {table_name} for record {record_id}"
return function_tool(handler, raw_schema=schema)
# Create tools dynamically
user_tools = [
create_database_tool("users", "read"),
create_database_tool("users", "update"),
create_database_tool("users", "delete")
]
class DataAgent(Agent):
def __init__(self):
super().__init__(
instructions="You are a database assistant.",
tools=user_tools,
)

Error handling

Raise the ToolError exception to return an error to the LLM in place of a response. You can include a custom message to describe the error and/or recovery options.

@function_tool()
async def lookup_weather(
self,
context: RunContext,
location: str,
) -> dict[str, Any]:
if location == "mars":
raise ToolError("This location is coming soon. Please join our mailing list to stay updated.")
else:
return {"weather": "sunny", "temperature_f": 70}

Model Context Protocol (MCP)

ONLY Available in
Python

LiveKit Agents has full support for MCP servers to load tools from external sources.

To use it, first install the mcp optional dependencies:

pip install livekit-agents[mcp]~=1.2

Then pass the MCP server URL to the AgentSession or Agent constructor. The tools will be automatically loaded like any other tool.

from livekit.agents import mcp
session = AgentSession(
#... other arguments ...
mcp_servers=[
mcp.MCPServerHTTP(
"https://your-mcp-server.com"
)
]
)
from livekit.agents import mcp
agent = Agent(
#... other arguments ...
mcp_servers=[
mcp.MCPServerHTTP(
"https://your-mcp-server.com"
)
]
)

Forwarding to the frontend

Forward tool calls to a frontend app using RPC. This is useful when the data needed to fulfill the function call is only available at the frontend. You may also use RPC to trigger actions or UI updates in a structured way.

For instance, here's a function that accesses the user's live location from their web browser:

Agent implementation

from livekit.agents import function_tool, get_job_context, RunContext
@function_tool()
async def get_user_location(
context: RunContext,
high_accuracy: bool
):
"""Retrieve the user's current geolocation as lat/lng.
Args:
high_accuracy: Whether to use high accuracy mode, which is slower but more precise
Returns:
A dictionary containing latitude and longitude coordinates
"""
try:
room = get_job_context().room
participant_identity = next(iter(room.remote_participants))
response = await room.local_participant.perform_rpc(
destination_identity=participant_identity,
method="getUserLocation",
payload=json.dumps({
"highAccuracy": high_accuracy
}),
response_timeout=10.0 if high_accuracy else 5.0,
)
return response
except Exception:
raise ToolError("Unable to retrieve user location")

Frontend implementation

The following example uses the JavaScript SDK. The same pattern works for other SDKs. For more examples, see the RPC documentation.

import { RpcError, RpcInvocationData } from 'livekit-client';
localParticipant.registerRpcMethod(
'getUserLocation',
async (data: RpcInvocationData) => {
try {
let params = JSON.parse(data.payload);
const position: GeolocationPosition = await new Promise((resolve, reject) => {
navigator.geolocation.getCurrentPosition(resolve, reject, {
enableHighAccuracy: params.highAccuracy ?? false,
timeout: data.responseTimeout,
});
});
return JSON.stringify({
latitude: position.coords.latitude,
longitude: position.coords.longitude,
});
} catch (error) {
throw new RpcError(1, "Could not retrieve user location");
}
}
);

Slow tool calls

For best practices on providing feedback to the user during long-running tool calls, see the section on user feedback in the External data and RAG guide.

External tools and MCP

To load tools from an external source as a Model Context Protocol (MCP) server, use the function_tool function and register the tools with the tools property or update_tools() method. See the following example for a complete MCP implementation:

ModelContextProtocol

MCP Agent

A voice AI agent with an integrated Model Context Protocol (MCP) server for the LiveKit API.

Examples

The following additional examples show how to use tools in different ways:

Further reading

The following articles provide more information about the topics discussed in this guide: