Tool definition and use

Overview

LiveKit Agents has full support for LLM tool use. This feature allows you to create a custom library of tools to extend your agent's context, create interactive experiences, and overcome LLM limitations. Within a tool, you can:

Generate agent speech with session.say() or session.generate_reply().
Call methods on the frontend using RPC.
Handoff control to another agent as part of a workflow.
Store and retrieve session data from the context.
Anything else that a Python function can do.
Call external APIs or lookup data for RAG.

Tool definition

The LLM has access to any tools you add to your agent class.

Add tools to your agent class with the @function_tool decorator.

from livekit.agents import function_tool, Agent, RunContext


class MyAgent(Agent):
    @function_tool()
    async def lookup_weather(
        self,
        context: RunContext,
        location: str,
    ) -> dict[str, Any]:
        """Look up weather information for a given location.
        
        Args:
            location: The location to look up weather information for.
        """

        return {"weather": "sunny", "temperature_f": 70}

Add tools to your agent class with the llm.tool function. This example uses Zod to make it easy to provide a typed, annotated tool definition.

import { voice, llm } from '@livekit/agents';
import { z } from 'zod';

class MyAgent extends voice.Agent {
  constructor() {
    super({
      instructions: 'You are a helpful assistant.',
      tools: {
        lookupWeather: llm.tool({
          description: 'Look up weather information for a given location.',
          parameters: z.object({
            location: z.string().describe("The location to look up weather information for.")
          }),
          execute: async ({ location }, { ctx }) => {
            return { weather: "sunny", temperatureF: 70 };
          },
        }),
      },
    });
  }
}

You can also define the tool parameters as a JSON schema. For example, the tool in the example above can be defined as follows:

parameters: {
  type: "object",
  properties: {
    location: {
      type: "string",
      description: "The location to look up weather information for."
    }
  }
}

Best practices

A good tool definition is key to reliable tool use from your LLM. Be specific about what the tool does, when it should or should not be used, what the arguments are for, and what type of return value to expect.

Name and description

By default, the tool name is the name of the function, and the description is its docstring. Override this behavior with the name and description arguments to the @function_tool decorator.

Arguments

The tool arguments are copied automatically by name from the function arguments. Type hints for arguments are included, if present.

Place additional information about the tool arguments, if needed, in the tool description.

Return value

The tool return value is automatically converted to a string before being sent to the LLM. The LLM generates a new reply or additional tool calls based on the return value. Return None or nothing at all to complete the tool silently without requiring a reply from the LLM.

You can use the return value to initiate a handoff to a different Agent within a workflow. Optionally, you can return a tool result to the LLM as well. The tool call and subsequent LLM reply are completed prior to the handoff.

In Python, return a tuple that includes both the Agent instance and the result. If there is no tool result, you can return the new Agent instance by itself.

In Node.js, return an instance of llm.handoff, which specifies the new Agent instance and the tool's return value, if any.

When a handoff occurs, prompt the LLM to inform the user:

@function_tool()
async def my_tool(context: RunContext):
    return SomeAgent(), "Transferring the user to SomeAgent"

const myTool = llm.tool({
  description: 'Example tool that hands off to another agent',
  execute: async (_, { ctx }) => {
    return llm.handoff({
      agent: new SomeAgent(),
      returns: 'Transferring the user to SomeAgent',
    });
  },
});

Structured output

Some LLMs can return structured JSON payloads that define behavior like TTS style separately from the spoken text.

In this example, the LLM streams a JSON object that has both TTS style directives and a spoken response. The TTS style is applied once per message and the spoken response is stripped out for downstream processing. The example contains two code blocks: the format of the JSON and the parsing logic, and an implementation example in an agent workflow.

Tip

This example uses a cast for the LLM and TTS instances. It's specifically built to work with OpenAI (or OpenAI-compatible) APIs. Read more in the OpenAI Structured Outputs docs.

See the following example for the full implementation:

Structured Output

Handle structured output from the LLM by overriding the `llm_node` and `tts_node`.

Core components: Definition and parsing

This code block has two components: the ResponseEmotion schema definition and the process_structured_output parsing function.

ResponseEmotion: Defines the structure of the JSON object, with both the TTS style directives (voice_instructions) and the spoken response.
process_structured_output: Incrementally parses the JSON object, optionally applies a callback for TTS style directives, and only streams the spoken response.

class ResponseEmotion(TypedDict):
    voice_instructions: Annotated[
        str,
        Field(..., description="Concise TTS directive for tone, emotion, intonation, and speed"),
    ]
    response: str

async def process_structured_output(
    text: AsyncIterable[str],
    callback: Optional[Callable[[ResponseEmotion], None]] = None,
) -> AsyncIterable[str]:
    last_response = ""
    acc_text = ""
    async for chunk in text:
        acc_text += chunk
        try:
            resp: ResponseEmotion = from_json(acc_text, allow_partial="trailing-strings")
        except ValueError:
            continue

        if callback:
            callback(resp)

        if not resp.get("response"):
            continue

        new_delta = resp["response"][len(last_response) :]
        if new_delta:
            yield new_delta
        last_response = resp["response"]

Agent method implementation

This agent implementation example overrides default behavior with custom logic using the LLM and TTS nodes: llm_node and tts_node.

llm_node: Casts the LLM instance to the OpenAI type, streams the output using the ResponseEmotion schema, and parses it into structured JSON.
tts_node: Processes the streamed JSON with a callback that applies the TTS style directives (voice_instructions), then streams the audio from the response.

async def llm_node(
    self, chat_ctx: ChatContext, tools: list[FunctionTool], model_settings: ModelSettings
):
    # not all LLMs support structured output, so we need to cast to the specific LLM type
    llm = cast(openai.LLM, self.llm)
    tool_choice = model_settings.tool_choice if model_settings else NOT_GIVEN
    async with llm.chat(
        chat_ctx=chat_ctx,
        tools=tools,
        tool_choice=tool_choice,
        response_format=ResponseEmotion,
    ) as stream:
        async for chunk in stream:
            yield chunk

async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):
    instruction_updated = False

    def output_processed(resp: ResponseEmotion):
        nonlocal instruction_updated
        if resp.get("voice_instructions") and resp.get("response") and not instruction_updated:
            # when the response isn't empty, we can assume voice_instructions is complete.
            # (if the LLM sent the fields in the right order)
            instruction_updated = True
            logger.info(
                f"Applying TTS instructions before generating response audio: "
                f'"{resp["voice_instructions"]}"'
            )

            tts = cast(openai.TTS, self.tts)
            tts.update_options(instructions=resp["voice_instructions"])

    # process_structured_output strips the TTS instructions and only synthesizes the verbal part
    # of the LLM output
    return Agent.default.tts_node(
        self, process_structured_output(text, callback=output_processed), model_settings
    )

RunContext

Tools include support for a special context argument. This contains access to the current session, function_call, speech_handle, and userdata. Consult the documentation on speech and state within workflows for more information about how to use these features.

Interruptions

By default, tools can be interrupted if the user speaks. When interrupted, the tool is removed from the history and the result, if any, is ignored.

The speech handle has utilities for detecting interruption:

wait_for_result = asyncio.ensure_future(self._a_long_running_task(query))
await run_ctx.speech_handle.wait_if_not_interrupted([wait_for_result])

if run_ctx.speech_handle.interrupted:
   # interruption occurred, you should cancel / clean up your tasks
   wait_for_result.cancel()
   return None # it doesn't matter what you return, the tool no longer exists from LLM perspective
else: 
  # your work finished  without interruption

If your tool is taking external actions that can't be rolled back, you should instead disable interruptions by calling run_ctx.disallow_interruptions() at the start of your tool to ensure user speech won't interrupt the agent's task.

For best practices on providing feedback to the user during long-running tool calls, see the section on user feedback in the External data and RAG guide.

Long running tools

Interruptions during long-running tools.

Adding tools dynamically

You can exercise more control over the tools available by setting the tools argument directly.

To share a tool between multiple agents, define it outside of their class and then provide it to each. The RunContext is especially useful for this purpose to access the current session, agent, and state.

Tools set in the tools value are available alongside any registered within the class using the @function_tool decorator.

from livekit.agents import function_tool, Agent, RunContext

@function_tool()
async def lookup_user(
    context: RunContext,
    user_id: str,
) -> dict:
    """Look up a user's information by ID."""

    return {"name": "John Doe", "email": "john.doe@example.com"}


class AgentA(Agent):
    def __init__(self):
        super().__init__(
            tools=[lookup_user],
            # ...
        )


class AgentB(Agent):
    def __init__(self):
        super().__init__(
            tools=[lookup_user],
            # ...
        )

import { voice, llm } from '@livekit/agents';
import { z } from 'zod';

const lookupUser = llm.tool({
  description: 'Look up a user\'s information by ID.',
  parameters: z.object({
    userId: z.string(),
  }),
  execute: async ({ userId }, { ctx }) => {
    return { name: "John Doe", email: "john.doe@example.com" };
  },
});

class AgentA extends voice.Agent {
  constructor() {
    super({
      tools: {
        lookupUser,
      },
      // ...
    });
  }
}

class AgentB extends voice.Agent {
  constructor() {
    super({
      tools: {
        lookupUser,
      },
      // ...
    });
  }
}

Use agent.update_tools() to update available tools after creating an agent. This replaces all tools, including those registered automatically within the agent class. To reference existing tools before replacement, access the agent.tools property:

# add a tool
await agent.update_tools(agent.tools + [tool_a])

# remove a tool
await agent.update_tools(agent.tools - [tool_a]) 

# replace all tools
await agent.update_tools([tool_a, tool_b])

// add a tool
await agent.updateTools({ ...agent.toolCtx, toolA })

// remove a tool
const { toolA, ...rest } = agent.toolCtx;
await agent.updateTools({ ...rest }) 

// replace all tools
await agent.updateTools({ toolA, toolB})

Creating tools programmatically

To create a tool on the fly, use function_tool as a function rather than as a decorator. You must supply a name, description, and callable function. This is useful to compose specific tools based on the same underlying code or load them from external sources such as a database or Model Context Protocol (MCP) server.

In the following example, the app has a single function to set any user profile field but gives the agent one tool per field for improved reliability:

from livekit.agents import function_tool, RunContext

class Assistant(Agent):
    def _set_profile_field_func_for(self, field: str):
        async def set_value(context: RunContext, value: str):
            # custom logic to set input
            return f"field {field} was set to {value}"

        return set_value

    def __init__(self):
        super().__init__(
            tools=[
                function_tool(self._set_profile_field_func_for("phone"),
                              name="set_phone_number",
                              description="Call this function when user has provided their phone number."),
                function_tool(self._set_profile_field_func_for("email"),
                              name="set_email",
                              description="Call this function when user has provided their email."),
                # ... other tools ...
            ],
            # instructions, etc ...
        )

import { voice, llm } from '@livekit/agents';
import { z } from 'zod';

class Assistant extends voice.Agent {
  private createSetProfileFieldTool(field: string) {
    return llm.tool({
      description: `Call this function when user has provided their ${field}.`,
      parameters: z.object({
        value: z.string().describe(`The ${field} value to set`),
      }),
      execute: async ({ value }, { ctx }) => {
        // custom logic to set input
        return `field ${field} was set to ${value}`;
      },
    });
  }

  constructor() {
    super({
      tools: {
        setPhoneNumber: this.createSetProfileFieldTool("phone number"),
        setEmail: this.createSetProfileFieldTool("email"),
        // ... other tools ...
      },
      // instructions, etc ...
    });
  }
}

Creating tools from raw schema

For advanced use cases, you can create tools directly from a raw function calling schema. This is useful when integrating with existing function definitions, loading tools from external sources, or working with schemas that don't map cleanly to Python function signatures.

Use the raw_schema parameter in the @function_tool decorator to provide the full function schema:

from livekit.agents import function_tool, RunContext

raw_schema = {
    "type": "function",
    "name": "get_weather",
    "description": "Get weather for a given location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City and country e.g. New York"
            }
        },
        "required": [
            "location"
        ],
        "additionalProperties": False
    }
}

@function_tool(raw_schema=raw_schema)
async def get_weather(raw_arguments: dict[str, object], context: RunContext):
    location = raw_arguments["location"]
    
    # Your implementation here
    return f"The weather of {location} is ..."

import { voice, llm } from '@livekit/agents';

const rawSchema = {
  type: 'object',
  properties: {
    location: {
      type: 'string',
      description: 'City and country e.g. New York'
    }
  },
  required: ['location'],
  additionalProperties: false
};

const getWeather = llm.tool({
  description: 'Get weather for a given location.',
  parameters: rawSchema,
  execute: async ({ location }, { ctx }) => {
    // Your implementation here
    return `The weather of ${location} is ...`;
  },
});

When using raw schemas, function parameters are passed to your handler as a dictionary named raw_arguments. You can extract values from this dictionary using the parameter names defined in your schema.

You can also create tools programmatically using function_tool as a function with a raw schemas:

from livekit.agents import function_tool

def create_database_tool(table_name: str, operation: str):
    schema = {
        "type": "function",
        "name": f"{operation}_{table_name}",
        "description": f"Perform {operation} operation on {table_name} table",
        "parameters": {
            "type": "object",
            "properties": {
                "record_id": {
                    "type": "string",
                    "description": f"ID of the record to {operation}"
                }
            },
            "required": ["record_id"]
        }
    }
    
    async def handler(raw_arguments: dict[str, object], context: RunContext):
        record_id = raw_arguments["record_id"]
        # Perform database operation
        return f"Performed {operation} on {table_name} for record {record_id}"
    
    return function_tool(handler, raw_schema=schema)

# Create tools dynamically
user_tools = [
    create_database_tool("users", "read"),
    create_database_tool("users", "update"),  
    create_database_tool("users", "delete")
]

class DataAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="You are a database assistant.",
            tools=user_tools,
        )

import { voice, llm } from '@livekit/agents';
import { z } from 'zod';

function createDatabaseTool(tableName: string, operation: string) {
  return llm.tool({
    description: `Perform ${operation} operation on ${tableName} table`,
    parameters: z.object({
      recordId: z.string().describe(`ID of the record to ${operation}`),
    }),
    execute: async ({ recordId }, { ctx }) => {
      // Perform database operation
      return `Performed ${operation} on ${tableName} for record ${recordId}`;
    },
  });
}

// Create tools dynamically
const dataAgent = new voice.Agent({
  instructions: 'You are a database assistant.',
  tools: {
	  readUsers: createDatabaseTool("users", "read"),
	  updateUsers: createDatabaseTool("users", "update"),
	  deleteUsers: createDatabaseTool("users", "delete"),
	},
});

Error handling

Raise the ToolError exception to return an error to the LLM in place of a response. You can include a custom message to describe the error and/or recovery options.

@function_tool()
async def lookup_weather(
    self,
    context: RunContext,
    location: str,
) -> dict[str, Any]:
    if location == "mars":
        raise ToolError("This location is coming soon. Please join our mailing list to stay updated.")
    else:
        return {"weather": "sunny", "temperature_f": 70}

import { llm } from '@livekit/agents';
import { z } from 'zod';

const lookupWeather = llm.tool({
  description: 'Look up weather information for a location',
  parameters: z.object({
    location: z.string().describe('The location to get weather for'),
  }),
  execute: async ({ location }, { ctx }) => {
    if (location === "mars") {
      throw new llm.ToolError("This location is coming soon. Please join our mailing list to stay updated.");
    }
    return { weather: "sunny", temperatureF: 70 };
  },
});

Model Context Protocol (MCP)

ONLY Available in

Python

LiveKit Agents has full support for MCP servers to load tools from external sources.

To use it, first install the mcp optional dependencies:

uv add livekit-agents[mcp]~=1.2

Then pass the MCP server URL to the AgentSession or Agent constructor. The tools will be automatically loaded like any other tool.

from livekit.agents import mcp

session = AgentSession(
    #... other arguments ...
    mcp_servers=[
        mcp.MCPServerHTTP(
            "https://your-mcp-server.com"
        )       
    ]       
)

from livekit.agents import mcp

agent = Agent(
    #... other arguments ...
    mcp_servers=[
        mcp.MCPServerHTTP(
            "https://your-mcp-server.com"
        )       
    ]       
)

Forwarding to the frontend

Forward tool calls to a frontend app using RPC. This is useful when the data needed to fulfill the function call is only available at the frontend. You may also use RPC to trigger actions or UI updates in a structured way.

For instance, here's a function that accesses the user's live location from their web browser:

Agent implementation

from livekit.agents import function_tool, get_job_context, RunContext

@function_tool()
async def get_user_location(
    context: RunContext,
    high_accuracy: bool
):
    """Retrieve the user's current geolocation as lat/lng.
    
    Args:
        high_accuracy: Whether to use high accuracy mode, which is slower but more precise
    
    Returns:
        A dictionary containing latitude and longitude coordinates
    """
    try:
        room = get_job_context().room
        participant_identity = next(iter(room.remote_participants))
        response = await room.local_participant.perform_rpc(
            destination_identity=participant_identity,
            method="getUserLocation",
            payload=json.dumps({
                "highAccuracy": high_accuracy
            }),
            response_timeout=10.0 if high_accuracy else 5.0,
        )
        return response
    except Exception:
        raise ToolError("Unable to retrieve user location")

import { llm, getJobContext } from '@livekit/agents';
import { z } from 'zod';

const getUserLocation = llm.tool({
  description: 'Retrieve the user\'s current geolocation as lat/lng.',
  parameters: z.object({
    highAccuracy: z.boolean().describe('Whether to use high accuracy mode, which is slower but more precise'),
  }),
  execute: async ({ highAccuracy }, { ctx }) => {
    try {
      const room = getJobContext().room;
      const participant = Array.from(room.remoteParticipants.values())[0]!;
      
      const response = await room.localParticipant!.performRpc({
        destinationIdentity: participant.identity,
        method: 'getUserLocation',
        payload: JSON.stringify({ highAccuracy }),
        responseTimeout: highAccuracy ? 10000 : 5000,
      });
      
      return response;
    } catch (error) {
      throw new llm.ToolError("Unable to retrieve user location");
    }
  },
});

Frontend implementation

The following example uses the JavaScript SDK. The same pattern works for other SDKs. For more examples, see the RPC documentation.

import { RpcError, RpcInvocationData } from 'livekit-client';

localParticipant.registerRpcMethod(
    'getUserLocation',
    async (data: RpcInvocationData) => {
        try {
            let params = JSON.parse(data.payload);
            const position: GeolocationPosition = await new Promise((resolve, reject) => {
                navigator.geolocation.getCurrentPosition(resolve, reject, {
                    enableHighAccuracy: params.highAccuracy ?? false,
                    timeout: data.responseTimeout,
                });
            });

            return JSON.stringify({
                latitude: position.coords.latitude,
                longitude: position.coords.longitude,
            });
        } catch (error) {
            throw new RpcError(1, "Could not retrieve user location");
        }
    }
);

External tools and MCP

To load tools from an external source as a Model Context Protocol (MCP) server, use the function_tool function and register the tools with the tools property or update_tools() method. See the following example for a complete MCP implementation:

MCP Agent

A voice AI agent with an integrated Model Context Protocol (MCP) client for the LiveKit API.

Examples

The following additional examples show how to use tools in different ways:

Use of enum

Example showing how to annotate arguments with enum.

Dynamic tool creation

Complete example with dynamic tool lists.

MCP Agent

A voice AI agent with an integrated Model Context Protocol (MCP) client for the LiveKit API.

Additional resources

The following articles provide more information about the topics discussed in this guide:

RPC

Complete documentation on function calling between LiveKit participants.

Agent speech

More information about precise control over agent speech output.

Workflows

Read more about handing off control to other agents.

External data and RAG

Best practices for adding context and taking external actions.