Function calling (also known as "tool calling" or "tool use") is a powerful LLM capability that allows AI models to interact with external functions and tools. This allows your agent to retrieve additional context before generating a response or take real-world actions.
For example, when a user says "What's the weather like in New York?", the LLM can intelligently detect the need to call your weather function before it responds. As another example, you could use function calling to trigger an action in the frontend, such as presenting a map to the user or querying their current location.
Both PipelineVoiceAgent
and MultimodalAgent
have built-in support for function calling, making it easy to create rich voice-powered applications.
Usage
import aiohttpfrom typing import Annotatedfrom livekit.agents import llmfrom livekit.agents.pipeline import VoicePipelineAgentfrom livekit.agents.multimodal import MultimodalAgent# first define a class that inherits from llm.FunctionContextclass AssistantFnc(llm.FunctionContext):# the llm.ai_callable decorator marks this function as a tool available to the LLM# by default, it'll use the docstring as the function's description@llm.ai_callable()async def get_weather(self,# by using the Annotated type, arg description and type are available to the LLMlocation: Annotated[str, llm.TypeInfo(description="The location to get the weather for")],):"""Called when the user asks about the weather. This function will return the weather for the given location."""logger.info(f"getting weather for {location}")url = f"https://wttr.in/{location}?format=%C+%t"async with aiohttp.ClientSession() as session:async with session.get(url) as response:if response.status == 200:weather_data = await response.text()# response from the function call is returned to the LLM# as a tool response. The LLM's response will include this datareturn f"The weather in {location} is {weather_data}."else:raise f"Failed to get weather data, status code: {response.status}"fnc_ctx = AssistantFnc()# pass the function context to the agentpipeline_agent = VoicePipelineAgent(...fnc_ctx=fnc_ctx,)multimodal_agent = MultimodalAgent(...fnc_ctx=fnc_ctx,)
Forwarding with RPC
Function calls can be forwarded to a frontend application using RPC. This is useful when the data needed to fulfill the function call is only available at the frontend. It can also be used to trigger actions or UI updates in a structured way.
For instance, here's a function that accesses the user's live location from their web browser:
Agent implementation
from livekit.agents import llmfrom typing import Annotatedclass AssistantFnc(llm.FunctionContext):@llm.ai_callable()async def get_user_location(self,high_accuracy: Annotated[bool, llm.TypeInfo(description="Whether to use high accuracy mode, which is slower")] = False):"""Retrieve the user's current geolocation as lat/lng."""try:return await ctx.room.local_participant.perform_rpc(destination_identity=participant.identity,method="getUserLocation",payload=json.dumps({"highAccuracy": high_accuracy}),response_timeout=10.0 if high_accuracy else 5.0,)except Exception:return "Unable to retrieve user location"
Frontend implementation
import { RpcError, RpcInvocationData } from 'livekit-client';localParticipant.registerRpcMethod('getUserLocation',async (data: RpcInvocationData) => {try {let params = JSON.parse(data.payload);const position: GeolocationPosition = await new Promise((resolve, reject) => {navigator.geolocation.getCurrentPosition(resolve, reject, {enableHighAccuracy: params.highAccuracy ?? false,timeout: data.responseTimeout,});});return JSON.stringify({latitude: position.coords.latitude,longitude: position.coords.longitude,});} catch (error) {throw new RpcError(1, "Could not retrieve user location");}});