Skip to main content

Restaurant agent

Build a multi-agent restaurant system using handoffs and shared state between agents.

Overview

In this recipe, build a voice AI restaurant system where a greeter agent routes callers to specialist agents for reservations, takeaway orders, and checkout. The example uses agent handoffs and demonstrates the following patterns:

  • Shared state with userdata: a typed state object stored in session.userdata tracks customer information, order details, and payment across agents.
  • Shared tools: tools like updateName and updatePhone are defined once and included in multiple agents.
  • Context preservation: a base class copies truncated chat history from the previous agent so each specialist has conversational continuity.
  • Per-agent voice: each agent uses a distinct TTS voice to signal the transition to the caller.

When to use this pattern

Agent handoffs are one pattern among several for structuring multi-agent workflows. Handoffs are a good fit when:

  • Each phase has distinct instructions and tools that would bloat a single agent prompt.
  • The caller should hear a distinct voice or persona for each phase.
  • Transitions between phases are clear-cut and driven by user intent.

For alternatives, see the Workflows guide, which compares single-agent tools, the supervisor pattern, handoffs, and task groups.

Prerequisites

To complete this guide, you need the following prerequisites:

  • Create an agent using the Voice AI quickstart. This gives you a working project with API keys and dependencies installed. Replace the contents of your agent file with the code in this recipe.
  • Install the pyyaml package (Python only). The summarize helper uses YAML to serialize state for the LLM.

Define shared state

Start by defining a UserData type to hold everything the agents collect during a call: the customer name, phone number, order items, and payment details. Every agent and tool reads and writes to this single object through session.userdata.

The custom summarize method serializes the current state and injects it into the chat context when an agent takes over, so each specialist knows what data was collected.

Add the following imports and UserData definition at the top of your agent file:

import logging
from dataclasses import dataclass, field
from typing import Annotated
import yaml
from dotenv import load_dotenv
from pydantic import Field
from livekit.agents import Agent, AgentServer, AgentSession, JobContext, RunContext, cli, inference
from livekit.agents.llm import function_tool
from livekit.plugins import silero
logger = logging.getLogger("restaurant-example")
logger.setLevel(logging.INFO)
load_dotenv()
# Each agent uses a distinct voice to signal transitions to the caller.
voices = {
"greeter": "e07c00bc-4134-4eae-9ea4-1a55fb45746b",
"reservation": "9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
"takeaway": "5ee9feff-1265-424a-9d7f-8e4d431a12c7",
"checkout": "a167e0f3-df7e-4d52-a9c3-f949145efdab",
}
@dataclass
class UserData:
customer_name: str | None = None
customer_phone: str | None = None
reservation_time: str | None = None
order: list[str] | None = None
customer_credit_card: str | None = None
customer_credit_card_expiry: str | None = None
customer_credit_card_cvv: str | None = None
expense: float | None = None
checked_out: bool | None = None
agents: dict[str, Agent] = field(default_factory=dict)
prev_agent: Agent | None = None
def summarize(self) -> str:
data = {
"customer_name": self.customer_name or "unknown",
"customer_phone": self.customer_phone or "unknown",
"reservation_time": self.reservation_time or "unknown",
"order": self.order or "unknown",
"credit_card": {
"number": self.customer_credit_card or "unknown",
"expiry": self.customer_credit_card_expiry or "unknown",
"cvv": self.customer_credit_card_cvv or "unknown",
}
if self.customer_credit_card
else None,
"expense": self.expense or "unknown",
"checked_out": self.checked_out or False,
}
# YAML is more compact and easier for the LLM to parse than JSON.
return yaml.dump(data)
RunContext_T = RunContext[UserData]
import {
type JobContext,
type JobProcess,
ServerOptions,
cli,
dedent,
defineAgent,
inference,
llm,
voice,
} from '@livekit/agents';
import * as silero from '@livekit/agents-plugin-silero';
import { fileURLToPath } from 'node:url';
import { z } from 'zod';
// Each agent uses a distinct voice to signal transitions to the caller.
const voices = {
greeter: 'e07c00bc-4134-4eae-9ea4-1a55fb45746b',
reservation: '9626c31c-bec5-4cca-baa8-f8ba9e84c8bc',
takeaway: '5ee9feff-1265-424a-9d7f-8e4d431a12c7',
checkout: 'a167e0f3-df7e-4d52-a9c3-f949145efdab',
};
type UserData = {
customer: Partial<{
name: string;
phone: string;
}>;
creditCard: Partial<{
number: string;
expiry: string;
cvv: string;
}>;
reservationTime?: string;
order?: string[];
expense?: number;
checkedOut?: boolean;
agents: Record<string, voice.Agent<UserData>>;
prevAgent?: voice.Agent<UserData>;
};
function createUserData(agents: Record<string, voice.Agent<UserData>>) {
return {
customer: {},
creditCard: {},
agents,
};
}
function summarize({
customer,
reservationTime,
order,
creditCard,
expense,
checkedOut,
}: UserData) {
return JSON.stringify(
{
customer: customer.name ?? 'unknown',
customerPhone: customer.phone ?? 'unknown',
reservationTime: reservationTime ?? 'unknown',
order: order ?? 'unknown',
creditCard: creditCard
? {
number: creditCard.number ?? 'unknown',
expiry: creditCard.expiry ?? 'unknown',
cvv: creditCard.cvv ?? 'unknown',
}
: undefined,
expense: expense ?? 'unknown',
checkedOut: checkedOut ?? false,
},
null,
2,
);
}

Define shared tools and a base agent

Define tools that multiple agents share, such as collecting a customer name or phone number. These are standalone tool definitions rather than methods on a single agent, so any agent can include them in its tool set.

The BaseAgent class handles the common onEnter logic that every specialist agent needs. When an agent becomes active, it copies truncated chat history from the previous agent and injects a system message with the current UserData state. This gives each specialist enough conversational context to continue naturally without carrying the full history.

Add the shared tools and base class below the UserData definition:

# Shared tools that multiple agents reuse.
@function_tool()
async def update_name(
name: Annotated[str, Field(description="The customer's name")],
context: RunContext_T,
) -> str:
"""Called when the user provides their name.
Confirm the spelling with the user before calling the function."""
userdata = context.userdata
userdata.customer_name = name
return f"The name is updated to {name}"
@function_tool()
async def update_phone(
phone: Annotated[str, Field(description="The customer's phone number")],
context: RunContext_T,
) -> str:
"""Called when the user provides their phone number.
Confirm the spelling with the user before calling the function."""
userdata = context.userdata
userdata.customer_phone = phone
return f"The phone number is updated to {phone}"
@function_tool()
async def to_greeter(context: RunContext_T) -> Agent:
"""Called when user asks any unrelated questions or requests
any other services not in your job description."""
curr_agent: BaseAgent = context.session.current_agent
return await curr_agent._transfer_to_agent("greeter", context)
class BaseAgent(Agent):
"""Base class that every specialist agent extends. Handles two things
that all agents need: copying context from the previous agent on entry,
and transferring control to the next agent on exit."""
async def on_enter(self) -> None:
"""Called by the framework when this agent becomes active."""
agent_name = self.__class__.__name__
logger.info(f"entering task {agent_name}")
userdata: UserData = self.session.userdata
chat_ctx = self.chat_ctx.copy()
# Copy the last few turns from the previous agent so this agent
# has conversational continuity without carrying the full history.
# truncate(max_items=6) keeps context growth bounded across handoffs.
if isinstance(userdata.prev_agent, Agent):
truncated_chat_ctx = userdata.prev_agent.chat_ctx.copy(
exclude_instructions=True,
exclude_function_call=False,
exclude_handoff=True,
exclude_config_update=True,
).truncate(max_items=6)
existing_ids = {item.id for item in chat_ctx.items}
items_copy = [item for item in truncated_chat_ctx.items if item.id not in existing_ids]
chat_ctx.items.extend(items_copy)
# Inject the serialized UserData as a system message so this agent
# knows the customer name, order, and other collected data.
chat_ctx.add_message(
role="system",
content=f"You are {agent_name} agent. Current user data is {userdata.summarize()}",
)
await self.update_chat_ctx(chat_ctx)
self.session.generate_reply(tool_choice="none")
async def _transfer_to_agent(self, name: str, context: RunContext_T) -> tuple[Agent, str]:
"""Look up the next agent by name from the shared registry and hand
off control. Returning an (Agent, str) tuple from a tool triggers
the framework's handoff mechanism."""
userdata = context.userdata
current_agent = context.session.current_agent
next_agent = userdata.agents[name]
userdata.prev_agent = current_agent
return next_agent, f"Transferring to {name}."
// Shared tools that multiple agents reuse.
const updateName = llm.tool({
description:
'Called when the user provides their name. Confirm the spelling with the user before calling the function.',
parameters: z.object({
name: z.string().describe('The customer name'),
}),
execute: async ({ name }, { ctx }: llm.ToolOptions<UserData>) => {
ctx.userData.customer.name = name;
return `The name is updated to ${name}`;
},
});
const updatePhone = llm.tool({
description:
'Called when the user provides their phone number. Confirm the spelling with the user before calling the function.',
parameters: z.object({
phone: z.string().describe('The customer phone number'),
}),
execute: async ({ phone }, { ctx }: llm.ToolOptions<UserData>) => {
ctx.userData.customer.phone = phone;
return `The phone number is updated to ${phone}`;
},
});
const toGreeter = llm.tool({
description:
'Called when user asks any unrelated questions or requests any other services not in your job description.',
execute: async (_, { ctx }: llm.ToolOptions<UserData>) => {
const currAgent = ctx.session.currentAgent as BaseAgent;
return await currAgent.transferToAgent({
name: 'greeter',
ctx,
});
},
});
// Base class that every specialist agent extends. Handles two things
// that all agents need: copying context from the previous agent on entry,
// and transferring control to the next agent on exit.
class BaseAgent extends voice.Agent<UserData> {
name: string;
constructor(options: voice.AgentOptions<UserData> & { name: string }) {
const { name, ...opts } = options;
super(opts);
this.name = name;
}
// Called by the framework when this agent becomes active.
async onEnter(): Promise<void> {
const userdata = this.session.userData;
const chatCtx = this.chatCtx.copy();
// Copy the last few turns from the previous agent so this agent
// has conversational continuity without carrying the full history.
// truncate(6) keeps context growth bounded across handoffs.
if (userdata.prevAgent) {
const truncatedChatCtx = userdata.prevAgent.chatCtx
.copy({
excludeInstructions: true,
excludeFunctionCall: false,
})
.truncate(6);
const existingIds = new Set(chatCtx.items.map((item) => item.id));
const newItems = truncatedChatCtx.items.filter((item) => !existingIds.has(item.id));
chatCtx.items.push(...newItems);
}
// Inject the serialized UserData as a system message so this agent
// knows the customer name, order, and other collected data.
chatCtx.addMessage({
role: 'system',
content: `You are ${this.name} agent. Current user data is ${summarize(userdata)}`,
});
await this.updateChatCtx(chatCtx);
this.session.generateReply({ toolChoice: 'none' });
}
// Look up the next agent by name from the shared registry and hand
// off control. Returning llm.handoff() from a tool triggers
// the framework's handoff mechanism.
async transferToAgent(options: { name: string; ctx: voice.RunContext<UserData> }) {
const { name, ctx } = options;
const userdata = ctx.userData;
const currentAgent = ctx.session.currentAgent;
const nextAgent = userdata.agents[name];
if (!nextAgent) {
throw new Error(`Agent ${name} not found`);
}
userdata.prevAgent = currentAgent;
return llm.handoff({
agent: nextAgent,
returns: `Transferring to ${name}`,
});
}
}

Shared agent capabilities

The BaseAgent class provides four capabilities that every specialist agent inherits:

  • Truncated context: truncate(max_items=6) (Python) or truncate(6) (Node.js) keeps only the last few turns from the previous agent. This prevents the context window from growing across handoffs while preserving enough history for conversational continuity.
  • State injection: The agent injects the summarize() output as a system message so the specialist knows the customer name, order, and any other collected data without needing the full history.
  • Agent transfer: In Python, returning an (Agent, str) tuple from a tool triggers a handoff. In Node.js, returning llm.handoff() does the same. The framework switches the active agent and uses the string as the transition message.
  • Pre-instantiated agents: The UserData.agents dictionary holds all agent instances, so the system reuses them across handoffs rather than recreating each time.

Implement the greeter

The greeter is the entry point. It receives the caller, explains the menu, and routes to the reservation or takeaway agent based on what the caller wants. Each routing tool returns a handoff.

Add the greeter class below the BaseAgent:

class Greeter(BaseAgent):
def __init__(self, menu: str) -> None:
super().__init__(
instructions=(
f"You are a friendly restaurant receptionist. The menu is: {menu}\n"
"Your jobs are to greet the caller and understand if they want to "
"make a reservation or order takeaway. Guide them to the right agent using tools."
),
llm=inference.LLM(
model="openai/gpt-4.1-mini", extra_kwargs={"parallel_tool_calls": False}
),
tts=inference.TTS(model="cartesia/sonic-3", voice=voices["greeter"]),
)
self.menu = menu
@function_tool()
async def to_reservation(self, context: RunContext_T) -> tuple[Agent, str]:
"""Called when user wants to make or update a reservation.
This function handles transitioning to the reservation agent
who will collect the necessary details like reservation time,
customer name and phone number."""
return await self._transfer_to_agent("reservation", context)
@function_tool()
async def to_takeaway(self, context: RunContext_T) -> tuple[Agent, str]:
"""Called when the user wants to place a takeaway order.
This includes handling orders for pickup, delivery, or when the user wants to
proceed to checkout with their existing order."""
return await self._transfer_to_agent("takeaway", context)
function createGreeterAgent(menu: string) {
const greeter = new BaseAgent({
name: 'greeter',
instructions: `You are a friendly restaurant receptionist. The menu is: ${menu}\nYour jobs are to greet the caller and understand if they want to make a reservation or order takeaway. Guide them to the right agent using tools.`,
llm: new inference.LLM({ model: 'openai/gpt-4.1-mini' }),
tts: new inference.TTS({ model: 'cartesia/sonic-3', voice: voices.greeter }),
tools: {
toReservation: llm.tool({
description: dedent`
Called when user wants to make or update a reservation.
This function handles transitioning to the reservation agent
who will collect the necessary details like reservation time,
customer name and phone number.
`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff> => {
return await greeter.transferToAgent({
name: 'reservation',
ctx,
});
},
}),
toTakeaway: llm.tool({
description: dedent`
Called when the user wants to place a takeaway order.
This includes handling orders for pickup, delivery, or when the user wants to
proceed to checkout with their existing order.
`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff> => {
return await greeter.transferToAgent({
name: 'takeaway',
ctx,
});
},
}),
},
});
return greeter;
}

Implement specialist agents

Three specialist agents handle the restaurant workflow: reservation, takeaway, and checkout. Each overrides the TTS voice so callers hear a distinct voice when control transfers. Tools validate that required data is present before allowing transitions, ensuring orders are complete before reaching checkout.

Reservation agent

Collects the reservation time, customer name, and phone number. Add this below the greeter:

class Reservation(BaseAgent):
def __init__(self) -> None:
super().__init__(
instructions="You are a reservation agent at a restaurant. Your jobs are to ask for "
"the reservation time, then customer's name, and phone number. Then "
"confirm the reservation details with the customer.",
tools=[update_name, update_phone, to_greeter],
tts=inference.TTS(model="cartesia/sonic-3", voice=voices["reservation"]),
)
@function_tool()
async def update_reservation_time(
self,
time: Annotated[str, Field(description="The reservation time")],
context: RunContext_T,
) -> str:
"""Called when the user provides their reservation time.
Confirm the time with the user before calling the function."""
userdata = context.userdata
userdata.reservation_time = time
return f"The reservation time is updated to {time}"
@function_tool()
async def confirm_reservation(self, context: RunContext_T) -> str | tuple[Agent, str]:
"""Called when the user confirms the reservation."""
userdata = context.userdata
if not userdata.customer_name or not userdata.customer_phone:
return "Please provide your name and phone number first."
if not userdata.reservation_time:
return "Please provide reservation time first."
return await self._transfer_to_agent("greeter", context)
function createReservationAgent() {
const reservation = new BaseAgent({
name: 'reservation',
instructions: `You are a reservation agent at a restaurant. Your jobs are to ask for the reservation time, then customer's name, and phone number. Then confirm the reservation details with the customer.`,
tts: new inference.TTS({ model: 'cartesia/sonic-3', voice: voices.reservation }),
tools: {
updateName,
updatePhone,
toGreeter,
updateReservationTime: llm.tool({
description: dedent`
Called when the user provides their reservation time.
Confirm the time with the user before calling the function.
`,
parameters: z.object({
time: z.string().describe('The reservation time'),
}),
execute: async ({ time }, { ctx }) => {
ctx.userData.reservationTime = time;
return `The reservation time is updated to ${time}`;
},
}),
confirmReservation: llm.tool({
description: `Called when the user confirms the reservation.`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff | string> => {
const userdata = ctx.userData;
if (!userdata.customer.name || !userdata.customer.phone) {
return 'Please provide your name and phone number first.';
}
if (!userdata.reservationTime) {
return 'Please provide reservation time first.';
}
return await reservation.transferToAgent({
name: 'greeter',
ctx,
});
},
}),
},
});
return reservation;
}

Takeaway agent

Manages the food order and routes to checkout once the caller confirms. Add this below the reservation agent:

class Takeaway(BaseAgent):
def __init__(self, menu: str) -> None:
super().__init__(
instructions=(
f"You are a takeaway agent that takes orders from the customer. "
f"Our menu is: {menu}\n"
"Clarify special requests and confirm the order with the customer."
),
tools=[to_greeter],
tts=inference.TTS(model="cartesia/sonic-3", voice=voices["takeaway"]),
)
@function_tool()
async def update_order(
self,
items: Annotated[list[str], Field(description="The items of the full order")],
context: RunContext_T,
) -> str:
"""Called when the user creates or updates their order."""
userdata = context.userdata
userdata.order = items
return f"The order is updated to {items}"
@function_tool()
async def to_checkout(self, context: RunContext_T) -> str | tuple[Agent, str]:
"""Called when the user confirms the order."""
userdata = context.userdata
if not userdata.order:
return "No takeaway order found. Please make an order first."
return await self._transfer_to_agent("checkout", context)
function createTakeawayAgent(menu: string) {
const takeaway = new BaseAgent({
name: 'takeaway',
instructions: `You are a takeaway agent that takes orders from the customer. Our menu is: ${menu}\nClarify special requests and confirm the order with the customer.`,
tts: new inference.TTS({ model: 'cartesia/sonic-3', voice: voices.takeaway }),
tools: {
toGreeter,
updateOrder: llm.tool({
description: `Called when the user creates or updates their order.`,
parameters: z.object({
items: z.array(z.string()).describe('The items of the full order'),
}),
execute: async ({ items }, { ctx }) => {
ctx.userData.order = items;
return `The order is updated to ${items}`;
},
}),
toCheckout: llm.tool({
description: `Called when the user confirms the order.`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff | string> => {
const userdata = ctx.userData;
if (!userdata.order) {
return 'No takeaway order found. Please make an order first.';
}
return await takeaway.transferToAgent({
name: 'checkout',
ctx,
});
},
}),
},
});
return takeaway;
}

Checkout agent

Confirms the expense and collects payment information before completing the order. Add this below the takeaway agent:

Demo only

This example stores raw credit card data in memory for simplicity. In production, use a payment processor like Stripe and never store raw card numbers.

class Checkout(BaseAgent):
def __init__(self, menu: str) -> None:
super().__init__(
instructions=(
f"You are a checkout agent at a restaurant. The menu is: {menu}\n"
"You are responsible for confirming the expense of the "
"order and then collecting customer's name, phone number and credit card "
"information, including the card number, expiry date, and CVV step by step."
),
tools=[update_name, update_phone, to_greeter],
tts=inference.TTS(model="cartesia/sonic-3", voice=voices["checkout"]),
)
@function_tool()
async def confirm_expense(
self,
expense: Annotated[float, Field(description="The expense of the order")],
context: RunContext_T,
) -> str:
"""Called when the user confirms the expense."""
userdata = context.userdata
userdata.expense = expense
return f"The expense is confirmed to be {expense}"
@function_tool()
async def update_credit_card(
self,
number: Annotated[str, Field(description="The credit card number")],
expiry: Annotated[str, Field(description="The expiry date of the credit card")],
cvv: Annotated[str, Field(description="The CVV of the credit card")],
context: RunContext_T,
) -> str:
"""Called when the user provides their credit card number, expiry date, and CVV.
Confirm the spelling with the user before calling the function."""
userdata = context.userdata
userdata.customer_credit_card = number
userdata.customer_credit_card_expiry = expiry
userdata.customer_credit_card_cvv = cvv
return f"The credit card number is updated to {number}"
@function_tool()
async def confirm_checkout(self, context: RunContext_T) -> str | tuple[Agent, str]:
"""Called when the user confirms the checkout."""
userdata = context.userdata
if not userdata.expense:
return "Please confirm the expense first."
if (
not userdata.customer_credit_card
or not userdata.customer_credit_card_expiry
or not userdata.customer_credit_card_cvv
):
return "Please provide the credit card information first."
userdata.checked_out = True
return await to_greeter(context)
@function_tool()
async def to_takeaway(self, context: RunContext_T) -> tuple[Agent, str]:
"""Called when the user wants to update their order."""
return await self._transfer_to_agent("takeaway", context)
function createCheckoutAgent(menu: string) {
const checkout = new BaseAgent({
name: 'checkout',
instructions: `You are a checkout agent at a restaurant. The menu is: ${menu}\nYou are responsible for confirming the expense of the order and then collecting customer's name, phone number and credit card information, including the card number, expiry date, and CVV step by step.`,
tts: new inference.TTS({ model: 'cartesia/sonic-3', voice: voices.checkout }),
tools: {
updateName,
updatePhone,
toGreeter,
confirmExpense: llm.tool({
description: `Called when the user confirms the expense.`,
parameters: z.object({
expense: z.number().describe('The expense of the order'),
}),
execute: async ({ expense }, { ctx }) => {
ctx.userData.expense = expense;
return `The expense is confirmed to be ${expense}`;
},
}),
updateCreditCard: llm.tool({
description: dedent`
Called when the user provides their credit card number, expiry date, and CVV.
Confirm the spelling with the user before calling the function.
`,
parameters: z.object({
number: z.string().describe('The credit card number'),
expiry: z.string().describe('The expiry date of the credit card'),
cvv: z.string().describe('The CVV of the credit card'),
}),
execute: async ({ number, expiry, cvv }, { ctx }) => {
ctx.userData.creditCard = { number, expiry, cvv };
return `The credit card number is updated to ${number}`;
},
}),
confirmCheckout: llm.tool({
description: `Called when the user confirms the checkout.`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff | string> => {
const userdata = ctx.userData;
if (!userdata.expense) {
return 'Please confirm the expense first.';
}
if (
!userdata.creditCard.number ||
!userdata.creditCard.expiry ||
!userdata.creditCard.cvv
) {
return 'Please provide the credit card information first.';
}
userdata.checkedOut = true;
return await checkout.transferToAgent({
name: 'greeter',
ctx,
});
},
}),
toTakeaway: llm.tool({
description: `Called when the user wants to update their order.`,
execute: async (_, { ctx }): Promise<llm.AgentHandoff> => {
return await checkout.transferToAgent({
name: 'takeaway',
ctx,
});
},
}),
},
});
return checkout;
}

Notice how each transition tool validates required fields before allowing the handoff. confirmReservation checks for a name, phone number, and time. toCheckout checks for an order. confirmCheckout checks for expense and credit card details. If validation fails, the tool returns an error string instead of a handoff, and the LLM uses that message to ask the caller for the missing information.

Set up the session

Create all agent instances up front and store them in UserData.agents. This agent registry pattern lets any agent look up and hand off to any other agent by name. Start the session with the greeter as the initial agent.

Replace the entrypoint at the bottom of your agent file with the following:

server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: JobContext):
menu = "Pizza: $10, Salad: $5, Ice Cream: $3, Coffee: $2"
userdata = UserData()
userdata.agents.update(
{
"greeter": Greeter(menu),
"reservation": Reservation(),
"takeaway": Takeaway(menu),
"checkout": Checkout(menu),
}
)
session = AgentSession[UserData](
userdata=userdata,
stt=inference.STT(model="deepgram/nova-3"),
llm=inference.LLM(model="openai/gpt-4.1-mini"),
tts=inference.TTS(model="cartesia/sonic-3"),
vad=silero.VAD.load(),
max_tool_steps=5,
)
await session.start(
agent=userdata.agents["greeter"],
room=ctx.room,
)
if __name__ == "__main__":
cli.run_app(server)
export default defineAgent({
prewarm: async (proc: JobProcess) => {
proc.userData.vad = await silero.VAD.load();
},
entry: async (ctx: JobContext) => {
const menu = 'Pizza: $10, Salad: $5, Ice Cream: $3, Coffee: $2';
const userData = createUserData({
greeter: createGreeterAgent(menu),
reservation: createReservationAgent(),
takeaway: createTakeawayAgent(menu),
checkout: createCheckoutAgent(menu),
});
const vad = ctx.proc.userData.vad! as silero.VAD;
const session = new voice.AgentSession({
vad,
stt: new inference.STT({ model: 'deepgram/nova-3' }),
llm: new inference.LLM({ model: 'openai/gpt-4.1-mini' }),
tts: new inference.TTS({ model: 'cartesia/sonic-3' }),
userData,
maxToolSteps: 5,
});
await session.start({
agent: userData.agents.greeter!,
room: ctx.room,
});
},
});
cli.runApp(new ServerOptions({ agent: fileURLToPath(import.meta.url) }));

Run it

Start the agent in development mode:

uv run src/agent.py dev
pnpm dev

Open the link printed by the CLI to speak to your agent in the Agent Console. Try asking to make a reservation, then place a takeaway order. Each agent uses a different voice, so the transition is audible.

How it works

When a caller connects, the greeter agent takes control, introduces the restaurant, and asks what the caller needs. Based on the response, the LLM picks a routing tool (toReservation or toTakeaway) and the framework hands off to the corresponding specialist.

The reservation path collects a time, name, and phone number, then hands back to the greeter. The takeaway path collects the order, then routes to the checkout agent, which confirms the expense and collects credit card details before marking the order complete. Every specialist also includes the toGreeter tool, so the caller can return to the main menu at any point by asking an unrelated question.

Full source code

The full source code is available in GitHub: