Skip to main content

Company directory phone assistant

Build a phone assistant that can transfer calls to different departments using SIP REFER.

In this recipe, build a phone assistant that transfers callers to different departments via SIP REFER. The assistant handles two input paths: a room-level sip_dtmf_received handler routes keypad presses immediately, and a route_to_department tool uses GetDtmfTask to collect a selection from callers who speak or ask the agent to route them.

Prerequisites

To complete this guide, you need the following prerequisites:

Setting up the environment

Create an environment file with the necessary credentials and phone numbers:

# Initialize environment variables
# The .env.local file should look like:
# BILLING_PHONE_NUMBER=+12345678901
# TECH_SUPPORT_PHONE_NUMBER=+12345678901
# CUSTOMER_SERVICE_PHONE_NUMBER=+12345678901
# LIVEKIT_URL=wss://your-url-goes-here.livekit.cloud
# LIVEKIT_API_KEY=your-key-here
# LIVEKIT_API_SECRET=your-secret-here
from dotenv import load_dotenv
load_dotenv(".env.local")

Implementing the phone assistant

Create a custom Agent class that extends the base Agent class. UserData tracks the selected department, a cached LiveKitAPI client, the JobContext, and the active SIP caller:

import asyncio
import logging
import os
from dataclasses import dataclass
from typing import Optional
from livekit import rtc, api
from livekit.agents import (
Agent,
AgentServer,
AgentSession,
JobContext,
JobProcess,
RunContext,
ToolError,
cli,
function_tool,
inference,
room_io,
)
from livekit.agents.beta.workflows.dtmf_inputs import GetDtmfTask
from livekit.plugins import ai_coustics, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
from livekit.protocol import sip as proto_sip
logger = logging.getLogger("phone-assistant")
DEPARTMENTS = {
"1": ("BILLING_PHONE_NUMBER", "Billing"),
"2": ("TECH_SUPPORT_PHONE_NUMBER", "Tech Support"),
"3": ("CUSTOMER_SERVICE_PHONE_NUMBER", "Customer Service"),
}
@dataclass
class UserData:
"""Store user data and state for the phone assistant."""
selected_department: Optional[str] = None
livekit_api: Optional[api.LiveKitAPI] = None
ctx: Optional[JobContext] = None
sip_caller: Optional[rtc.RemoteParticipant] = None
RunContext_T = RunContext[UserData]
class PhoneAssistant(Agent):
"""A voice-enabled phone assistant that routes callers to a department."""
def __init__(self) -> None:
instructions = (
"You are a friendly assistant at Vandelay Industries providing support. "
"When a caller wants to reach a department, call the route_to_department "
"tool to collect their selection. The available departments are:\n"
"- 1 for Billing\n"
"- 2 for Technical Support\n"
"- 3 for Customer Service"
)
super().__init__(instructions=instructions)
async def on_enter(self) -> None:
"""Called when the agent is first activated."""
logger.info("PhoneAssistant activated")
greeting = (
"Hi, thanks for calling Vandelay Industries — global leader in fine latex goods! "
"You can press 1 for Billing, 2 for Technical Support, "
"or 3 for Customer Service. You can also just talk to me, since I'm a LiveKit agent."
)
await self.session.generate_reply(user_input=greeting)

Routing callers

The assistant supports two routing paths. A DTMF-driven fast path handles keypad presses directly, and a voice-driven tool uses GetDtmfTask to collect a selection from callers who speak.

DTMF fast path

When the caller presses a keypad digit, the room-level sip_dtmf_received handler in the entrypoint (shown later) calls route_digit directly. This skips the LLM and keeps routing snappy — DTMF tones transfer the call without waiting for the model to interpret what happened.

async def route_digit(self, digit: str) -> None:
"""Route the caller to a department based on an already-received digit."""
userdata = self.session.userdata
if digit not in DEPARTMENTS or userdata.sip_caller is None:
return
env_var, dept_name = DEPARTMENTS[digit]
userdata.selected_department = dept_name
logger.info(f"DTMF routing: digit={digit} department={dept_name}")
self.session.interrupt()
await self.session.generate_reply(
instructions=f"Tell the caller they're being transferred to our {dept_name} department and to please hold.",
allow_interruptions=False,
)
await asyncio.sleep(6)
await self._transfer_call(userdata.sip_caller.identity, f"tel:{os.getenv(env_var)}")

Voice tool with GetDtmfTask

The route_to_department tool runs when the LLM decides the caller wants to be routed — for example, the caller says "transfer me" or "connect me with billing." GetDtmfTask is a prebuilt task that collects digits from the caller, accepting both DTMF keypad tones and spoken digits. Configure it with num_digits=1 to collect a single menu selection.

The tool wraps GetDtmfTask in a retry loop: if collection fails (timeout, missed digits), GetDtmfTask raises ToolError and the loop re-prompts. Invalid selections are also handled inside the tool rather than being returned to the LLM, so the caller gets a consistent re-prompt experience.

TransferSIPParticipant requires the participant_identity of the SIP caller, which is assigned at dispatch time and might differ from the caller's phone number. The entrypoint captures the SIP caller once via ctx.wait_for_participant and stores it in UserData.sip_caller, so both routing paths reference it directly. To learn more, see Identifying SIP callers.

@function_tool()
async def route_to_department(self, context: RunContext_T) -> str:
"""Collect a department selection from the caller and transfer their call."""
userdata = context.userdata
if userdata.sip_caller is None:
return "No active SIP caller to transfer."
while True:
try:
result = await GetDtmfTask(
num_digits=1,
chat_ctx=self.chat_ctx.copy(
exclude_instructions=True,
exclude_function_call=True,
exclude_handoff=True,
exclude_config_update=True,
),
extra_instructions=(
"Ask the caller to press or say 1 for Billing, 2 for Technical Support, "
"or 3 for Customer Service. Give them a moment to respond."
),
)
except ToolError as e:
await self.session.generate_reply(
instructions=e.message, allow_interruptions=False
)
continue
if result.user_input in DEPARTMENTS:
break
await self.session.generate_reply(
instructions=(
"Apologize that the selection wasn't recognized, then remind the caller "
"to press or say 1 for Billing, 2 for Technical Support, or 3 for Customer Service."
),
allow_interruptions=False,
)
env_var, dept_name = DEPARTMENTS[result.user_input]
userdata.selected_department = dept_name
await self.session.generate_reply(
instructions=f"Tell the caller they're being transferred to our {dept_name} department and to please hold.",
allow_interruptions=False,
)
await asyncio.sleep(6)
await self._transfer_call(
userdata.sip_caller.identity, f"tel:{os.getenv(env_var)}"
)
return f"Transferring to {dept_name} department."

Handling SIP call transfers

Both routing paths call _transfer_call, which sends the SIP REFER through the trunk:

async def _transfer_call(self, participant_identity: str, transfer_to: str) -> None:
"""Transfer the SIP call to another number."""
logger.info(f"Transferring call for participant {participant_identity} to {transfer_to}")
try:
userdata = self.session.userdata
if not userdata.livekit_api:
userdata.livekit_api = api.LiveKitAPI(
url=os.getenv('LIVEKIT_URL'),
api_key=os.getenv('LIVEKIT_API_KEY'),
api_secret=os.getenv('LIVEKIT_API_SECRET'),
)
transfer_request = proto_sip.TransferSIPParticipantRequest(
participant_identity=participant_identity,
room_name=userdata.ctx.room.name,
transfer_to=transfer_to,
play_dialtone=True,
)
await userdata.livekit_api.sip.transfer_sip_participant(transfer_request)
except Exception as e:
logger.error(f"Failed to transfer call: {e}", exc_info=True)
await self.session.generate_reply(
user_input="I'm sorry, I couldn't transfer your call. Is there something else I can help with?"
)

Starting the agent

Set up an AgentServer with an rtc_session handler. The prewarm callback loads the Silero VAD model once per worker process. LiveKit Inference provides STT, LLM, and TTS — no additional API keys required. After the session starts, register the sip_dtmf_received handler that drives the DTMF fast path, capture the SIP caller so the transfer methods can reference it, and register a shutdown callback to close the LiveKitAPI client:

server = AgentServer()
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
server.setup_fnc = prewarm
@server.rtc_session(agent_name="company-directory")
async def entrypoint(ctx: JobContext) -> None:
ctx.log_context_fields = {"room": ctx.room.name}
userdata = UserData(ctx=ctx)
session = AgentSession(
userdata=userdata,
stt=inference.STT(model="deepgram/nova-3", language="multi"),
llm=inference.LLM(model="xai/grok-4-1-fast-non-reasoning"),
tts=inference.TTS(
model="cartesia/sonic-3",
voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
),
turn_detection=MultilingualModel(),
vad=ctx.proc.userdata["vad"],
preemptive_generation=True,
max_tool_steps=3,
)
async def cleanup():
if userdata.livekit_api:
await userdata.livekit_api.aclose()
userdata.livekit_api = None
ctx.add_shutdown_callback(cleanup)
agent = PhoneAssistant()
await session.start(
agent=agent,
room=ctx.room,
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=ai_coustics.audio_enhancement(
model=ai_coustics.EnhancerModel.QUAIL_VF_L
),
),
),
)
@ctx.room.on("sip_dtmf_received")
def on_dtmf(ev: rtc.SipDTMF) -> None:
logger.info(f"DTMF input: {ev.digit}")
if ev.digit not in DEPARTMENTS:
return
asyncio.create_task(agent.route_digit(ev.digit))
# Capture the SIP caller once. The identity is set at dispatch time and
# might not match the phone number.
userdata.sip_caller = await ctx.wait_for_participant(
kind=rtc.ParticipantKind.PARTICIPANT_KIND_SIP,
)
await ctx.connect()
if __name__ == "__main__":
cli.run_app(server)

How it works

  1. An inbound call dispatches the agent and adds the SIP caller to the room.
  2. The agent greets the caller and describes the menu options.
  3. If the caller presses a keypad digit, the room-level sip_dtmf_received handler fires and calls route_digit directly. The agent announces the transfer and calls TransferSIPParticipant.
  4. If the caller speaks instead, the LLM invokes route_to_department, which runs GetDtmfTask to collect a single digit. On timeouts or invalid selections, the tool re-prompts the caller.
  5. TransferSIPParticipant sends a SIP REFER through the trunk to forward the caller to the selected department.

Full agent code

The following is the complete agent.py file combining every section above:

import asyncio
import logging
import os
from dataclasses import dataclass
from typing import Optional
from dotenv import load_dotenv
from livekit import rtc, api
from livekit.agents import (
Agent,
AgentServer,
AgentSession,
JobContext,
JobProcess,
RunContext,
ToolError,
cli,
function_tool,
inference,
room_io,
)
from livekit.agents.beta.workflows.dtmf_inputs import GetDtmfTask
from livekit.plugins import ai_coustics, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
from livekit.protocol import sip as proto_sip
load_dotenv(".env.local")
logger = logging.getLogger("phone-assistant")
DEPARTMENTS = {
"1": ("BILLING_PHONE_NUMBER", "Billing"),
"2": ("TECH_SUPPORT_PHONE_NUMBER", "Tech Support"),
"3": ("CUSTOMER_SERVICE_PHONE_NUMBER", "Customer Service"),
}
@dataclass
class UserData:
"""Store user data and state for the phone assistant."""
selected_department: Optional[str] = None
livekit_api: Optional[api.LiveKitAPI] = None
ctx: Optional[JobContext] = None
sip_caller: Optional[rtc.RemoteParticipant] = None
RunContext_T = RunContext[UserData]
class PhoneAssistant(Agent):
"""A voice-enabled phone assistant that routes callers to a department."""
def __init__(self) -> None:
instructions = (
"You are a friendly assistant at Vandelay Industries providing support. "
"When a caller wants to reach a department, call the route_to_department "
"tool to collect their selection. The available departments are:\n"
"- 1 for Billing\n"
"- 2 for Technical Support\n"
"- 3 for Customer Service"
)
super().__init__(instructions=instructions)
async def on_enter(self) -> None:
logger.info("PhoneAssistant activated")
greeting = (
"Hi, thanks for calling Vandelay Industries — global leader in fine latex goods! "
"You can press 1 for Billing, 2 for Technical Support, "
"or 3 for Customer Service. You can also just talk to me, since I'm a LiveKit agent."
)
await self.session.generate_reply(user_input=greeting)
async def route_digit(self, digit: str) -> None:
"""Route the caller to a department based on an already-received digit."""
userdata = self.session.userdata
if digit not in DEPARTMENTS or userdata.sip_caller is None:
return
env_var, dept_name = DEPARTMENTS[digit]
userdata.selected_department = dept_name
logger.info(f"DTMF routing: digit={digit} department={dept_name}")
self.session.interrupt()
await self.session.generate_reply(
instructions=f"Tell the caller they're being transferred to our {dept_name} department and to please hold.",
allow_interruptions=False,
)
await asyncio.sleep(6)
await self._transfer_call(userdata.sip_caller.identity, f"tel:{os.getenv(env_var)}")
@function_tool()
async def route_to_department(self, context: RunContext_T) -> str:
"""Collect a department selection from the caller and transfer their call."""
userdata = context.userdata
if userdata.sip_caller is None:
return "No active SIP caller to transfer."
while True:
try:
result = await GetDtmfTask(
num_digits=1,
chat_ctx=self.chat_ctx.copy(
exclude_instructions=True,
exclude_function_call=True,
exclude_handoff=True,
exclude_config_update=True,
),
extra_instructions=(
"Ask the caller to press or say 1 for Billing, 2 for Technical Support, "
"or 3 for Customer Service. Give them a moment to respond."
),
)
except ToolError as e:
await self.session.generate_reply(
instructions=e.message, allow_interruptions=False
)
continue
if result.user_input in DEPARTMENTS:
break
await self.session.generate_reply(
instructions=(
"Apologize that the selection wasn't recognized, then remind the caller "
"to press or say 1 for Billing, 2 for Technical Support, or 3 for Customer Service."
),
allow_interruptions=False,
)
env_var, dept_name = DEPARTMENTS[result.user_input]
userdata.selected_department = dept_name
await self.session.generate_reply(
instructions=f"Tell the caller they're being transferred to our {dept_name} department and to please hold.",
allow_interruptions=False,
)
await asyncio.sleep(6)
await self._transfer_call(
userdata.sip_caller.identity, f"tel:{os.getenv(env_var)}"
)
return f"Transferring to {dept_name} department."
async def _transfer_call(self, participant_identity: str, transfer_to: str) -> None:
"""Transfer the SIP call to another number."""
logger.info(f"Transferring call for participant {participant_identity} to {transfer_to}")
try:
userdata = self.session.userdata
if not userdata.livekit_api:
userdata.livekit_api = api.LiveKitAPI(
url=os.getenv('LIVEKIT_URL'),
api_key=os.getenv('LIVEKIT_API_KEY'),
api_secret=os.getenv('LIVEKIT_API_SECRET'),
)
transfer_request = proto_sip.TransferSIPParticipantRequest(
participant_identity=participant_identity,
room_name=userdata.ctx.room.name,
transfer_to=transfer_to,
play_dialtone=True,
)
await userdata.livekit_api.sip.transfer_sip_participant(transfer_request)
except Exception as e:
logger.error(f"Failed to transfer call: {e}", exc_info=True)
await self.session.generate_reply(
user_input="I'm sorry, I couldn't transfer your call. Is there something else I can help with?"
)
server = AgentServer()
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
server.setup_fnc = prewarm
@server.rtc_session(agent_name="company-directory")
async def entrypoint(ctx: JobContext) -> None:
ctx.log_context_fields = {"room": ctx.room.name}
userdata = UserData(ctx=ctx)
session = AgentSession(
userdata=userdata,
stt=inference.STT(model="deepgram/nova-3", language="multi"),
llm=inference.LLM(model="xai/grok-4-1-fast-non-reasoning"),
tts=inference.TTS(
model="cartesia/sonic-3",
voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
),
turn_detection=MultilingualModel(),
vad=ctx.proc.userdata["vad"],
preemptive_generation=True,
max_tool_steps=3,
)
async def cleanup():
if userdata.livekit_api:
await userdata.livekit_api.aclose()
userdata.livekit_api = None
ctx.add_shutdown_callback(cleanup)
agent = PhoneAssistant()
await session.start(
agent=agent,
room=ctx.room,
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=ai_coustics.audio_enhancement(
model=ai_coustics.EnhancerModel.QUAIL_VF_L
),
),
),
)
@ctx.room.on("sip_dtmf_received")
def on_dtmf(ev: rtc.SipDTMF) -> None:
logger.info(f"DTMF input: {ev.digit}")
if ev.digit not in DEPARTMENTS:
return
asyncio.create_task(agent.route_digit(ev.digit))
userdata.sip_caller = await ctx.wait_for_participant(
kind=rtc.ParticipantKind.PARTICIPANT_KIND_SIP,
)
await ctx.connect()
if __name__ == "__main__":
cli.run_app(server)