Overview
Your agent can connect to external data sources to retrieve information, store data, or take other actions. In general, you can install any Python package or add custom code to the agent to use any database or API that you need.
For instance, your agent might need to:
- Load a user's profile information from a database before starting a conversation.
- Search a private knowledge base for information to accurately answer user queries.
- Perform read/write/update operations on a database or service such as a calendar.
- Store conversation history or other data to a remote server.
This guide covers best practices and techniques for job initialization, retrieval-augmented generation (RAG), tool calls, and other techniques to connect your agent to external data sources and other systems.
Initial context
By default, each AgentSession
begins with an empty chat context. You can load user or task-specific data into the agent's context before connecting to the room and starting the session. For instance, this agent greets the user by name based on the job metadata.
from livekit import agentsfrom livekit.agents import Agent, ChatContext, AgentSessionclass Assistant(Agent):def __init__(self, chat_ctx: ChatContext) -> None:super().__init__(chat_ctx=chat_ctx, instructions="You are a helpful voice AI assistant.")async def entrypoint(ctx: agents.JobContext):# Simple lookup, but you could use a database or API here if neededmetadata = json.loads(ctx.job.metadata)user_name = metadata["user_name"]await ctx.connect()session = AgentSession(# ... stt, llm, tts, vad, turn_detection, etc.)initial_ctx = ChatContext()initial_ctx.add_message(role="assistant", content=f"The user's name is {user_name}.")await session.start(room=ctx.room,agent=Assistant(chat_ctx=initial_ctx),# ... room_input_options, etc.)await session.generate_reply(instructions="Greet the user by name and offer your assistance.")
If your agent requires external data in order to start, the following tips can help minimize the impact to the user experience:
- For static data (not user-specific) load it in the prewarm function
- Send user specific data in the job metadata, room metadata, or participant attributes rather than loading it in the entrypoint.
- If you must load make a network call in the entrypoint, do so before
ctx.connect()
. This ensures your frontend doesn't show the agent participant before it is listening to incoming audio.
Tool calls
To achieve the highest degree of precision or take external actions, you can offer the LLM a choice of tools to use in its response. These tools can be as generic or as specific as needed for your use case.
For instance, define tools for search_calendar
, create_event
, update_event
, and delete_event
to give the LLM complete access to the user's calendar. Use participant attributes or job metadata to pass the user's calendar ID and access tokens to the agent.
Tool definition and use
Guide to defining and using custom tools in LiveKit Agents.
Add context during conversation
You can use the on_user_turn_completed node to perform a RAG lookup based on the user's most recent turn, prior to the LLM generating a response. This method can be highly performant as it avoids the extra round-trips involved in tool calls, but it's only available for STT-LLM-TTS pipelines that have access to the user's turn in text form. Additionally, the results are only as good as the accuracy of the search function you implement.
For instance, you can use vector search to retrieve additional context relevant to the user's query and inject it into the chat context for the next LLM generation. Here is a simple example:
from livekit.agents import ChatContext, ChatMessageasync def on_user_turn_completed(self, turn_ctx: ChatContext, new_message: ChatMessage,) -> None:# RAG function definition omitted for brevityrag_content = await my_rag_lookup(new_message.text_content())turn_ctx.add_message(role="assistant",content=f"Additional information relevant to the user's next message: {rag_content}")
User feedback
It’s important to provide users with direct feedback about status updates—for example, to explain a delay or failure. Here are a few example use cases:
- When an operation takes more than a few hundred milliseconds.
- When performing write operations such as sending an email or scheduling a meeting.
- When the agent is unable to perform an operation.
The following section describes various techniques to provide this feedback to the user.
Verbal status updates
Use Agent speech to provide verbal feedback to the user during a long-running tool call or other operation.
In the following example, the agent speaks a status update only if the call takes longer than a specified timeout. The update is dynamically generated based on the query, and could be extended to include an estimate of the remaining time or other information.
import asynciofrom livekit.agents import function_tool, RunContext@function_tool()async def search_knowledge_base(self,context: RunContext,query: str,) -> str:# Send a verbal status update to the user after a short delayasync def _speak_status_update(delay: float = 0.5):await asyncio.sleep(delay)await context.session.generate_reply(instructions=f"""You are searching the knowledge base for \"{query}\" but it is taking a little while.Update the user on your progress, but be very brief.""")status_update_task = asyncio.create_task(_speak_status_update(0.5))# Perform search (function definition omitted for brevity)result = await _perform_search(query)# Cancel status update if search completed before timeoutstatus_update_task.cancel()return result
For more information, see the following article:
Agent speech
Explore the speech capabilities and features of LiveKit Agents.
"Thinking" sounds
Add background audio to play a "thinking" sound automatically when tool calls are ongoing. This can be useful to provide a more natural feel to the agent's responses.
from livekit.agents import BackgroundAudioPlayer, AudioConfig, BuiltinAudioClipasync def entrypoint(ctx: agents.JobContext):await ctx.connect()session = AgentSession(# ... stt, llm, tts, vad, turn_detection, etc.)await session.start(room=ctx.room,# ... agent, etc.)background_audio = BackgroundAudioPlayer(thinking_sound=[AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING, volume=0.8),AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING2, volume=0.7),],)await background_audio.start(room=ctx.room, agent_session=session)
Frontend UI
If your app includes a frontend, you can add custom UI to represent the status of the agent's operations. For instance, present a popup for a long-running operation that the user can optionally cancel:
from livekit.agents import get_job_contextimport jsonimport asyncio@function_tool()async def perform_deep_search(self,context: RunContext,summary: str,query: str,) -> str:"""Initiate a deep internet search that will reference many external sources to answer the given query. This may take 1-5 minutes to complete.Summary: A user-friendly summary of the queryQuery: the full query to be answered"""async def _notify_frontend(query: str):room = get_job_context().roomresponse = await room.local_participant.perform_rpc(destination_identity=next(iter(room.remote_participants)),# frontend method that shows a cancellable popup# (method definition omitted for brevity, see RPC docs)method='start_deep_search',payload=json.dumps({"summary": summary,"estimated_completion_time": 300,}),# Allow the frontend a long time to return a responseresponse_timeout=500,)# In this example the frontend has a Cancel button that returns "cancelled"# to stop the taskif response == "cancelled":deep_search_task.cancel()notify_frontend_task = asyncio.create_task(_notify_frontend(query))# Perform deep search (function definition omitted for brevity)deep_search_task = asyncio.create_task(_perform_deep_search(query))try:result = await deep_search_taskexcept asyncio.CancelledError:result = "Search cancelled by user"finally:notify_frontend_task.cancel()return result
For more information and examples, see the following articles:
Web and mobile frontends
Guide to building a custom web or mobile frontend for your agent.
RPC
Learn how to use RPC to communicate with your agent from the frontend.
Fine-tuned models
Sometimes the best way to get the most relevant results is to fine-tune a model for your specific use case. You can explore the available LLM integrations to find a provider that supports fine-tuning, or use Ollama to integrate a custom model.
RAG providers and services
You can integrate with any RAG provider or tool of your choice to enhance your agent with additional context. Suggested providers and tools include:
- LlamaIndex - Framework for connecting custom data to LLMs.
- Mem0 - Memory layer for AI assistants.
- TurboPuffer - Fast serverless vector search built on object storage.
- Pinecone - Managed vector database for AI applications.
- Annoy - Open source Python library from Spotify for nearest neighbor search.
Additional examples
The following examples show how to implement RAG and other techniques: