Overview
The LiveKit Agents framework lets you build sophisticated voice AI apps with multiple personas, conversation phases, or specialized capabilities using agents, handoffs, and tasks.
Core constructs
An agent session is the main orchestrator of your voice AI app and can be composed of one or more agents. Agents are one of the core building blocks of a workflow that also includes tasks and tools. Each plays a distinct role in creating a flexible, maintainable system:
Agents hold long-lived control of a session. They define instructions, reasoning behavior, and tools, and can transfer control to another agent when different rules or capabilities are required.
Tools are user-defined functions callable by the model. They allow the agent to perform actions beyond generative text, such as reading from or writing to external systems. Tool invocations are model-driven: the LLM chooses to call them based on context, and the returned results are fed back to the model for continued reasoning. Tools can also trigger agent handoffs.
Tasks are short-lived units of work that run to completion and return a typed result. Unlike agents, tasks do not persist; they take temporary control only while executing. Tasks can include tool definitions used to complete their objectives.
Task groups run sequences of tasks for multi-step operations. They allow users to revisit earlier steps when corrections are needed, and all tasks in a group share conversation context. The summarized result is returned to the controlling agent when the group finishes.
This architecture makes workflows explicit and predictable: agents manage ongoing conversational control, tasks encapsulate discrete operations, tools execute side effects and enable handoffs, and task groups coordinate ordered multi-step flows with regression support. Together, these constructs form a testable and maintainable execution model for non-trivial voice AI systems.
Best practices
Before building your workflow, map out the conversation phases, identify where different personas or capabilities are needed, and determine which operations are short-lived versus continuous. The following guidelines help you choose the right pattern for each part of your workflow:
Create separate agents when you need distinct reasoning behavior or tool access.
Use tasks for discrete operations that must complete before continuing the conversation (for example, consent collection, data capture, or verification).
Expose external actions through tools with clear purpose and meaningful return values that contribute to reasoning.
Plan how conversation context is preserved or reset across agents. Some transitions require full continuity; others benefit from a clean slate.
Use a task group for ordered multi-step processes that might need to revisit earlier steps.
Build workflows incrementally. Add tests and evals to verify tool, task, and agent behavior.
Design for user experience: announce handoffs, preserve relevant context to avoid repetition, and handle correction paths cleanly.
Following these patterns keeps complex workflows predictable, testable, and extensible.
Additional resources
For more information on specific topics related to building voice AI workflows, see the following topics:
Agents and handoffs
Define agents and agent handoffs to build multi-agent voice AI workflows.
Tasks & task groups
Use tasks and task groups to execute discrete operations and build complex workflows.
Prompting guide
Complete guide to writing good instructions for your agents.
Tool definition and use
Use tools to call external services, inject custom logic, agent handoffs,and more.
Testing & evaluation
Test every aspect of your agents with a custom test suite.