Overview
Virtual avatars add lifelike video output for your voice AI agents. You can integrate a variety of providers to LiveKit Agents with just a few lines of code.
Available providers
The following providers are available. Choose a provider from this list for a step-by-step guide:
Have another provider in mind? LiveKit is open source and welcomes new plugin contributions.
How it works
The virtual avatar integrations works with the AgentSession
class automatically. The plugin adds a separate participant, the avatar worker, to the room. The agent session sends its audio output to the avatar worker instead of to the room, which the avatar worker uses to publish synchronized audio + video tracks to the room and the end user.
To add a virtual avatar:
- Install the selected plugin and API keys
- Create an
AgentSession
, as in the voice AI quickstart - Create an
AvatarSession
and configure it as necessary - Start the avatar session, passing in the
AgentSession
instance - Start the
AgentSession
with audio output disabled (the audio is sent to the avatar session instead)
Sample code
Here is an example using Tavus:
from livekit import agentsfrom livekit.agents import AgentSession, RoomOutputOptionsfrom livekit.plugins import tavusasync def entrypoint(ctx: agents.JobContext):await ctx.connect()session = AgentSession(# ... stt, llm, tts, etc.)avatar = tavus.AvatarSession(replica_id="...", # ID of the Tavus replica to usepersona_id="...", # ID of the Tavus persona to use (see preceding section for configuration details))# Start the avatar and wait for it to joinawait avatar.start(session, room=ctx.room)# Start your agent session with the userawait session.start(room=ctx.room,room_output_options=RoomOutputOptions(# Disable audio output to the room. The avatar plugin publishes audio separately.audio_enabled=False,),# ... agent, room_input_options, etc....)
Avatar workers
To minimize latency, the avatar provider joins the LiveKit room directly as a secondary participant to publish synchronized audio and video to the room. In your frontend app, you must distinguish between the agent — your Python program running the AgentSession
— and the avatar worker.
Loading diagram…
You can identify an avatar worker as a participant of kind agent
with the attribute lk.publish_on_behalf
. Check for these values in your frontend code to associate the worker's audio and video tracks with the agent.
const agent = room.remoteParticipants.find(p => p.kind === Kind.Agent && p.attributes['lk.publish_on_behalf'] === null);const avatarWorker = room.remoteParticipants.find(p => p.kind === Kind.Agent && p.attributes['lk.publish_on_behalf'] === agent.identity);
In React apps, use the useVoiceAssistant hook to get the correct audio and video tracks automatically:
const {agent, // The agent participantaudioTrack, // the worker's audio trackvideoTrack, // the worker's video track} = useVoiceAssistant();
Frontend starter apps
The following frontend starter apps include out-of-the-box support for virtual avatars.
Web
A web voice assistant app built with React and Next.js.
Swift
A native iOS, macOS, and visionOS voice assistant built in SwiftUI.
Agents Playground
A virtual workbench to test your multimodal AI agent.