Virtual avatar integrations

Guides for adding virtual avatars to your agents.

Overview

Virtual avatars add lifelike video output for your voice AI agents. You can integrate a variety of providers to LiveKit Agents with just a few lines of code.

Available providers

The following providers are available. Choose a provider from this list for a step-by-step guide:

Have another provider in mind? LiveKit is open source and welcomes new plugin contributions.

How it works

The virtual avatar integrations works with the AgentSession class automatically. The plugin adds a separate participant, the avatar worker, to the room. The agent session sends its audio output to the avatar worker instead of to the room, which the avatar worker uses to publish synchronized audio + video tracks to the room and the end user.

To add a virtual avatar:

  1. Install the selected plugin and API keys
  2. Create an AgentSession, as in the voice AI quickstart
  3. Create an AvatarSession and configure it as necessary
  4. Start the avatar session, passing in the AgentSession instance
  5. Start the AgentSession with audio output disabled (the audio is sent to the avatar session instead)

Sample code

Here is an example using Tavus:

from livekit import agents
from livekit.agents import AgentSession, RoomOutputOptions
from livekit.plugins import tavus
async def entrypoint(ctx: agents.JobContext):
await ctx.connect()
session = AgentSession(
# ... stt, llm, tts, etc.
)
avatar = tavus.AvatarSession(
replica_id="...", # ID of the Tavus replica to use
persona_id="...", # ID of the Tavus persona to use (see preceding section for configuration details)
)
# Start the avatar and wait for it to join
await avatar.start(session, room=ctx.room)
# Start your agent session with the user
await session.start(
room=ctx.room,
room_output_options=RoomOutputOptions(
# Disable audio output to the room. The avatar plugin publishes audio separately.
audio_enabled=False,
),
# ... agent, room_input_options, etc....
)

Avatar workers

To minimize latency, the avatar provider joins the LiveKit room directly as a secondary participant to publish synchronized audio and video to the room. In your frontend app, you must distinguish between the agent — your Python program running the AgentSession — and the avatar worker.

Loading diagram…

You can identify an avatar worker as a participant of kind agent with the attribute lk.publish_on_behalf. Check for these values in your frontend code to associate the worker's audio and video tracks with the agent.

const agent = room.remoteParticipants.find(
p => p.kind === Kind.Agent && p.attributes['lk.publish_on_behalf'] === null
);
const avatarWorker = room.remoteParticipants.find(
p => p.kind === Kind.Agent && p.attributes['lk.publish_on_behalf'] === agent.identity
);

In React apps, use the useVoiceAssistant hook to get the correct audio and video tracks automatically:

const {
agent, // The agent participant
audioTrack, // the worker's audio track
videoTrack, // the worker's video track
} = useVoiceAssistant();

Frontend starter apps

The following frontend starter apps include out-of-the-box support for virtual avatars.

Further reading