Observability data hooks

Overview

The LiveKit Agents SDK includes access to extensive detail about each session, which you can collect locally and integrate with other systems. For information about data collected in LiveKit Cloud, see the Insights in LiveKit Cloud overview.

Metrics and usage data

AgentSession emits a metrics_collected event whenever new metrics are available. You can log these events directly or forward them to external services.

from livekit.agents import metrics, MetricsCollectedEvent

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    metrics.log_metrics(ev.metrics)

import { voice, metrics } from '@livekit/agents';

session.on(voice.AgentSessionEventTypes.MetricsCollected, (ev) => {
  metrics.logMetrics(ev.metrics);
});

Aggregate usage with `UsageCollector`

Use UsageCollector to accumulate LLM, TTS, and STT usage across a session for cost estimation or billing exports.

from livekit.agents import metrics, MetricsCollectedEvent

usage_collector = metrics.UsageCollector()

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    usage_collector.collect(ev.metrics)

async def log_usage():
    summary = usage_collector.get_summary()
    logger.info(f"Usage: {summary}")

ctx.add_shutdown_callback(log_usage)

import { voice, metrics } from '@livekit/agents';

const usageCollector = new metrics.UsageCollector();

session.on(voice.AgentSessionEventTypes.MetricsCollected, (ev) => {
  metrics.logMetrics(ev.metrics);
  usageCollector.collect(ev.metrics);
});

const logUsage = async () => {
  const summary = usageCollector.getSummary();
  console.log(`Usage: ${JSON.stringify(summary)}`);
};

ctx.addShutdownCallback(logUsage);

Metrics reference

Each metrics event is included in the LiveKit Cloud trace spans and surfaced as JSON in the dashboard. Use the tables below when you emit the data elsewhere.

Speech-to-text (STT)

STTMetrics is emitted after the STT model processes the audio input. This metrics event is only available when an STT component is configured (Realtime APIs do not emit it).

Metric	Description
`audio_duration`	The duration (seconds) of the audio input received by the STT model.
`duration`	For non-streaming STT, the amount of time (seconds) it took to create the transcript. Always `0` for streaming STT.
`streamed`	`True` if the STT is in streaming mode.

LLM

LLMMetrics is emitted after each LLM inference completes. Tool calls that run after the initial completion emit their own LLMMetrics events.

Metric	Description
`duration`	The amount of time (seconds) it took for the LLM to generate the entire completion.
`completion_tokens`	The number of tokens generated by the LLM in the completion.
`prompt_tokens`	The number of tokens provided in the prompt sent to the LLM.
`prompt_cached_tokens`	The number of cached tokens in the input prompt.
`speech_id`	A unique identifier representing a turn in the user input.
`total_tokens`	Total token usage for the completion.
`tokens_per_second`	The rate of token generation (tokens/second) by the LLM to generate the completion.
`ttft`	The amount of time (seconds) that it took for the LLM to generate the first token of the completion.

Text-to-speech (TTS)

TTSMetrics is emitted after the TTS model generates speech from text input.

Metric	Description
`audio_duration`	The duration (seconds) of the audio output generated by the TTS model.
`characters_count`	The number of characters in the text input to the TTS model.
`duration`	The amount of time (seconds) it took for the TTS model to generate the entire audio output.
`ttfb`	The amount of time (seconds) that it took for the TTS model to generate the first byte of its audio output.
`speech_id`	An identifier linking to a user's turn.
`streamed`	`True` if the TTS is in streaming mode.

End-of-utterance (EOU)

EOUMetrics is emitted when the user is determined to have finished speaking. It includes metrics related to end-of-turn detection and transcription latency.

EOU metrics are available in Realtime APIs when turn_detection is set to VAD or LiveKit's turn detector plugin. When using server-side turn detection, EOUMetrics is not emitted.

Metric	Description
`end_of_utterance_delay`	Time (in seconds) from the end of speech (as detected by VAD) to the point when the user's turn is considered complete. This includes any `transcription_delay`.
`transcription_delay`	Time (seconds) between the end of speech and when the final transcript is available.
`on_user_turn_completed_delay`	Time (in seconds) taken to execute the `on_user_turn_completed` callback.
`speech_id`	A unique identifier indicating the user's turn.

Measure conversation latency

Total conversation latency is the time it takes for the agent to respond to a user's utterance. Approximate it with the following metrics:

total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb

const totalLatency = eou.endOfUtteranceDelay + llm.ttft + tts.ttfb;

Session transcripts and reports

The session.history object contains the full conversation, and the SDK raises events like conversation_item_added and user_input_transcribed as turns progress. Use these hooks to build live dashboards or persist transcripts once a session ends. When you need a structured post-session artifact, call ctx.make_session_report() inside on_session_end to gather identifiers, history, events, and recording metadata in one JSON payload.

Save conversation history example

The following Python example augments the Voice AI quickstart to save the transcript as JSON when the session closes.

from datetime import datetime
import json

def entrypoint(ctx: JobContext):
    async def write_transcript():
        current_date = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"/tmp/transcript_{ctx.room.name}_{current_date}.json"

        with open(filename, 'w') as f:
            json.dump(session.history.to_dict(), f, indent=2)

        print(f"Transcript for {ctx.room.name} saved to {filename}")

    ctx.add_shutdown_callback(write_transcript)
    # ... continue with ctx.connect(), agent setup, etc.

Capture a session report

Only Available in

Python

Use the on_session_end callback to capture a structured SessionReport with identifiers, conversation history, events, recording metadata, and agent configuration.

import json
from datetime import datetime
from livekit.agents import JobContext, AgentServer

server = AgentServer()

async def on_session_end(ctx: JobContext) -> None:
    report = ctx.make_session_report()
    report_dict = report.to_dict()

    current_date = datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = f"/tmp/session_report_{ctx.room.name}_{current_date}.json"

    with open(filename, 'w') as f:
        json.dump(report_dict, f, indent=2)

    print(f"Session report for {ctx.room.name} saved to {filename}")

@server.rtc_session(on_session_end=on_session_end)
async def entrypoint(ctx: JobContext):
    await ctx.connect()
    # ...

The report includes fields such as:

Job, room, and participant identifiers
Complete conversation history with timestamps
All session events (transcription, speech detection, handoffs, etc.)
Audio recording metadata and paths (when recording is enabled)
Agent session options and configuration

Record audio or video

Use LiveKit Egress to capture audio and video directly to your storage provider. The simplest pattern is to start a room composite recorder when your agent joins the room.

from livekit import api

async def entrypoint(ctx: JobContext):
    req = api.RoomCompositeEgressRequest(
        room_name=ctx.room.name,
        audio_only=True,
        file_outputs=[
            api.EncodedFileOutput(
                file_type=api.EncodedFileType.OGG,
                filepath="livekit/my-room-test.ogg",
                s3=api.S3Upload(
                    bucket=os.getenv("AWS_BUCKET_NAME"),
                    region=os.getenv("AWS_REGION"),
                    access_key=os.getenv("AWS_ACCESS_KEY_ID"),
                    secret=os.getenv("AWS_SECRET_ACCESS_KEY"),
                ),
            )
        ],
    )

    lkapi = api.LiveKitAPI()
    await lkapi.egress.start_room_composite_egress(req)
    await lkapi.aclose()
    # ... continue with your agent logic

OpenTelemetry integration

ONLY Available in