LiveKit docs › Voice Processing › STT Metrics

---

# STT Metrics

> Shows how to use the STT metrics to log metrics to the console.

This example shows how to log speech-to-text metrics every time the STT pipeline runs. The agent streams audio, and the STT plugin publishes metrics you render as a Rich table.

> ℹ️ **Note**
> 
> This recipe uses the per-plugin `metrics_collected` event on the STT instance. This per-component surface is not deprecated. A separate session-level `metrics_collected` event (`session.on("metrics_collected", ...)`) is deprecated. For session-scoped cost and usage tracking, see [Session usage](https://docs.livekit.io/deploy/observability/data.md#session-usage).

## Prerequisites

- Add a `.env.local` in this directory with your LiveKit credentials:```
LIVEKIT_URL=your_livekit_url
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

```
- Install dependencies:```bash
pip install python-dotenv rich "livekit-agents[silero]"

```

## Load configuration and logging

Set up dotenv, a logger, and a Rich console for reporting.

```python
import logging
import asyncio
from dotenv import load_dotenv
from livekit.agents import JobContext, JobProcess, Agent, AgentSession, inference, AgentServer, cli
from livekit.agents.metrics import STTMetrics
from livekit.plugins import silero
from rich.console import Console
from rich.table import Table
from rich import box
from datetime import datetime

load_dotenv(".env.local")

logger = logging.getLogger("metrics-stt")
logger.setLevel(logging.INFO)

console = Console()

server = AgentServer()

```

## Prewarm VAD for faster connections

Preload the VAD model once per process to reduce connection latency.

```python
def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()

server.setup_fnc = prewarm

```

## Build the agent and subscribe to metrics

Keep the agent lightweight. In `on_enter`, attach a `metrics_collected` listener to the STT plugin. Wrap the handler so you can `await` inside it.

```python
class STTMetricsAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="""
                You are a helpful agent.
            """
        )

    async def on_enter(self):
        def stt_wrapper(metrics: STTMetrics):
            asyncio.create_task(self.on_stt_metrics_collected(metrics))

        self.session.stt.on("metrics_collected", stt_wrapper)
        self.session.generate_reply()

```

## Display STT stats

The handler renders a Rich table with duration, audio duration, and token counts.

```python
    async def on_stt_metrics_collected(self, metrics: STTMetrics) -> None:
        table = Table(
            title="[bold blue]STT Metrics Report[/bold blue]",
            box=box.ROUNDED,
            highlight=True,
            show_header=True,
            header_style="bold cyan"
        )

        table.add_column("Metric", style="bold green")
        table.add_column("Value", style="yellow")

        timestamp = datetime.fromtimestamp(metrics.timestamp).strftime('%Y-%m-%d %H:%M:%S')

        table.add_row("Type", str(metrics.type))
        table.add_row("Label", str(metrics.label))
        table.add_row("Request ID", str(metrics.request_id))
        table.add_row("Timestamp", timestamp)
        table.add_row("Duration", f"[white]{metrics.duration:.4f}[/white]s")
        table.add_row("Audio Duration", f"[white]{metrics.audio_duration:.4f}[/white]s")
        table.add_row("Streamed", "✓" if metrics.streamed else "✗")

        console.print("\n")
        console.print(table)
        console.print("\n")

```

## Set up the session

Configure the AgentSession with STT, LLM, TTS, and prewarmed VAD. The STT's metrics events will be captured by the listeners attached in `on_enter`.

```python
@server.rtc_session(agent_name="my-agent")
async def entrypoint(ctx: JobContext):
    ctx.log_context_fields = {"room": ctx.room.name}

    session = AgentSession(
        stt=inference.STT(model="deepgram/nova-3-general"),
        llm=inference.LLM(model="openai/gpt-5.3-chat-latest"),
        tts=inference.TTS(model="cartesia/sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"),
        vad=ctx.proc.userdata["vad"],
        preemptive_generation=True,
    )
    agent = STTMetricsAgent()

    await session.start(agent=agent, room=ctx.room)
    await ctx.connect()

```

## Run the server

Start the agent server with the CLI.

```python
if __name__ == "__main__":
    cli.run_app(server)

```

## Run it

```console
python metrics_stt.py console

```

## How it works

1. The agent uses Deepgram streaming STT with Silero VAD.
2. The STT plugin emits `metrics_collected` after each recognition request.
3. An async handler formats and prints the data so you can watch latency and audio durations live.
4. Because the handler runs in a task, it does not block audio processing.

## Full example

```python
import logging
import asyncio
from dotenv import load_dotenv
from livekit.agents import JobContext, JobProcess, Agent, AgentSession, inference, AgentServer, cli
from livekit.agents.metrics import STTMetrics
from livekit.plugins import silero
from rich.console import Console
from rich.table import Table
from rich import box
from datetime import datetime

load_dotenv(".env.local")

logger = logging.getLogger("metrics-stt")
logger.setLevel(logging.INFO)

console = Console()

class STTMetricsAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="""
                You are a helpful agent.
            """
        )

    async def on_enter(self):
        def stt_wrapper(metrics: STTMetrics):
            asyncio.create_task(self.on_stt_metrics_collected(metrics))

        self.session.stt.on("metrics_collected", stt_wrapper)
        self.session.generate_reply()

    async def on_stt_metrics_collected(self, metrics: STTMetrics) -> None:
        table = Table(
            title="[bold blue]STT Metrics Report[/bold blue]",
            box=box.ROUNDED,
            highlight=True,
            show_header=True,
            header_style="bold cyan"
        )

        table.add_column("Metric", style="bold green")
        table.add_column("Value", style="yellow")

        timestamp = datetime.fromtimestamp(metrics.timestamp).strftime('%Y-%m-%d %H:%M:%S')

        table.add_row("Type", str(metrics.type))
        table.add_row("Label", str(metrics.label))
        table.add_row("Request ID", str(metrics.request_id))
        table.add_row("Timestamp", timestamp)
        table.add_row("Duration", f"[white]{metrics.duration:.4f}[/white]s")
        table.add_row("Audio Duration", f"[white]{metrics.audio_duration:.4f}[/white]s")
        table.add_row("Streamed", "✓" if metrics.streamed else "✗")

        console.print("\n")
        console.print(table)
        console.print("\n")

server = AgentServer()

def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()

server.setup_fnc = prewarm

@server.rtc_session(agent_name="my-agent")
async def entrypoint(ctx: JobContext):
    ctx.log_context_fields = {"room": ctx.room.name}

    session = AgentSession(
        stt=inference.STT(model="deepgram/nova-3-general"),
        llm=inference.LLM(model="openai/gpt-5.3-chat-latest"),
        tts=inference.TTS(model="cartesia/sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"),
        vad=ctx.proc.userdata["vad"],
        preemptive_generation=True,
    )
    agent = STTMetricsAgent()

    await session.start(agent=agent, room=ctx.room)
    await ctx.connect()

if __name__ == "__main__":
    cli.run_app(server)

```

---

This document was rendered at 2026-06-07T11:35:49.277Z.
For the latest version of this document, see [https://docs.livekit.io/reference/recipes/metrics_stt.md](https://docs.livekit.io/reference/recipes/metrics_stt.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).