LiveKit docs › Features › Answering machine detection

---

# Answering machine detection

> Classify whether a real person, voicemail, or IVR system answered an outbound call.

## Overview

An outbound call can reach a person, voicemail, an IVR menu, or a number that can't accept messages. Answering machine detection (AMD) listens to the start of the call, classifies it with an LLM, and returns a result so your agent can respond appropriately.

## How AMD works

AMD runs once at the start of the call, on the first user utterance. It doesn't monitor continuously. While AMD is running, the agent's speech is paused so it doesn't talk over a voicemail greeting before classification completes.

AMD classifies the call into one of five categories. Your agent uses the result to decide the next step: continue the conversation, leave a voicemail, navigate an IVR, or hang up.

AMD runs two paths in parallel: a fast-path heuristic for short greetings followed by silence, and an LLM classifier for transcripts that need more reasoning. The first path to reach a conclusion produces the result.

![AMD classification flow: short speech and transcript inputs feed a fast-path heuristic and an LLM classifier, which together emit one of five categories: human, machine-ivr, machine-vm, machine-unavailable, or uncertain.](/images/sip/amd-pipeline.svg)

| Category | Description |
| `human` | A real person answered. Proceed with normal conversation. |
| `machine-ivr` | An IVR or DTMF menu was detected. In Python, the session automatically starts [IVR navigation](https://docs.livekit.io/telephony/features/dtmf.md) when `ivr_detection` is enabled (the default). The Node.js SDK doesn't support IVR navigation, so the agent should handle `machine-ivr` the same as `human` and let the main agent respond. |
| `machine-vm` | A voicemail greeting where leaving a message is possible. |
| `machine-unavailable` | The mailbox is full, not set up, or the callee is unreachable. Leaving a message isn't possible. |
| `uncertain` | The greeting can't be classified with confidence. Treat as a human and proceed with normal conversation. |

## Usage

Initialize AMD before creating the SIP participant so detection is ready before audio starts arriving. The detector pauses agent speech until a result is available.

**Python**:

Open the async context manager, then create the SIP participant inside it. Pass `participant_identity` so AMD's timers wait for that specific participant's audio track:

```python
import os

from livekit.agents import AMD
from livekit.protocol.sip import SIPOutboundConfig

async with AMD(session, participant_identity=participant_identity) as detector:
    await ctx.api.sip.create_sip_participant(
        api.CreateSIPParticipantRequest(
            trunk=SIPOutboundConfig(
                hostname=os.getenv("SIP_TRUNK_HOSTNAME"),
                auth_username=os.getenv("SIP_AUTH_USERNAME"),
                auth_password=os.getenv("SIP_AUTH_PASSWORD"),
            ),
            sip_number="<SIP provider number>",
            room_name=ctx.room.name,
            sip_call_to=phone_number,
            participant_identity=participant_identity,
            wait_until_answered=True,
        )
    )
    await ctx.wait_for_participant(identity=participant_identity)

    result = await detector.execute()

    if result.category == "human" or result.category == "uncertain":
        logger.info(
            "human answered the call or amd is uncertain, proceeding with normal conversation",
            extra={"transcript": result.transcript},
        )
    elif result.category == "machine-ivr":
        logger.info("ivr menu detected, starting navigation")
    elif result.category == "machine-vm":
        logger.info("voicemail detected, leaving a message")
        speech_handle = session.generate_reply(
            instructions=(
                "You've reached voicemail. Leave a brief message asking "
                "the customer to call back."
            ),
        )
        await speech_handle.wait_for_playout()
        ctx.shutdown("voicemail detected")
    elif result.category == "machine-unavailable":
        logger.info("mailbox unavailable, ending call")
        ctx.shutdown("mailbox unavailable")

```

---

**Node.js**:

Instantiate the detector before creating the SIP participant. Pass `participantIdentity` so AMD's timers wait for that participant's audio track. Wrap the run in `try`/`finally` so `detector.aclose()` runs even on error:

```typescript
import { voice } from '@livekit/agents';
import { SipClient } from 'livekit-server-sdk';

session._roomIO.setParticipant(participantIdentity);
const detector = new voice.AMD(session, { participantIdentity });

try {
  const sip = new SipClient(
    process.env.LIVEKIT_URL,
    process.env.LIVEKIT_API_KEY,
    process.env.LIVEKIT_API_SECRET,
  );
  await sip.createSipParticipant(
    '', // Empty string when using inline trunk config
    phoneNumber,
    ctx.room.name,
    {
      participantIdentity,
      fromNumber: '<SIP provider number>',
      waitUntilAnswered: true,
    },
    { // Inline trunk configuration
      hostname: process.env.SIP_TRUNK_HOSTNAME,
      authUsername: process.env.SIP_AUTH_USERNAME,
      authPassword: process.env.SIP_AUTH_PASSWORD,
    },
  );
  await ctx.waitForParticipant(participantIdentity);

  const result = await detector.execute();

  if (
    result.category === voice.AMDCategory.HUMAN ||
    result.category === voice.AMDCategory.UNCERTAIN ||
    result.category === voice.AMDCategory.MACHINE_IVR
  ) {
    logger.info(
      { amd: result },
      'human or ivr menu detected, proceeding with normal conversation',
    );
  } else if (result.category === voice.AMDCategory.MACHINE_VM) {
    logger.info({ amd: result }, 'voicemail detected, leaving a message');
    const speechHandle = session.generateReply({
      instructions:
        "You've reached voicemail. Leave a brief message asking the customer to call back.",
    });
    await speechHandle.waitForPlayout();
    session.shutdown({ reason: 'amd:machine-vm' });
  } else if (result.category === voice.AMDCategory.MACHINE_UNAVAILABLE) {
    logger.info({ amd: result }, 'mailbox unavailable, ending call');
    session.shutdown({ reason: 'amd:machine-unavailable' });
  }
} finally {
  await detector.aclose();
}

```

> ℹ️ **Stored outbound trunk**
> 
> You can also use a stored outbound trunk by passing `sip_trunk_id` (Python) or `sipTrunkId` (Node.js) instead of [inline trunk configuration](https://docs.livekit.io/telephony/making-calls/outbound-calls.md#inline-trunk). For details, see [Outbound trunk](https://docs.livekit.io/telephony/making-calls/outbound-trunk.md).

## Recommended models

AMD has been evaluated against a small set of LLMs and STT models on [LiveKit Inference](https://docs.livekit.io/agents/models/inference.md).

Behavior on unevaluated models isn't guaranteed, so AMD logs a compatibility warning when you pass an unevaluated model. Once you've validated your own choice, set `suppress_compatibility_warning=True` (Python) or `suppressCompatibilityWarning: true` (Node.js) to silence the warning.

### Evaluated LLMs

- `google/gemini-3.1-flash-lite` (default)
- `google/gemini-3-flash-preview`
- `google/gemini-2.5-flash-lite`
- `openai/gpt-4o`
- `openai/gpt-4.1`
- `openai/gpt-4.1-mini`
- `openai/gpt-4.1-nano`
- `openai/gpt-5.1`
- `openai/gpt-5.1-chat-latest`
- `openai/gpt-5.2`
- `openai/gpt-5.2-chat-latest`
- `openai/gpt-5.4`

### Evaluated STT models

- `cartesia/ink-whisper` (default)
- `assemblyai/universal-streaming-multilingual`
- `deepgram/nova-3`

## Parameters

Defaults are calibrated for typical outbound calls. Override them when you need different timing thresholds or a different classification prompt.

**Python**:

- **`llm`** _(LLM | str)_ (optional): LLM used for greeting classification. Accepts an `LLM` instance or a [LiveKit Inference](https://docs.livekit.io/agents/models/llm.md) model ID string. If not set, AMD uses `google/gemini-3.1-flash-lite` via LiveKit Inference when available, and otherwise falls back to the session's own LLM. See [recommended models](#models) for the evaluated set.

- **`stt`** _(STT | str)_ (optional): STT used to transcribe the greeting. Accepts an `STT` instance or a [LiveKit Inference](https://docs.livekit.io/agents/models/stt.md) model ID string. If not set, AMD uses `cartesia/ink-whisper` via LiveKit Inference when available, and otherwise reuses the session's existing STT transcripts. AMD runs its own STT pipeline so it can listen even when the session uses a realtime model with no separate STT.

- **`interrupt_on_machine`** _(bool)_ (optional) - Default: `True`: Interrupt any pending agent speech when a machine is detected.

- **`participant_identity`** _(str)_ (optional): Identity of the SIP participant whose audio AMD should listen to. When omitted, AMD attaches to the first remote audio track in the room. Set this when the room might have other participants so AMD timers don't start on the wrong track.

- **`ivr_detection`** _(bool)_ (optional) - Default: `True`: Automatically start [IVR navigation](https://docs.livekit.io/telephony/features/dtmf.md) when the result is `machine-ivr`. When `False`, AMD returns the `machine-ivr` result without starting navigation, and your agent decides how to handle it.

- **`detection_options`** _(DetectionOptions)_ (optional): Override the default timing thresholds and classification prompt. Pass a dict with any of the following keys: `human_speech_threshold` (default `2.5`), `human_silence_threshold` (default `0.5`), `machine_silence_threshold` (default `1.5`), `no_speech_threshold` (default `10.0`), `timeout` (default `20.0`), or `prompt`. All thresholds are in seconds. Values not provided fall back to library defaults.

- **`suppress_compatibility_warning`** _(bool)_ (optional) - Default: `False`: Silence the warning that fires when `llm` or `stt` isn't among the evaluated models. Has no effect on classification behavior.

---

**Node.js**:

The Node.js SDK doesn't support IVR navigation, so treat `machine-ivr` results as a human conversation and let the main agent respond.

- **`llm`** _(LLM | string)_ (optional): LLM used for greeting classification. Accepts an `LLM` instance or a [LiveKit Inference](https://docs.livekit.io/agents/models/llm.md) model ID string. If not set, AMD uses `google/gemini-3.1-flash-lite` via LiveKit Inference when available, and otherwise falls back to the session's own LLM. See [recommended models](#models) for the evaluated set.

- **`stt`** _(STT | string)_ (optional): STT used to transcribe the greeting. Accepts an `STT` instance or a [LiveKit Inference](https://docs.livekit.io/agents/models/stt.md) model ID string. If not set, AMD uses `cartesia/ink-whisper` via LiveKit Inference when available, and otherwise listens to session-level transcripts instead. AMD runs its own STT pipeline so it can listen even when the session uses a realtime model with no separate STT.

- **`interruptOnMachine`** _(boolean)_ (optional) - Default: `true`: Interrupt any pending agent speech when a machine is detected.

- **`participantIdentity`** _(string)_ (optional): Identity of the SIP participant whose audio AMD should listen to. When omitted, AMD attaches to the session's linked participant or the first remote audio track in the room. Set this when the room might have other participants so AMD timers don't start on the wrong track.

- **`humanSpeechThresholdMs`** _(number)_ (optional) - Default: `2500`: Maximum length in milliseconds of a "short greeting." Speech shorter than this triggers the fast-path human heuristic; speech longer is treated as machine-like and defers to the LLM classifier.

- **`humanSilenceThresholdMs`** _(number)_ (optional) - Default: `500`: Silence in milliseconds after a short greeting before AMD settles as `human`. Shorter values commit to `human` faster on quick "Hello?" greetings.

- **`machineSilenceThresholdMs`** _(number)_ (optional) - Default: `1500`: Silence in milliseconds after machine-like speech before AMD opens the silence gate and emits a verdict. Longer values give the LLM more time to review the transcript.

- **`noSpeechTimeoutMs`** _(number)_ (optional) - Default: `10000`: Maximum time in milliseconds to wait for a transcript before AMD gives up. When this elapses with no speech detected, AMD settles as `machine-unavailable`.

- **`detectionTimeoutMs`** _(number)_ (optional) - Default: `20000`: Maximum time in milliseconds for the entire detection. When this elapses, AMD settles with whatever evidence is available.

- **`prompt`** _(string)_ (optional): Override the default classification prompt passed to the LLM. Use this to bias detection toward your domain (for example, recognizing region-specific voicemail phrasing) or to translate the prompt into another language.

- **`suppressCompatibilityWarning`** _(boolean)_ (optional) - Default: `false`: Silence the warning that fires when `llm` or `stt` isn't among the evaluated models. Has no effect on classification behavior.

## Additional resources

- **[AMD example (Python)](https://github.com/livekit/agents/blob/main/examples/telephony/amd.py)**: Outbound voice agent that runs AMD before responding and branches on the classification result.

- **[AMD example (Node.js)](https://github.com/livekit/agents-js/blob/main/examples/src/telephony_amd.ts)**: Outbound voice agent that runs AMD before responding and branches on the classification result.

- **[DTMF and IVR navigation](https://docs.livekit.io/telephony/features/dtmf.md)**: Send and receive DTMF tones, and navigate IVR systems after AMD detection.

- **[Outbound calls](https://docs.livekit.io/telephony/making-calls/outbound-calls.md)**: Create SIP participants and place outbound calls that AMD can classify.

---

This document was rendered at 2026-06-07T11:36:45.229Z.
For the latest version of this document, see [https://docs.livekit.io/telephony/features/answering-machine-detection.md](https://docs.livekit.io/telephony/features/answering-machine-detection.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).