TurnHandlingOptions
The turn_handling parameter accepts a TurnHandlingOptions object (or plain object) that controls turn detection, endpointing, and interruption behavior in an agent session. Pass it to the AgentSession constructor.
Usage
The following example creates an AgentSession with turn detection set to VAD and custom endpointing and interruption handling settings:
from livekit.agents import AgentSession, TurnHandlingOptionssession = AgentSession(turn_handling=TurnHandlingOptions(turn_detection="vad",endpointing={"mode": "fixed","min_delay": 0.5,"max_delay": 3.0,},interruption={"mode": "adaptive","min_duration": 0.5,"resume_false_interruption": True,},),# ... other parameters)
const session = new voice.AgentSession({turnHandling: {turnDetection: 'vad',endpointing: {minDelay: 500,maxDelay: 3000,},interruption: {mode: 'adaptive',minDuration: 500,resumeFalseInterruption: true,},},// ... other parameters});
Parameters
The following parameters are available in the TurnHandlingOptions object (the turn_handling argument):
TurnDetectionMode | NoneOptionalStrategy for deciding when the user has finished speaking.
Options:
"stt"- Rely on speech-to-text end-of-utterance cues."vad"- Rely on Voice Activity Detection (VAD) start and stop cues."realtime_llm"- Use server-side detection from a realtime LLM."manual"- Control turn boundaries explicitly.TurnDetectorinstance - Plug-in custom detector (for example,MultilingualModel()).
If this parameter is omitted, the session chooses the best available mode in priority order: realtime_llm → vad → stt → manual and automatically falls back to the next available mode if the necessary model is missing. See Turns overview for mode descriptions and fallback behavior.
EndpointingOptionsOptionalOptions for endpointing behavior. See EndpointingOptions for details.
InterruptionOptionsOptionalOptions for interruption handling. See InterruptionOptions for details.
EndpointingOptions
Options for endpointing behavior, which determines timing thresholds for turn completion. With fixed endpointing (the default), the agent always uses the configured min_delay and max_delay.
For context and configuration in a session, see Endpointing configuration in the turns overview.
Dynamic endpointing
When you use dynamic endpointing, the agent adapts the delay within the min_delay and max_delay range based on session pause statistics. This can result in a more responsive turn-taking experience over time.
Usage
The following example creates turn handling options with custom endpointing settings. Pass it to the turn_handling parameter of AgentSession:
The following example enables dynamic endpointing:
turn_handling = {"endpointing": {"mode": "dynamic","min_delay": 0.5,"max_delay": 3.0,},}
const turnHandling = {endpointing: {mode: 'fixed',minDelay: 500,maxDelay: 3000,},};
Parameters
The following parameters are available in the endpointing options object EndpointingOptions:
Literal['dynamic', 'fixed']OptionalDefault: fixedEndpointing timing behavior. The endpointing delay is the time the agent waits before terminating the users's turn.
"fixed"- Use the configuredmin_delayandmax_delayvalues to determine the endpointing delay.- Dynamic endpointing only Available inPython
"dynamic"- Adapt the delay within themin_delayandmax_delayrange based on session pause statistics (exponential moving average of between-utterance and between-turn pauses). Suits most conversations.
floatOptionalDefault: 0.5 secondsMinimum time (in seconds) to wait since the last detected speech to declare the user's turn to be complete.
With dynamic endpointing (Python only), this is the lower bound. The agent might use a longer effective delay when session pause statistics suggest slower turn-taking.
- In VAD mode, this effectively behaves like
max(VAD silence, min_delay). - In STT mode, this is applied after the STT end-of-speech signal, and therefore in addition to the STT provider's endpointing delay.
floatOptionalDefault: 3.0 secondsMaximum time (in seconds) the agent waits before terminating the turn. This prevents the agent from waiting indefinitely for the user to continue speaking.
With dynamic endpointing (Python only), this is the upper bound. The agent might use a shorter effective delay when session pause statistics suggest faster turn-taking.
In Node.js, min_delay and max_delay are in milliseconds (for example, 500 and 3000). Python uses seconds (for example, 0.5 and 3.0).
InterruptionOptions
Options for interruption handling, including adaptive detection and false interruption recovery.
Usage
The following example creates turn handling options with adaptive interruption handling. Pass it to the turn_handling parameter of AgentSession:
turn_handling = {"interruption": {"mode": "adaptive","min_duration": 0.5,"min_words": 0,"discard_audio_if_uninterruptible": True,"false_interruption_timeout": 2.0,"resume_false_interruption": True,},}
const turnHandling = {interruption: {mode: 'adaptive',minDuration: 500,minWords: 0,discardAudioIfUninterruptible: true,falseInterruptionTimeout: 2000,resumeFalseInterruption: true,},};
To disable interruptions entirely, set enabled to false in the interruption options:
turn_handling = {"interruption": {"enabled": False},}
const turnHandling = {interruption: {enabled: false,},};
Parameters
The following parameters are available in the interruption options object InterruptionOptions:
boolOptionalDefault: TrueWhen True, the agent can be interrupted by user speech. When False, interruptions are disabled entirely. Use {"enabled": False} to disable; the previous bool shorthand is no longer supported.
Literal['adaptive', 'vad']OptionalInterruption detection strategy. Only applies when enabled is True.
Options:
"adaptive"- Use context-aware interruption detection (barge-in model). To learn more, see Adaptive interruption handling."vad"- Use VAD for interruption detection. See Interruption mode for when each mode applies.
If this parameter is omitted, the session uses "adaptive" when a turn detector model is configured with an STT that supports aligned transcripts. Otherwise, it falls back to "vad".
boolOptionalDefault: TrueWhen True, drop buffered audio while the agent is speaking and cannot be interrupted. This prevents audio buildup during uninterruptible speech. See Interruptions for context.
floatOptionalDefault: 0.5 secondsMinimum duration of speech to be considered as an interruption. Helps filter out brief sounds or noise that shouldn't trigger interruptions. Python uses seconds (for example, 0.5); Node.js uses milliseconds (for example, 500).
intOptionalDefault: 0Minimum number of words to be considered as an interruption. Only used if STT is enabled. Set to a value greater than 0 to require actual speech content before triggering interruptions.
float | NoneOptionalDefault: 2.0 secondsAmount of time (in seconds) to wait after an interruption before emitting an agent_false_interruption event if the user is silent and no user transcript is detected.
Set to None to disable false interruption detection. When disabled, all interruptions are treated as intentional. To learn more, see False interruptions. In Node.js, use milliseconds.
boolOptionalDefault: TrueWhether to resume the agent's speech after a false interruption is detected. When True, the agent continues speaking from where it left off after the false_interruption_timeout period has passed with no user transcription. To learn more, see False interruptions.