Amazon Nova Sonic integration guide | LiveKit Documentation

Available in

Python

Overview

Amazon Nova Sonic is a state-of-the-art speech-to-speech model with a bidirectional audio streaming API. Nova Sonic processes and responds to realtime speech as it occurs, enabling natural, human-like conversational experiences. LiveKit's AWS plugin includes support for both Nova Sonic 1.0 and Nova Sonic 2.0 on AWS Bedrock.

Installation

Install the AWS plugin from PyPI with the realtime extra:

uv add "livekit-plugins-aws[realtime]"

Authentication

The AWS plugin requires AWS credentials. Set the following environment variables in your .env file:

AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key>

Usage

Use the Nova Sonic API within an AgentSession. For example, you can use it in the Voice AI quickstart.

from livekit.plugins import aws

# Nova Sonic 2.0 (default) - supports text input and 16 voices
session = AgentSession(
    llm=aws.realtime.RealtimeModel(),
)

# Or explicitly specify version
session = AgentSession(
    llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
        voice="tiffany",
        turn_detection="MEDIUM"
    ),
)

# Nova Sonic 1.0 - audio-only, 11 voices
session = AgentSession(
    llm=aws.realtime.RealtimeModel.with_nova_sonic_1(
        voice="matthew"
    ),
)

Parameters

This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference.

modelstringOptionalDefault: "amazon.nova-2-sonic-v1:0"

Bedrock model ID for realtime inference. Use "amazon.nova-sonic-v1:0" for Nova Sonic 1.0.

modalitiesstringOptionalDefault: "mixed"

Input/output mode. Valid values are:

"mixed" enables audio and text input (Nova Sonic 2.0 only).
"audio" enables audio-only (Nova Sonic 1.0).

voicestringOptionalDefault: "tiffany"

Name of the Nova Sonic API voice. Nova Sonic 2.0 supports 16 voices across 8 languages. Nova Sonic 1.0 supports 11 voices. For a full list, see Voices.

turn_detectionstringOptionalDefault: "MEDIUM"

Turn-taking sensitivity. Valid values are "HIGH", "MEDIUM", and "LOW". To learn more, see Turn detection sensitivity.

regionstringOptionalDefault: "us-east-1"

AWS region of the Bedrock runtime endpoint. Defaults to "us-east-1".

generate_reply_timeoutfloatOptionalDefault: 10.0

Timeout in seconds for generate_reply() calls. This is the maximum time to wait before aborting the request.

Turn detection sensitivity

Nova Sonic includes built-in VAD-based turn detection with configurable sensitivity:

"HIGH": Fast responses, detects pauses quickly. Might interrupt slower speakers.
"MEDIUM": Balanced turn-taking. Recommended for most use cases.
"LOW": Patient, waits longer before responding. Best for thoughtful or hesitant speakers.

Configure turn detection when creating the model:

session = AgentSession(
    llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
        turn_detection="MEDIUM"
    ),
)

Text input with Nova Sonic 2.0

Nova Sonic 2.0 supports text input via the generate_reply() method, enabling programmatic agent responses. This is useful for having the agent speak first, providing instructions, or simulating user input. Use modalities="mixed" to enable text input.

Parameters

instructionsstringOptional

Instructions for the agent to use for the reply. This is sent as a system prompt to guide the model's response.

user_inputstringOptional

User input to respond to. This is sent as a user message to the model and added to Nova's conversation context.

Examples

Make the agent greet users when they join:

class Assistant(Agent):
    async def on_enter(self):
        await self.session.generate_reply(
            instructions="Greet the user warmly and introduce your capabilities"
        )

Send instructions to the agent to generate a response:

await session.generate_reply(
    instructions="Greet the user and ask how you can help"
)

Send user input to the agent to generate a response:

await session.generate_reply(
    user_input="Hello, I need help with my account"
)

Timeout configuration

Configure a timeout for generate_reply() calls:

session = AgentSession(
    llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
        generate_reply_timeout=15.0  # seconds
    ),
)

Additional resources

The following resources provide more information about using Nova Sonic with LiveKit Agents.

Python package

The livekit-plugins-aws package on PyPI.

Plugin reference

Reference for the Nova Sonic integration.

GitHub repo

View the source or contribute to the LiveKit AWS plugin.

Nova Sonic docs

Nova Sonic API documentation.

Voice AI quickstart

Get started with LiveKit Agents and Amazon Nova Sonic.

Joke Teller example

Full example demonstrating Nova Sonic 2.0 features including text prompting, multilingual support, and function calling.

AWS AI ecosystem guide

Overview of the entire AWS AI and LiveKit Agents integration.