Skip to main content

Amazon Nova Sonic integration guide

How to use the Amazon Nova Sonic model with LiveKit Agents.

Available in
Python

Overview

Amazon Nova Sonic is a state-of-the-art speech-to-speech model with a bidirectional audio streaming API. Nova Sonic processes and responds to realtime speech as it occurs, enabling natural, human-like conversational experiences. LiveKit's AWS plugin includes support for both Nova Sonic 1.0 and Nova Sonic 2.0 on AWS Bedrock.

Quick reference

This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources.

Installation

Install the AWS plugin from PyPI with the realtime extra:

uv add "livekit-plugins-aws[realtime]"

Authentication

The AWS plugin requires AWS credentials. Set the following environment variables in your .env file:

AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key>

Usage

Use the Nova Sonic API within an AgentSession. For example, you can use it in the Voice AI quickstart.

from livekit.plugins import aws
# Nova Sonic 2.0 (default) - supports text input and 16 voices
session = AgentSession(
llm=aws.realtime.RealtimeModel(),
)
# Or explicitly specify version
session = AgentSession(
llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
voice="tiffany",
turn_detection="MEDIUM"
),
)
# Nova Sonic 1.0 - audio-only, 11 voices
session = AgentSession(
llm=aws.realtime.RealtimeModel.with_nova_sonic_1(
voice="matthew"
),
)

Parameters

This section describes some of the available parameters. For a complete reference of all available parameters, see the plugin reference.

modelstringOptionalDefault: "amazon.nova-2-sonic-v1:0"

Bedrock model ID for realtime inference. Use "amazon.nova-sonic-v1:0" for Nova Sonic 1.0.

modalitiesstringOptionalDefault: "mixed"

Input/output mode. Valid values are:

  • "mixed" enables audio and text input (Nova Sonic 2.0 only).
  • "audio" enables audio-only (Nova Sonic 1.0).
voicestringOptionalDefault: "tiffany"

Name of the Nova Sonic API voice. Nova Sonic 2.0 supports 16 voices across 8 languages. Nova Sonic 1.0 supports 11 voices. For a full list, see Voices.

turn_detectionstringOptionalDefault: "MEDIUM"

Turn-taking sensitivity. Valid values are "HIGH", "MEDIUM", and "LOW". To learn more, see Turn detection sensitivity.

regionstringOptionalDefault: "us-east-1"

AWS region of the Bedrock runtime endpoint. Defaults to "us-east-1".

generate_reply_timeoutfloatOptionalDefault: 10.0

Timeout in seconds for generate_reply() calls. This is the maximum time to wait before aborting the request.

Turn detection sensitivity

Nova Sonic includes built-in VAD-based turn detection with configurable sensitivity:

  • "HIGH": Fast responses, detects pauses quickly. Might interrupt slower speakers.
  • "MEDIUM": Balanced turn-taking. Recommended for most use cases.
  • "LOW": Patient, waits longer before responding. Best for thoughtful or hesitant speakers.

Configure turn detection when creating the model:

session = AgentSession(
llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
turn_detection="MEDIUM"
),
)

Text input with Nova Sonic 2.0

Nova Sonic 2.0 supports text input via the generate_reply() method, enabling programmatic agent responses. This is useful for having the agent speak first, providing instructions, or simulating user input. Use modalities="mixed" to enable text input.

Parameters

instructionsstringOptional

Instructions for the agent to use for the reply. This is sent as a system prompt to guide the model's response.

user_inputstringOptional

User input to respond to. This is sent as a user message to the model and added to Nova's conversation context.

Examples

Make the agent greet users when they join:

class Assistant(Agent):
async def on_enter(self):
await self.session.generate_reply(
instructions="Greet the user warmly and introduce your capabilities"
)

Send instructions to the agent to generate a response:

await session.generate_reply(
instructions="Greet the user and ask how you can help"
)

Send user input to the agent to generate a response:

await session.generate_reply(
user_input="Hello, I need help with my account"
)

Timeout configuration

Configure a timeout for generate_reply() calls:

session = AgentSession(
llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
generate_reply_timeout=15.0 # seconds
),
)

Additional resources

The following resources provide more information about using Nova Sonic with LiveKit Agents.