LiveKit docs › Multimodality › Modality-aware instructions

---

# Modality-aware instructions

> Give your agent different system prompts for voice and text input.

## Overview

A single agent can serve both voice and text users in the same session, but the two input types can benefit from different instructions. Spoken input arrives as imperfect transcription and can contain relative expressions (for example, "next Tuesday"), self-corrections, and filler words, so the LLM might need additional guidance to interpret it correctly. Typed input, on the other hand, is usually more precise and literal, so these same instructions can degrade text responses by adding spoken-style confirmations or stripping useful formatting.

The `Instructions` class holds two variants of your system prompt, one for `audio` and one for `text`. The framework applies the variant that matches each turn's input modality before calling the LLM, so voice turns get the audio prompt and text turns get the text prompt automatically.

> ℹ️ **Beta in Python**
> 
> In Python, `Instructions` is exported from `livekit.agents.beta` and is subject to change. In Node.js, it's a stable export of the main `llm` namespace (and also re-exported from `beta`).

## Define instructions per modality

Create an `Instructions` object with `audio` and `text` variants and pass it wherever you would pass an instructions string, such as the `Agent` constructor. The `text` variant is optional and falls back to the `audio` variant when omitted.

**Python**:

```python
from livekit.agents import Agent
from livekit.agents.beta import Instructions

instructions = Instructions(
    audio=(
        "You are a scheduling assistant. The user is speaking, so their input may be "
        "imperfect. Resolve spoken expressions like 'next Tuesday' to concrete dates, "
        "honor verbal self-corrections, and confirm the date and time out loud before booking."
    ),
    text=(
        "You are a scheduling assistant. The user is typing, so take their input literally. "
        "Accept exact dates and times in any common format and skip verbal confirmations."
    ),
)


class SchedulingAgent(Agent):
    def __init__(self) -> None:
        super().__init__(instructions=instructions)

```

---

**Node.js**:

```typescript
import { llm, voice } from '@livekit/agents';

const instructions = new llm.Instructions({
  audio:
    'You are a scheduling assistant. The user is speaking, so their input may be ' +
    "imperfect. Resolve spoken expressions like 'next Tuesday' to concrete dates, " +
    'honor verbal self-corrections, and confirm the date and time out loud before booking.',
  text:
    'You are a scheduling assistant. The user is typing, so take their input literally. ' +
    'Accept exact dates and times in any common format and skip verbal confirmations.',
});

class SchedulingAgent extends voice.Agent {
  constructor() {
    super({ instructions });
  }
}

```

## How variants are applied

During a session, the framework selects the variant that matches the input modality of each turn: the `audio` variant for spoken turns and the `text` variant for typed turns. Both variants are preserved across turns, so an agent that handles a voice turn followed by a text turn uses the correct prompt for each.

## Select the active variant

When you [generate a reply manually](https://docs.livekit.io/agents/multimodality/audio.md#generate_reply), specify the variant with the `input_modality` (Python) or `inputModality` (Node.js) parameter:

**Python**:

```python
# Use the audio variant for this reply
session.generate_reply(input_modality="audio")

```

---

**Node.js**:

```typescript
// Use the audio variant for this reply
session.generateReply({ inputModality: 'audio' });

```

To explicitly set the active variant, use `as_modality` (Python) or `asModality` (Node.js). This returns a copy of the instructions with the selected variant active. Both variants are preserved, so you can switch between them as needed.

**Python**:

```python
# Return a copy whose active value is the text variant
text_first = instructions.as_modality("text")

```

---

**Node.js**:

```typescript
// Return a copy whose active value is the text variant
const textFirst = instructions.asModality('text');

```

## Compose instructions

You can build instructions from reusable pieces while keeping both variants intact. A shared base prompt can be combined with modality-specific guidance using concatenation and templating.

**Python**:

In Python, `Instructions` subclasses `str`. Use `+` to concatenate and `format` to substitute values. Both handle each variant separately:

```python
base = Instructions(
    audio="You are Alex, a scheduling assistant.\n{modality_specific}",
    text="You are Alex, a scheduling assistant.\n{modality_specific}",
)

modality_specific = Instructions(
    audio="Resolve spoken dates and confirm out loud.",
    text="Accept literal dates and skip confirmations.",
)

# `format` applies to both variants at once
instructions = base.format(modality_specific=modality_specific)

# `+` also works and preserves both variants
instructions = instructions + "\nThe current date is 2026-05-29."

```

---

**Node.js**:

In Node.js, use the `Instructions.tpl` tagged template to compose with template literals, or `concatInstructions` to join a mix of strings and `Instructions`. Both handle each variant separately:

```typescript
import { llm } from '@livekit/agents';

const modalitySpecific = new llm.Instructions({
  audio: 'Resolve spoken dates and confirm out loud.',
  text: 'Accept literal dates and skip confirmations.',
});

// `tpl` interpolates each variant from any embedded Instructions
const instructions = llm.Instructions.tpl`You are Alex, a scheduling assistant.
${modalitySpecific}
The current date is 2026-05-29.`;

// `concatInstructions` joins strings and Instructions, preserving both variants
const combined = llm.concatInstructions('Base prompt. ', modalitySpecific);

```

## Customize built-in tasks

Available in (BETA):
- [ ] Node.js
- [x] Python

[Prebuilt tasks](https://docs.livekit.io/agents/prebuilt/tasks.md) ship with their own default prompts. The beta `InstructionParts` type lets you customize those prompts without rewriting them. Set `persona` to change the agent's identity and `extra` to append domain-specific context. Leave a field unset to keep the task's built-in default, or set it to an empty string to remove that section entirely. Each field accepts a plain string or an `Instructions` object, so customizations can themselves be modality-aware.

To apply a customization, pass an `InstructionParts` object as a task's `instructions` argument:

```python
from livekit.agents.beta import Instructions
from livekit.agents.beta.workflows import GetEmailTask, InstructionParts

task = GetEmailTask(
    instructions=InstructionParts(
        persona="You are Riley, a friendly intake assistant collecting a contact email.",
        # `extra` is itself modality-aware: confirm out loud for voice, stay quiet for text
        extra=Instructions(
            audio="Confirm the spelling out loud, letter by letter, for unusual domains.",
            text="Accept the email exactly as typed; only re-prompt if it's clearly malformed.",
        ),
    )
)

```

For a complete example that runs the task inside a function tool, see the [email registration example](https://github.com/livekit/agents/blob/main/examples/voice_agents/email_example.py).

## Additional resources

Complete, runnable example agents that set different instructions for voice and text users:

- **[Per-modality instructions (Python)](https://github.com/livekit/agents/blob/main/examples/voice_agents/instructions_per_modality.py)**: A scheduling assistant with separate audio and text prompts, built with the Python SDK.

- **[Per-modality instructions (Node.js)](https://github.com/livekit/agents-js/blob/main/examples/src/instructions_per_modality.ts)**: A scheduling assistant with separate audio and text prompts, built with the Node.js SDK.

---

This document was rendered at 2026-06-07T11:34:13.593Z.
For the latest version of this document, see [https://docs.livekit.io/agents/multimodality/instructions.md](https://docs.livekit.io/agents/multimodality/instructions.md).

To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).