Modifying LLM output before TTS

In this recipe, build an agent that speaks chain-of-thought reasoning aloud while avoiding the vocalization of <think> and </think> tokens. The steps focus on cleaning up the text just before it's sent to the TTS engine so the agent sounds natural.

Prerequisites

To complete this guide, you need to create an agent using the Voice agent quickstart.

Modifying LLM output before TTS

The following function removes the <think> tags from the agent's output so the TTS engine doesn't speak them. Here's how you can define the callback:

async def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):
    if isinstance(text, str):
        # Handle non-streaming text
        result = text.replace("<think>", "").replace("</think>", "")
        return result
    else:
        # Handle streaming text
        async def process_stream():
            async for chunk in text:
                processed = chunk.replace("<think>", "")\
                    .replace("</think>", "Okay, I'm ready to respond.")
                yield processed
        
        return process_stream()

The callback receives two parameters:

agent: The VoicePipelineAgent instance.
text: Either a string (for non-streaming) or an AsyncIterable of strings (for streaming).

Initializing the VoicePipelineAgent

Use the before_tts_cb parameter to pass in the callback when creating an instance of VoicePipelineAgent. The following code also specifies Groq's deepseek-r1-distill-llama-70b LLM model.

agent = VoicePipelineAgent(
    vad=ctx.proc.userdata["vad"],
    stt=openai.STT.(),
    llm=openai.LLM.with_groq(model="deepseek-r1-distill-llama-70b"), # change the model here
    tts=openai.TTS(),
    before_tts_cb=_before_tts_cb,  # Add the tts callback here
    chat_ctx=initial_ctx
)

How it works

The LLM generates text, including tokens like <think>...</think>.
Before the text goes to TTS, the before_tts_cb intercepts it.
The callback strips or modifies unwanted tokens.
The cleaned text is then spoken by the TTS engine.