Skip to main content
Use audio management to reduce voice latency and control how your agent sounds during key moments – greetings, transfer messages, and other frequently spoken phrases. Caching these responses means callers hear them faster and with consistent quality, instead of waiting for real-time TTS generation on every call. The Audio Management tab is found under Channels > Voice > Audio management. audio-management-1

Getting started

To manage your agent’s audio:
  1. Go to Channels > Voice > Audio management.
  2. Review all audio saved to the cache and monitor how often it has been used by the agent.
  3. You can delete cached files and upload new ones to overwrite existing audio.
You can edit the stability and clarity of the agent’s voice specifically for this utterance. For more information on these settings, visit the voice feature page. The edit tab also includes sync and play buttons so you can test changes to the utterance live in the edit panel.audio-management-edit-options
Why am I only seeing a few cached audios? The audio cache stores a file only if the same TTS is generated at least twice within a 24-hour window. This helps manage cache size and performance. If a particular utterance isn’t used multiple times within that period, it won’t persist in the cache and may appear missing. To ensure key audios remain cached, consider generating them repeatedly or uploading static versions manually.

Interaction style

Adjust response latency to balance speed and accuracy.

Interaction style settings

audio-management-1
  1. Locate the Interaction style section on the Audio management page (Channels > Voice > Audio management).
  2. Choose from the available modes:
    • Turbo Mode – Fastest response time with high interruption tolerance
    • Balanced Mode – Moderate latency with good accuracy
    • Precise Mode – Longest latency for maximum accuracy
  3. Click on the bubble for your preferred mode. A brief description of the mode will appear.
  4. Save your settings to apply changes. Your agent will adjust its behavior immediately.

Performance characteristics

Each response mode is designed for specific performance needs:
ModeLatencyInterruption toleranceBest for
Turbo400msHigh (enable barge-in)Ultra-responsive agents; pair with barge-in to let callers reclaim control
Balanced1600msModerateGeneral use cases balancing responsiveness and accuracy
Precise2000msLowAccuracy-critical scenarios with minimal interruptions

Barge-in

Allow callers to interrupt the agent mid-utterance. When enabled, the agent stops speaking as soon as it detects caller speech, shortening VAD time and reducing response latency. To enable this feature, go to Channels > Voice > Audio Management and toggle Enable barge-in.

How barge-in works

When barge-in is enabled:
  1. The agent begins speaking its response.
  2. If the caller starts talking, the agent stops its current utterance and begins listening.
  3. The agent processes the caller’s input and responds as a new turn.
This also applies to delay control responses – if the caller speaks during a filler phrase, the delay sequence is interrupted.

When to use barge-in

Barge-in works well for:
  • FAQ-heavy agents where callers may already know what they need
  • Long agent responses where the caller wants to redirect the conversation
  • Turbo interaction style, where fast responsiveness is a priority

When to disable barge-in

Consider disabling barge-in (globally or per flow/step) when:
  • The agent is executing a function with external side effects (bookings, payments, form submissions) – the caller may interrupt after the action completes but before hearing the confirmation
  • The agent must deliver a mandatory disclosure or disclaimer that cannot be skipped
  • The environment is noisy, causing false barge-in triggers from background sounds

Per-flow and per-step overrides

You can configure barge-in at a granular level using the experimental JSON config. This lets you enable barge-in globally while disabling it for specific flows or steps where interruption would be problematic. Overrides follow a precedence order: step > flow > global. For example, you can have barge-in off globally, enabled for a specific flow, and disabled again for a sensitive step within that flow.
Barge-in behavior cannot be fully tested in the chat panel. Always verify with a real phone call.

Voice configuration

Configure VAD, greeting audio, and call handling settings.

Agent Voice

Adjust voice stability and clarity for your TTS provider.

Response control

Block keywords and configure pronunciations.
Last modified on April 20, 2026