This page covers how to update voice settings, manage audio quality, and fix common audio issues.
Quick reference
| I need to… | Action |
|---|
| Change the agent’s voice | Channels > Voice > Agent Voice → Change |
| Adjust voice parameters | Channels > Voice > Agent Voice → settings gear |
| Fix a mispronunciation | Channels > Voice > Response Control → Pronunciations |
| Update cached audio | Audio Management → Edit → Sync |
| Change interaction style | Channels > Voice > Audio Management → Interaction style |
| Enable/disable barge-in | Channels > Voice > Audio Management → toggle |
| Upload custom audio | Audio Management → Upload |
Changing your agent’s voice
Consider updating when your brand refreshes, customers report clarity issues, you’re expanding to new languages, or newer voice models become available.
- Go to Channels > Voice > Agent Voice
- Click Change to open the Voice Library
- Filter by Language, Region, and Gender
- Preview voices with custom text
- Click Select to apply
- Test in Agent Chat before publishing
For non-English projects, use a multilingual_v2 model to ensure proper language support.
For programmatic voice configuration, see voice classes and Add a voice.
Interaction style and barge-in
Interaction style (response latency)
Control how quickly your agent responds in Channels > Voice > Audio Management:
| Mode | Delay | Best for |
|---|
| Turbo | 400ms | Ultra-fast, may interrupt more |
| Swift | 1200ms | Simple queries |
| Balanced | 1600ms | Most use cases (default) |
| Precise | 2000ms | Complex queries needing accuracy |
Barge-in
Toggle in Channels > Voice > Audio Management. Lets callers interrupt the agent mid-sentence.
Enable when: using Turbo mode, callers frequently interrupt, or you want more natural conversations.
Disable when: delivering complete information (legal disclaimers), background noise causes false interruptions.
Managing audio quality
Cached audio
The Audio Management tab lets you cache and optimize frequently-used audio for reduced latency and consistent quality.
- Find utterances in Channels > Voice > Audio Management
- Click Edit to adjust stability/clarity settings or add IPA pronunciation corrections
- Click the sync icon to regenerate, then preview
Audio is only cached after the same TTS is generated at least twice within 24 hours. For critical phrases (greetings, transfers), generate them repeatedly or upload manually.
Custom audio uploads
Upload pre-recorded audio (WAV or MP3) for maximum control over greetings, legal disclaimers, or brand-specific moments.
Fixing pronunciations
When the agent mispronounces words:
- Go to Channels > Voice > Response Control → Pronunciations tab
- Click Add pronunciation
- Enter the word as it appears in text
- Provide the IPA pronunciation (e.g., “PolyAI” →
/ˈpɒli eɪ aɪ/)
- Test in Agent Chat
You can also use SSML for advanced control:
<break time="500ms"/>
<prosody rate="slow">Speak this slowly</prosody>
Troubleshooting
| Issue | Likely cause | Fix |
|---|
| Voice sounds robotic | Low-quality TTS | Switch to Cartesia or ElevenLabs |
| Agent speaks too fast | Rate set too high | Adjust via settings gear in Agent Voice |
| Agent interrupts frequently | Turbo mode without barge-in | Enable barge-in or switch to Balanced |
| Mispronunciations | TTS doesn’t recognize word | Add pronunciation in Response Control |
| High latency | Slow TTS provider | Switch to Cartesia or use cached audio |
| Background noise interruptions | Barge-in too sensitive | Disable barge-in or adjust interaction style |
Maintenance routine
- Monthly: Listen to recent calls and identify voice quality issues
- As needed: Add pronunciations for new terms
- After voice changes: Regenerate cached audio
Related pages