Voice and audio updates

Update voice settings, fix mispronunciations, manage cached audio, and tune call behavior to maintain high-quality voice experiences for your callers.

Quick reference

I need to…	Action
Change the agent’s voice	Voice > Settings → Change
Adjust voice parameters	Voice > Settings → settings gear
Fix a mispronunciation	Voice > Advanced settings > Speech → Pronunciation
Update cached audio	Voice > Audio library → Edit → Sync
Enable/disable barge-in	Voice > Advanced settings > Call → toggle
Upload custom audio	Voice > Audio library → Upload

Changing your agent’s voice

Consider updating when your brand refreshes, customers report clarity issues, you’re expanding to new languages, or newer voice models become available.

Go to Voice > Settings
Click Change to open the Voice Library
Filter by Language, Region, and Gender
Preview voices with custom text
Click Select to apply
Test in Agent Chat before publishing

For non-English projects, use a multilingual_v2 model to ensure proper language support.

For programmatic voice configuration, see voice classes and Add a voice.

Barge-in

Toggle in Voice > Advanced settings > Call. Lets callers interrupt the agent mid-sentence. Enable when: callers frequently interrupt, or you want more natural conversations. Disable when: delivering complete information (legal disclaimers), background noise causes false interruptions.

Managing audio quality

Cached audio

The Audio library tab lets you cache and optimize frequently-used audio for reduced latency and consistent quality.

Open Voice > Audio library
Click Edit to adjust stability/clarity settings or add IPA pronunciation corrections
Click the sync icon to regenerate, then preview

Audio is only cached after the same TTS is generated at least twice in 24 hours. For critical phrases (greetings, transfers), generate them repeatedly or upload manually.

Custom audio uploads

Upload pre-recorded audio (WAV or MP3) for maximum control over greetings, legal disclaimers, or brand-specific moments.

Fixing pronunciations

When the agent mispronounces words:

Go to Voice > Advanced settings > Speech → Pronunciation section
Add a pronunciation rule
Enter the regex pattern for the word as it appears in text
Provide the IPA replacement (e.g., “PolyAI” → /ˈpɒli eɪ aɪ/)
Test in Agent Chat

You can also use SSML for advanced control:

<break time="500ms"/>
<prosody rate="slow">Speak this slowly</prosody>

Troubleshooting

Issue	Likely cause	Fix
Voice sounds robotic	Low-quality TTS	Switch to Cartesia or ElevenLabs
Agent speaks too fast	Rate set too high	Adjust with the settings gear in Voice Settings
Agent interrupts frequently	Barge-in too sensitive	Disable barge-in in Advanced settings > Call
Mispronunciations	TTS doesn’t recognize word	Add pronunciation rule in Advanced settings > Speech
High latency	Slow TTS provider	Switch to Cartesia or use cached audio
Background noise interruptions	Barge-in too sensitive	Disable barge-in or increase speech end delay

Maintenance routine

Monthly: Listen to recent calls and identify voice quality issues
As needed: Add pronunciations for new terms
After voice changes: Regenerate cached audio

Audio library – audio caching and optimization
Advanced voice settings – model, barge-in, speech recognition, pronunciation
Voice library – browse and select voices
Voice settings – voice configuration options

PolyAcademy

Recipes

Maintain

Glossary

FAQs

Quick reference

Changing your agent’s voice

Barge-in

Managing audio quality

Cached audio

Custom audio uploads

Fixing pronunciations

Troubleshooting

Maintenance routine

​Quick reference

​Changing your agent’s voice

​Barge-in

​Managing audio quality

​Cached audio

​Custom audio uploads

​Fixing pronunciations

​Troubleshooting

​Maintenance routine

​Related pages

Quick reference

Changing your agent’s voice

Barge-in

Managing audio quality

Cached audio

Custom audio uploads

Fixing pronunciations

Troubleshooting

Maintenance routine

Related pages