Skip to main content
Voice choice strongly affects trust, clarity, and perceived competence. A good voice setup makes the agent feel calm, intentional, and predictable — especially in call scenarios where users cannot re-read responses. This step should be completed after agent behaviour is stable, but before serious call testing.

Where this lives

  • Configure → Agent voice
You will see two configurable sections:
  • Agent (the primary speaking voice)
  • Disclaimer (used only for legal or informational disclosures, if enabled)

What you are configuring

Each voice entry has three layers:
  1. Voice selection (who the agent sounds like)
  2. Weighting (if multiple voices are used)
  3. Voice tuning (stability, clarity, speed)
You can use a single voice or blend multiple voices together.

Step-by-step

1. Select the agent voice

In the Agent section:
  • Click Change
  • Browse voices by:
    • Language / locale
    • Gender
    • Voice type (for example, PolyVoice vs standard)
  • Preview each voice using the play button
Choose a voice that:
  • Matches your audience’s expectations
  • Sounds neutral and professional
  • Remains pleasant over repeated listening
Example (hotel concierge):
  • Language: English (United States)
  • Voice: Female, calm, mid-range pitch
  • Avoid novelty or character voices
Example (retail support):
  • Language: English (United Kingdom)
  • Voice: Neutral, friendly, slightly upbeat
Once selected, click Select.

2. (Optional) Add multiple agent voices

You can add more than one agent voice and split usage between them.
  • Click + Voice
  • Select an additional voice
  • Assign a percentage weight (for example, 70% / 30%)
This is useful when:
  • You want subtle variation across calls
  • You are testing two voices in parallel
Example:
  • Voice A (primary): 70%
  • Voice B (secondary): 30%
Avoid adding more than 2–3 voices unless you are deliberately testing variation.

3. Configure voice settings (gear icon)

Next to each voice, click the Settings (gear) icon to open advanced tuning. You will see: Voice model
  • Leave as default unless advised otherwise.
Stability (%)
  • Controls how consistent the voice sounds.
  • Higher = more predictable, less expressive.
Recommended ranges:
  • Call agents: 70–90
  • Informational agents: 60–80
Example: Stability: 80 Result: calm, consistent delivery across long calls
Clarity and similarity (%)
  • Controls how close the generated voice is to the original recording.
  • Higher = more realistic, but slightly less forgiving.
Recommended ranges:
  • Dynamic conversations: 80–95
  • Scripted or repetitive content: 95–100
Example: Clarity: 100 Result: very natural, human-like voice
Voice speed
  • Controls pacing.
  • Default is usually correct.
Recommended ranges:
  • Calls: 0.95–1.05
  • Long explanations: slightly slower (0.95)
Example: Speed: 1.05 Result: efficient without sounding rushed
Click Done to save settings.

4. Configure disclaimer voice (if enabled)

If your agent uses a disclaimer:
  • Expand the Disclaimer section
  • Select a voice (can be the same or different from the agent)
Best practices:
  • Use a clear, neutral voice
  • Avoid expressive or casual tones
  • Keep speed slightly slower than the main agent
Example disclaimer text: “This call may be recorded for quality and training purposes.”
Example settings:
  • Stability: 85
  • Clarity: 100
  • Speed: 0.95

5. Preview real utterances

Before publishing, preview the voice using real agent lines, not generic samples. Test:
  • Greeting
  • Clarifying question
  • SMS offer
  • Handoff offer
  • Closing line
Example preview phrases: “How can I help you today?” “I can send you a text with the link — would you like me to do that now?” “I can connect you to a colleague if you’d prefer.”
Listen for:
  • Natural pauses
  • Clear pronunciation
  • No rushing or slurring

6. Publish changes

Once satisfied:
  • Click Publish
  • Changes will apply to the selected environment
Remember: voice settings are environment-aware. Always test in Sandbox before Live.

Common mistakes to avoid

  • Using overly expressive or novelty voices
  • Setting stability too low (causes inconsistency)
  • Speaking too fast for phone audio
  • Forgetting to preview handoff language
  • Changing voices without retesting call flows

Verify

After publishing:
  • Make at least one test call
  • Listen to:
    • Greeting
    • Mid-call responses
    • Handoff or SMS offers
  • Confirm:
    • Voice sounds consistent
    • No awkward pacing
    • Audio matches expectations across turns
If something feels off, adjust stability first, then speed. Voice tuning is iterative — small adjustments often make a big difference.