Add a voice - PolyAI Platform

This page requires Python familiarity. It covers programmatic voice configuration using provider classes. For no-code voice selection, see the Voice library. The PolyAI platform supports flexible voice selection for external providers including Cartesia, ElevenLabs, Hume, Rime, Minimax, PlayHT, and Google TTS.

Provider classes

When picking models, adjusting stability, or accessing third-party providers – use provider-specific TTSVoice classes. See the Voice classes reference for the full list of providers and parameters.

Example: Cartesia

Cartesia is the recommended provider for new projects due to its low latency and natural output.

from polyai.voice import CartesiaVoice, Emotion, EmotionKind, EmotionIntensity

conv.set_voice(
    CartesiaVoice(
        provider_voice_id="a1b2c3d4",
        speed=0.0,  # -1.0 (slowest) to 1.0 (fastest)
        emotions=[
            Emotion(EmotionKind.POSITIVITY, EmotionIntensity.HIGH)
        ],
        model_id="sonic-3"  # or "sonic-preview"
    )
)

Example: ElevenLabs

from polyai.voice import ElevenLabsVoice

conv.set_voice(
    ElevenLabsVoice(
        provider_voice_id="gDnGxUcsitTxRiGHr904",
        model_id="eleven_turbo_v2_5",  # Recommended default
        stability=1.0,
        similarity_boost=0.7,
        speed=1.0,              # Optional: 0.7–1.2, adjusts speech rate
    )
)

Available model IDs: eleven_monolingual_v1, eleven_multilingual_v1, eleven_turbo_v2, eleven_turbo_v2_5, eleven_flash_v2_5, and eleven_v3. The default is eleven_turbo_v2_5.

eleven_v3 limitations:

Stability: Only supports discrete values: 0.0 (Creative), 0.5 (Natural), and 1.0 (Robust). Values between these are not supported and may produce unexpected results.
Streaming latency: Do not set optimize_streaming_latency when using eleven_v3 – this parameter is not supported and will cause an error.

Example: Hume

from polyai.voice import HumeVoice

conv.set_voice(
    HumeVoice(
        provider_voice_id="voice_uuid_or_name",
        voice_description="patient, empathetic counselor",  # Optional
        version="2",        # "1" for octave-1, "2" for octave-2
        instant_mode=False,  # Ultra-low latency mode
    )
)

Cache behavior

Changing model_id does not automatically invalidate cached audio.
To reset cached audio after changing models:
- Go to Channels > Voice > Audio management and delete existing cache entries.
- Or, create a new voice entry with a different voice ID.

Prepend the model ID to the voice ID (e.g. eleven_turbo_v2_5/a1b2c3...) to isolate cache entries per model. This is the most reliable way to ensure the correct model is used after a switch.

Voice library

Browse and select voices

Agent Voice

Configure voice settings and fine-tuning

Multi-voice

Use multiple voices in a single agent

Voice configuration

Configure model selection and call settings

​Provider classes

​Example: Cartesia

​Example: ElevenLabs

​Example: Hume

​Cache behavior

​Related pages

Voice library

Agent Voice

Multi-voice

Voice configuration

Provider classes

Example: Cartesia

Example: ElevenLabs

Example: Hume

Cache behavior

Related pages