This page requires Python familiarity. It covers programmatic voice configuration using provider classes. For no-code voice selection, see the Voice library.
The PolyAI platform supports flexible voice selection for external providers including Cartesia, ElevenLabs, Hume, Rime, Minimax, PlayHT, and Google TTS.
Provider classes
When picking models, adjusting stability, or accessing third-party providers — use provider-specific TTSVoice classes. See the Voice classes reference for the full list of providers and parameters.
Example: Cartesia
Cartesia is the recommended provider for new projects due to its low latency and natural output.
from polyai.voice import CartesiaVoice, Emotion, EmotionKind, EmotionIntensity
conv.set_voice(
CartesiaVoice(
provider_voice_id="a1b2c3d4",
speed=0.0, # -1.0 (slowest) to 1.0 (fastest)
emotions=[
Emotion(EmotionKind.POSITIVITY, EmotionIntensity.HIGH)
],
model_id="sonic" # or "sonic-preview"
)
)
Example: ElevenLabs
from polyai.voice import ElevenLabsVoice
conv.set_voice(
ElevenLabsVoice(
provider_voice_id="gDnGxUcsitTxRiGHr904",
model_id="eleven_turbo_v2_5", # Recommended default
stability=1.0,
similarity_boost=0.7,
speed=1.0, # Optional: 0.7–1.2, adjusts speech rate
)
)
Available model IDs: eleven_monolingual_v1, eleven_multilingual_v1, eleven_turbo_v2, eleven_turbo_v2_5, eleven_flash_v2_5, and eleven_v3. The default is eleven_turbo_v2_5.
eleven_v3 limitations:
- Stability: Only supports discrete values:
0.0 (Creative), 0.5 (Natural), and 1.0 (Robust). Values between these are not supported and may produce unexpected results.
- Streaming latency: Do not set
optimize_streaming_latency when using eleven_v3 — this parameter is not supported and will cause an error.
Example: Hume
from polyai.voice import HumeVoice
conv.set_voice(
HumeVoice(
provider_voice_id="voice_uuid_or_name",
voice_description="patient, empathetic counselor", # Optional
version="2", # "1" for octave-1, "2" for octave-2
instant_mode=False, # Ultra-low latency mode
)
)
Cache behavior
- Changing
model_id does not automatically invalidate cached audio.
- To reset cached audio after changing models:
- Go to Channels > Voice > Audio management and delete existing cache entries.
- Or, create a new voice entry with a different voice ID.
Prepend the model ID to the voice ID (e.g. eleven_turbo_v2_5/a1b2c3...) to isolate cache entries per model. This is the most reliable way to ensure the correct model is used after a switch.