Model

The Model section in Agent Settings lets you choose the Large Language Model (LLM) that powers your agent — or plug in your own through a custom endpoint. Below is an overview of all models currently available in Agent Studio.

PolyAI Models

Our default, proprietary models.

PolyAI Raven V2

A production-hardened PolyAI model optimised for real-time voice interactions and high retrieval precision.

PolyAI Raven V3

The latest Raven model with improved grounding, paraphrasing, and robustness for enterprise voice use cases.

OpenAI Models

GPT-5

The newest general-purpose model with strong reasoning and conversational ability. Best for high-quality interactions requiring nuance.

GPT-5 chat

Optimised for extended dialogue and conversational stability.

GPT-5 mini

A lighter version of GPT-5, offering lower latency and reduced cost for mid-complexity use cases.

GPT-5 nano

A highly efficient variant suitable for simple tasks and fast-response workloads.

GPT-4o

A powerful, versatile model balancing reasoning, speed, and cost.

GPT-4o mini

A smaller, faster version ideal for everyday queries and high-volume deployments.

GPT-4.1

A refined GPT-4 generation with strong reasoning and improved performance across tasks.

GPT-4.1 mini

A cost-effective, latency-focused variant for lighter workloads.

GPT-4.1 nano

The most lightweight option in the GPT-4.1 family, designed for minimal compute and high throughput.

Amazon Bedrock Models

Bedrock Claude 3.5 Haiku

A fast, lightweight Claude variant suitable for simple, predictable tasks with strong safety alignment.

Bedrock Nova Micro

Amazon’s compact LLM optimised for efficiency while maintaining strong general-purpose performance.

Configuring the model

Open Agent Settings → Large Language Model.
Select the desired model from the dropdown.
Click Save to apply your changes.

For more details on each provider, see:

OpenAI Models
Anthropic (Claude)
Google DeepMind (Gemini)
Mistral
Amazon Nova Micro
Contact PolyAI for information about Raven, PolyAI’s proprietary LLM.

Bring Your Own Model (BYOM)

PolyAI supports bring-your-own-model (BYOM) via a simple API integration. If you run your own LLM, expose an endpoint that follows the OpenAI chat/completions schema and PolyAI will treat it like any other provider.

Overview

Expose an API endpoint that accepts/returns data in the OpenAI chat/completions format.
Provide authentication — PolyAI can send either an x-api-key header or a Bearer token.
(Optional) Support streaming responses using stream: true.

API endpoint

Request format

    {
      "model": "your-model-id",
      "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "What's the weather today?" }
      ],
      "temperature": 0.7,
      "top_p": 1.0,
      "stream": false
    }

You might receive extra OpenAI-style fields such as frequency_penalty, presence_penalty, etc.

Response format

    {
      "id": "chatcmpl-abc123",
      "object": "chat.completion",
      "created": 1712345678,
      "model": "your-model-id",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "It’s sunny today in London."
          },
          "finish_reason": "stop"
        }
      ]
    }

Streaming support (optional)

If stream is true, send Server-Sent Events (SSE) mirroring OpenAI’s format:

    data: {
      "id": "...",
      "object": "chat.completion.chunk",
      "choices": [{
        "delta": { "content": "Hello" },
        "index": 0,
        "finish_reason": null
      }]
    }

    data: {
      "choices": [{
        "delta": {},
        "index": 0,
        "finish_reason": "stop"
      }]
    }

    data: [DONE]

Authentication

Method	Header sent by PolyAI
API Key	`x-api-key: YOUR_API_KEY`
Bearer	`Authorization: Bearer YOUR_TOKEN`

Configure your server to accept one of the above.

Sample implementation (Python / Flask)

    from flask import Flask, request, jsonify
    import time, uuid

    app = Flask(__name__)

    @app.route('/chat/completions', methods=['POST'])
    def chat_completions():
        data = request.json
        messages = data.get('messages', [])
        user_input = messages[-1]['content'] if messages else ''

        # TODO: insert your model inference here
        reply = f'You said: {user_input}'

        return jsonify({
            'id': f'chatcmpl-{uuid.uuid4().hex}',
            'object': 'chat.completion',
            'created': int(time.time()),
            'model': 'my-llm',
            'choices': [{
                'index': 0,
                'message': { 'role': 'assistant', 'content': reply },
                'finish_reason': 'stop'
            }]
        })

Final checklist

Endpoint reachable via POST.
Request/response match OpenAI chat/completions schema.
Authentication header configured (API Key or Bearer token).
(Optional) Streaming supported if needed.

Send to your PolyAI contact:

Endpoint URL
Model ID
Auth method & credential

Introduction

Manage

Build

Voice

Configure

Troubleshoot

Legal

PolyAI Models

PolyAI Raven V2

PolyAI Raven V3

OpenAI Models

GPT-5

GPT-5 chat

GPT-5 mini

GPT-5 nano

GPT-4o

GPT-4o mini

GPT-4.1

GPT-4.1 mini

GPT-4.1 nano

Amazon Bedrock Models

Bedrock Claude 3.5 Haiku

Bedrock Nova Micro

Configuring the model

Bring Your Own Model (BYOM)

Overview

API endpoint

Request format

Response format

Streaming support (optional)

Authentication

Sample implementation (Python / Flask)

Final checklist

Introduction

Manage

Build

Voice

Configure

Troubleshoot

Legal

​PolyAI Models

​PolyAI Raven V2

​PolyAI Raven V3

​OpenAI Models

​GPT-5

​GPT-5 chat

​GPT-5 mini

​GPT-5 nano

​GPT-4o

​GPT-4o mini

​GPT-4.1

​GPT-4.1 mini

​GPT-4.1 nano

​Amazon Bedrock Models

​Bedrock Claude 3.5 Haiku

​Bedrock Nova Micro

​Configuring the model

​Bring Your Own Model (BYOM)

​Overview

​API endpoint

​Request format

​Response format

​Streaming support (optional)

​Authentication

​Sample implementation (Python / Flask)

​Final checklist

PolyAI Models

PolyAI Raven V2

PolyAI Raven V3

OpenAI Models

GPT-5

GPT-5 chat

GPT-5 mini

GPT-5 nano

GPT-4o

GPT-4o mini

GPT-4.1

GPT-4.1 mini

GPT-4.1 nano

Amazon Bedrock Models

Bedrock Claude 3.5 Haiku

Bedrock Nova Micro

Configuring the model

Bring Your Own Model (BYOM)

Overview

API endpoint

Request format

Response format

Streaming support (optional)

Authentication

Sample implementation (Python / Flask)

Final checklist