Skip to main content
Import existing content – help articles, PDFs, internal docs – so your agent can reference it without rewriting everything as individual topics. Connected Knowledge aggregates sources and re-syncs automatically. collected-knowledge The Connected tab is found under Knowledge > Sources in Agent Studio. Raven is the recommended model — it paraphrases unstructured content more naturally than other models.
Use Connected Knowledge when you want to expose large volumes of external content quickly without curating individual topics. Use FAQs instead when you need actions, flows, or precise control over what the agent says and does. Both use RAG (retrieval-augmented generation) to match user queries.

Supported sources

  • Websites
  • Documents (PDF, CSV, JSON)
  • Help desk systems (Zendesk, Gladly)
Sources sync automatically and can be reused across projects.

How Sources differs from FAQs

Both tabs expose information to your agent. Key differences:
CapabilityConnected tabFAQs tab
Trigger actions, functions, flows, SMSNoYes
Precise control over agent responsesNoYes
Auto-sync from external sourcesYesNo
Best for frequently updated FAQ contentYes
Best for stable, structured infoYes
Fine-grained behavior controlNoYes
Setup complexityLow – no prompting skill requiredHigher – requires more expertise and maintenance
Connected = fast import of external content. FAQs = precise control with actions and flows. If both tabs contain conflicting information, FAQs always takes priority.

Add a new source

  1. Go to Knowledge > Sources tab
  2. Select New source
  3. Choose one of:
    • Upload files
    • Add URL
    • Zendesk
    • Gladly
    • Additional integrations are in development – contact your PolyAI representative for the latest availability
  4. Complete the required details and click Add
new-source Your agent will begin Syncing the content. Once ready, the source appears in the list.

Supported source types

Source TypeDetails
Upload files – Text & structured data.txt, .csv, .json, .xml, .md, .html, .rtf
Upload files – PDF.pdf
Upload files – Microsoft Office.docx, .doc, .docm, .xlsx, .xls, .xlsm, .pptx, .ppt, .pptm, .msg
Upload files – OpenDocument.odt, .ods, .odp
Upload files – Email files.eml
Upload files – E-books.epub
URL scrapingPublic documentation pages and help center articles
Zendesk (beta)Help Center content with API sync
Gladly (beta)Knowledge source sync
Additional integrationsIn development – contact your PolyAI representative for the latest availability

What exactly gets scraped when I upload a URL?

URL scraping traverses linked pages from the provided URL, with the following limits:
  1. Depth → Only one level below the initial URL.
  2. Breadth → A maximum of 10 embedded pages.
If your page contains more than 10 links, not all will be scraped. In that case, upload additional URLs individually or use integrations like Zendesk/Gladly for complete coverage. Where possible, connect applications such as Zendesk rather than relying on website scraping.

Keeping content fresh

After external content changes:
  • click Update to re-scrape files or URLs
  • or use the Sync icon per source
If a URL requires login or credentials change, syncing may fail. Update access and retry.

Group and manage sources

Group sources by product line, team, region, or document type. Sort by newest, oldest, type, or name. Each source offers:
  • Sync
  • Rename
  • Move to group
  • Remove

Why isn’t my agent using the sources I connected?

Several factors affect retrieval:

Data structure

Sources splits content into 2000-character chunks with 500-character overlap. Very large documents or widely separated related sections may struggle more with relevance. What to do:
  • Restructure documents into smaller, tighter pieces.
  • Repeat key headings or terms.
  • Or curate the material as a managed topic for guaranteed usage.

Update state

Two updates must be current:
  • Source Update → keeps the data in each source fresh
  • Agent Update → applies knowledge connection changes to the agent
Both can be triggered manually. Agent updates also run automatically every few minutes.

Environments, variants, saved changes

Each source must be enabled in the correct environment and variant. Any edits must be saved before leaving the page.

Conflicting information?

If the FAQs and Sources contain conflicting data, the FAQs tab wins. Content from the FAQs tab is always prioritized.

Viewing Connected Knowledge in Conversation Review

When your agent retrieves content from Connected Knowledge during a conversation, you can see exactly which sources were used in Conversation Review.
  1. Open a conversation in Analytics > Conversations > Voice.
  2. In the Diagnosis dropdown, toggle Sources on.
  3. Each turn where Connected Knowledge was retrieved shows a Sources tag beneath the agent’s response, alongside any matched FAQs.
  4. Click a source name to open an inline preview panel showing the exact text chunks the agent used.
  5. Use Open in Knowledge in the panel to navigate directly to the source in the Knowledge area.
sources-conversation-review This is useful for:
  • Verifying the agent retrieved the correct content for a given question
  • Debugging cases where the agent’s response seems inaccurate or incomplete
  • Confirming that newly added or updated sources are being picked up
Combine the Sources and Topic citations diagnosis layers to see both Connected Knowledge and FAQs side by side for each turn.

Behavior and configuration notes

  • Use PolyAI’s Raven LLM for best results – it paraphrases structured and unstructured content more naturally.
  • Sources results are given ranking priority to ensure they surface alongside FAQs.
  • Sources and FAQs data are merged at runtime.

FAQs

Create curated topics alongside connected sources. FAQs always take priority.

RAG overview

Understand how retrieval-augmented generation works across your knowledge.

Conversation diagnosis

Verify which knowledge sources were retrieved on each turn.
Last modified on June 18, 2026