Skip to main content

Feeding document knowledge to Zoe RAG

Zoe uses information from your documents to improve the accuracy of its answers. You can add document-based knowledge in two ways, depending on what the documents describe:

  • AI Knowledge Index – best for general information that applies to all products or your brand.
  • Product facts – best for product-specific documents, such as user manuals or data sheets linked to individual items.

Both paths can be used together.

Crawl documents into the AI Knowledge Index

Use this option when the documents contain general product or brand information that should be available to Zoe across the catalog.

  • In Data Platform, go to Pipelines and create a crawler.
  • Under Data sources, select URL list and enter the links to your content pages.

Crawler

  • In Data preview & export, select AI Knowledge Index as the use case.

Crawler

Zoe will learn from these documents to provide general answers about your products and categories.

Crawl documents as product facts

Use this method for product‑specific documents, such as user manuals or specification sheets. Zoe connects the content of each document to the corresponding product.

  • In Data Platform, create a crawler pipeline using your product feed (with valid product detail page URLs) as the data source. Select Product Index:

Crawler

  • Optional: You can limit the crawl to specific catalogs.

Defining document data points for RAG

For the RAG use case, you only need to define document data points. In most cases, you don’t have to change your crawler setup. Document data points work like regular ones but include an option to identify product documents.

  • In the crawler configuration, go to Step 3: Configure how content is indexed.
  • Create a data point and edit it.

Crawler

  • Under Set as, choose Product document URLs to tell the system that this data point contains links to product documents.

Crawler

  • Select how the crawler should extract the document links: XPath, URL pattern, Linked Data, PDF Meta Data, or Content Regex.
  • If you use XPath, enter a rule that points to the correct document link on the product page.

After you define the data point, the crawler collects document URLs for each product. When processing finishes, the documents appear in Assigned Facts and Zoe can use their content in conversations.

Enable document‑based answers in Zoe

To allow Zoe to use information from product documents:

  • Enable the following feature flags:

    • Advisor Studio
    • Zoovu Ontology Expert
    • Zoovu Ontology Expert (alpha)
  • In Advanced configuration, add this parameter (booleans, not strings):

{"AI_CONTENT_RETRIEVER":{"enabled":"true", "enableCitations":true}}

RAG

Zoe can now access and cite relevant document content during conversations.

Processing time

Once the crawler setup is complete, Zoe processes and indexes the documents. Processing time varies depending on the number and size of the files. The documents appear in Zoe after indexing is finished.