A knowledge base is a document collection an agent can query as part of its reasoning. The Workbench has a drag-and-drop UI for managing documents, but for anything programmatic (syncing from a CMS, batch-uploading from S3, refreshing a doc set on a schedule) you’ll want the SDK.Documentation Index
Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt
Use this file to discover all available pages before exploring further.
Knowledge base retrieval is wired in automatically only for Agno. For LangChain, OpenAI Agents SDK, and AWS Strands, you need to add the retriever as a tool manually. See the framework pages for details.
Prerequisites
- Complete the Quickstart so the CLI, SDK, and
xpander loginare already set up. - Python 3.12+ for the local handler.
1. List your knowledge bases
Knowledge bases live at the organization level, not on a specific agent. You attach one or more KBs to an agent in the Workbench, and the agent’s framework gets a retriever wired in automatically.KnowledgeBase object has these fields:
| Field | Type | What it’s for |
|---|---|---|
id | str | Stable identifier. Use this to attach the KB to an agent or look it up later. |
name | str | Human-readable label. |
description | str | Shown to the agent as context for when to query this KB. |
type | str | managed (xpander handles embeddings and storage) or external (your own vector store, enterprise-only). |
total_documents | int | Count of documents currently indexed. |
2. Create a KB
- Knowledge bases are xpander-managed, which means chunking, embedding, vector storage, and search are configured automatically.
- The agent reads
descriptionto decide when to query this knowledge base.
To bring your own self-managed knowledge base, contact sales.
3. Add documents
Documents are referenced by URL, not uploaded as bytes. The KB fetches each URL, parses it, chunks it, embeds the chunks, and stores them.sync=Truewaits for processing before returning. Use it when you need to search the documents in the same script that uploads them. Usesync=False(the default) for batch jobs where you don’t need to block on completion.- URLs must be reachable from xpander’s infrastructure. For internal documents not on the public web, host them somewhere xpander can reach: S3 with a presigned URL, an internal HTTPS endpoint with IP whitelisting, or a public bucket. There’s no in-memory-blob entry point; if a document only exists in memory (a generated report, a transient export), upload it to storage first.
- Documents behind auth need a credentialless URL. The platform’s fetcher can’t carry your session. Use S3 presigned URLs that include a short-lived token, or make a temporary public link.
4. List documents in a KB
status is most useful when you’ve added documents asynchronously and want to know which ones finished processing. Failed documents stay in the list with their error captured, so you can find and re-add the URLs after fixing whatever was wrong.
5. Remove documents
6. Delete a KB
7. Attach a KB to an agent
Sync patterns
Two patterns come up repeatedly in production: Keeping a KB in sync with a source-of-truth elsewhere. Run a scheduled job that fetches the latest URL list from your CMS, diffs it againstkb.list_documents(), removes documents no longer in the source, and adds new ones. With sync=True on the add, the job is idempotent.
Per-tenant KBs in a multi-tenant system. One KB per customer, attached to a customer-specific agent. The setup overhead is one kbs.create(name=...) call when you onboard a customer, plus an agent.attach_knowledge_base(...) call to link it.
Troubleshooting
Document status is `failed` after `add_documents`
Document status is `failed` after `add_documents`
The most common causes are: the URL isn’t reachable from xpander’s infrastructure (private network, missing auth), the file format isn’t supported, or the file is malformed. Check the
status field on the document object; it carries the error message. Fix the URL or file and re-add.`sync=True` times out on large files
`sync=True` times out on large files
Large PDFs and Office documents can take a while to chunk and embed. Switch to
sync=False and poll kb.list_documents() until the document’s status is ready. Or break the upload into smaller batches.Agent doesn't use the KB in its reasoning
Agent doesn't use the KB in its reasoning
Check that the KB is attached to the agent (visible in the Workbench under the agent’s KB tab) and that the agent has been published after attachment. For frameworks other than Agno, you may need to wire the retriever in manually. See the framework pages.
Deleting a KB breaks an attached agent
Deleting a KB breaks an attached agent
Detach the KB from all agents before deleting it. In the Workbench, open each agent’s KB tab and remove the reference. Then delete the KB. An agent referencing a deleted KB will silently get no results from KB queries.
Next steps
Semantic search
Query a KB directly from code, outside the agent’s reasoning loop.
Agent KB integration
How attached KBs reach the agent’s reasoning loop.
KB SDK reference
Full method-level docs.
Output Response Filtering
Trim large KB responses before they reach the LLM.

