> ## Documentation Index
> Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Semantic Search

> Query a knowledge base from code

When a knowledge base is attached to an agent, the framework calls it automatically as part of the agent's reasoning loop and you don't see the search call. Sometimes you want to search directly: building a search box, enriching a record before it goes into a workflow, or testing how a query ranks against your corpus. That's what `kb.search` and `agent.knowledge_bases_retriever()` are for.

<Note>
  Knowledge base retrieval is wired in automatically only for **Agno**. For LangChain, OpenAI Agents SDK, and AWS Strands, you need to add the retriever as a tool manually. See the [framework pages](/developers/frameworks/agno) for details.
</Note>

## Prerequisites

* **Complete the [Quickstart](/developers/quickstart)** so the CLI, SDK, and `xpander login` are already set up.
* **At least one knowledge base** created and populated via the Workbench or the [Document Management SDK](/developers/knowledge/document-management).

## 1. Search a knowledge base

```python theme={"dark"}
from xpander_sdk import KnowledgeBases

kb = KnowledgeBases().get(knowledge_base_id="kb_01H...")

results = kb.search(
    search_query="how do we handle refund requests over $500?",
)

for r in results:
    print(f"score={r.score:.3f}  source={r.document_name}")
    print(r.content[:200])
```

Each result object has these fields:

| Field           | What it's for                                     |
| --------------- | ------------------------------------------------- |
| `content`       | The matched chunk of text, not the full document. |
| `score`         | Relevance score, higher is better.                |
| `document_name` | Name of the source document.                      |
| `document_id`   | Stable ID of the source document.                 |

Results are ordered by score descending.

## 2. Tune the query

Two parameters you control:

* **`top_k`** controls how many results are returned. It defaults to 10. Lower it for tightly scoped queries where you only want the best match; raise it when feeding results into a downstream LLM that can re-rank or summarize.

* `use_bubble` enables **bubble search** which returns the matched chunk *plus surrounding context* rather than the chunk in isolation. Use it when chunks are small (e.g., per-paragraph) and the agent needs surrounding text to make sense of a match. Skip it when chunks are already large enough to be self-contained.

```python highlight={3-5} theme={"dark"}
results = kb.search(
    search_query="incident escalation policy",
    top_k=3,
    use_bubble=True,
    bubble_size=2000,   # character window around each match, default 1000
)
```

## 3. Search across multiple KBs from an agent

If your agent has more than one knowledge base attached, `agent.knowledge_bases_retriever()` returns a callable that searches all of them and merges results by score:

```python highlight={3-4} theme={"dark"}
from xpander_sdk import Agents

agent = Agents().get(agent_id="agt_01H...")
search = agent.knowledge_bases_retriever()

results = search(query="quarterly metrics", num_documents=10)
for r in results:
    print(r["score"], r["document_name"], r["content"][:100])
```

Use this when you want the agent's knowledge base context without its reasoning loop.

`num_documents` controls the merged result count. It's the equivalent of `top_k` across the combined corpus.

## Troubleshooting

<AccordionGroup>
  <Accordion title="Search returns no results">
    Check that the KB has documents with `status=ready` (see [Document Management](/developers/knowledge/document-management)). A KB with all documents still processing or failed will return empty results. Also confirm the query isn't empty and `top_k` is greater than 0.
  </Accordion>

  <Accordion title="Scores are all very low">
    Low scores usually mean the query language doesn't match the document language, or the content is too sparse. Try rephrasing the query to match how the documents describe the topic. If the corpus is large and diverse, raising `top_k` and letting a downstream LLM re-rank often helps more than query reformulation.
  </Accordion>

  <Accordion title="`knowledge_bases_retriever()` returns an empty list">
    No KBs are attached to the agent. Attach at least one via the Workbench (agent's KB tab) or via `agent.attach_knowledge_base(knowledge_base_id=...)` in code. Then reload the agent before calling the retriever.
  </Accordion>

  <Accordion title="Agent doesn't call the KB during reasoning">
    For frameworks other than Agno, the retriever isn't wired in automatically; you need to add it as a tool explicitly. See the [framework pages](/developers/frameworks/agno) for the per-framework setup. Also check that the agent is published after attaching the KB.
  </Accordion>
</AccordionGroup>

## Next steps

<CardGroup cols={2}>
  <Card title="Document management" icon="file-arrow-up" href="/developers/knowledge/document-management">
    Add, list, and remove documents from a KB programmatically.
  </Card>

  <Card title="Agent KB integration" icon="plug" href="/developers/frameworks/agno">
    How attached KBs reach the agent's reasoning loop in each framework.
  </Card>

  <Card title="Output Response Filtering" icon="filter" href="/developers/tools/output-response-filtering">
    Trim large KB responses before they reach the LLM.
  </Card>

  <Card title="KB SDK reference" icon="book" href="/developers/sdk-reference/knowledge-bases/overview">
    Full method-level docs.
  </Card>
</CardGroup>
