Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt

Use this file to discover all available pages before exploring further.

Agno is the framework xpander.ai has the deepest integration with. In this guide, we’ll build a production-ready agent with credentials, instructions, tools, knowledge-base access, session DB, memory, guardrails, and a context-optimization pipeline.

Prerequisites

  • Complete the Quickstart so the CLI, SDK, and xpander login are already set up.
  • Python 3.12+ for the local handler.
  • An LLM provider key in your shell like OPENAI_API_KEY, ANTHROPIC_API_KEY. (These keys will only be used locally.)

1. Install

# Quote the package name so zsh doesn't expand the brackets.
pip install "xpander-sdk[agno]"

2. Set up scaffolding

# Create an agent named "my-first-agent"
xpander agent new \
    --name "my-first-agent" \
    --framework "agno" \
    --folder "."
These files get created:
./

├── xpander_handler.py        # Your @on_task entry point. The file you'll edit most.
├── xpander_config.json       # Agent ID, organization ID, API key, framework selection.
├── agent_instructions.json   # role / goal / general (the agent's system prompt).
├── requirements.txt          # Python dependencies (xpander-sdk[agno] is pinned here).
├── Dockerfile                # Used by xpander agent deploy.
└── .env                      # XPANDER_API_KEY, XPANDER_ORGANIZATION_ID, XPANDER_AGENT_ID.

xpander_config.json reference

xpander_config.json
{
  "agent_id": "agt_01H...",
  "organization_id": "org_01H...",
  "api_key": "xpd_...",
  "framework": "agno"
}

3. Create task handler

The full pattern, wrapped in @on_task so the platform routes tasks to it:
xpander_handler.py
from dotenv import load_dotenv
load_dotenv()  # loads XPANDER_API_KEY and friends before any sdk import

from xpander_sdk import on_task, Task, Backend, Tokens
from agno.agent import Agent

@on_task
async def handler(task: Task) -> Task:
    # 1. Fetch the agent's full configuration from xpander's control plane.
    backend = Backend(configuration=task.configuration)
    agno_args = await backend.aget_args(task=task)

    # 2. Build the framework's own Agent with that config.
    agno_agent = Agent(**agno_args, debug_mode=True)

    # 3. Run the LLM loop with the task's input, files, and images.
    result = await agno_agent.arun(
        input=task.to_message(),
        files=task.get_files(),
        images=task.get_images(),
    )

    # 4. Write the result back so the platform can store and display it.
    task.result = result.content
    task.tokens = Tokens(
        prompt_tokens=result.metrics.input_tokens,
        completion_tokens=result.metrics.output_tokens,
    )
    task.used_tools = [t.tool_name for t in (result.tools or [])]
    return task
Here’s what’s happening:
  1. Backend(configuration=task.configuration) picks up the API key, organization ID, and base URL from the active task. No need to read .env directly.
  2. await backend.aget_args(task=task) calls the control plane and returns a dict with the full agent configuration (instructions, tools, model, knowledge bases, session storage, memory, guardrails). Always pass task=task inside an @on_task handler so task-level overrides (instructions_override, expected_output, output_schema) are merged in.
  3. Agent(**agno_args, debug_mode=True) splats that dict into Agno’s own Agent class. debug_mode prints tool calls and token usage; remove it for production.
  4. task.to_message() flattens the prompt text, file URLs, and any inline-readable file content into a single string ready for Agno. task.get_files() and task.get_images() return Agno-typed agno.media.File and agno.media.Image objects.
  5. Reporting task.tokens and task.used_tools is optional. Skipping them just means the metrics view in Agent Studio shows “no usage data” for that run.

backend.aget_args reference

Input parameters

backend.aget_args accepts these arguments:
KeyTypeNotes
taskTaskThe active task inside an @on_task handler. Pass it so task-level overrides (instructions_override, expected_output, output_schema) are merged into the resolved args. Required inside @on_task; optional in scripts and notebooks where you can pass agent_id= instead.
overridedictdict.update()-merged onto the resolved args after everything else is built. Accepts any key Agno’s Agent.__init__ takes (see below). Optional. Defaults to None.
toolslist[Callable]Appended to the resolved tools list (connectors plus custom plus MCP). Use it to inject ephemeral tools without touching the agent’s graph. Optional. Defaults to [].
What can override change? override accepts any key Agno’s Agent.__init__ takes. Two ways to use it:
  1. Replace any value the SDK already resolves: any key from the output-params table below (model, instructions, tools, db, knowledge_retriever, pre_hooks, output_schema, and so on).
  2. Add Agno-native kwargs the SDK doesn’t set itself. Common ones:
KeyWhat it controls
temperatureSampling temperature for the model.
max_tokensCap on tokens generated per response.
show_tool_callsPrint tool calls during the run.
debug_modePrint full LLM payloads and timings.
use_json_modeForce JSON-mode responses.
markdownRender responses as Markdown.
See the Agno Agent reference for the full surface. Setting override["model"] skips the SDK’s own model resolution entirely, so use it whenever you want a different model client without re-implementing credential handling. Example using override to A/B-test two models against the same agent definition:
# Run the same agent against a different model for one task.
from agno.models.anthropic import Claude

args = await backend.aget_args(
    task=task,
    override={"model": Claude(id="claude-sonnet-4-5", api_key="...")},
)
Example using override to tune Agno-native sampling parameters:
# Lower temperature and bound max tokens for a deterministic run.
args = await backend.aget_args(
    task=task,
    override={"temperature": 0.2, "max_tokens": 800},
)
Example using tools to inject an ephemeral test tool:
def _local_clock() -> str:
    """Return the developer's local clock for debugging."""
    from datetime import datetime
    return datetime.now().isoformat()

args = await backend.aget_args(task=task, tools=[_local_clock])
For anything more invasive (a new pre-hook, a different DB), grab the args dict and mutate it directly before splatting into Agent(...). The dict is yours; the SDK won’t reach back in.

Output parameters

Calling backend.aget_args returns these fields:
KeyTypeNotes
idstrAgent ID from xpander_config.json. Always set.
namestrAgent name from the control plane. Always set.
descriptionstrSourced from agent_instructions.json general. Always set.
modelagno.models.base.ModelInstantiated with credentials resolved. The agent’s custom LLM key wins on cloud, your shell env var wins locally. Always set.
instructionsstrThe Agent Studio instructions (or task.instructions_override), with context-optimization, workspace-output, and compact-tool guidance appended. Always set.
toolslist[Callable]Connectors, custom @register_tool functions, MCP servers (with OAuth handled), xpcompact_context, plus Agno’s think / analyze when reasoning_tools_enabled=True. Always set.
tool_hookslist[Callable]One internal hook for retries on transient errors, stuck-loop detection, activity reporting, and Layer 1 microcompaction. Append to it, don’t replace. Always set.
compression_managerXPanderContextOptimizerLayered context-optimization pipeline (microcompaction, auto-compaction, manual via xpcompact_context, emergency, pre-retry). Always set.
add_datetime_to_contextboolInject the current datetime into context on every run. Always set. Defaults to True.
store_eventsboolPersist Agno run events. Always set. Defaults to True.
dbAsyncPostgresDb or PostgresDbScoped to a per-agent schema. Set when session_storage, user_memories, or agent_memories is on.
add_history_to_context, session_id, user_id, num_history_runs, max_tool_calls_from_history, enable_session_summariesvariousDriven by agno_settings.session_storage (see settings table below). Set when session_storage=True.
enable_user_memories, memory_manager, enable_agentic_memoryvariousPer-user facts that persist across sessions. Set when user_memories=True.
add_culture_to_context, update_cultural_knowledge, enable_agentic_cultureboolOrg-wide facts. Single-agent only (skipped on Teams). Set when agent_memories=True.
learningboolSet when agno_settings.learning=True.
tool_call_limitintCap on tool calls per run. Set when configured on the agent.
knowledge_retriever, search_knowledgecallable, boolAgno calls the retriever automatically during the loop. Set when KBs are attached.
pre_hookslistPIIDetectionGuardrail, PromptInjectionGuardrail, OpenAIModerationGuardrail per agno_settings. Set when guardrails are on.
output_schema, use_json_mode, markdownvariousStructured output and Markdown formatting. Task-level output format and schema overrides applied. Set per output settings.
expected_output, additional_contextstrForwarded from the agent definition and task. Set when present.
members, add_member_tools_to_context, share_member_interactions, show_members_responsesvariousThe SDK recursively builds each sub-agent as AgnoAgent or AgnoTeam. Set when agent.is_a_team.

4. Edit the agent’s system prompt

agent_instructions.json contains the agent’s system prompt and has exactly three fields:
agent_instructions.json
{
  "role": [
    "You are a customer support assistant for Acme.",
    "Always confirm the customer's account ID before taking any action."
  ],
  "goal": [
    "Resolve the customer's issue in as few turns as possible.",
    "Escalate to a human if the request involves a refund over $500."
  ],
  "general": "Be concise, professional, and friendly. Never invent policy details; if you don't know something, say so and offer to escalate."
}
Save the file and the next xpander agent dev or xpander agent deploy syncs it to the control plane.

5. Set up streaming (optional)

For token-by-token output, decorate an async def that yields TaskUpdateEvent objects instead of returning a Task. The decorator detects the difference automatically.
streaming_handler.py
from datetime import datetime, timezone
from xpander_sdk import on_task, Task, Backend, TaskUpdateEvent, TaskUpdateEventType
from agno.agent import Agent
from agno.run.agent import RunEvent, RunOutput

@on_task
async def handler(task: Task):
    backend = Backend(configuration=task.configuration)
    agno_agent = Agent(**(await backend.aget_args(task=task)))

    final_output = None
    # Agno emits a stream of events: chunks, tool calls, final RunOutput.
    async for event in await agno_agent.arun(
        input=task.to_message(),
        stream=True,
        stream_events=True,
        yield_run_output=True,
    ):
        if isinstance(event, RunOutput):
            final_output = event
        elif hasattr(event, "event") and event.event == RunEvent.run_content and event.content:
            # Forward each token chunk to the platform's SSE stream.
            yield TaskUpdateEvent(
                type=TaskUpdateEventType.Chunk,
                task_id=task.id,
                organization_id=task.organization_id,
                time=datetime.now(timezone.utc),
                data=event.content,
            )

    task.result = final_output.content if final_output else ""
    yield TaskUpdateEvent(
        type=TaskUpdateEventType.TaskFinished,
        task_id=task.id,
        organization_id=task.organization_id,
        time=datetime.now(timezone.utc),
        data=task,
    )
Here’s what’s happening:
  1. stream=True, stream_events=True, yield_run_output=True tell Agno to emit events instead of buffering. The handler receives chunks, tool-call events, and a final RunOutput.
  2. The Chunk event forwards each token to the platform’s SSE stream so clients render output as it arrives.
  3. The TaskFinished event signals the end of the stream and carries the final task back to the platform.
A streaming handler exposes itself only through POST /invoke, returning Server-Sent Events. The platform’s SSE listener for cloud-deployed agents expects a regular handler that returns a Task. So if you need both an interactive streaming experience and platform-routed tasks, run two handlers, or have your streaming endpoint proxy through a regular handler.

6. Test local development

Run the handler with the dev server. Tasks created from any channel (REST, Slack, Agent Studio) route to your laptop:
# Starts the @on_task HTTP server and subscribes to the platform event stream.
xpander agent dev
Routing cloud traffic to a local instance is a preview feature.Inbound traffic goes to your deployed container by default. When a local instance is running via xpander agent dev, it takes over and all tasks route to your locally running agent instead. Only one can be active at a time.If a container is already deployed, run xpander agent stop first, then start dev. When you stop the local server, the cloud-based container automatically reclaims traffic.
For one-shot testing without a server:
# Calls your handler exactly once with the given prompt and exits.
python3 xpander_handler.py \
    --invoke \
    --prompt "Quick test" \
    --output_format json \
    --output_schema '{"answer":"string"}'
--output_format and --output_schema are useful for testing structured output without changing the agent’s settings in the control plane.

7. Deploy to xpander cloud

When the local handler works, push it as a managed container:
# Bundles the project, builds a Docker image, rolls out the new version.
xpander agent deploy
What happens:
  1. The CLI bundles xpander_handler.py, requirements.txt, the Dockerfile, and the rest of the project.
  2. xpander builds a Docker image, pushes it, and rolls out a new immutable version. The previous version stays available for instant rollback.
  3. Once the rollout finishes, the platform routes inbound tasks to the new container. The first deploy takes a couple of minutes; subsequent deploys are faster thanks to layer caching.
Stream logs from the running container while the rollout settles:
xpander agent logs

Secrets and environment variables

.env ships with the deploy by default. For values you don’t want bundled into the image (production keys, rotating secrets), upload them to xpander’s secret store instead:
# Pushes the variables in your local .env to the agent's secret store
# and injects them into the container at runtime.
xpander secrets-sync
Re-run xpander secrets-sync whenever you rotate a secret. Don’t commit .env to source control either way.

Lifecycle hooks

Containers support @on_boot and @on_shutdown for one-time resource setup and teardown. Use them for caches you want to warm before the first task lands, or open connections you want to close cleanly when the container is replaced:
from xpander_sdk import on_boot, on_shutdown

@on_boot
async def warmup():
    # Pre-load a model, open a DB pool, fetch config, etc.
    ...

@on_shutdown
async def cleanup():
    # Flush queues, close connections, etc.
    ...

When to redeploy

Anything that changes Python code, dependencies, or the Dockerfile needs a redeploy. The control-plane bits stay live without one:
  • Live (no redeploy): instructions, model selection, memory settings, attached agents, attached knowledge bases, tool selection from the catalog.
  • Needs xpander agent deploy: any change to xpander_handler.py, requirements.txt, Dockerfile, or other files in the container.
Full deployment reference, including rollback and lifecycle controls, is on the Containers page.

Inspect the deployment settings

After every xpander agent dev or xpander agent deploy, the live Agno settings for the cloud version are saved on the agent. Read them back from the Agents() class to confirm what’s actually running:
from xpander_sdk import Agents

xpander_agent = await Agents().aget(agent_id="agt_01H...")
print(xpander_agent.agno_settings.session_storage)
SettingDefaultWhat it does
session_storageTruePostgres-backed conversation history within a thread. Sets add_history_to_context, session_id, and user_id on the args.
num_history_runs10How many prior runs to load into context.
max_tool_calls_from_history0Cap on tool calls replayed from history. 0 means no cap.
session_summariesFalseGenerate summaries of completed sessions (enable_session_summaries).
user_memoriesFalsePer-user facts that persist across sessions. Manual mode. Adds enable_user_memories and a MemoryManager.
agentic_memoryFalsePer-user facts, agentic-managed (the agent decides when to remember). Sets enable_agentic_memory. Requires user_memories.
agent_memoriesFalseOrg-wide facts the agent carries into every conversation. Adds add_culture_to_context and update_cultural_knowledge. Single-agent only (skipped on Teams).
agentic_cultureFalseOrg-wide facts, agentic-managed. Sets enable_agentic_culture instead of update_cultural_knowledge.
learningFalseThe agent learns and improves with every interaction.
tool_call_limitNoneMax tool calls per run. None means unlimited.
coordinate_modeFalseForce Team mode (also auto-detected when the agent has sub-agents attached).
pii_detection_enabledFalseAdds a PIIDetectionGuardrail pre-hook.
pii_detection_maskTrueMask detected PII rather than blocking the request.
prompt_injection_detection_enabledFalseAdds a PromptInjectionGuardrail pre-hook.
openai_moderation_enabledFalseAdds an OpenAIModerationGuardrail pre-hook.
openai_moderation_categoriesNoneRestrict moderation to specific categories. None means all.
reasoning_tools_enabledFalseAdd Agno’s think and analyze reasoning tools. Skipped automatically for Teams.
You change agno_settings in Agent Studio, not in code. There’s no SDK call to flip them by design: changing memory settings affects billing and persistence semantics, so they live in the control plane.

Troubleshooting

Backend.aget_args() reads task.instructions_override while building the args. Inside an @on_task handler, always pass the active task: await backend.aget_args(task=task). The agent_id-only form is supported outside a handler (in scripts and notebooks), but inside one the active task is the source of truth for instruction overrides.
Session storage is on by default, so the args dict includes a db wired to xpander’s Postgres. For cloud-hosted agents this is automatic. For self-hosted or air-gapped deployments, the database needs to be reachable from your container. Check the connection string with await agent.aget_connection_string() and confirm the host is reachable. To turn session storage off, flip agno_settings.session_storage to False in Agent Studio.
Custom LLM keys configured on the agent take precedence on cloud deployments. Locally, your shell’s OPENAI_API_KEY (or the equivalent for your provider) wins. If you want the cloud-side custom key locally too, mirror it into your .env.
zsh expands the brackets. Quote the package name: pip install "xpander-sdk[agno]".

Next steps

Quickstart

The 10-minute scaffold-to-deploy walkthrough that produced the handler shown above.

Custom Tools

Wrap private APIs as tools with @register_tool and pass them through the args dict.

Memory & State

The deep dive on session_storage, user memories, and agent memories.

Containers

Ship the handler as a container managed by xpander.

Core Concepts

The SDK class names mapped onto agents, tasks, threads, and memory.

Frameworks overview

What’s auto-wired vs. manual for Agno, OpenAI Agents SDK, LangChain, and AWS Strands.