Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt

Use this file to discover all available pages before exploring further.

Every Xpander agent is powered by a large language model. The model determines how your agent thinks, how it writes, how much it costs, and how fast it responds. This page covers:

Models

To change what model your agent uses:
1

Open the General tab

In the Agent Studio, click the gear icon and go to GeneralLLM Settings.
2

Choose a provider

Select your provider from the dropdown. If using built-in keys, you’re done. Xpander handles billing. If using your own keys, enter them in the API key field.
3

Select a model

Pick from the Featured models list, or toggle to Custom and enter any model ID your provider supports. Custom mode is useful for fine-tuned models, newly released models, or provider-specific variants.
4

Publish

Click Publish to apply the new model. Your agent will use it for all subsequent conversations.

Supported Providers

Built-in access (pay-as-you-go) or BYOK
  • GPT-5.4, GPT-5.3 Chat, GPT-5.2, GPT-5.1, GPT-5 Nano, GPT-5, GPT-5 Mini
  • GPT-4.1, GPT-4.1-mini
  • GPT-4o, GPT-4o Mini, GPT-4 Turbo
  • GPT-3.5 Turbo
Built-in access (pay-as-you-go) or BYOK
  • Claude Sonnet 4.6, Claude Opus 4.6
  • Claude Sonnet 4.5, Claude Opus 4.5
  • Claude Opus 4, Claude Sonnet 4
  • Claude Sonnet 3.7, Claude Sonnet 3.5
BYOK: requires AWS credentials with Bedrock access
  • Claude Sonnet 4.6, Claude Opus 4.6
  • Claude Sonnet 4.5, Claude Opus 4
  • Claude Sonnet 4, Claude Sonnet 3.7, Claude Sonnet 3.5
  • Claude Haiku 3.5
  • Amazon Titan Text Express
BYOK: requires Azure AI Foundry credentialsAzure AI Foundry provides access to OpenAI models through Azure’s infrastructure.
  • GPT-5.2, GPT-5.1, GPT-5 Nano, GPT-5, GPT-5 Mini
  • GPT-4.1, GPT-4.1-mini
  • GPT-4o, GPT-4o Mini
  • GPT-4 Turbo, GPT-3.5 Turbo
Set the API Base URL to include your deployment name without the completions path: https://your-resource.openai.azure.com/openai/deployments/gpt-4oDo NOT include /completions?api-version=... at the end.
BYOK: requires ByteDance ModelArk API keyNo featured models. Custom model identifier only. Access ByteDance’s model inference platform by entering your model ID in Custom mode.
BYOK: requires Fireworks AI API key
  • GLM-4.6
  • Kimi K2 Instruct 0905
  • DeepSeek V3.1
  • OpenAI gpt-oss-120b, OpenAI gpt-oss-20b
  • Qwen3 235B A22B Thinking 2507, Qwen3 235B A22B Instruct 2507
BYOK: requires Google AI Studio API key
  • Gemini 2.0 Flash, Gemini 2.0 Flash Lite
  • Gemini 2.5 Pro, Gemini 3 Pro
BYOK: requires NVIDIA API keyMeta Llama Models:
  • Llama 3.1 8B Instruct, Llama 3.1 70B Instruct, Llama 3.1 405B Instruct
  • Llama 3.2 1B Instruct, Llama 3.2 3B Instruct
  • Llama 3.3 70B Instruct
  • Llama 4 Scout 17B, Llama 4 Maverick 17B
Mistral Models:
  • Mistral 7B Instruct v0.3, Mistral Small 3.2 24B Instruct
NVIDIA Nemotron:
  • Nemotron Nano 4B v1.1, Nemotron Nano 8B v1, Nemotron Ultra 253B v1

Supported Gateways

BYOK: requires Cloudflare API key + base URLNo featured models. Requires a custom model identifier, API key, and base URL. Cloudflare AI Gateway provides caching, rate limiting, and observability for LLM requests.
BYOK: requires Nebius API keyNebius provides inference for open-source models.Meta Llama:
  • Meta-Llama-3.1-8B-Instruct-fast, Meta-Llama-3.1-8B-Instruct, Llama-Guard-3-8B
NVIDIA:
  • Llama-3.1-Nemotron-Ultra-253B-v1, Nemotron-Nano-V2-12b
Google Gemma:
  • gemma-2-2b-it, gemma-2-9b-it-fast, gemma-3-27b-it, gemma-3-27b-it-fast
Qwen:
  • Qwen2.5-Coder-7B-fast, Qwen3-235B-A22B-Instruct-2507, Qwen3-235B-A22B-Thinking-2507
  • Qwen3-32B, Qwen3-32B-fast, Qwen2.5-VL-72B-Instruct
  • Qwen3-Coder-30B-A3B-Instruct, Qwen3-Coder-480B-A35B-Instruct
DeepSeek:
  • DeepSeek-R1-0528
Nous:
  • Hermes-4-70B, Hermes-4-405B
Others:
  • INTELLECT-3, Kimi-K2-Thinking
Image Generation:
  • flux-dev, flux-schnell
BYOK: requires OpenRouter API keyOpenRouter provides unified access to 200+ models with automatic fallback, load balancing, and unified pricing.Featured:
  • Prime Intellect: INTELLECT-3
  • TNG: R1T Chimera (free), TNG: R1T Chimera
Anthropic:
  • Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, Claude Opus 4.1
AllenAI:
  • OLMo 3 32B Think, OLMo 3 7B Instruct, OLMo 3 7B Think
LiquidAI:
  • LFM2-8B-A1B, LFM2-2.6B
IBM:
  • Granite 4.0 Micro
Deep Cogito:
  • Cogito V2 Preview Llama 405B
OpenAI:
  • GPT-5 Image Mini, GPT-5 Pro, GPT-4o Audio, gpt-oss-120b, gpt-oss-20b (free), gpt-oss-20b
Google:
  • Gemini 2.5 Flash Image (Nano Banana), Gemini 2.5 Flash Image Preview (Nano Banana)
Qwen:
  • Qwen3 VL 8B Thinking, Qwen3 VL 8B Instruct, Qwen3 VL 30B A3B Thinking, Qwen3 VL 30B A3B Instruct
  • Qwen3 Coder 30B A3B Instruct, Qwen3 30B A3B Instruct 2507
Z.AI:
  • GLM 4.6, GLM 4.6 (exacto)
DeepSeek:
  • DeepSeek V3.2 Exp, DeepSeek V3.1
Nous:
  • Hermes 4 70B, Hermes 4 405B
Mistral:
  • Mistral Medium 3.1, Codestral 2508
Baidu:
  • ERNIE 4.5 21B A3B, ERNIE 4.5 VL 28B A3B
Bert:
  • Nebulon Alpha
BYOK: requires Tzafon LightCone API keyNo featured models. Custom model identifier only. Access models through Tzafon’s LightCone inference platform by entering your model ID in Custom mode.

Bring Your Own Keys

Contact Sales to enable Bring your Own LLM Keys for your organization.
When you bring your own keys, you get full control over billing and model access. Enter your provider’s API key in the LLM Settings panel. For providers like Azure or AWS Bedrock that need additional configuration (deployment IDs, regions), fill in the extra fields that appear. You can also set a custom API Base URL to route requests through your own AI gateway or proxy.
If your AI Gateway is behind a private subnet or firewall, make sure to run Xpander in the same network with access to those models.

Use Custom Models

The models listed above are featured models that have been tested with Xpander. To use a different model from your provider, select “Custom” in the model dropdown and enter the model name exactly as specified by your provider (e.g., custom-model-id). This is particularly useful for:
  • Private or fine-tuned models only you have access to
  • Newly released models not yet in the dropdown
  • Provider-specific model variants with custom endpoints

Model Settings

Two settings fine-tune how the model generates responses. Configure these in the General tab → LLM Settings panel.
SettingWhat it doesRange
TemperatureControls randomness. Lower values (0.0–0.3) give more focused, deterministic responses. Higher values (0.7–1.0) give more creative, varied responses.0.0–1.0
Reasoning EffortControls how much effort the model puts into reasoning before responding. Higher effort means slower but more thorough answers. Currently only visible when using OpenAI models.Low / Medium / High

LLM Extra Headers

You can configure custom HTTP headers to be sent with every LLM request. This is useful for:
  • Sending custom authentication tokens to AI Gateways
  • Adding tracking or metadata headers (e.g., X-Request-ID, X-Organization-ID)
  • Passing compliance or security headers required by your infrastructure
  • Integration with observability platforms like Helicone, LangSmith, or custom proxies
Configuration levels:
  1. Organization Default Headers: Set default headers at the organization level in Admin Settings → LLM Settings that apply to all agents
  2. Agent-Specific Override: Override organization defaults with agent-specific headers in the Agent Studio GeneralLLM Settings panel
Agent-level headers take precedence and merge with organization defaults (agent headers override matching keys).
Supported Providers: Extra headers are currently supported for OpenAI, Helicone, Nebius, OpenRouter, Fireworks, and NVIDIA NIM providers. For Anthropic, headers are sent as default_headers. Google AI Studio and Amazon Bedrock don’t support custom headers.
Track and monitor all LLM requests through Helicone:
Organization Headers (Admin Settings)
{
  "Helicone-Auth": "Bearer sk-helicone-xxx",
  "Helicone-Property-Environment": "production"
}
Agent Headers (Agent Studio)
{
  "Helicone-Property-Agent": "support-agent",
  "Helicone-User-Id": "team-support"
}
Route requests through your internal AI Gateway with authentication:
Organization Headers
{
  "X-Gateway-Token": "your-gateway-token",
  "X-Tenant-ID": "company-prod"
}
Add tracking and compliance headers for audit logs:
Organization Headers
{
  "X-Request-Source": "xpander-platform",
  "X-Compliance-Level": "gdpr-compliant",
  "X-Data-Region": "eu-west-1"
}
Differentiate between development and production environments:Production Org Headers:
{
  "X-Environment": "production",
  "X-Rate-Limit-Tier": "premium"
}
Dev Agent Override:
{
  "X-Environment": "development",
  "X-Debug-Mode": "enabled"
}

Planning Mode

Planning Mode gives your agent the ability to break complex tasks into structured, trackable steps. When enabled, the agent receives a set of to-do list tools and is nudged to create a checklist before starting work. It then works through the items one by one, marking each complete and reporting progress along the way.

How to Enable

In the Agent Studio, go to the Tools tab → Agent Thinking & Planning section:
  • Checklist Toolkit: Toggle this on to give the agent planning tools. The agent will be nudged to create a plan and work through it.
  • Agent Checklist Enforcement (Beta): When enabled, execution blocks entirely until the agent creates a plan. Without this, the agent is encouraged to plan but not forced to.
Agent Thinking & Planning section showing Checklist Toolkit and Reasoning Toolkit toggles

How It Works

Planning Mode gives the agent tools to create, update, and manage a to-do list. The typical flow looks like this:
  1. The agent receives a prompt and creates a checklist of steps it needs to complete
  2. It works through each item, using its tools and marking tasks done as it goes
  3. If it discovers new work along the way, it adds items to the list dynamically
  4. The checklist updates in real-time in both the chat interface and the Monitor tab
  5. If tasks remain incomplete after a run, the system retries automatically based on the configured retry strategy
The agent can also remove tasks that turn out to be unnecessary, or update task descriptions as requirements become clearer during execution.

Retry Strategies

Planning Mode also nudges the Agent to complete its checklist before finishing its response. If some tasks remain incomplete, it prompts the agent to retry.
StrategyBehavior
TieredLightweight nudge first (retries 1–3), then full context compaction on retry 4+. This is the default.
AggressiveAlways compacts context on every retry. Use when the agent consistently fails due to long context.
DisabledNo automatic retries. The execution ends after the first attempt, even if tasks are incomplete.
Tiered is the best default for most agents. It gives the agent a chance to finish naturally before resorting to context compaction, which can lose some detail from earlier in the conversation.

When to Use Planning Mode

Good fitNot a good fit
Multi-step workflows (research → draft → review)Simple Q&A
Tasks requiring coordinated actions across toolsSingle tool calls
Audit trails where you need to see exactly what the agent didReal-time chat where speed matters
Complex implementations or migrationsQuick lookups

Reasoning Mode

Reasoning Mode exposes a think tool that gives the agent a private scratchpad for extended thinking. Instead of responding immediately, the agent can reason through the problem step-by-step, considering multiple approaches, evaluating tradeoffs, and self-correcting before delivering a final answer. Reasoning steps are visible in the activity log and the chat interface, and are delivered in real-time as think events when streaming. To enable it, go to the Tools tab → Agent Thinking & Planning section and toggle on Reasoning Toolkit. No additional configuration is needed. Reasoning Toolkit toggle in the Agent Thinking & Planning section
Reasoning Mode produces higher-quality answers at the cost of speed and tokens. Use it selectively for tasks where accuracy matters more than speed.

Multi-Agent Orchestration

Your agent can call other agents in your workspace to handle specialized tasks. Instead of building one agent that does everything, you can create focused agents and connect them together. For example: create 3 specialized agents: a research agent, a writing agent, a data analysis agent. Each has its own knowledge bases, tool connections, and system prompts. Then create a primary agent that delegates work to whichever specialist is best suited.
1

Open the General tab

In the Agent Studio, scroll down to the Multi-Agent section and expand it.
2

Attach agents

Click + Attach agents. A panel opens showing all agents in your workspace. Select the agents you want this agent to be able to call.
3

Click Add to agent

Click + Add to agent to confirm. The attached agents now appear as available tools the primary agent can invoke during execution.
Multi-Agent section showing attached agents panel Once attached, the primary agent can decide when to delegate a task to another agent based on its instructions and the nature of the request. Each sub-agent runs independently with its own tools, memory, and configuration, and can even have its own planning checklist if Planning Mode is enabled. Sub-agent executions are tracked separately in the Monitor tab, so you can drill into any agent’s run to see exactly what it did.

LLM Guardrails

Safety checks applied to user input and model output before and during each run. Configure them under the LLM Guardrails section in the Agent Studio General tab.
LLM Guardrails section with PII Detection, prompt injection detection, and OpenAI moderation toggles
GuardrailWhat it does
PII DetectionBlock runs when user input contains credit cards, emails, SSNs, or phone numbers.
Mask detected PIIReplace PII with **** instead of blocking.
Prompt injection detectionBlock attempts to manipulate the agent’s behavior through malicious or unauthorized instructions.
OpenAI moderationBlock content that violates OpenAI’s content policy.
Turn on Prompt injection detection for any agent exposed to public or untrusted users. Use Mask detected PII instead of blocking when you want the agent to keep working without ever seeing the raw sensitive data.

Next Steps

Testing & Chat

Test model behavior and intelligence features

Tools & Connectors

Give your agent actions to take

Memory & State

Configure what your agent remembers

Deploy an Agent

Publish and connect to channels