Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt

Use this file to discover all available pages before exploring further.

Air-gapped deployments run every xpander component inside your environment with zero outbound network access. The same Helm chart that runs in xpander Cloud runs in your isolated environment, with the deployment manager link disabled and the AI Gateway pointed at locally-hosted LLMs. There is no separate “air-gapped product” to maintain. Use this configuration for classified, no-egress, or residency-constrained environments: defense, government, financial services with strict residency rules, healthcare under tight PHI controls, critical infrastructure. For the underlying self-hosted base, see Self-Hosted Kubernetes. Air-gapped builds on that foundation.

Differences from self-hosted

All platform services (Agent Controller, AI Gateway, Workers, MCP, Chat, PostgreSQL, Redis) run inside your environment as usual. What’s different:
  • LLM inference runs against a model on your hardware, not OpenAI, Anthropic, or Bedrock
  • Connector operations only target systems inside your network
  • The deployment manager link to xpander Cloud is disabled, so there’s no metadata sync, no heartbeats, no telemetry
  • All platform data (agent configurations, threads, memory, knowledge base contents, audit logs) lives in your storage
Updates are explicit. Nothing auto-updates. See Updates and patching for the operator flow.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Your Environment (on-prem / classified / isolated network)  │
│                                                              │
│  ┌──────────────┐   ┌──────────────┐   ┌─────────────────┐   │
│  │ Agent        │──►│ Agent        │──►│ AI Gateway      │   │
│  │ Controller   │   │ Workers      │   │ (local LLMs)    │   │
│  └──────────────┘   └──────────────┘   └─────────────────┘   │
│         │                  │                    │            │
│         ▼                  ▼                    ▼            │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  PostgreSQL (state) · Redis (cache)                 │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌──────────────────────────────────────────────────┐       │
│  │ Connectors → internal Salesforce / Jira / DBs    │       │
│  └──────────────────────────────────────────────────┘       │
│                                                              │
│  Egress: ❌ blocked   Inbound from xpander Cloud: ❌ blocked │
└──────────────────────────────────────────────────────────────┘
The standard self-hosted control plane / data plane split collapses into a single plane: everything runs inside your boundary, including what would normally be the control plane.

Deployment path

1. Prepare your environment

You’ll need:
  • A Kubernetes cluster (1.20+) and Helm 3.12+
  • Storage classes for the persistent volumes Redis and PostgreSQL will claim
  • An ingress controller (NGINX or equivalent) and a TLS strategy (typically cert-manager backed by your internal CA)
  • An internal container registry your cluster can pull from
The full prerequisite list is in the Self-Hosted Kubernetes prerequisites.

2. Mirror images and chart

From a workstation with internet access, pull the xpander Helm chart and every container image it references, then push them into your internal registry:
helm repo add xpander https://charts.xpander.ai
helm pull xpander/xpander --untar
# image references live in xpander/values.yaml
Once images are mirrored, your cluster never needs to reach an external registry again.

3. Disable external connectivity

Configure firewall and network policies to block:
  • Outbound traffic from agent pods to the public internet
  • Outbound traffic to the xpander Cloud IPs (15.197.85.80, 166.117.85.46)
  • DNS resolution for any external domain used by hosted LLM providers
NTP for time sync is typically the only outbound connection that needs to remain open.

4. Configure local LLM providers

Point the AI Gateway at LLMs running inside your environment. The gateway speaks the OpenAI-compatible API, so any local runtime that exposes that interface plugs in directly.
RuntimeModelsNotes
vLLMLlama, Mistral, Qwen, DeepSeekHigh-throughput inference; OpenAI-compatible API
OllamaLlama, Mistral, Phi, QwenEasier to operate; good for smaller deployments
NVIDIA NIMLlama, Mistral, othersOptimized GPU runtime; enterprise support
AWS Bedrock (private)Anthropic, Mistral, LlamaOnly valid in VPC deployments with PrivateLink to Bedrock; not fully air-gapped

5. Install the Helm chart

Use a values file that disables the deployment manager connection and points at your internal LLMs. The base Helm syntax is the same as a standard self-hosted install (see Self-Hosted Kubernetes), with air-gapped overrides layered on top: internal registry references, local LLM endpoints, disabled outbound URLs. Air-gapped deployments are typically delivered with hands-on support. Contact us for the air-gapped values reference and license.

Connectors

Connectors execute against internal endpoints: your private Salesforce instance, internal GitLab, on-prem Jira, databases, or self-hosted Slack/Teams. Custom connectors built from your own OpenAPI specs work the same way they do in any deployment. Anything requiring an external service is blocked: public SaaS APIs, hosted LLMs, public webhooks, anything resolving a public DNS name. If you depend on a system that’s normally external but you maintain an internal proxy or mirror, route the connector to the internal endpoint via configuration.

Data and observability

Knowledge bases use vector search backed by storage inside your cluster. Embedding generation runs against your local LLM provider, so document ingestion doesn’t require any external embedding API. Documents are uploaded via SDK, REST API, or Workbench; embeddings, document storage, and retrieval all stay local. Observability data is similarly contained:
  • Threads, tasks, and tool call history live in PostgreSQL inside your environment
  • Metrics are scraped from the platform’s /metrics endpoints by your Prometheus
  • Logs stream from kubectl logs to your existing aggregation stack (Loki, ELK, Splunk)
  • Audit logs are available through the API against your local Agent Controller
Nothing is sent to xpander Cloud.

Updates and patching

Air-gapped deployments don’t auto-update. The flow:
  1. Pull a new chart version and the corresponding images on a connected workstation
  2. Mirror them into your internal registry
  3. Test the upgrade in a staging cluster
  4. Run helm upgrade against production using your values file
For security-critical patches, xpander publishes advisories with the patched chart version and image tags. The mechanics of helm upgrade are the same as a standard self-hosted deployment; see the upgrade flow.

Air-gapped vs. VPC

Choose air-gapped when your environment policy forbids outbound traffic, you operate in classified or no-egress networks, or data sovereignty rules out hosted LLM providers entirely. Tradeoff: operational weight. You maintain local model infrastructure, mirror image and chart updates, and own the stack end-to-end. Choose VPC or private cloud when the goal is data residency and audit rather than zero egress. Operationally lighter, same data-plane isolation, hosted LLMs still reachable through PrivateLink or VPC endpoints. Decision check: can your workload tolerate calls to hosted LLM providers over a private network? If yes, VPC. If no, air-gapped.