Air-Gapped Deployments

Air-gapped deployments run every xpander component inside your environment with zero outbound network access. The same Helm chart that runs in xpander Cloud runs in your isolated environment, with the deployment manager link disabled and the AI Gateway pointed at locally-hosted LLMs. There is no separate “air-gapped product” to maintain. Use this configuration for classified, no-egress, or residency-constrained environments: defense, government, financial services with strict residency rules, healthcare under tight PHI controls, critical infrastructure. For the underlying self-hosted base, see Self-Hosted Kubernetes. Air-gapped builds on that foundation.

Differences from self-hosted

All platform services (Agent Controller, AI Gateway, Workers, MCP, Chat, PostgreSQL, Redis) run inside your environment as usual. What’s different:

LLM inference runs against a model on your hardware, not OpenAI, Anthropic, or Bedrock
Connector operations only target systems inside your network
The deployment manager link to xpander Cloud is disabled, so there’s no metadata sync, no heartbeats, no telemetry
All platform data (agent configurations, threads, memory, knowledge base contents, audit logs) lives in your storage

Updates are explicit. Nothing auto-updates. See Updates and patching for the operator flow.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Your Environment (on-prem / classified / isolated network)  │
│                                                              │
│  ┌──────────────┐   ┌──────────────┐   ┌─────────────────┐   │
│  │ Agent        │──►│ Agent        │──►│ AI Gateway      │   │
│  │ Controller   │   │ Workers      │   │ (local LLMs)    │   │
│  └──────────────┘   └──────────────┘   └─────────────────┘   │
│         │                  │                    │            │
│         ▼                  ▼                    ▼            │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  PostgreSQL (state) · Redis (cache)                 │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌──────────────────────────────────────────────────┐       │
│  │ Connectors → internal Salesforce / Jira / DBs    │       │
│  └──────────────────────────────────────────────────┘       │
│                                                              │
│  Egress: ❌ blocked   Inbound from xpander Cloud: ❌ blocked │
└──────────────────────────────────────────────────────────────┘

The standard self-hosted control plane / data plane split collapses into a single plane: everything runs inside your boundary, including what would normally be the control plane.

Deployment path

1. Prepare your environment

You’ll need:

A Kubernetes cluster (1.20+) and Helm 3.12+
Storage classes for the persistent volumes Redis and PostgreSQL will claim
An ingress controller (NGINX or equivalent) and a TLS strategy (typically cert-manager backed by your internal CA)
An internal container registry your cluster can pull from

The full prerequisite list is in the Self-Hosted Kubernetes prerequisites.

2. Mirror images and chart

From a workstation with internet access, pull the xpander Helm chart and every container image it references, then push them into your internal registry:

helm repo add xpander https://charts.xpander.ai
helm pull xpander/xpander --untar
# image references live in xpander/values.yaml

Once images are mirrored, your cluster never needs to reach an external registry again.

3. Disable external connectivity

Configure firewall and network policies to block:

Outbound traffic from agent pods to the public internet
Outbound traffic to the xpander Cloud IPs (15.197.85.80, 166.117.85.46)
DNS resolution for any external domain used by hosted LLM providers

NTP for time sync is typically the only outbound connection that needs to remain open.

4. Configure local LLM providers

Point the AI Gateway at LLMs running inside your environment. The gateway speaks the OpenAI-compatible API, so any local runtime that exposes that interface plugs in directly.

Runtime	Models	Notes
vLLM	Llama, Mistral, Qwen, DeepSeek	High-throughput inference; OpenAI-compatible API
Ollama	Llama, Mistral, Phi, Qwen	Easier to operate; good for smaller deployments
NVIDIA NIM	Llama, Mistral, others	Optimized GPU runtime; enterprise support
AWS Bedrock (private)	Anthropic, Mistral, Llama	Only valid in VPC deployments with PrivateLink to Bedrock; not fully air-gapped

5. Install the Helm chart

Use a values file that disables the deployment manager connection and points at your internal LLMs. The base Helm syntax is the same as a standard self-hosted install (see Self-Hosted Kubernetes), with air-gapped overrides layered on top: internal registry references, local LLM endpoints, disabled outbound URLs. Air-gapped deployments are typically delivered with hands-on support. Contact us for the air-gapped values reference and license.

Connectors

Connectors execute against internal endpoints: your private Salesforce instance, internal GitLab, on-prem Jira, databases, or self-hosted Slack/Teams. Custom connectors built from your own OpenAPI specs work the same way they do in any deployment. Anything requiring an external service is blocked: public SaaS APIs, hosted LLMs, public webhooks, anything resolving a public DNS name. If you depend on a system that’s normally external but you maintain an internal proxy or mirror, route the connector to the internal endpoint via configuration.

Data and observability

Knowledge bases use vector search backed by storage inside your cluster. Embedding generation runs against your local LLM provider, so document ingestion doesn’t require any external embedding API. Documents are uploaded via SDK, REST API, or Workbench; embeddings, document storage, and retrieval all stay local. Observability data is similarly contained:

Threads, tasks, and tool call history live in PostgreSQL inside your environment
Metrics are scraped from the platform’s /metrics endpoints by your Prometheus
Logs stream from kubectl logs to your existing aggregation stack (Loki, ELK, Splunk)
Audit logs are available through the API against your local Agent Controller

Nothing is sent to xpander Cloud.

Updates and patching

Air-gapped deployments don’t auto-update. The flow:

Pull a new chart version and the corresponding images on a connected workstation
Mirror them into your internal registry
Test the upgrade in a staging cluster
Run helm upgrade against production using your values file

For security-critical patches, xpander publishes advisories with the patched chart version and image tags. The mechanics of helm upgrade are the same as a standard self-hosted deployment; see the upgrade flow.

Air-gapped vs. VPC

Choose air-gapped when your environment policy forbids outbound traffic, you operate in classified or no-egress networks, or data sovereignty rules out hosted LLM providers entirely. Tradeoff: operational weight. You maintain local model infrastructure, mirror image and chart updates, and own the stack end-to-end. Choose VPC or private cloud when the goal is data residency and audit rather than zero egress. Operationally lighter, same data-plane isolation, hosted LLMs still reachable through PrivateLink or VPC endpoints. Decision check: can your workload tolerate calls to hosted LLM providers over a private network? If yes, VPC. If no, air-gapped.

​Differences from self-hosted

​Architecture

​Deployment path

​1. Prepare your environment

​2. Mirror images and chart

​3. Disable external connectivity

​4. Configure local LLM providers

​5. Install the Helm chart

​Connectors

​Data and observability

​Updates and patching

​Air-gapped vs. VPC