> ## Documentation Index
> Fetch the complete documentation index at: https://docs.xpander.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Observability with Tool Hooks

> Observe and instrument every tool call without modifying the tools themselves

Tool hooks are decorators that fire around every tool invocation an agent makes. They give you a single place to plug in logging, metrics, alerting, payload redaction, custom guardrails, and per-tool observability without touching the tools themselves. Hooks are framework-agnostic: they run at the SDK level, so the same hook fires whether the agent is built on Agno, OpenAI Agents SDK, LangChain, or AWS Strands, and whether the tool is a connector, a custom `@register_tool` function, or an MCP-server tool.

There are three decorators:

* `@on_tool_before` runs before each tool invocation.
* `@on_tool_after` runs after a successful invocation, with the result.
* `@on_tool_error` runs when a tool invocation raises.

## Prerequisites

* **Complete the [Quickstart](/developers/quickstart)** so the CLI, SDK, and `xpander login` are already set up.
* **An agent with at least one tool attached.** Connectors selected in [Agent Studio](https://app.xpander.ai), `@register_tool` functions, or MCP-server tools all work.
* **Python 3.12+** for the local handler.

## 1. Log every tool call

The smallest useful hook is a logger that records each tool the agent reaches for. Drop the three decorators in a module that's imported from your handler and they auto-register at import time:

```python hooks.py highlight={5,12,20} theme={"dark"}
from typing import Any, Dict, Optional
from loguru import logger
from xpander_sdk import on_tool_before, on_tool_after, on_tool_error, Tool

@on_tool_before
def log_invocation(tool: Tool, payload: Any,
                   payload_extension: Optional[Dict[str, Any]] = None,
                   tool_call_id: Optional[str] = None,
                   agent_version: Optional[str] = None):
    logger.info(f"-> {tool.name} called with payload {payload}")

@on_tool_after
def log_success(tool: Tool, payload: Any,
                payload_extension: Optional[Dict[str, Any]] = None,
                tool_call_id: Optional[str] = None,
                agent_version: Optional[str] = None,
                result: Any = None):
    logger.info(f"<- {tool.name} returned {type(result).__name__}")

@on_tool_error
def log_failure(tool: Tool, payload: Any,
                payload_extension: Optional[Dict[str, Any]] = None,
                tool_call_id: Optional[str] = None,
                agent_version: Optional[str] = None,
                error: Optional[Exception] = None):
    logger.error(f"x {tool.name} failed: {error}")
```

What this means in practice:

1. **`@on_tool_before`** runs immediately before the tool body executes. The `Tool` object exposes `tool.name`, `tool.id`, `tool.is_local` (true for `@register_tool` functions, false for connectors), and `tool.description`.
2. **`@on_tool_after`** only runs on success and adds a `result` parameter carrying whatever the tool returned. For connector tools, that's the raw response body. For local tools, it's whatever your function returned.
3. **`@on_tool_error`** runs in place of the after-hook when the tool raises. The `error` parameter is the original exception. The agent's framework still sees the failure; the hook is for your side effects (logs, alerts, traces).
4. **`tool_call_id`** is unique per invocation. Use it as the correlation key to pair before-hooks with their matching after-hooks or error-hooks.
5. **Both sync and async hooks work.** The SDK detects coroutine functions and awaits them automatically, so you can `await` an HTTP client or a DB write inside an async hook without extra wiring.

Three traits of every hook signature, regardless of which decorator you use:

| Parameter             | Type                  | Notes                                                                                                                                                                                                      |
| --------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `tool`                | `Tool`                | The tool being invoked. Read `tool.name`, `tool.id`, `tool.is_local`, `tool.description`. Always set.                                                                                                      |
| `payload`             | `Any`                 | The arguments the LLM produced for the call. For connectors, typically `{"body_params": {...}, "path_params": {...}, "query_params": {...}}`. For local tools, whatever your function expects. Always set. |
| `payload_extension`   | `Optional[Dict]`      | The deep-merged extension you passed via `tool_call_payload_extension` on the task or `payload_extension=` on `agent.ainvoke_tool`. `None` if you didn't set one.                                          |
| `tool_call_id`        | `Optional[str]`       | Stable identifier for one invocation. Pair before/after hooks with this key.                                                                                                                               |
| `agent_version`       | `Optional[str]`       | The deployed version of the agent that issued the call. Useful for filtering metrics by rollout.                                                                                                           |
| `result` (after only) | `Any`                 | The value the tool returned on success.                                                                                                                                                                    |
| `error` (error only)  | `Optional[Exception]` | The exception raised by the tool body.                                                                                                                                                                     |

## 2. Time and instrument every tool call

Once you have logging, the next thing most teams want is timing and counter metrics per tool. The before/after pair is the natural fit, with `tool_call_id` as the correlation key:

```python hooks.py highlight={5,7,15} theme={"dark"}
import time
from typing import Any, Dict, Optional
from xpander_sdk import on_tool_before, on_tool_after, Tool

starts: dict[str, float] = {}

@on_tool_before
def record_start(tool: Tool, payload: Any,
                 payload_extension: Optional[Dict[str, Any]] = None,
                 tool_call_id: Optional[str] = None,
                 agent_version: Optional[str] = None):
    if tool_call_id:
        starts[tool_call_id] = time.time()

@on_tool_after
def record_duration(tool: Tool, payload: Any,
                    payload_extension: Optional[Dict[str, Any]] = None,
                    tool_call_id: Optional[str] = None,
                    agent_version: Optional[str] = None,
                    result: Any = None):
    started = starts.pop(tool_call_id, None) if tool_call_id else None
    if started is not None:
        metrics_client.timing(f"tool.{tool.name}.duration_ms", (time.time() - started) * 1000)
        metrics_client.increment(f"tool.{tool.name}.calls")
```

What this means in practice:

1. **`tool_call_id` is the correlation key.** Concurrent tool calls run on the same agent, so a global timestamp would clobber. The id stays stable from the before hook to the matching after or error hook.
2. **Pop, don't peek.** Removing the entry on the after hook keeps memory bounded across long-running containers.
3. **Embed `tool.name` in the metric name.** Per-tool dashboards drop out of this naming scheme without per-tool boilerplate.

Mirror the increment in `@on_tool_error` so success and failure counters add up to the total call count.

## 3. Redact payloads and add custom guardrails

Hooks are observe-only by design. The SDK calls them, but ignores any return value, so you cannot mutate the payload or rewrite the result from a hook. What you can do is:

* **Redact at the sink.** Strip secrets from the copy of the payload you log or send to a tracing backend.
* **Detect and alert.** Match the payload against a guardrail policy and emit an alert or a metric when it trips.
* **Raise to fail loud.** A hook that raises has its exception logged by the SDK; the tool itself still runs, but the alert reaches your error-tracking system.

```python hooks.py highlight={5,12-13,15-17} theme={"dark"}
import copy
from typing import Any, Dict, Optional
from xpander_sdk import on_tool_before, Tool

SENSITIVE_KEYS = {"api_key", "password", "ssn", "credit_card"}

@on_tool_before
def redact_and_log(tool: Tool, payload: Any,
                   payload_extension: Optional[Dict[str, Any]] = None,
                   tool_call_id: Optional[str] = None,
                   agent_version: Optional[str] = None):
    # Deep copy so we never touch the live payload the tool will receive.
    safe = copy.deepcopy(payload) if isinstance(payload, dict) else payload
    if isinstance(safe, dict):
        for key in list(safe.get("body_params", {})):
            if key.lower() in SENSITIVE_KEYS:
                safe["body_params"][key] = "***"
    audit_log.write({"tool": tool.name, "tool_call_id": tool_call_id, "payload": safe})
```

What this means in practice:

1. **`copy.deepcopy(payload)`** is the safety net. Even though hook return values are ignored, mutating a shared dict in place could affect other observers reading the same object. Copy first, redact the copy.
2. **`SENSITIVE_KEYS`** is your project's policy. Extend it with whatever your security team flags.
3. **`audit_log.write(...)`** is a stand-in for whatever sink you ship to (S3, Datadog, OpenTelemetry). Hooks are the right place for this work because they fire on every tool, not just the ones you remember to instrument.

To enforce a policy that should *block* a call, do it inside the tool function itself. Hooks fire before the tool body runs, but raising from a hook only logs the exception, it doesn't cancel the invocation.

## 4. Alert on failures of business-critical tools

Most tool errors are noise: an LLM produced an invalid payload, a connector returned a 4xx, the agent retries. The few that should page someone (a charge that didn't go through, an auth check that broke) deserve their own hook with a name allowlist:

```python hooks.py highlight={3,8-9} theme={"dark"}
from xpander_sdk import on_tool_error, Tool

CRITICAL = {"payment_processor", "auth_service", "fraud_check"}

@on_tool_error
async def alert_on_failure(tool: Tool, payload, payload_extension=None,
                           tool_call_id=None, agent_version=None, error=None):
    if tool.name not in CRITICAL:
        return
    await alert_service.send(
        title=f"Critical tool failure: {tool.name}",
        message=f"Error: {error}\nCall ID: {tool_call_id}\nAgent: {agent_version}",
        severity="critical",
    )
```

What this means in practice:

1. **The name allowlist** is what keeps alert volume sane. Without it, every transient connector failure pages you.
2. **`agent_version`** is included in the alert so you can correlate a spike of errors with the rollout that introduced it.
3. **The hook is async**, so it can `await` an HTTP call to PagerDuty or Slack without spinning up a background thread.

## 5. Attribute cost and usage per tenant

Tool hooks are how you build per-customer billing or per-team cost dashboards on top of agent activity. Combine [`tool_call_payload_extension`](/developers/tools/pre-built#5-pass-per-request-context-to-every-tool-call) with an `@on_tool_after` hook that reads the tenant ID off the extension and increments a counter:

```python hooks.py theme={"dark"}
from xpander_sdk import on_tool_after, Tool

@on_tool_after
async def attribute_cost(tool: Tool, payload, payload_extension=None,
                         tool_call_id=None, agent_version=None, result=None):
    tenant_id = (payload_extension or {}).get("body_params", {}).get("tenant_id")
    if tenant_id:
        await billing.increment(tenant_id, tool=tool.name, count=1)
```

What this means in practice:

1. **`payload_extension`** is the same dict you set when creating the task with `tool_call_payload_extension={"body_params": {"tenant_id": "acme-corp"}}`. Every tool call inside that task carries it through to the hook.
2. **The hook fires on every successful invocation**, so the counter reflects real usage, not LLM intentions.
3. **It works uniformly across tool types.** Connector calls, custom `@register_tool` calls, and MCP tools all hit this hook with the same extension.

## 6. Where hooks fit in your project

Register hooks at module level so they're set up before any task is processed. The cleanest pattern is a `hooks.py` imported from your handler:

```python xpander_handler.py highlight={1} theme={"dark"}
import hooks  # registers logging, metrics, and audit hooks at import time

from xpander_sdk import on_task, Task, Backend
from agno.agent import Agent

@on_task
async def handler(task: Task) -> Task:
    backend = Backend(configuration=task.configuration)
    agno_agent = Agent(**(await backend.aget_args(task=task)))
    result = await agno_agent.arun(input=task.to_message())
    task.result = result.content
    return task
```

What this means in practice:

1. **The `import hooks` line is enough.** Each `@on_tool_before` / `@on_tool_after` / `@on_tool_error` decorator registers itself in a process-global registry on import. There's no `register_hooks(...)` call.
2. **Hooks compose with `@on_boot`.** Use a boot handler to construct the metrics client, alerting client, or audit-log writer that your hooks reach for, so they exist before the first tool fires.
3. **Hooks coexist with framework-level callbacks.** Agno's `tool_hooks` arg, OpenAI Agents SDK's run hooks, and LangChain callbacks all keep working. xpander's hooks fire at the SDK's tool-invocation layer, so they run alongside (not instead of) any framework callback you've already wired up.

## How hooks fire

The SDK runs hooks synchronously around the tool body. The order is fixed:

1. Schema validation runs first if the tool has a Pydantic schema.
2. All `@on_tool_before` hooks run, in registration order.
3. The tool body executes (the connector HTTP call, the local `@register_tool` function, or the MCP server call).
4. **On success**, every `@on_tool_after` hook runs, in registration order, with the result.
5. **On failure**, every `@on_tool_error` hook runs, in registration order, with the exception.
6. Activity reporting to Agent Studio happens after hooks return, so your hooks see the call before the platform's metrics view does.

A few non-obvious properties:

* **Exceptions inside a hook are caught by the SDK and logged.** They don't prevent the tool from running, don't cancel sibling hooks, and don't propagate to the agent loop. This makes hooks safe for instrumentation, but it means you can't use them to block a call.
* **Hooks observe; they don't mutate.** The SDK calls each hook and ignores its return value. Mutate the local copy you log, but don't expect hook returns to alter the live payload or rewrite the result.
* **Order matters when hooks share state.** If two `@on_tool_after` hooks both read a dict populated by a `@on_tool_before` hook, register them in the order the after-hooks need to run.

## Troubleshooting

<AccordionGroup>
  <Accordion title="My hook never fires">
    The decorator only registers the hook when the module that defines it is imported. If `hooks.py` lives next to `xpander_handler.py` but nothing ever imports it, the decorators never run. Add `import hooks` at the top of `xpander_handler.py` (or wherever your `@on_task` lives) so registration happens at boot.
  </Accordion>

  <Accordion title="My hook fires twice for one tool call">
    Hooks register globally, so importing `hooks.py` from two different modules registers each decorator twice. Pick one import site (the handler) and remove the others. Re-running `xpander agent dev` reloads the registry from a fresh process, which is the easiest way to confirm.
  </Accordion>

  <Accordion title="Async hook seems to block or never complete">
    The SDK detects coroutine functions and awaits them; sync hooks run inline. If you wrote a sync hook that calls `asyncio.run(...)` or blocks on a sync HTTP client inside an async handler, you'll stall the event loop. Either declare the hook `async def` and `await` an async client, or keep it sync and use a non-blocking client.
  </Accordion>

  <Accordion title="An exception in my hook crashed... nothing">
    Hook exceptions are caught and logged by the SDK; the tool still runs. If you need a hook failure to be loud, push the exception to your error tracker yourself (`sentry_sdk.capture_exception(e)`) inside a `try/except`. Don't rely on the exception bubbling up to the agent loop, because it won't.
  </Accordion>

  <Accordion title="`payload_extension` is `None` even though I set `tool_call_payload_extension`">
    `tool_call_payload_extension` is a per-task setting passed to `agent.acreate_task(...)`. If you're invoking a tool by hand with `agent.ainvoke_tool(...)` and didn't pass `payload_extension=...`, the hook receives `None`. Either set the extension on the task, or pass it to `ainvoke_tool` directly.
  </Accordion>

  <Accordion title="Hook return value seems to be ignored">
    It is. Hooks observe the call; they don't mutate it. The SDK ignores whatever a hook returns. To shape the payload that reaches a tool, use input schema overrides on the tool's Advanced tab in [Agent Studio](https://app.xpander.ai). To shape the result the LLM sees, use [Output Response Filtering](/developers/tools/output-response-filtering) or filter inside your `@register_tool` function before returning.
  </Accordion>
</AccordionGroup>

## Next steps

<CardGroup cols={2}>
  <Card title="Pre-built connectors" icon="plug" href="/developers/tools/pre-built">
    The other tool surface hooks observe, including `tool_call_payload_extension` for per-tenant context.
  </Card>

  <Card title="Custom tools" icon="wrench" href="/developers/tools/custom-tools">
    Wrap your own Python functions with `@register_tool`. Hooks fire for these too.
  </Card>

  <Card title="Output Response Filtering" icon="filter" href="/developers/tools/output-response-filtering">
    How large tool responses get filtered before reaching the LLM.
  </Card>

  <Card title="Lifecycle hooks" icon="play" href="/developers/sdk-reference/decorators/on-boot-shutdown">
    `@on_boot` and `@on_shutdown` for setting up the clients your tool hooks reach for.
  </Card>

  <Card title="Frameworks" icon="cubes" href="/developers/frameworks">
    How tool calls flow through Agno, OpenAI Agents SDK, LangChain, and AWS Strands.
  </Card>

  <Card title="Core Concepts" icon="lightbulb" href="/developers/core-concepts">
    The SDK class names mapped onto agents, tasks, threads, and tools.
  </Card>
</CardGroup>
