docs(http): add ADR-041 MCP tool-gateway pattern for to_mcp

The to_mcp spec was describing one MCP tool per alknet operation — the
tool-bloat problem. An LLM connecting to a node with 200 operations gets
200 MCP tools dumped into its context, degrading reasoning and wasting
context budget.

ADR-041 replaces this with the tool-gateway pattern (same pattern as
opencode's memory and worktree tools): to_mcp exposes 4 fixed meta-tools
(search, schema, call, batch) that gate access to the full operation
registry. The LLM has a few tools in context, discovers operations on
demand through search + schema, then calls. Same principle as Linux's
man command — don't preload all documentation; query on demand.

Gateway tool set:
- search -> services/list (names + descriptions, AccessControl-filtered)
- schema -> services/schema (full OperationSpec for a specific op)
- call -> call.requested (Query/Mutation only, request/response)
- batch -> multiple call.requested (correlated IDs, OQ-14)

Subscription operations are excluded — MCP tool calls are
request/response by protocol design (the client blocks until
CallToolResult returns); streaming subscriptions don't fit. Subscriptions
are filtered out of search results and cannot be invoked via call.

http-mcp.md to_mcp section rewritten: the gateway tool set, Subscription
exclusion, and the service behavior (tools/list returns 4 fixed tools,
tools/call dispatches through the gateway). The 'Why' section adds the
tool-bloat rationale and the memory/worktree tool pattern that informed
the design.

README/overview ADR tables and the top-level README current-state note
updated for ADR-041.
This commit is contained in:
2026-06-29 08:34:44 +00:00
parent 398e3d512d
commit 5fc074713c
5 changed files with 288 additions and 18 deletions

View File

@@ -18,7 +18,7 @@ The storage and auth strategy research (`docs/research/alknet-storage-strategy/f
The alknet-call crate is **implemented and reviewed** — both the server-side core and the client/adapter surface (207 lib + 2 integration tests passing). The alknet-core and alknet-call crate specs are in draft; the alknet-vault crate specs are stable.
**alknet-http specs drafted.** The alknet-http crate (HTTP interface — `h2`/`http/1.1`/`h3` server + `from_openapi`/`to_openapi`/`from_mcp`/`to_mcp` adapters) now has architecture specs: [crates/http/](crates/http/) (overview, http-server, http-adapters, http-mcp, webtransport) and five new ADRs — [ADR-036](decisions/036-http-to-call-operation-mapping.md) (HTTP-to-call mapping), [ADR-037](decisions/037-mcp-stdio-transport-exclusion.md) (MCP stdio exclusion), [ADR-038](decisions/038-http3-and-webtransport-as-first-class.md) (HTTP/3 + WebTransport as first-class, correcting the Phase 0 deferral framing), [ADR-039](decisions/039-http-server-and-client-host-colocated.md) (HTTP server + client host colocated in one crate), [ADR-040](decisions/040-webtransport-alpn-stream-proxy.md) (WebTransport ALPN-stream-proxy — browser → WebTransport stream → any ALPN handler via WASM parser; the "VPN-like without being a VPN" use case). ADR-003 Amendment 1 clarifies that `alknet-call` is a protocol-foundation crate (the `alknet-http``alknet-call` dependency edge). The specs are in draft; implementation has not started. Three open questions carried: OQ-38 (WebTransport standalone relay service scope — distinct from the in-process ALPN-stream-proxy resolved by ADR-040), OQ-39 (`to_openapi` published-spec versioning), OQ-40 (reqwest client config).
**alknet-http specs drafted.** The alknet-http crate (HTTP interface — `h2`/`http/1.1`/`h3` server + `from_openapi`/`to_openapi`/`from_mcp`/`to_mcp` adapters) now has architecture specs: [crates/http/](crates/http/) (overview, http-server, http-adapters, http-mcp, webtransport) and six new ADRs — [ADR-036](decisions/036-http-to-call-operation-mapping.md) (HTTP-to-call mapping), [ADR-037](decisions/037-mcp-stdio-transport-exclusion.md) (MCP stdio exclusion), [ADR-038](decisions/038-http3-and-webtransport-as-first-class.md) (HTTP/3 + WebTransport as first-class, correcting the Phase 0 deferral framing), [ADR-039](decisions/039-http-server-and-client-host-colocated.md) (HTTP server + client host colocated in one crate), [ADR-040](decisions/040-webtransport-alpn-stream-proxy.md) (WebTransport ALPN-stream-proxy — browser → WebTransport stream → any ALPN handler via WASM parser; the "VPN-like without being a VPN" use case), [ADR-041](decisions/041-mcp-tool-gateway-pattern.md) (`to_mcp` tool-gateway pattern — 4 fixed gateway tools instead of one tool per operation, addressing LLM context tool-bloat). ADR-003 Amendment 1 clarifies that `alknet-call` is a protocol-foundation crate (the `alknet-http``alknet-call` dependency edge). The specs are in draft; implementation has not started. Three open questions carried: OQ-38 (WebTransport standalone relay service scope — distinct from the in-process ALPN-stream-proxy resolved by ADR-040), OQ-39 (`to_openapi` published-spec versioning), OQ-40 (reqwest client config).
**Next step**: The storage/repo-pattern ADRs (030033) are accepted and amend the core and call specs. The next implementation phase is the ADR-029 migration (peer-keyed overlays, `PeerRef` routing, retire `remote_safe`/`trusted_peer`) with the ADR-030 `PeerEntry` change and the ADR-032 `forwarded_for` field folded in — the `OperationContext`, `from_call` handler, and `AuthPolicy` are all under edit, making this the cheapest window. After that: alknet-http implementation (specs drafted, ADRs 036038 proposed), which consumes the `CredentialStore` trait and the `OperationAdapter` contract. The alknet-ssh crate (the other post-core crate, specced in parallel) proceeds independently — it depends on `alknet-core`, not `alknet-call`.
@@ -93,6 +93,7 @@ The alknet-call crate is **implemented and reviewed** — both the server-side c
| [038](decisions/038-http3-and-webtransport-as-first-class.md) | HTTP/3 and WebTransport as First-Class HTTP Transports | Proposed |
| [039](decisions/039-http-server-and-client-host-colocated.md) | HTTP Server and Client Host Colocated in alknet-http | Proposed |
| [040](decisions/040-webtransport-alpn-stream-proxy.md) | WebTransport ALPN-Stream-Proxy | Proposed |
| [041](decisions/041-mcp-tool-gateway-pattern.md) | MCP Tool-Gateway Pattern for to_mcp | Proposed |
## Open Questions

View File

@@ -41,6 +41,7 @@ on standard ALPNs, and hosts the HTTP-backed call-protocol adapters
| [038](../../decisions/038-http3-and-webtransport-as-first-class.md) | HTTP/3 and WebTransport as First-Class HTTP Transports | `h3` in scope, not deferred |
| [039](../../decisions/039-http-server-and-client-host-colocated.md) | HTTP Server and Client Host Colocated in alknet-http | One crate for server + client host (shared HTTP deps, shared mapping) |
| [040](../../decisions/040-webtransport-alpn-stream-proxy.md) | WebTransport ALPN-Stream-Proxy | Browser → WebTransport stream → any ALPN handler (SSH, git, SFTP) via WASM parser |
| [041](../../decisions/041-mcp-tool-gateway-pattern.md) | MCP Tool-Gateway Pattern for to_mcp | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation; Subscription excluded |
## Relevant Open Questions

View File

@@ -129,10 +129,17 @@ pub fn to_mcp_service(
) -> StreamableHttpService<...>;
```
`to_mcp` exposes the local registry's `External` operations as MCP tools
over streamable HTTP, using rmcp's `StreamableHttpService` (an
axum-compatible tower service). The rmcp
`simple_auth_streamhttp.rs` server example shows the pattern:
`to_mcp` exposes the local registry's operations as a **fixed gateway
tool set** over streamable HTTP — not one MCP tool per operation. This
is the tool-gateway pattern (ADR-041): the LLM has a few tools in
context (search, schema, call, batch), not hundreds, and discovers
operations on demand through the gateway. See
[ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) for the
rationale (the tool-bloat problem, the `memory`/`worktree` tool pattern
that informed the design).
The rmcp `simple_auth_streamhttp.rs` server example shows the
streamable-HTTP-service-into-axum-`Router` pattern:
```rust
// From the rmcp example:
@@ -148,25 +155,59 @@ let protected_mcp_router = Router::new()
.layer(middleware::from_fn_with_state(token_store, auth_middleware));
```
`alknet-http`'s `to_mcp` follows the same pattern: the local operations
are exposed as an MCP server (an rmcp `Service` impl that wraps the
`OperationRegistry`), the `StreamableHttpService` nests into the axum
`Router` at `/mcp`, and a Bearer auth middleware gates access (the
`simple_auth_streamhttp.rs` `auth_middleware` + `extract_token` pattern).
`alknet-http`'s `to_mcp` follows the same axum integration pattern,
but the rmcp `Service` impl is a gateway service (4 fixed tools) rather
than a per-operation tool registry.
The `to_mcp` service:
#### The gateway tool set
1. On MCP `tools/list`: returns the local registry's `External`
operations as MCP tools (name, description, `inputSchema`).
2. On MCP `tools/call`: dispatches to the `OperationRegistry::invoke()`
— the same dispatch path the HTTP server uses for HTTP requests
(ADR-036). The MCP tool call becomes a `call.requested` internally.
The result is mapped back to the MCP `tools/call` response shape
(`structuredContent` or `content` blocks).
`to_mcp` exposes four MCP tools that gate access to the full operation
registry:
| MCP tool | Call protocol operation | Purpose |
|----------|------------------------|---------|
| `search` | `services/list` | List/search available operations (filtered by the caller's `AccessControl`). Returns names + descriptions, not full schemas. |
| `schema` | `services/schema` | Get an operation's full `OperationSpec` (input/output JSON Schemas, error schemas). |
| `call` | `call.requested` (Query/Mutation) | Invoke an operation by name with a JSON input. Returns the output or a typed error (ADR-023). |
| `batch` | multiple `call.requested` | Invoke multiple operations in one tool call (correlated request IDs, OQ-14). |
The LLM calls `search` to discover operations, `schema` to learn an
operation's input shape, `call` to invoke. Same pattern as `man
<command>` — discover on demand, don't preload. See ADR-041 for the
rationale.
#### `Subscription` exclusion
The gateway exposes only `Query` and `Mutation` operations
(request/response). `Subscription` operations (streaming) are filtered
out of `search` results and cannot be invoked via `call` — MCP tool
calls are request/response by protocol design; streaming subscriptions
don't fit the LLM tool-call pattern. See ADR-041 §2.
#### `to_mcp` service behavior
1. On MCP `tools/list`: returns the fixed gateway tool set (4 tools:
`search`, `schema`, `call`, `batch`), not the registry's
operations. The gateway tools have stable names and schemas; the
registry's operations are discovered through `search`.
2. On MCP `tools/call`:
- `search` → dispatches `services/list` (filtered by the caller's
`AccessControl`), returns operation names + descriptions.
- `schema` → dispatches `services/schema`, returns the
`OperationSpec`.
- `call` → dispatches `OperationRegistry::invoke()` (the same
dispatch path the HTTP server uses, ADR-036). The result is
mapped to an MCP `CallToolResult` (`structuredContent` for the
output, or `isError: true` for a `CallError` with typed
`details` per ADR-023).
- `batch` → dispatches multiple `call.requested` events, returns
an array of results.
3. Auth: the Bearer middleware resolves the token via
`IdentityProvider::resolve_from_token()`, same as the HTTP server's
auth (ADR-004). The MCP client authenticates by bearer token; no
`PeerId` (browsers and MCP clients are not alknet peers — ADR-034 §4).
`AccessControl` gates `search` results and `call` dispatch — the
LLM sees only what it's authorized to call.
### No-Env-Vars
@@ -186,6 +227,16 @@ external MCP clients (an editor, an AI tool) discover and call alknet
operations through the MCP protocol, without those clients needing to
speak EventEnvelope.
`to_mcp` uses the **tool-gateway pattern** (ADR-041): a fixed set of
meta-tools (`search`, `schema`, `call`, `batch`) gates access to the
full operation registry, so the LLM has a few tools in context instead
of hundreds. This addresses the tool-bloat problem — an LLM connecting
to a node with 200 operations gets 4 MCP tools, not 200, and discovers
operations on demand through `search` + `schema`. Same pattern as the
`memory` and `worktree` tools (one entry point, large dataset behind
it), and the same principle as Linux's `man` command (don't preload all
documentation; query on demand).
The streamable-HTTP-only constraint (ADR-037) is a security position:
alknet does not import the MCP stdio RCE vector. The streamable HTTP
path is network-isolated, auth-gatable, and runs under alknet's
@@ -213,6 +264,7 @@ every other HTTP request.
| Decision | ADR | Summary |
|----------|-----|---------|
| MCP stdio transport excluded | [ADR-037](../../decisions/037-mcp-stdio-transport-exclusion.md) | Streamable HTTP only; stdio is not built |
| `to_mcp` tool-gateway pattern | [ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation; Subscription excluded |
| `from_mcp` is an `OperationAdapter` | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | Async trait; produces `HandlerRegistration` bundles |
| `to_mcp` is a projection | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | Consumes the registry, doesn't produce entries |
| Adapter-registered ops are `Internal` | [ADR-015](../../decisions/015-privilege-model-and-authority-context.md) | `from_mcp` ops are composition material |

View File

@@ -206,6 +206,7 @@ verified against this invariant. See ADR-014 and
| HTTP/3 + WebTransport first-class | [ADR-038](../../decisions/038-http3-and-webtransport-as-first-class.md) | `h3` in scope, not deferred; browser streaming uses QUIC streams |
| HTTP server + client host colocated | [ADR-039](../../decisions/039-http-server-and-client-host-colocated.md) | One crate for server + adapters (shared HTTP deps, shared mapping) |
| WebTransport ALPN-stream-proxy | [ADR-040](../../decisions/040-webtransport-alpn-stream-proxy.md) | Browser → WebTransport stream → any ALPN handler (SSH, git, SFTP) via WASM parser |
| `to_mcp` tool-gateway pattern | [ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation |
| `alknet-call` is protocol-foundation | [ADR-003](../../decisions/003-crate-decomposition.md) Am. 1 | `alknet-http` depends on `alknet-call` (types, not peer handler) |
| Bearer auth via `resolve_from_token` | [ADR-004](../../decisions/004-auth-as-shared-core.md) | HTTP handler credential source + resolution (settled) |
| Stealth mode = HTTP handler on standard ALPNs | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Decoy for unknown paths (settled) |

View File

@@ -0,0 +1,215 @@
# ADR-041: MCP Tool-Gateway Pattern for to_mcp
## Status
Proposed
## Context
The current `to_mcp` spec (`crates/http/http-mcp.md`) describes
`to_mcp` as "exposes the local registry's `External` operations as MCP
tools" — one MCP tool per alknet operation. An LLM connecting to an
alknet node with 200 registered operations gets 200 MCP tools dumped
into its context. This is the **tool-bloat problem**: the LLM's context
is bloated with tools that are irrelevant to the current task, degrading
its reasoning and wasting context budget.
### The problem in concrete terms
The MCP `tools/list` response returns every tool the server exposes.
An MCP client (an editor, an AI tool) loads all of them into the LLM's
context as tool definitions. An alknet node exposing 200 operations
produces a `tools/list` response with 200 `Tool` structs, each with a
name, description, and `inputSchema` (JSON Schema). The LLM sees 200
tool definitions whether it needs them or not. This is the same anti-
pattern as loading every man page into a shell's environment —
absurd, but it's what the naive one-tool-per-operation mapping produces.
### The pattern that works
The project already has two examples of a better pattern:
1. **The `memory` tool** (opencode): read-only access to the underlying
session database. The LLM doesn't load all past sessions into
context — it calls `memory` with a search query when it needs to
recall something. One tool, access to a large dataset on demand.
2. **The `worktree` tool** (opencode): gates 8-10 sub-tools behind a
single `worktree` entry point. The LLM has one tool in context; the
sub-tools are discovered and invoked through it.
The general principle (same as Linux's `man` command): **don't load all
documentation/tools into context 24/7; expose a small fixed set of
meta-tools that gate access to the full set on demand.**
### The call protocol's discovery surface
The call protocol already has the discovery primitives that make this
work:
- `services/list` — lists registered operations (filtered by
`AccessControl`).
- `services/schema` — returns an operation's `OperationSpec`
(input/output JSON Schemas, error schemas).
The `to_mcp` gateway exposes these primitives (plus invocation) as a
small fixed set of MCP tools. The LLM searches for what it needs, learns
the schema, then calls — instead of having every operation pre-loaded.
## Decision
### 1. `to_mcp` exposes a fixed gateway tool set, not one tool per operation
`to_mcp` exposes a small fixed set of MCP tools that gate access to the
full operation registry. The LLM has a few tools in context (not
hundreds); it discovers and invokes operations through the gateway.
The gateway tool set (initial, two-way-door extensible):
| MCP tool | Call protocol operation | Purpose |
|----------|------------------------|---------|
| `search` | `services/list` | List/search available operations (filtered by the caller's `AccessControl`). The LLM discovers what it can call. |
| `schema` | `services/schema` | Get an operation's `OperationSpec` (input/output JSON Schemas, error schemas). The LLM learns how to call a specific operation. |
| `call` | `call.requested` (Query/Mutation) | Invoke an operation by name with a JSON input. Returns the operation's output (or a typed error per ADR-023). |
| `batch` | multiple `call.requested` | Invoke multiple operations in one tool call (correlated request IDs, OQ-14). The LLM batches independent calls. |
Four tools. The LLM calls `search` to find operations relevant to its
task, `schema` to learn the input shape, `call` to invoke. Same pattern
as `man <command>` — discover on demand, don't preload.
### 2. `Subscription` operations are excluded from the MCP gateway
MCP tool calls are request/response — an LLM invokes a tool and
receives a result. The call protocol's `Subscription` type
(streaming, many `call.responded` events) does not map onto the MCP
tool-call model. The gateway exposes only `Query` and `Mutation`
operations (request/response). `Subscription` operations are filtered
out of `search` results and cannot be invoked via `call`.
This is a deliberate scoping decision, not a deferral: MCP tool calls
are request/response by protocol design; streaming subscriptions are a
different interaction model that doesn't fit the LLM tool-call pattern.
If a future MCP extension adds streaming tool calls, the gateway could
expose `Subscription` operations through it — but that's a future MCP
spec question, not an alknet decision.
### 3. `search` returns names + descriptions, not full schemas
The `search` tool (backed by `services/list`) returns operation names,
namespaces, types, and short descriptions — not the full input/output
JSON Schemas. This keeps the search result small (the LLM is choosing
what to call, not how to call it yet). The LLM calls `schema` for the
specific operation it wants to invoke, getting the full `OperationSpec`
only when needed. Two-step discovery: search (cheap, list) → schema
(targeted, full spec).
### 4. `call` maps to the call protocol's request/response dispatch
The `call` tool takes `{ operation: "/fs/readFile", input: { ... } }`
and dispatches through the `OperationRegistry::invoke()` — the same
dispatch path the HTTP server uses (ADR-036). The result is mapped to
an MCP `CallToolResult` (`structuredContent` for the output, or
`isError: true` for a `CallError` with the typed `details` payload per
ADR-023). The `batch` tool takes an array of `{ operation, input }`
pairs and returns an array of results.
### 5. `AccessControl` gates the gateway
The `search` tool's results are filtered by the caller's
`AccessControl::check(identity)` — the LLM (authenticated by bearer
token, ADR-034 §4) sees only the operations it is authorized to call.
The `call` tool's dispatch runs the same `AccessControl` check. An
LLM that calls `call` with an operation it isn't authorized for gets
`FORBIDDEN` (mapped to an MCP error result). The gateway does not
bypass the call protocol's authorization — it's the same dispatch
path, just reached through an MCP tool call instead of an HTTP request.
## Consequences
**Positive:**
- The LLM has 4 tools in context, not hundreds. Context budget is
preserved for the actual task; the LLM discovers operations on
demand through `search` + `schema`. This is the same pattern that
makes the `memory` and `worktree` tools effective.
- The gateway maps onto the call protocol's existing discovery
primitives (`services/list`, `services/schema`) and dispatch
(`OperationRegistry::invoke`). No new call-protocol mechanisms
needed — `to_mcp` is a thin wrapper around the existing surface.
- `AccessControl` gates the gateway. An LLM sees only what it's
authorized to call; the gateway doesn't leak operation existence or
schemas to unauthorized callers.
- `Subscription` exclusion is explicit. The LLM tool-call model is
request/response; streaming doesn't fit, and pretending it does
would produce a broken mapping.
**Negative:**
- The LLM needs two round-trips to call an operation it hasn't seen
before (`search``schema``call`). A one-tool-per-operation
mapping would let it call directly. The tradeoff: 4 tools in context
+ 2 discovery round-trips vs. 200 tools in context + 0 round-trips.
The context budget is the scarcer resource; the round-trips are
cheap (the MCP server is local or nearby).
- The `search` tool's result format (names + descriptions, not full
schemas) means the LLM may need to call `schema` for multiple
operations before finding the right one. Mitigated: `search` can
accept a query/filter (namespace, keyword) to narrow results.
- The gateway tool set is fixed (4 tools). An operation that wants a
custom MCP tool (e.g., a specialized `git_clone` tool with a curated
input schema, not the generic `call` wrapper) is not exposed through
the gateway. A future "custom tool" extension could allow operations
to declare an MCP tool projection — but the gateway pattern is the
default, and the custom-tool path is additive (not a replacement).
## Assumptions
1. **The LLM context budget is the scarcer resource.** The tradeoff
favoring 4 tools + discovery round-trips over 200 preloaded tools
assumes the LLM's context window is more valuable than the network
round-trips. This holds for current LLMs (context windows are
large but not unlimited; tool definitions consume context
proportionally to their schemas).
2. **`Query` and `Mutation` cover the LLM tool-call use case.**
LLMs invoke tools in a request/response pattern: call a tool,
receive a result, reason about it. Streaming subscriptions
(`call.responded` events over time) don't fit this pattern — the
LLM expects one result per tool call. The assumption is that the
operations an LLM wants to call are `Query`/`Mutation`, not
`Subscription`.
3. **The gateway tool set is stable.** Once LLM clients build
prompts/workflows against the `search`/`schema`/`call`/`batch`
tool set, changing the tool surface (renaming, removing) breaks
them. Adding tools is additive (non-breaking); removing or renaming
is a one-way door. The initial 4-tool set is the published contract.
4. **`AccessControl` filtering is sufficient for `search`.** The LLM
sees the operations it's authorized to call. If an operation's
existence is itself sensitive (the LLM shouldn't know it exists
even if it can't call it), `Visibility::Internal` (ADR-015) is the
mechanism — Internal ops are excluded from `services/list` and
therefore from `search` results. The gateway does not add a
separate visibility layer.
## References
- [ADR-015](015-privilege-model-and-authority-context.md) —
External/Internal visibility (Internal ops excluded from
`services/list`, therefore from `search`)
- [ADR-017](017-call-protocol-client-and-adapter-contract.md) —
`to_*` adapters are projections (consume the registry, don't
produce entries)
- [ADR-023](023-operation-error-schemas.md) — typed error `details`
mapped to MCP error results
- [ADR-034](034-outgoing-only-x509-and-three-peer-roles.md) §4 —
browsers/MCP clients are not alknet peers (bearer token, no
`PeerId`)
- [ADR-036](036-http-to-call-operation-mapping.md) — the HTTP-to-call
dispatch path the `call` tool reuses
- [ADR-037](037-mcp-stdio-transport-exclusion.md) — streamable HTTP
only (the transport `to_mcp` uses)
- `crates/http/http-mcp.md` — the spec that implements the gateway
- `/workspace/rust-sdk/crates/rmcp/src/model/tool.rs` — the MCP
`Tool` struct (name, description, input_schema, output_schema)
- `/workspace/rust-sdk/crates/rmcp/src/handler/server.rs`
`list_tools` / `call_tool` server trait (the interface `to_mcp`
implements)