diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 23c0f38..d6b8a70 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -18,7 +18,7 @@ The storage and auth strategy research (`docs/research/alknet-storage-strategy/f The alknet-call crate is **implemented and reviewed** — both the server-side core and the client/adapter surface (207 lib + 2 integration tests passing). The alknet-core and alknet-call crate specs are in draft; the alknet-vault crate specs are stable. -**alknet-http specs drafted.** The alknet-http crate (HTTP interface — `h2`/`http/1.1`/`h3` server + `from_openapi`/`to_openapi`/`from_mcp`/`to_mcp` adapters) now has architecture specs: [crates/http/](crates/http/) (overview, http-server, http-adapters, http-mcp, webtransport) and five new ADRs — [ADR-036](decisions/036-http-to-call-operation-mapping.md) (HTTP-to-call mapping), [ADR-037](decisions/037-mcp-stdio-transport-exclusion.md) (MCP stdio exclusion), [ADR-038](decisions/038-http3-and-webtransport-as-first-class.md) (HTTP/3 + WebTransport as first-class, correcting the Phase 0 deferral framing), [ADR-039](decisions/039-http-server-and-client-host-colocated.md) (HTTP server + client host colocated in one crate), [ADR-040](decisions/040-webtransport-alpn-stream-proxy.md) (WebTransport ALPN-stream-proxy — browser → WebTransport stream → any ALPN handler via WASM parser; the "VPN-like without being a VPN" use case). ADR-003 Amendment 1 clarifies that `alknet-call` is a protocol-foundation crate (the `alknet-http` → `alknet-call` dependency edge). The specs are in draft; implementation has not started. Three open questions carried: OQ-38 (WebTransport standalone relay service scope — distinct from the in-process ALPN-stream-proxy resolved by ADR-040), OQ-39 (`to_openapi` published-spec versioning), OQ-40 (reqwest client config). +**alknet-http specs drafted.** The alknet-http crate (HTTP interface — `h2`/`http/1.1`/`h3` server + `from_openapi`/`to_openapi`/`from_mcp`/`to_mcp` adapters) now has architecture specs: [crates/http/](crates/http/) (overview, http-server, http-adapters, http-mcp, webtransport) and six new ADRs — [ADR-036](decisions/036-http-to-call-operation-mapping.md) (HTTP-to-call mapping), [ADR-037](decisions/037-mcp-stdio-transport-exclusion.md) (MCP stdio exclusion), [ADR-038](decisions/038-http3-and-webtransport-as-first-class.md) (HTTP/3 + WebTransport as first-class, correcting the Phase 0 deferral framing), [ADR-039](decisions/039-http-server-and-client-host-colocated.md) (HTTP server + client host colocated in one crate), [ADR-040](decisions/040-webtransport-alpn-stream-proxy.md) (WebTransport ALPN-stream-proxy — browser → WebTransport stream → any ALPN handler via WASM parser; the "VPN-like without being a VPN" use case), [ADR-041](decisions/041-mcp-tool-gateway-pattern.md) (`to_mcp` tool-gateway pattern — 4 fixed gateway tools instead of one tool per operation, addressing LLM context tool-bloat). ADR-003 Amendment 1 clarifies that `alknet-call` is a protocol-foundation crate (the `alknet-http` → `alknet-call` dependency edge). The specs are in draft; implementation has not started. Three open questions carried: OQ-38 (WebTransport standalone relay service scope — distinct from the in-process ALPN-stream-proxy resolved by ADR-040), OQ-39 (`to_openapi` published-spec versioning), OQ-40 (reqwest client config). **Next step**: The storage/repo-pattern ADRs (030–033) are accepted and amend the core and call specs. The next implementation phase is the ADR-029 migration (peer-keyed overlays, `PeerRef` routing, retire `remote_safe`/`trusted_peer`) with the ADR-030 `PeerEntry` change and the ADR-032 `forwarded_for` field folded in — the `OperationContext`, `from_call` handler, and `AuthPolicy` are all under edit, making this the cheapest window. After that: alknet-http implementation (specs drafted, ADRs 036–038 proposed), which consumes the `CredentialStore` trait and the `OperationAdapter` contract. The alknet-ssh crate (the other post-core crate, specced in parallel) proceeds independently — it depends on `alknet-core`, not `alknet-call`. @@ -93,6 +93,7 @@ The alknet-call crate is **implemented and reviewed** — both the server-side c | [038](decisions/038-http3-and-webtransport-as-first-class.md) | HTTP/3 and WebTransport as First-Class HTTP Transports | Proposed | | [039](decisions/039-http-server-and-client-host-colocated.md) | HTTP Server and Client Host Colocated in alknet-http | Proposed | | [040](decisions/040-webtransport-alpn-stream-proxy.md) | WebTransport ALPN-Stream-Proxy | Proposed | +| [041](decisions/041-mcp-tool-gateway-pattern.md) | MCP Tool-Gateway Pattern for to_mcp | Proposed | ## Open Questions diff --git a/docs/architecture/crates/http/README.md b/docs/architecture/crates/http/README.md index 84b7012..538de41 100644 --- a/docs/architecture/crates/http/README.md +++ b/docs/architecture/crates/http/README.md @@ -41,6 +41,7 @@ on standard ALPNs, and hosts the HTTP-backed call-protocol adapters | [038](../../decisions/038-http3-and-webtransport-as-first-class.md) | HTTP/3 and WebTransport as First-Class HTTP Transports | `h3` in scope, not deferred | | [039](../../decisions/039-http-server-and-client-host-colocated.md) | HTTP Server and Client Host Colocated in alknet-http | One crate for server + client host (shared HTTP deps, shared mapping) | | [040](../../decisions/040-webtransport-alpn-stream-proxy.md) | WebTransport ALPN-Stream-Proxy | Browser → WebTransport stream → any ALPN handler (SSH, git, SFTP) via WASM parser | +| [041](../../decisions/041-mcp-tool-gateway-pattern.md) | MCP Tool-Gateway Pattern for to_mcp | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation; Subscription excluded | ## Relevant Open Questions diff --git a/docs/architecture/crates/http/http-mcp.md b/docs/architecture/crates/http/http-mcp.md index 5349acd..62e2132 100644 --- a/docs/architecture/crates/http/http-mcp.md +++ b/docs/architecture/crates/http/http-mcp.md @@ -129,10 +129,17 @@ pub fn to_mcp_service( ) -> StreamableHttpService<...>; ``` -`to_mcp` exposes the local registry's `External` operations as MCP tools -over streamable HTTP, using rmcp's `StreamableHttpService` (an -axum-compatible tower service). The rmcp -`simple_auth_streamhttp.rs` server example shows the pattern: +`to_mcp` exposes the local registry's operations as a **fixed gateway +tool set** over streamable HTTP — not one MCP tool per operation. This +is the tool-gateway pattern (ADR-041): the LLM has a few tools in +context (search, schema, call, batch), not hundreds, and discovers +operations on demand through the gateway. See +[ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) for the +rationale (the tool-bloat problem, the `memory`/`worktree` tool pattern +that informed the design). + +The rmcp `simple_auth_streamhttp.rs` server example shows the +streamable-HTTP-service-into-axum-`Router` pattern: ```rust // From the rmcp example: @@ -148,25 +155,59 @@ let protected_mcp_router = Router::new() .layer(middleware::from_fn_with_state(token_store, auth_middleware)); ``` -`alknet-http`'s `to_mcp` follows the same pattern: the local operations -are exposed as an MCP server (an rmcp `Service` impl that wraps the -`OperationRegistry`), the `StreamableHttpService` nests into the axum -`Router` at `/mcp`, and a Bearer auth middleware gates access (the -`simple_auth_streamhttp.rs` `auth_middleware` + `extract_token` pattern). +`alknet-http`'s `to_mcp` follows the same axum integration pattern, +but the rmcp `Service` impl is a gateway service (4 fixed tools) rather +than a per-operation tool registry. -The `to_mcp` service: +#### The gateway tool set -1. On MCP `tools/list`: returns the local registry's `External` - operations as MCP tools (name, description, `inputSchema`). -2. On MCP `tools/call`: dispatches to the `OperationRegistry::invoke()` - — the same dispatch path the HTTP server uses for HTTP requests - (ADR-036). The MCP tool call becomes a `call.requested` internally. - The result is mapped back to the MCP `tools/call` response shape - (`structuredContent` or `content` blocks). +`to_mcp` exposes four MCP tools that gate access to the full operation +registry: + +| MCP tool | Call protocol operation | Purpose | +|----------|------------------------|---------| +| `search` | `services/list` | List/search available operations (filtered by the caller's `AccessControl`). Returns names + descriptions, not full schemas. | +| `schema` | `services/schema` | Get an operation's full `OperationSpec` (input/output JSON Schemas, error schemas). | +| `call` | `call.requested` (Query/Mutation) | Invoke an operation by name with a JSON input. Returns the output or a typed error (ADR-023). | +| `batch` | multiple `call.requested` | Invoke multiple operations in one tool call (correlated request IDs, OQ-14). | + +The LLM calls `search` to discover operations, `schema` to learn an +operation's input shape, `call` to invoke. Same pattern as `man +` — discover on demand, don't preload. See ADR-041 for the +rationale. + +#### `Subscription` exclusion + +The gateway exposes only `Query` and `Mutation` operations +(request/response). `Subscription` operations (streaming) are filtered +out of `search` results and cannot be invoked via `call` — MCP tool +calls are request/response by protocol design; streaming subscriptions +don't fit the LLM tool-call pattern. See ADR-041 §2. + +#### `to_mcp` service behavior + +1. On MCP `tools/list`: returns the fixed gateway tool set (4 tools: + `search`, `schema`, `call`, `batch`), not the registry's + operations. The gateway tools have stable names and schemas; the + registry's operations are discovered through `search`. +2. On MCP `tools/call`: + - `search` → dispatches `services/list` (filtered by the caller's + `AccessControl`), returns operation names + descriptions. + - `schema` → dispatches `services/schema`, returns the + `OperationSpec`. + - `call` → dispatches `OperationRegistry::invoke()` (the same + dispatch path the HTTP server uses, ADR-036). The result is + mapped to an MCP `CallToolResult` (`structuredContent` for the + output, or `isError: true` for a `CallError` with typed + `details` per ADR-023). + - `batch` → dispatches multiple `call.requested` events, returns + an array of results. 3. Auth: the Bearer middleware resolves the token via `IdentityProvider::resolve_from_token()`, same as the HTTP server's auth (ADR-004). The MCP client authenticates by bearer token; no `PeerId` (browsers and MCP clients are not alknet peers — ADR-034 §4). + `AccessControl` gates `search` results and `call` dispatch — the + LLM sees only what it's authorized to call. ### No-Env-Vars @@ -186,6 +227,16 @@ external MCP clients (an editor, an AI tool) discover and call alknet operations through the MCP protocol, without those clients needing to speak EventEnvelope. +`to_mcp` uses the **tool-gateway pattern** (ADR-041): a fixed set of +meta-tools (`search`, `schema`, `call`, `batch`) gates access to the +full operation registry, so the LLM has a few tools in context instead +of hundreds. This addresses the tool-bloat problem — an LLM connecting +to a node with 200 operations gets 4 MCP tools, not 200, and discovers +operations on demand through `search` + `schema`. Same pattern as the +`memory` and `worktree` tools (one entry point, large dataset behind +it), and the same principle as Linux's `man` command (don't preload all +documentation; query on demand). + The streamable-HTTP-only constraint (ADR-037) is a security position: alknet does not import the MCP stdio RCE vector. The streamable HTTP path is network-isolated, auth-gatable, and runs under alknet's @@ -213,6 +264,7 @@ every other HTTP request. | Decision | ADR | Summary | |----------|-----|---------| | MCP stdio transport excluded | [ADR-037](../../decisions/037-mcp-stdio-transport-exclusion.md) | Streamable HTTP only; stdio is not built | +| `to_mcp` tool-gateway pattern | [ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation; Subscription excluded | | `from_mcp` is an `OperationAdapter` | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | Async trait; produces `HandlerRegistration` bundles | | `to_mcp` is a projection | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | Consumes the registry, doesn't produce entries | | Adapter-registered ops are `Internal` | [ADR-015](../../decisions/015-privilege-model-and-authority-context.md) | `from_mcp` ops are composition material | diff --git a/docs/architecture/crates/http/overview.md b/docs/architecture/crates/http/overview.md index 6c19bf4..b94dd83 100644 --- a/docs/architecture/crates/http/overview.md +++ b/docs/architecture/crates/http/overview.md @@ -206,6 +206,7 @@ verified against this invariant. See ADR-014 and | HTTP/3 + WebTransport first-class | [ADR-038](../../decisions/038-http3-and-webtransport-as-first-class.md) | `h3` in scope, not deferred; browser streaming uses QUIC streams | | HTTP server + client host colocated | [ADR-039](../../decisions/039-http-server-and-client-host-colocated.md) | One crate for server + adapters (shared HTTP deps, shared mapping) | | WebTransport ALPN-stream-proxy | [ADR-040](../../decisions/040-webtransport-alpn-stream-proxy.md) | Browser → WebTransport stream → any ALPN handler (SSH, git, SFTP) via WASM parser | +| `to_mcp` tool-gateway pattern | [ADR-041](../../decisions/041-mcp-tool-gateway-pattern.md) | 4 fixed gateway tools (search/schema/call/batch), not one tool per operation | | `alknet-call` is protocol-foundation | [ADR-003](../../decisions/003-crate-decomposition.md) Am. 1 | `alknet-http` depends on `alknet-call` (types, not peer handler) | | Bearer auth via `resolve_from_token` | [ADR-004](../../decisions/004-auth-as-shared-core.md) | HTTP handler credential source + resolution (settled) | | Stealth mode = HTTP handler on standard ALPNs | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Decoy for unknown paths (settled) | diff --git a/docs/architecture/decisions/041-mcp-tool-gateway-pattern.md b/docs/architecture/decisions/041-mcp-tool-gateway-pattern.md new file mode 100644 index 0000000..939936c --- /dev/null +++ b/docs/architecture/decisions/041-mcp-tool-gateway-pattern.md @@ -0,0 +1,215 @@ +# ADR-041: MCP Tool-Gateway Pattern for to_mcp + +## Status + +Proposed + +## Context + +The current `to_mcp` spec (`crates/http/http-mcp.md`) describes +`to_mcp` as "exposes the local registry's `External` operations as MCP +tools" — one MCP tool per alknet operation. An LLM connecting to an +alknet node with 200 registered operations gets 200 MCP tools dumped +into its context. This is the **tool-bloat problem**: the LLM's context +is bloated with tools that are irrelevant to the current task, degrading +its reasoning and wasting context budget. + +### The problem in concrete terms + +The MCP `tools/list` response returns every tool the server exposes. +An MCP client (an editor, an AI tool) loads all of them into the LLM's +context as tool definitions. An alknet node exposing 200 operations +produces a `tools/list` response with 200 `Tool` structs, each with a +name, description, and `inputSchema` (JSON Schema). The LLM sees 200 +tool definitions whether it needs them or not. This is the same anti- +pattern as loading every man page into a shell's environment — +absurd, but it's what the naive one-tool-per-operation mapping produces. + +### The pattern that works + +The project already has two examples of a better pattern: + +1. **The `memory` tool** (opencode): read-only access to the underlying + session database. The LLM doesn't load all past sessions into + context — it calls `memory` with a search query when it needs to + recall something. One tool, access to a large dataset on demand. +2. **The `worktree` tool** (opencode): gates 8-10 sub-tools behind a + single `worktree` entry point. The LLM has one tool in context; the + sub-tools are discovered and invoked through it. + +The general principle (same as Linux's `man` command): **don't load all +documentation/tools into context 24/7; expose a small fixed set of +meta-tools that gate access to the full set on demand.** + +### The call protocol's discovery surface + +The call protocol already has the discovery primitives that make this +work: + +- `services/list` — lists registered operations (filtered by + `AccessControl`). +- `services/schema` — returns an operation's `OperationSpec` + (input/output JSON Schemas, error schemas). + +The `to_mcp` gateway exposes these primitives (plus invocation) as a +small fixed set of MCP tools. The LLM searches for what it needs, learns +the schema, then calls — instead of having every operation pre-loaded. + +## Decision + +### 1. `to_mcp` exposes a fixed gateway tool set, not one tool per operation + +`to_mcp` exposes a small fixed set of MCP tools that gate access to the +full operation registry. The LLM has a few tools in context (not +hundreds); it discovers and invokes operations through the gateway. + +The gateway tool set (initial, two-way-door extensible): + +| MCP tool | Call protocol operation | Purpose | +|----------|------------------------|---------| +| `search` | `services/list` | List/search available operations (filtered by the caller's `AccessControl`). The LLM discovers what it can call. | +| `schema` | `services/schema` | Get an operation's `OperationSpec` (input/output JSON Schemas, error schemas). The LLM learns how to call a specific operation. | +| `call` | `call.requested` (Query/Mutation) | Invoke an operation by name with a JSON input. Returns the operation's output (or a typed error per ADR-023). | +| `batch` | multiple `call.requested` | Invoke multiple operations in one tool call (correlated request IDs, OQ-14). The LLM batches independent calls. | + +Four tools. The LLM calls `search` to find operations relevant to its +task, `schema` to learn the input shape, `call` to invoke. Same pattern +as `man ` — discover on demand, don't preload. + +### 2. `Subscription` operations are excluded from the MCP gateway + +MCP tool calls are request/response — an LLM invokes a tool and +receives a result. The call protocol's `Subscription` type +(streaming, many `call.responded` events) does not map onto the MCP +tool-call model. The gateway exposes only `Query` and `Mutation` +operations (request/response). `Subscription` operations are filtered +out of `search` results and cannot be invoked via `call`. + +This is a deliberate scoping decision, not a deferral: MCP tool calls +are request/response by protocol design; streaming subscriptions are a +different interaction model that doesn't fit the LLM tool-call pattern. +If a future MCP extension adds streaming tool calls, the gateway could +expose `Subscription` operations through it — but that's a future MCP +spec question, not an alknet decision. + +### 3. `search` returns names + descriptions, not full schemas + +The `search` tool (backed by `services/list`) returns operation names, +namespaces, types, and short descriptions — not the full input/output +JSON Schemas. This keeps the search result small (the LLM is choosing +what to call, not how to call it yet). The LLM calls `schema` for the +specific operation it wants to invoke, getting the full `OperationSpec` +only when needed. Two-step discovery: search (cheap, list) → schema +(targeted, full spec). + +### 4. `call` maps to the call protocol's request/response dispatch + +The `call` tool takes `{ operation: "/fs/readFile", input: { ... } }` +and dispatches through the `OperationRegistry::invoke()` — the same +dispatch path the HTTP server uses (ADR-036). The result is mapped to +an MCP `CallToolResult` (`structuredContent` for the output, or +`isError: true` for a `CallError` with the typed `details` payload per +ADR-023). The `batch` tool takes an array of `{ operation, input }` +pairs and returns an array of results. + +### 5. `AccessControl` gates the gateway + +The `search` tool's results are filtered by the caller's +`AccessControl::check(identity)` — the LLM (authenticated by bearer +token, ADR-034 §4) sees only the operations it is authorized to call. +The `call` tool's dispatch runs the same `AccessControl` check. An +LLM that calls `call` with an operation it isn't authorized for gets +`FORBIDDEN` (mapped to an MCP error result). The gateway does not +bypass the call protocol's authorization — it's the same dispatch +path, just reached through an MCP tool call instead of an HTTP request. + +## Consequences + +**Positive:** +- The LLM has 4 tools in context, not hundreds. Context budget is + preserved for the actual task; the LLM discovers operations on + demand through `search` + `schema`. This is the same pattern that + makes the `memory` and `worktree` tools effective. +- The gateway maps onto the call protocol's existing discovery + primitives (`services/list`, `services/schema`) and dispatch + (`OperationRegistry::invoke`). No new call-protocol mechanisms + needed — `to_mcp` is a thin wrapper around the existing surface. +- `AccessControl` gates the gateway. An LLM sees only what it's + authorized to call; the gateway doesn't leak operation existence or + schemas to unauthorized callers. +- `Subscription` exclusion is explicit. The LLM tool-call model is + request/response; streaming doesn't fit, and pretending it does + would produce a broken mapping. + +**Negative:** +- The LLM needs two round-trips to call an operation it hasn't seen + before (`search` → `schema` → `call`). A one-tool-per-operation + mapping would let it call directly. The tradeoff: 4 tools in context + + 2 discovery round-trips vs. 200 tools in context + 0 round-trips. + The context budget is the scarcer resource; the round-trips are + cheap (the MCP server is local or nearby). +- The `search` tool's result format (names + descriptions, not full + schemas) means the LLM may need to call `schema` for multiple + operations before finding the right one. Mitigated: `search` can + accept a query/filter (namespace, keyword) to narrow results. +- The gateway tool set is fixed (4 tools). An operation that wants a + custom MCP tool (e.g., a specialized `git_clone` tool with a curated + input schema, not the generic `call` wrapper) is not exposed through + the gateway. A future "custom tool" extension could allow operations + to declare an MCP tool projection — but the gateway pattern is the + default, and the custom-tool path is additive (not a replacement). + +## Assumptions + +1. **The LLM context budget is the scarcer resource.** The tradeoff + favoring 4 tools + discovery round-trips over 200 preloaded tools + assumes the LLM's context window is more valuable than the network + round-trips. This holds for current LLMs (context windows are + large but not unlimited; tool definitions consume context + proportionally to their schemas). + +2. **`Query` and `Mutation` cover the LLM tool-call use case.** + LLMs invoke tools in a request/response pattern: call a tool, + receive a result, reason about it. Streaming subscriptions + (`call.responded` events over time) don't fit this pattern — the + LLM expects one result per tool call. The assumption is that the + operations an LLM wants to call are `Query`/`Mutation`, not + `Subscription`. + +3. **The gateway tool set is stable.** Once LLM clients build + prompts/workflows against the `search`/`schema`/`call`/`batch` + tool set, changing the tool surface (renaming, removing) breaks + them. Adding tools is additive (non-breaking); removing or renaming + is a one-way door. The initial 4-tool set is the published contract. + +4. **`AccessControl` filtering is sufficient for `search`.** The LLM + sees the operations it's authorized to call. If an operation's + existence is itself sensitive (the LLM shouldn't know it exists + even if it can't call it), `Visibility::Internal` (ADR-015) is the + mechanism — Internal ops are excluded from `services/list` and + therefore from `search` results. The gateway does not add a + separate visibility layer. + +## References + +- [ADR-015](015-privilege-model-and-authority-context.md) — + External/Internal visibility (Internal ops excluded from + `services/list`, therefore from `search`) +- [ADR-017](017-call-protocol-client-and-adapter-contract.md) — + `to_*` adapters are projections (consume the registry, don't + produce entries) +- [ADR-023](023-operation-error-schemas.md) — typed error `details` + mapped to MCP error results +- [ADR-034](034-outgoing-only-x509-and-three-peer-roles.md) §4 — + browsers/MCP clients are not alknet peers (bearer token, no + `PeerId`) +- [ADR-036](036-http-to-call-operation-mapping.md) — the HTTP-to-call + dispatch path the `call` tool reuses +- [ADR-037](037-mcp-stdio-transport-exclusion.md) — streamable HTTP + only (the transport `to_mcp` uses) +- `crates/http/http-mcp.md` — the spec that implements the gateway +- `/workspace/rust-sdk/crates/rmcp/src/model/tool.rs` — the MCP + `Tool` struct (name, description, input_schema, output_schema) +- `/workspace/rust-sdk/crates/rmcp/src/handler/server.rs` — + `list_tools` / `call_tool` server trait (the interface `to_mcp` + implements) \ No newline at end of file