From e855c8c7eb5ca3625036134d6dd43c87327edd06 Mon Sep 17 00:00:00 2001 From: "glm-5.2" Date: Wed, 1 Jul 2026 07:11:17 +0000 Subject: [PATCH] docs(http): decompose alknet-http spec into 19 implementation tasks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Break the alknet-http architecture spec into atomic, dependency-ordered tasks in tasks/http/, following the taskgraph frontmatter conventions used by the call/core/vault crates. Tasks span 7 phases across 5 module subdirectories (server/, gateway/, client/, adapters/, websocket/): - Phase 0: crate-init (foundation) - Phase 1: gateway-dispatch-spine, error-mapping, shared-http-client (shared infrastructure) - Phase 2: http-adapter, bearer-auth-middleware, gateway-endpoints, healthz-decoy (HTTP server surface) - Phase 3: to-openapi (OpenAPI gateway projection) - Phase 4: from-openapi (OpenAPI adapter, reqwest forwarding) - Phase 5: dispatcher-transport-abstraction, upgrade-handler, connection-overlay (WebSocket browser bidirectional path) - Phase 6: from-mcp, to-mcp (MCP adapters, feature-gated) - Phase 7: review-http, review-websocket, review-mcp, review-http-final (quality checkpoints) The gateway-dispatch-spine task implements the thin shared core recommended by the gateway-factoring research (concrete struct, not a trait). The dispatcher-transport-abstraction task is a cross-crate change to alknet-call (exposes EventEnvelope-level dispatch API for non-QUIC transports) — the highest-risk task. WebTransport/h3 is deferred per ADR-044 and has no tasks; from_wss is out of scope. Validated: 19 tasks, no cycles, 8 parallel generations, critical path length 8 (through the WebSocket strand). --- tasks/http/adapters/from-mcp.md | 226 ++++++++++++++++ tasks/http/adapters/from-openapi.md | 242 ++++++++++++++++++ tasks/http/adapters/to-mcp.md | 208 +++++++++++++++ tasks/http/adapters/to-openapi.md | 188 ++++++++++++++ tasks/http/client/shared-http-client.md | 173 +++++++++++++ tasks/http/crate-init.md | 146 +++++++++++ tasks/http/gateway/error-mapping.md | 139 ++++++++++ tasks/http/gateway/gateway-dispatch-spine.md | 183 +++++++++++++ tasks/http/review-http-final.md | 179 +++++++++++++ tasks/http/review-http.md | 166 ++++++++++++ tasks/http/review-mcp.md | 161 ++++++++++++ tasks/http/review-websocket.md | 154 +++++++++++ tasks/http/server/bearer-auth-middleware.md | 179 +++++++++++++ tasks/http/server/gateway-endpoints.md | 194 ++++++++++++++ tasks/http/server/healthz-decoy.md | 146 +++++++++++ tasks/http/server/http-adapter.md | 217 ++++++++++++++++ tasks/http/websocket/connection-overlay.md | 182 +++++++++++++ .../dispatcher-transport-abstraction.md | 180 +++++++++++++ tasks/http/websocket/upgrade-handler.md | 230 +++++++++++++++++ 19 files changed, 3493 insertions(+) create mode 100644 tasks/http/adapters/from-mcp.md create mode 100644 tasks/http/adapters/from-openapi.md create mode 100644 tasks/http/adapters/to-mcp.md create mode 100644 tasks/http/adapters/to-openapi.md create mode 100644 tasks/http/client/shared-http-client.md create mode 100644 tasks/http/crate-init.md create mode 100644 tasks/http/gateway/error-mapping.md create mode 100644 tasks/http/gateway/gateway-dispatch-spine.md create mode 100644 tasks/http/review-http-final.md create mode 100644 tasks/http/review-http.md create mode 100644 tasks/http/review-mcp.md create mode 100644 tasks/http/review-websocket.md create mode 100644 tasks/http/server/bearer-auth-middleware.md create mode 100644 tasks/http/server/gateway-endpoints.md create mode 100644 tasks/http/server/healthz-decoy.md create mode 100644 tasks/http/server/http-adapter.md create mode 100644 tasks/http/websocket/connection-overlay.md create mode 100644 tasks/http/websocket/dispatcher-transport-abstraction.md create mode 100644 tasks/http/websocket/upgrade-handler.md diff --git a/tasks/http/adapters/from-mcp.md b/tasks/http/adapters/from-mcp.md new file mode 100644 index 0000000..644cdc2 --- /dev/null +++ b/tasks/http/adapters/from-mcp.md @@ -0,0 +1,226 @@ +--- +id: http/adapters/from-mcp +name: Implement from_mcp adapter (rmcp streamable HTTP client, tools/list discovery, structuredContent handling) +status: pending +depends_on: [http/client/shared-http-client, http/gateway/error-mapping] +scope: broad +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement `from_mcp` in `src/adapters/from_mcp.rs` (feature-gated behind +`mcp`). This is the MCP-direction adapter: it discovers remote MCP tools +via the MCP `tools/list` call over streamable HTTP, and registers each +as a `HandlerRegistration` bundle with a forwarding handler that calls +the remote tool via `tools/call`. Uses rmcp's +`StreamableHttpClientTransport` (reqwest-based). Implements +`OperationAdapter` (ADR-017 §5). + +### Streamable HTTP only (ADR-037) + +MCP defines two transports: streamable HTTP and stdio. **alknet-http +supports only streamable HTTP.** Stdio is not built — it is the spawn- +arbitrary-executable RCE vector that the rest of the architecture is +designed to avoid (ADR-037). The `mcp` feature gate pulls in rmcp with +the streamable HTTP transport features only; the stdio transport +(`transport-child-process`) is not a dependency, not optional, not +behind a separate feature. + +### The adapter (http-mcp.md §"from_mcp") + +```rust +pub struct FromMCP { + /// The MCP server's streamable HTTP endpoint URL. + endpoint: String, + /// Bearer token for the MCP server (from Capabilities at registration). + auth_token: Option, + /// The importing deployment's name for this MCP server (becomes the + /// operation namespace). + namespace: String, +} + +#[async_trait] +impl OperationAdapter for FromMCP { + async fn import(&self) -> Result, AdapterError>; +} +``` + +The adapter: + +1. Connects to the MCP server's streamable HTTP endpoint using rmcp's + `StreamableHttpClientTransport::from_uri(endpoint)` (the rmcp + `streamable_http.rs` client example shows the pattern: `client_info + .serve(transport).await`, then `client.list_tools()`, + `client.call_tool()`). On connection failure, returns + `AdapterError::DiscoveryFailed`; on 401, `AdapterError::Unauthorized`. + +2. Calls `tools/list` → the list of MCP tools (name, description, + `inputSchema`, optional `outputSchema`). + +3. For each tool, constructs a `HandlerRegistration`: + - `spec.name` = the tool name (or `namespace/tool_name` if a + namespace prefix is configured — same local-naming sugar as + `from_call`'s `FromCallConfig::namespace_prefix`, ADR-029 §5). + - `spec.namespace` = the configured `namespace`. + - `spec.op_type` = `Mutation` (MCP tools are call/response; the MCP + spec doesn't have a native streaming/tool-subscription distinction + — `tools/call` returns a result. If MCP adds a streaming-tool + extension, a `Subscription` mapping would be added.) + - `spec.visibility` = `Internal` (adapter-registered, ADR-015). + - `spec.input_schema` = the tool's `inputSchema` (JSON Schema). + - `spec.output_schema` = depends on whether the tool declares + `outputSchema` (MCP 2025-06-18+): + - **`outputSchema` present** → `output_schema` = the declared + schema (converted from JSON Schema). The result arrives in + `CallToolResult.structured_content` and is composable with + local operations. + - **`outputSchema` absent** (older MCP servers) → `output_schema` + = the MCP `ContentBlock` union (`text | image | audio | resource + | resource_link` — a well-defined MCP type, *not* + `Type.Unknown()`). The result arrives in + `CallToolResult.content` as a `Vec`. See "Output + handling" below. + - `spec.error_schemas` = the MCP tool's error description mapped to + `ErrorDefinition` (ADR-023 — MCP tool definitions carry error + descriptions; the adapter maps them). + - `spec.access_control` = `AccessControl::default()`. + - `handler` = a forwarding handler (see Forwarding Handler below). + - `provenance` = `FromMCP`, `composition_authority: None`, + `scoped_env: None` (leaf — ADR-022). + - `capabilities` = the bearer token for the MCP server (injected by + the assembly layer at registration — see No-Env-Vars below). + +4. Returns the bundles. The caller (the assembly layer) registers them + in the `OperationRegistry`. + +### Forwarding handler (http-mcp.md §"Forwarding handler") + +At call time, the `from_mcp` forwarding handler: + +1. Reads the call input (`serde_json::Value` — the tool arguments). +2. Calls `client.call_tool({ name: tool_name, arguments: input })` via + the rmcp client (the `streamable_http.rs` example shows + `client.call_tool(CallToolRequestParams::new(name).with_arguments(...))`). +3. On success: extracts the result from the `CallToolResult`, following + the `structuredContent`-preferred-over-content-blocks rule (see + "Output handling" below), wraps in a `ResponseEnvelope`, returns. +4. On `result.isError`: maps to a `CallError` with the MCP error content + (the TS `from_mcp.ts` handler shows the error mapping), returns. +5. The rmcp client connection is maintained for the lifetime of the + registration (the MCP server is a persistent streamable HTTP + endpoint, not a per-call connection). + +The handler is opaque to the `CallAdapter` — `Arc` the +registry dispatches. `alknet-call` never sees rmcp. + +### Output handling: structuredContent vs content blocks (http-mcp.md §"Output handling") + +MCP `CallToolResult` (rmcp `model.rs`) carries two result fields: +`content: Vec` (always present, defaults to `[]`) and +`structured_content: Option` (present when the tool declared +`outputSchema`). The `from_mcp` handler follows the same rule the TS +adapter (`@alkdev/operations/src/from_mcp.ts`) and the rmcp SDK +(`CallToolResult::into_typed`) use: + +- **`structured_content` present** (tool declared `outputSchema`): the + handler uses `structured_content` as the result, validated/cast + against the declared `output_schema`. This is the composable case — + the data matches the declared type, so a composing handler can use it + as a typed value. +- **`structured_content` absent** (older server, no `outputSchema`): + the handler maps `content: Vec` to the + `ContentBlock`-union `output_schema` (text/image/audio/resource/ + resource_link). The TS `mapMCPContentBlocks` shows the mapping; the + Rust `ContentBlock` enum (`rmcp/src/model/content.rs`) is the same + shape. The common sub-case is a single `Text` block — older servers + often JSON-stringify structured data into the `text` field. The + adapter does *not* attempt to `JSON.parse` the text heuristically + (fragile, not the adapter's concern); it carries the `ContentBlock` + union as the typed result. A consumer that knows the text is JSON can + parse it downstream. + +The `isError: true` case is handled separately (step 4 above) — it +maps to a `CallError`, not to the output handling path. + +### No-Env-Vars (http-mcp.md §"No-Env-Vars") + +The `from_mcp` forwarding handler reads the MCP server's bearer token +from `context.capabilities` (the same injection path as `from_openapi`), +not from `std::env::var`. The assembly layer injects the token at +registration; the handler reads it per-call. This is the no-env-vars +invariant (ADR-014). + +## Acceptance Criteria + +- [ ] `FromMCP` struct with `endpoint`, `auth_token`, `namespace` +- [ ] `OperationAdapter` impl: `async fn import(&self) -> Result, AdapterError>` +- [ ] Connects via rmcp `StreamableHttpClientTransport::from_uri(endpoint)` +- [ ] Connection failure → `AdapterError::DiscoveryFailed` +- [ ] 401 → `AdapterError::Unauthorized` +- [ ] Calls `tools/list` → MCP tools (name, description, inputSchema, outputSchema) +- [ ] For each tool: constructs `HandlerRegistration` +- [ ] `spec.name` = tool name (or `namespace/tool_name` with prefix) +- [ ] `spec.namespace` = configured `namespace` +- [ ] `spec.op_type` = `Mutation` (MCP tools are call/response) +- [ ] `spec.visibility` = `Internal` (ADR-015) +- [ ] `spec.input_schema` = tool's `inputSchema` +- [ ] `spec.output_schema` = declared `outputSchema` if present, else `ContentBlock` union +- [ ] `spec.error_schemas` from MCP tool error descriptions (ADR-023) +- [ ] `spec.access_control` = `AccessControl::default()` +- [ ] `provenance` = `FromMCP`, `composition_authority: None`, `scoped_env: None` (ADR-022) +- [ ] `capabilities` = bearer token for MCP server (injected at registration) +- [ ] Forwarding handler calls `client.call_tool({ name, arguments })` +- [ ] `structured_content` present → use as result (validated against `output_schema`) +- [ ] `structured_content` absent → map `content: Vec` to `ContentBlock` union +- [ ] No heuristic `JSON.parse` of text blocks (carry as `ContentBlock`) +- [ ] `isError: true` → `CallError` with MCP error content +- [ ] rmcp client connection maintained for registration lifetime +- [ ] No-env-vars: handler reads `context.capabilities`, never `std::env::var` (ADR-014) +- [ ] Feature-gated behind `mcp` (no compile without `mcp` feature) +- [ ] stdio transport NOT built (ADR-037 — streamable HTTP only) +- [ ] Unit test: `import()` with mock MCP server → `HandlerRegistration` bundles +- [ ] Unit test: `outputSchema` present → `output_schema` = declared schema +- [ ] Unit test: `outputSchema` absent → `output_schema` = `ContentBlock` union +- [ ] Unit test: `structured_content` present → used as result +- [ ] Unit test: `structured_content` absent → `content` blocks mapped to union +- [ ] Unit test: `isError: true` → `CallError` +- [ ] Integration test: forwarding handler calls remote MCP tool via rmcp +- [ ] Integration test: no `std::env::var` reads in the forwarding handler +- [ ] `cargo test -p alknet-http --features mcp` succeeds +- [ ] `cargo clippy -p alknet-http --features mcp --all-targets` succeeds with no warnings +- [ ] `cargo check -p alknet-http` (no `mcp` feature) succeeds — from_mcp not compiled + +## References + +- docs/architecture/crates/http/http-mcp.md — from_mcp (full spec) +- docs/architecture/decisions/037-mcp-stdio-transport-exclusion.md — ADR-037 (streamable HTTP only) +- docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017 §5 (OperationAdapter) +- docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (Internal) +- docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022 (leaf) +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (error fidelity) +- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (no env vars) +- /workspace/rust-sdk/crates/rmcp/src/model.rs — CallToolResult (content + structured_content) +- /workspace/rust-sdk/crates/rmcp/src/model/content.rs — ContentBlock enum +- /workspace/rust-sdk/examples/clients/src/streamable_http.rs — streamable HTTP MCP client pattern +- /workspace/@alkdev/operations/src/from_mcp.ts — TypeScript prior art (mapMCPContentBlocks, structuredContent logic) + +## Notes + +> from_mcp is feature-gated behind mcp (rmcp dependency). Streamable HTTP +> only — stdio is NOT built (ADR-037). The output handling follows the +> structuredContent-preferred-over-content-blocks rule (same as the TS +> adapter and rmcp's into_typed). The adapter does NOT heuristically +> JSON.parse text blocks — it carries the ContentBlock union as the +> typed result; a downstream consumer that knows the text is JSON can +> parse it. The no-env-vars invariant applies (handler reads +> context.capabilities, not std::env::var). The rmcp client connection +> is maintained for the registration lifetime (persistent streamable HTTP +> endpoint, not per-call). The handler is opaque to CallAdapter +> (Arc); alknet-call never sees rmcp. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/adapters/from-openapi.md b/tasks/http/adapters/from-openapi.md new file mode 100644 index 0000000..fd3d896 --- /dev/null +++ b/tasks/http/adapters/from-openapi.md @@ -0,0 +1,242 @@ +--- +id: http/adapters/from-openapi +name: Implement from_openapi adapter (parse OpenAPI, reqwest forwarding handlers, no-env-vars injection) +status: pending +depends_on: [http/client/shared-http-client, http/gateway/error-mapping] +scope: broad +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement `from_openapi` in `src/adapters/from_openapi.rs`. This is the +OpenAPI-direction adapter: it parses an OpenAPI document, constructs a +`HandlerRegistration` bundle per OpenAPI operation with a forwarding +handler that calls the external HTTP endpoint via `reqwest`, and returns +the bundles for registration in the `OperationRegistry`. The adapter +implements `OperationAdapter` (the async trait from `alknet-call`, +ADR-017 §5). + +### The adapter (http-adapters.md §"from_openapi") + +```rust +pub struct FromOpenAPI { + spec: OpenAPISpec, + config: HttpServiceConfig, +} + +#[async_trait] +impl OperationAdapter for FromOpenAPI { + async fn import(&self) -> Result, AdapterError>; +} +``` + +### Type definitions (http-adapters.md §"Type definitions") + +```rust +/// A parsed OpenAPI document. The concrete type is a two-way-door +/// implementation detail (openapiv3::OpenApi, a local alknet-http type, +/// or a serde_json::Value-based parse); the one-way constraint is that +/// from_openapi accepts a standard OpenAPI 3.x JSON/YAML doc and to_openapi +/// produces one. Both directions share the same Rust type, but not the +/// same document shape. Coordinate with the to-openapi task on the type. +pub struct OpenAPISpec { + pub info: OpenAPIInfo, + pub paths: BTreeMap, + pub components: Option, + // ... OpenAPI 3.x fields as needed +} + +/// Configuration for an HTTP-backed adapter (from_openapi). Carries the +/// base URL, auth scheme (from Capabilities at registration, not env vars), +/// and optional headers. The auth field is the scheme the external API +/// expects; the credential itself is read from +/// OperationContext.capabilities at call time, not stored here. +pub struct HttpServiceConfig { + pub namespace: String, + pub base_url: String, + pub auth: Option, + pub default_headers: HashMap, +} + +pub enum HttpAuthScheme { + Bearer, // Authorization: Bearer + ApiKey { header_name: String }, // e.g., X-API-Key: + Basic, // Authorization: Basic +} +``` + +### The import flow (http-adapters.md §"from_openapi") + +The adapter: + +1. Parses the OpenAPI document (`OpenAPISpec` — `paths`, `components`, + `$ref` resolution). On parse failure, returns + `AdapterError::SchemaParse`. The TS prior art + (`@alkdev/operations/src/from_openapi.ts`) shows the parsing patterns: + `resolveRef` for `$ref`, `resolveRefsRecursive` for nested refs, + `buildInputSchema` (parameters + request body → input JSON Schema), + `buildOutputSchema` (200/201 response → output JSON Schema), + `detectOperationType` (SSE response → `Subscription`, GET → `Query`, + else `Mutation`). + +2. For each `(path, method, operation)` in `spec.paths`, constructs a + `HandlerRegistration`: + - `spec.name` = the `operationId` (or a generated + `${method}_${path_parts}` name if `operationId` is absent — same + normalization as the TS `normalizeOperationId`). + - `spec.namespace` = the `config.namespace` (the importing + deployment's name for the service, not the OpenAPI doc's `info.title`). + - `spec.op_type` = `Query` / `Mutation` / `Subscription` (detected + from the method + response content type, same as TS). + - `spec.visibility` = `Internal` (adapter-registered ops are + composition material, not directly callable from the wire — ADR-015). + - `spec.input_schema` / `output_schema` = the JSON Schemas built + from the OpenAPI parameters/responses. + - `spec.error_schemas` = the `ErrorDefinition`s built from the + non-2xx OpenAPI responses (ADR-023 §5 — see Error Fidelity below). + - `spec.access_control` = `AccessControl::default()` (the adapter + doesn't declare scopes; the composing handler that reaches the + imported op gates access). + - `handler` = a forwarding handler (see Forwarding Handler below). + - `provenance` = `FromOpenAPI`, `composition_authority: None`, + `scoped_env: None` (leaf — ADR-022). + - `capabilities` = the credentials the forwarding handler needs (the + bearer token / API key for the external HTTP endpoint, injected by + the assembly layer at registration — see No-Env-Vars below). + +3. Returns the bundles. The caller (the assembly layer) registers them + in the `OperationRegistry`. + +### Forwarding handler (http-adapters.md §"Forwarding handler") + +The forwarding handler is the `Arc` stored in the +`HandlerRegistration`. At call time, it: + +1. Reads the call input (`serde_json::Value`). +2. Builds the outbound HTTP request: + - URL path: substitutes path parameters (`{id}` → input value), + appends query parameters from input fields not in the path. + - Method: the OpenAPI operation's method. + - Headers: `Content-Type: application/json` + the auth header built + from `context.capabilities` (see No-Env-Vars below). + - Body: the `body` field of the input (for `Mutation`/`Subscription`). +3. Sends the request via the shared HTTP client (`SharedHttpClient` — + the `shared-http-client` task). +4. For a `Query`/`Mutation`: parses the response body (JSON, text, or + binary — same content-type branching as the TS `createHTTPOperation`), + wraps it in a `ResponseEnvelope`, returns. +5. For a `Subscription` (`text/event-stream` response): streams + `call.responded` events as the SSE chunks arrive (same SSE parsing as + the TS `parseSSEFrames`), then `call.completed` on stream end. +6. On HTTP error (non-2xx): maps to the declared `ErrorDefinition` by + HTTP status code (see Error Fidelity below), returns a `CallError`. + +The handler is opaque to the `CallAdapter` — it's an `Arc` +the registry dispatches. `alknet-call` never sees `reqwest`. + +### No-Env-Vars credential injection (http-adapters.md §"No-Env-Vars credential injection") + +The forwarding handler is the **credential injection point** for the +no-env-vars architecture. The handler reads +`context.capabilities.get("")` (e.g., `"openai"`, `"vastai"`, +`"github"`), extracts the credential, and injects it into the outbound +HTTP request: + +- Bearer token → `Authorization: Bearer `. +- API key → the header the OpenAPI spec declares (e.g., `X-API-Key: + `, or `Authorization: ApiKey ` — the `HTTPServiceConfig.auth` + in the TS prior art shows the three auth types: `bearer`, `apiKey`, + `basic`). +- Basic auth → `Authorization: Basic `. + +The credential comes from `Capabilities`, which was populated by the +dispatch path from the `HandlerRegistration.capabilities` bundle +(ADR-022 §6), which was populated by the assembly layer from the vault +(ADR-014). The handler never reads `std::env::var`. This is the +spec-level invariant: no handler reads outbound credentials from any +source other than `OperationContext.capabilities`. + +### Error Fidelity (ADR-023, http-adapters.md §"Error Fidelity") + +`from_openapi` maps OpenAPI non-2xx response status codes to +`ErrorDefinition`s (ADR-023 §5). The normative rule (review #002 W20): +`from_openapi` must not produce error codes that collide with the five +protocol-level codes (`NOT_FOUND`, `FORBIDDEN`, `INVALID_INPUT`, +`INTERNAL`, `TIMEOUT`). The adapter prefixes imported error codes with +`HTTP_` and the status number: + +```rust +// OpenAPI: 404: { schema: NotFoundError } +// → ErrorDefinition { code: "HTTP_404", http_status: Some(404), schema: NotFoundError } +``` + +## Acceptance Criteria + +- [ ] `FromOpenAPI` struct with `spec: OpenAPISpec`, `config: HttpServiceConfig` +- [ ] `OperationAdapter` impl: `async fn import(&self) -> Result, AdapterError>` +- [ ] Parses OpenAPI doc (`paths`, `components`, `$ref` resolution) +- [ ] Parse failure → `AdapterError::SchemaParse` +- [ ] For each `(path, method, operation)`: constructs `HandlerRegistration` +- [ ] `spec.name` = `operationId` (or generated `${method}_${path_parts}`) +- [ ] `spec.namespace` = `config.namespace` +- [ ] `spec.op_type` = `Query`/`Mutation`/`Subscription` (detected from method + response content type) +- [ ] `spec.visibility` = `Internal` (ADR-015) +- [ ] `spec.input_schema` / `output_schema` from OpenAPI parameters/responses +- [ ] `spec.error_schemas` from non-2xx OpenAPI responses with `HTTP_` prefix (ADR-023) +- [ ] `spec.access_control` = `AccessControl::default()` +- [ ] `provenance` = `FromOpenAPI`, `composition_authority: None`, `scoped_env: None` (ADR-022) +- [ ] `capabilities` = credentials for the external endpoint (injected at registration) +- [ ] Forwarding handler builds outbound HTTP request (path params, query, headers, body) +- [ ] Forwarding handler sends via `SharedHttpClient` (the shared client) +- [ ] `Query`/`Mutation`: parses response body (JSON/text/binary), wraps in `ResponseEnvelope` +- [ ] `Subscription` (`text/event-stream`): streams `call.responded` from SSE chunks, then `call.completed` +- [ ] HTTP error (non-2xx): maps to declared `ErrorDefinition` by status code, returns `CallError` +- [ ] No-env-vars: handler reads `context.capabilities.get("")`, never `std::env::var` (ADR-014) +- [ ] Bearer/ApiKey/Basic auth injection from `Capabilities` +- [ ] `HttpAuthScheme` enum with `Bearer`, `ApiKey { header_name }`, `Basic` +- [ ] `HttpServiceConfig` with `namespace`, `base_url`, `auth`, `default_headers` +- [ ] Unit test: parse a minimal OpenAPI doc → one `HandlerRegistration` +- [ ] Unit test: parse failure → `AdapterError::SchemaParse` +- [ ] Unit test: `operationId` absent → generated name `${method}_${path_parts}` +- [ ] Unit test: GET → `Query`, POST → `Mutation`, SSE response → `Subscription` +- [ ] Unit test: error response 404 → `ErrorDefinition { code: "HTTP_404", http_status: Some(404) }` +- [ ] Unit test: forwarding handler injects Bearer token from `context.capabilities` +- [ ] Integration test: forwarding handler calls external endpoint via `SharedHttpClient` +- [ ] Integration test: SSE response streams `call.responded` events +- [ ] Integration test: no `std::env::var` reads in the forwarding handler +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-adapters.md — from_openapi (full spec) +- docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017 §5 (OperationAdapter trait) +- docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (Internal visibility) +- docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022 (leaf provenance) +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (HTTP_ prefix, error fidelity) +- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (no env vars) +- /workspace/@alkdev/operations/src/from_openapi.ts — TypeScript prior art (parsing, SSE, auth headers, createHTTPOperation, parseSSEFrames) + +## Notes + +> from_openapi is the no-env-vars credential injection point. The +> forwarding handler reads context.capabilities, not std::env::var — +> this is the spec-level invariant (ADR-014). The handler is opaque to +> CallAdapter (Arc); alknet-call never sees reqwest. The +> error codes are prefixed HTTP_ to avoid collision with +> protocol-level codes (ADR-023, review #002 W20). The OpenAPISpec type +> is shared with to_openapi (coordinate on the type); the shape is not +> (from_openapi consumes per-operation-paths, to_openapi produces the +> 5-endpoint gateway doc). The TS prior art (@alkdev/operations/src/ +> from_openapi.ts) shows the parsing patterns (resolveRef, +> buildInputSchema, buildOutputSchema, detectOperationType, +> createHTTPOperation, parseSSEFrames) — the SSE normalization patterns +> stay referenced, the client construction anti-patterns (env-var +> config, hand-rolled retry) are discarded. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/adapters/to-mcp.md b/tasks/http/adapters/to-mcp.md new file mode 100644 index 0000000..12e01af --- /dev/null +++ b/tasks/http/adapters/to-mcp.md @@ -0,0 +1,208 @@ +--- +id: http/adapters/to-mcp +name: Implement to_mcp gateway projection (4-tool gateway, rmcp StreamableHttpService, ADR-041) +status: pending +depends_on: [http/gateway/gateway-dispatch-spine, http/server/bearer-auth-middleware] +scope: broad +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement `to_mcp` in `src/adapters/to_mcp.rs` (feature-gated behind +`mcp`). This is the MCP-direction gateway projection: it exposes the +local registry's `External` operations as a **fixed gateway tool set** +over streamable HTTP — not one MCP tool per operation. This is the +tool-gateway pattern (ADR-041): the LLM has a few tools in context +(search, schema, call, batch), not hundreds, and discovers operations +on demand through the gateway. + +### Pure projection (ADR-017 §5) + +`to_mcp` is a pure projection — it consumes the registry and does not +produce entries for it. It is not an `OperationAdapter`. An external MCP +client (an editor, an AI tool) discovers and calls alknet operations +through the MCP protocol. + +### The gateway tool set (http-mcp.md §"The gateway tool set") + +`to_mcp` exposes four MCP tools that gate access to the full operation +registry: + +| MCP tool | Call protocol operation | Purpose | +|----------|------------------------|---------| +| `search` | `services/list` | List/search available operations (filtered by the caller's `AccessControl`). Returns names + descriptions, not full schemas. | +| `schema` | `services/schema` | Get an operation's full `OperationSpec` (input/output JSON Schemas, error schemas). | +| `call` | `call.requested` (Query/Mutation) | Invoke an operation by name with a JSON input. Returns the output or a typed error (ADR-023). | +| `batch` | multiple `call.requested` | Invoke multiple operations in one tool call (correlated request IDs, OQ-14). | + +The LLM calls `search` to discover operations, `schema` to learn an +operation's input shape, `call` to invoke. Same pattern as +`man ` — discover on demand, don't preload. See ADR-041 for +the rationale (the tool-bloat problem). + +### `Subscription` exclusion (http-mcp.md §"Subscription exclusion") + +The gateway exposes only `Query` and `Mutation` operations +(request/response). `Subscription` operations (streaming) are filtered +out of `search` results and cannot be invoked via `call` — MCP tool +calls are request/response by protocol design; streaming subscriptions +don't fit the LLM tool-call pattern. See ADR-041 §2. + +### `to_mcp` service behavior (http-mcp.md §"to_mcp service behavior") + +1. On MCP `tools/list`: returns the fixed gateway tool set (4 tools: + `search`, `schema`, `call`, `batch`), not the registry's + operations. The gateway tools have stable names and schemas; the + registry's operations are discovered through `search`. + +2. On MCP `tools/call`: + - `search` → dispatches `services/list` (filtered by the caller's + `AccessControl`), returns operation names + descriptions. + - `schema` → dispatches `services/schema`, returns the + `OperationSpec`. + - `call` → dispatches `OperationRegistry::invoke()` (via the shared + `GatewayDispatch::invoke()` — the dispatch spine). The result is + mapped to an MCP `CallToolResult` (`structuredContent` for the + output, or `isError: true` for a `CallError` with typed + `details` per ADR-023). + - `batch` → dispatches multiple `call.requested` events, returns + an array of results. + +3. Auth: the Bearer middleware resolves the token via + `IdentityProvider::resolve_from_token()`, same as the HTTP server's + auth (ADR-004). The MCP client authenticates by bearer token; no + `PeerId` (browsers and MCP clients are not alknet peers — ADR-034 §4). + `AccessControl` gates `search` results and `call` dispatch — the + LLM sees only what it's authorized to call. + +### rmcp integration (http-mcp.md §"to_mcp", research §4) + +The rmcp `simple_auth_streamhttp.rs` server example shows the +streamable-HTTP-service-into-axum-`Router` pattern: + +```rust +// From the rmcp example: +let mcp_service: StreamableHttpService = + StreamableHttpService::new( + || Ok(Counter::new()), + LocalSessionManager::default().into(), + StreamableHttpServerConfig::default(), + ); + +let protected_mcp_router = Router::new() + .nest_service("/mcp", mcp_service) + .layer(middleware::from_fn_with_state(token_store, auth_middleware)); +``` + +`alknet-http`'s `to_mcp` follows the same axum integration pattern, but +the rmcp `Service` impl is a gateway service (4 fixed tools) rather than +a per-operation tool registry. The `to_mcp` gateway implements rmcp's +`ServerHandler` trait (`call_tool` / `list_tools`) and is wrapped by +`StreamableHttpService` (a `tower::Service>`). + +### Shared dispatch spine with `to_openapi` (http-mcp.md §"Shared dispatch spine") + +`to_mcp`'s `call` tool and `to_openapi`'s `/call` endpoint share the +same dispatch spine: resolve caller identity (Bearer → +`IdentityProvider::resolve_from_token`) → build a root +`OperationContext` → `OperationRegistry::invoke()` → map the +`ResponseEnvelope` to the gateway's wire shape (`CallToolResult` for +MCP, HTTP JSON for OpenAPI). The wire framing, discovery listing +(`tools/list` vs `/search`), streaming (excluded vs `/subscribe` SSE), +and server integration (rmcp `StreamableHttpService` tower service vs +axum route handlers) are genuinely per-gateway and are not shared. + +The shared spine is the `GatewayDispatch` struct (the +`gateway-dispatch-spine` task). `to_mcp` holds an `Arc` +(or it lives in the rmcp service state) and calls +`GatewayDispatch::invoke()` for the `call` tool. The +`ResponseEnvelope` → `CallToolResult` mapping is `to_mcp`-specific. + +### Auth: shared middleware (research §4.4) + +The Bearer auth middleware (the `bearer-auth-middleware` task) is applied +as an axum layer *around* the nested `StreamableHttpService` (the rmcp +example shows the pattern: `middleware::from_fn_with_state` around +`Router::nest_service`). The `to_mcp` `call_tool` handler reads the +`Identity` from `RequestContext.extensions` (rmcp injects +`http::request::Parts` into extensions — `tower.rs:487-521, 1086-1097`). + +### `CallToolResult` mapping + +The `ResponseEnvelope` → `CallToolResult` mapping uses rmcp's +`IntoCallToolResult` trait (`tool.rs:78-113`): + +- `Ok(value)` → `CallToolResult::structured(value)` (`model.rs:3006`). +- `Err(call_error)` → `CallToolResult::structured_error(error.details)` + (`model.rs:3032`) or `CallToolResult::error(error_data)`. + +## Acceptance Criteria + +- [ ] `to_mcp` implements rmcp `ServerHandler` trait (`call_tool`, `list_tools`) +- [ ] `tools/list` returns 4 fixed gateway tools (`search`, `schema`, `call`, `batch`) +- [ ] `tools/list` does NOT return the registry's operations (discovered via `search`) +- [ ] `search` tool → dispatches `services/list` via `GatewayDispatch::invoke` +- [ ] `search` results are `AccessControl::check(identity)`-filtered +- [ ] `search` results are names + descriptions (not full schemas) +- [ ] `Subscription` ops filtered out of `search` results (ADR-041 §2) +- [ ] `schema` tool → dispatches `services/schema` via `GatewayDispatch::invoke` +- [ ] `call` tool → dispatches via `GatewayDispatch::invoke` (shared spine) +- [ ] `call` result → `CallToolResult::structured(value)` for `Ok` +- [ ] `call` error → `CallToolResult::structured_error(details)` for `Err(CallError)` +- [ ] `batch` tool → loop over `GatewayDispatch::invoke`, returns array +- [ ] Bearer auth via shared `bearer_auth_middleware` (applied around `nest_service`) +- [ ] `Identity` read from `RequestContext.extensions` inside `call_tool` +- [ ] MCP client has no `PeerId` (not an alknet peer, ADR-034 §4) +- [ ] `AccessControl` gates `search` results and `call` dispatch +- [ ] `to_mcp` is a pure projection (consumes registry, does not produce entries) +- [ ] `StreamableHttpService` nested into axum `Router` at `/mcp` +- [ ] Feature-gated behind `mcp` (no compile without `mcp` feature) +- [ ] stdio transport NOT built (ADR-037) +- [ ] Unit test: `tools/list` returns exactly 4 gateway tools +- [ ] Unit test: `search` returns AccessControl-filtered ops (no Subscriptions) +- [ ] Unit test: `schema` returns full OperationSpec +- [ ] Unit test: `call` → `CallToolResult::structured` for success +- [ ] Unit test: `call` → `CallToolResult::structured_error` for CallError +- [ ] Unit test: `batch` returns array of results +- [ ] Integration test: MCP client calls `search` → `schema` → `call` round-trip +- [ ] Integration test: Bearer auth middleware gates `to_mcp` service +- [ ] Integration test: `Identity` survives rmcp framing (research §6 #2) +- [ ] `cargo test -p alknet-http --features mcp` succeeds +- [ ] `cargo clippy -p alknet-http --features mcp --all-targets` succeeds with no warnings +- [ ] `cargo check -p alknet-http` (no `mcp` feature) succeeds — to_mcp not compiled + +## References + +- docs/architecture/crates/http/http-mcp.md — to_mcp (full spec) +- docs/research/alknet-http-gateway-factoring/findings.md — §4 (rmcp StreamableHttpService constraints), §4.4 (auth middleware sharing) +- docs/architecture/decisions/041-mcp-tool-gateway-pattern.md — ADR-041 (4-tool gateway, Subscription exclusion) +- docs/architecture/decisions/037-mcp-stdio-transport-exclusion.md — ADR-037 (streamable HTTP only) +- docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017 §5 (to_* are projections) +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (error fidelity, CallToolResult mapping) +- docs/architecture/decisions/034-outgoing-only-x509-and-three-peer-roles.md — ADR-034 §4 (MCP clients are not peers) +- /workspace/rust-sdk/examples/servers/src/simple_auth_streamhttp.rs — axum middleware around nested StreamableHttpService +- /workspace/rust-sdk/crates/rmcp/src/handler/server.rs — ServerHandler trait (call_tool, list_tools) +- /workspace/rust-sdk/crates/rmcp/src/handler/server/tool.rs — IntoCallToolResult trait +- /workspace/rust-sdk/crates/rmcp/src/model.rs — CallToolResult (structured, structured_error) + +## Notes + +> to_mcp is the 4-tool gateway (ADR-041): search/schema/call/batch, not +> one tool per operation. The LLM discovers operations on demand through +> search + schema, same as man . Subscription ops are excluded +> (MCP tool calls are request/response). The shared dispatch spine +> (GatewayDispatch) is used for the call tool; the ResponseEnvelope → +> CallToolResult mapping is to_mcp-specific. The Bearer auth middleware +> is shared with the HTTP routes (research §4.4 — applied around +> nest_service). The load-bearing assumption is that Identity survives +> the rmcp framing (research §6 #2 — confirm with a spike that +> ctx.extensions.get::() works inside call_tool). to_mcp is a +> pure projection (consumes registry, does not produce entries). The mcp +> feature gate is optional; stdio is NOT built (ADR-037). + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/adapters/to-openapi.md b/tasks/http/adapters/to-openapi.md new file mode 100644 index 0000000..1e6d713 --- /dev/null +++ b/tasks/http/adapters/to-openapi.md @@ -0,0 +1,188 @@ +--- +id: http/adapters/to-openapi +name: Implement to_openapi gateway projection (5-endpoint OpenAPI doc, info.version semver, ADR-042/045) +status: pending +depends_on: [http/server/gateway-endpoints, http/gateway/gateway-dispatch-spine] +scope: moderate +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement `to_openapi` in `src/adapters/to_openapi.rs`. This is the +OpenAPI gateway projection: it generates an OpenAPI document with a +**fixed 5-endpoint gateway set** that gates access to the full operation +registry — not one path per operation (ADR-042). The external client (a +code generator, a human developer, a `fetch`-based client) calls +`/search` to discover operations, `/schema` to learn an operation's +input shape, `/call` to invoke. Served at `GET /openapi.json` by the HTTP +server. + +### Pure projection (ADR-017 §5) + +`to_openapi` is a pure projection — it consumes the registry and produces +a spec. It does not modify the registry; it does not register +operations; it is not an `OperationAdapter`. The HTTP server serves the +generated spec at `GET /openapi.json`. + +```rust +/// Generate an OpenAPI document describing the 5 gateway endpoints. +/// Pure projection: consumes the registry, does not produce entries. +/// The per-caller operation surface is discovered via /search, not +/// preloaded into the doc (ADR-042 §3). +pub fn to_openapi(registry: &OperationRegistry) -> OpenAPISpec; +``` + +### The gateway endpoint set (http-adapters.md §"The gateway endpoint set") + +`to_openapi` generates 5 fixed endpoints: + +| OpenAPI path | Call protocol | HTTP method | Purpose | +|--------------|--------------|-------------|---------| +| `/search` | `services/list` | `GET` | List/search operations (AccessControl-filtered). Names + descriptions. | +| `/schema` | `services/schema` | `GET` | Get an operation's full `OperationSpec`. | +| `/call` | `call.requested` (Query/Mutation) | `POST` | Invoke an operation. Flat JSON body `{ operation, input }`. | +| `/batch` | multiple `call.requested` | `POST` | Invoke multiple operations. Array of `{ operation, input }`. | +| `/subscribe` | `call.requested` (Subscription) | `POST` (SSE) | Invoke a streaming operation. Body `{ operation, input }`; response `text/event-stream`. | + +The input is always a flat JSON body — no path/query/body split to +reverse-engineer. JSON Schema for the input/output is already in the +`OperationSpec`; the gateway wraps it in OpenAPI's schema format without +splitting parameters. + +`/subscribe` is the one endpoint the MCP gateway excludes (ADR-041 — +MCP tool calls are request/response). OpenAPI/SSE supports streaming; +the gateway's `/subscribe` uses the SSE projection — `call.responded` → +SSE `data:` frames, `call.completed` → stream close. + +### Per-caller API surface (http-adapters.md §"Per-caller API surface") + +The `/search` endpoint's results are `AccessControl::check(identity)`- +filtered — the client sees only the operations it is authorized to call. +The generated OpenAPI doc describes the 5 gateway endpoints (stable, +same for every caller); the per-caller operation surface is discovered +through `/search`, not preloaded into the doc. This is the key +advantage over a traditional per-operation-paths OpenAPI doc: the +per-caller API surface is the default (the Gitea failure mode — dumping +admin ops to every caller — is structurally impossible). + +### `info.version` semver (ADR-045, OQ-39 resolved) + +The generated gateway doc carries `info.version` (semver) tracking the +**gateway endpoint contract**, not the operation set — per-caller +operation changes (add/remove/modify, schema changes) do not bump the +version (the operation set is discovered via `/search`, not preloaded +into the doc). Consumers detect breaking changes via the major version. + +- **Major** = breaking gateway change (an endpoint removed, a request + field removed, a status code changed meaning). +- **Minor** = additive (a new endpoint, a new optional request field). +- **Patch** = wording (doc clarifications, description tweaks). + +The version is a constant in `to_openapi` (bumped manually when the +gateway contract changes), not derived from the registry's operation +set. The initial version is `1.0.0`. + +### Error fidelity (ADR-023, http-adapters.md §"Error Fidelity") + +`to_openapi` projects `error_schemas` to the gateway endpoint's response +definitions. The `/call` endpoint's responses include the +operation-level errors (mapped by `http_status`), plus the protocol- +level errors: + +```yaml +# /call endpoint responses +responses: + '200': { schema: } + '400': { schema: } + '401': { schema: } + '403': { schema: } + '404': { schema: } + '422': { schema: } + '429': { schema: } + '500': { schema: } + '504': { schema: } +``` + +The operation-level errors (with `http_status`) are surfaced on the +`/call` endpoint's response — the gateway propagates the called +operation's `error_schemas` as response definitions. This makes the +adapter contract from ADR-017 faithful on the error axis — no silent +dropping of error contracts. + +### The `OpenAPISpec` type + +The concrete type is a two-way-door implementation detail +(`openapiv3::OpenApi`, a local alknet-http type, or a +`serde_json::Value`-based parse); the one-way constraint is that +`from_openapi` accepts a standard OpenAPI 3.x JSON/YAML doc and +`to_openapi` produces one. Both directions share the same Rust type, but +not the same document shape: `from_openapi` consumes traditional +per-operation-paths docs (one path per operation), while `to_openapi` +produces the 5-endpoint gateway doc (ADR-042). The type is shared; the +shape is not. Coordinate with the `from-openapi` task on the shared type. + +### Traditional per-operation-paths projection (additive, out of scope) + +A deployment that wants a traditional REST OpenAPI doc (per-operation +paths with split parameters) can build it as a separate projection with +HTTP-specific metadata (which fields are path params, etc.). The gateway +pattern is the default `to_openapi` projection; the traditional +projection is additive, not a replacement (ADR-042 §5). This task +implements the gateway projection only; the traditional projection is +out of scope. + +## Acceptance Criteria + +- [ ] `to_openapi(registry: &OperationRegistry) -> OpenAPISpec` implemented +- [ ] Generates 5 fixed gateway endpoints (`/search`, `/schema`, `/call`, `/batch`, `/subscribe`) +- [ ] No per-operation paths (the gateway is the surface, ADR-042) +- [ ] `/call` request body is flat JSON `{ operation, input }` (no path/query/body split) +- [ ] `/subscribe` response is `text/event-stream` +- [ ] `info.version` is semver tracking the gateway contract (initial `1.0.0`, ADR-045) +- [ ] Per-caller operation surface NOT preloaded into the doc (discovered via `/search`) +- [ ] `/call` responses include protocol-level errors (400, 401, 403, 404, 500, 504) +- [ ] `/call` responses include operation-level errors (mapped by `http_status`, ADR-023) +- [ ] `HTTP_`-prefixed error codes projected correctly (no collision with protocol codes) +- [ ] `to_openapi` is a pure projection (does not modify registry, not an OperationAdapter) +- [ ] `GET /openapi.json` route serves the generated spec (wired by http-adapter task) +- [ ] Unit test: generated doc has exactly 5 paths (the gateway endpoints) +- [ ] Unit test: `/call` request schema is `{ operation: string, input: object }` +- [ ] Unit test: `/subscribe` response content type is `text/event-stream` +- [ ] Unit test: `info.version` is `1.0.0` +- [ ] Unit test: `/call` responses include all protocol-level error statuses +- [ ] Unit test: operation with `error_schemas` → those errors projected on `/call` +- [ ] Unit test: operation with `HTTP_404` error code → projected as 404 response +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-adapters.md — to_openapi (§"to_openapi", §"The gateway endpoint set", §"Per-caller API surface", §"Error Fidelity") +- docs/architecture/decisions/042-openapi-gateway-pattern.md — ADR-042 (5 fixed gateway endpoints) +- docs/architecture/decisions/045-to-openapi-gateway-spec-versioning.md — ADR-045 (info.version semver) +- docs/architecture/decisions/047-remove-direct-call-http-surface.md — ADR-047 (gateway is sole invoke path) +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (error fidelity, HTTP_ prefix) +- docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017 §5 (to_* are projections) + +## Notes + +> to_openapi is a pure projection — it consumes the registry, does not +> produce entries. The generated doc describes the 5 fixed gateway +> endpoints (stable, same for every caller); the per-caller operation +> surface is discovered via /search, not preloaded. The info.version +> semver tracks the gateway endpoint contract, not the operation set +> (ADR-045) — per-caller operation changes do not bump the version. The +> error fidelity (ADR-023) projects operation-level errors (with +> http_status) onto /call's responses, plus the protocol-level errors. +> The OpenAPISpec type is shared with from_openapi (coordinate on the +> type); the shape is not (from_openapi consumes per-operation-paths, +> to_openapi produces the 5-endpoint gateway doc). The traditional +> per-operation-paths projection is additive (ADR-042 §5) and out of +> scope. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/client/shared-http-client.md b/tasks/http/client/shared-http-client.md new file mode 100644 index 0000000..3cc7641 --- /dev/null +++ b/tasks/http/client/shared-http-client.md @@ -0,0 +1,173 @@ +--- +id: http/client/shared-http-client +name: Implement shared HTTP client (ClientWithMiddleware + retry + Retry-After, OQ-40) +status: pending +depends_on: [http/crate-init] +scope: narrow +risk: low +impact: component +level: implementation +--- + +## Description + +Implement the shared HTTP client in `src/client/http_client.rs`. This is +the `reqwest_middleware::ClientWithMiddleware` used by all +`from_openapi`/`from_mcp` forwarding handlers. The client owns connection +pooling, keep-alive, TLS, and a retry stack. It is constructed once and +reused across all forwarding handlers; credential injection happens +per-request (from `OperationContext.capabilities`), not at client +construction — the client is shared across all operations, the +credentials are per-call. + +### The middleware stack (OQ-40 resolved) + +The shared type is `reqwest_middleware::ClientWithMiddleware`, not a +bare `reqwest::Client` — both retry and Retry-After are middleware on the +stack, and middleware requires the `ClientWithMiddleware` wrapper. The +stack has two layers: + +1. **`RetryTransientMiddleware`** (from `reqwest-retry`) — exponential + backoff on transient failures (connection errors, 5xx). The "retry N + times with increasing intervals" part. Configured via an + `ExponentialBackoff` policy at client construction. + +2. **Inlined `RetryAfterMiddleware`** — parses the `Retry-After` header + on 429/503 and sleeps before the next request to that URL. The + "respect what the server told you" part. **Inlined** (MIT, ~50 lines + of real logic) from `melotic/reqwest-retry-after`, not pulled as a + dependency: the crate is complementary to `reqwest-retry` (whose + default strategy does not honor `Retry-After`), and inlining lets the + upstream's unbounded `HashMap` storage be bounded + for a long-running process. + +### Pooling, keep-alive, TLS + +Pooling, keep-alive, and TLS come from `reqwest::ClientBuilder` defaults; +outbound TLS uses the system trust store (standard HTTPS to external +APIs like OpenAI, Anthropic). Custom CA bundle + client certs are an +optional config for self-hosted API gateways (two-way-door +implementation detail; the credential comes from `Capabilities`, the TLS +trust comes from the system). + +### Hot-reload (rebuild-and-swap) + +Hot-reload of the pooling/retry config is **rebuild-and-swap**: a config +change rebuilds the `ClientWithMiddleware` and swaps it via `ArcSwap` +(the same pattern `ConfigIdentityProvider` uses, ADR-035). A rebuild +drops the connection pool / keep-alive state, which is acceptable — a +config change wanting a fresh pool is the case that triggers it. The +retry policy is baked into the middleware at `ClientBuilder::build()` +time; live policy mutation is not supported by `reqwest-retry`, so cheap +per-policy updates are not part of the model. + +### API + +```rust +/// Configuration for the shared HTTP client (two-way-door, OQ-40 resolved). +pub struct HttpClientConfig { + /// Pool max idle connections per host (reqwest default if None). + pub pool_max_idle_per_host: Option, + /// Request timeout (reqwest default if None). + pub request_timeout: Option, + /// Retry policy for transient failures (ExponentialBackoff). + pub retry_policy: ExponentialBackoff, + /// Custom CA bundle path for self-hosted API gateways (optional). + pub ca_bundle: Option, + /// Client cert for mutual TLS (optional, from Capabilities at call time, + /// not here — this is the trust config, not the credential). + pub client_cert: Option, +} + +/// The shared HTTP client. Constructed once, reused across all forwarding +/// handlers. Wrapped in `ArcSwap` for rebuild-and-swap hot-reload. +pub struct SharedHttpClient { + inner: ArcSwap, + config: ArcSwap, +} + +impl SharedHttpClient { + pub fn new(config: HttpClientConfig) -> Self { ... } + + /// Get the current client (for forwarding handlers to use). + pub fn client(&self) -> Arc { + self.inner.load_full() + } + + /// Rebuild the client with new config (rebuild-and-swap hot-reload). + pub fn reload(&self, config: HttpClientConfig) { ... } +} +``` + +### The inlined `RetryAfterMiddleware` + +The inlined middleware is ~50 lines: a bounded `HashMap` +holding per-URL retry-after deadlines, a `before` hook that sleeps if the +deadline is in the future, and an `after` hook that parses the +`Retry-After` header on 429/503 and records the deadline. The bound +(e.g., LRU eviction at N entries) prevents unbounded growth in a +long-running process — the upstream's unbounded `HashMap` +is the reason it's inlined rather than depended on. + +### Downstream layering boundary + +The agent crate's provider SSE normalization sits on top of this +`ClientWithMiddleware`: it consumes the `reqwest::Response` stream the +forwarding handler produces and emits `call.responded` events. It does +not replace the client or own transport/pooling/retry. `alknet-http` +owns transport; the agent crate owns provider-specific SSE → +Vercel-UI-message mapping. The aisdk `core/client.rs` reference for HTTP +client construction is *not* carried forward — its env-var config and +hand-rolled retry are the anti-patterns discarded in favor of the +middleware stack above. + +## Acceptance Criteria + +- [ ] `SharedHttpClient` struct in `src/client/http_client.rs` +- [ ] Wraps `ArcSwap` for rebuild-and-swap +- [ ] `HttpClientConfig` with pool, timeout, retry policy, optional CA bundle +- [ ] `RetryTransientMiddleware` (from `reqwest-retry`) in the middleware stack +- [ ] Inlined `RetryAfterMiddleware` (~50 lines, bounded `HashMap`) +- [ ] `RetryAfterMiddleware` parses `Retry-After` header on 429/503 +- [ ] `RetryAfterMiddleware` sleeps before next request to a URL with an active deadline +- [ ] `RetryAfterMiddleware` storage is bounded (LRU eviction or equivalent) +- [ ] Pooling/keep-alive/TLS from `reqwest::ClientBuilder` defaults +- [ ] System trust store for outbound TLS by default +- [ ] Custom CA bundle optional (self-hosted API gateways) +- [ ] `reload(config)` rebuilds and swaps via `ArcSwap` +- [ ] Credential injection is per-request (NOT at client construction) +- [ ] No `std::env::var` reads (no-env-vars invariant — ADR-014) +- [ ] No shared global client (alknet-http owns its client) +- [ ] Unit test: `client()` returns a usable `ClientWithMiddleware` +- [ ] Unit test: `reload()` swaps the client (new client returned by `client()`) +- [ ] Unit test: `RetryAfterMiddleware` records deadline from `Retry-After: ` +- [ ] Unit test: `RetryAfterMiddleware` records deadline from `Retry-After: ` +- [ ] Unit test: bounded storage evicts old entries (no unbounded growth) +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-adapters.md — HTTP client (§"HTTP client (reqwest)") +- docs/architecture/crates/http/overview.md — OQ-40 resolved (ClientWithMiddleware + middleware stack) +- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (no env vars) +- https://docs.rs/reqwest-retry/ — RetryTransientMiddleware / ExponentialBackoff +- https://github.com/melotic/reqwest-retry-after — RetryAfterMiddleware source (MIT, inlined, not a dependency) + +## Notes + +> The shared HTTP client is the no-env-vars-compliant outbound transport. +> The middleware stack (RetryTransientMiddleware + inlined +> RetryAfterMiddleware) is the resolved shape (OQ-40). The +> RetryAfterMiddleware is inlined (not a dependency) so its storage can +> be bounded for a long-running process — the upstream's unbounded +> HashMap is the reason. Hot-reload is rebuild-and-swap via ArcSwap (same +> pattern as ConfigIdentityProvider, ADR-035). Credential injection is +> per-request (from OperationContext.capabilities), not at client +> construction — the client is shared, the credentials are per-call. The +> agent crate's SSE normalization sits on top of this client; it does not +> replace it. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/crate-init.md b/tasks/http/crate-init.md new file mode 100644 index 0000000..6a07913 --- /dev/null +++ b/tasks/http/crate-init.md @@ -0,0 +1,146 @@ +--- +id: http/crate-init +name: Initialize alknet-http crate with Cargo.toml, dependencies, and module skeleton +status: pending +depends_on: [] +scope: moderate +risk: low +impact: project +level: implementation +--- + +## Description + +Initialize the `alknet-http` crate from scratch. This crate is the HTTP +interface for alknet: it serves inbound HTTP on standard ALPNs (`h2`, +`http/1.1`, with WebSocket upgrade for browser bidirectional access to the +call protocol) and hosts the HTTP-backed call-protocol adapters +(`from_openapi`, `to_openapi`, `from_mcp`, `to_mcp`). HTTP/3 + WebTransport +(`h3`) is deferred per ADR-044 and is **not** part of this crate's initial +release. + +The crate has two roles in one (ADR-039): HTTP server (`HttpAdapter`, a +`ProtocolHandler` for `h2`/`http/1.1` + WS upgrade) and HTTP client host +(`from_openapi`/`from_mcp` forwarding via `reqwest`, `to_openapi`/`to_mcp` +projections via `axum`). Both directions share the same HTTP dependencies, +which is why they live in one crate. + +### Crate setup + +Create `crates/alknet-http/` with: + +- `Cargo.toml` — package metadata, dependencies, feature gates +- `src/lib.rs` — crate root with module declarations and re-exports +- Module skeleton files for: + - `src/server/mod.rs` — `HttpAdapter`, axum-over-QUIC, gateway routes, + `/healthz`, decoy, custom routes + - `src/gateway/mod.rs` — shared dispatch spine (`GatewayDispatch`), + error mapping + - `src/client/mod.rs` — shared HTTP client (`ClientWithMiddleware`) + - `src/adapters/mod.rs` — `from_openapi`, `to_openapi`, `from_mcp`, + `to_mcp` + - `src/websocket/mod.rs` — WS upgrade handler, framing, dispatch handoff + +### Dependencies + +| Crate | Purpose | Feature gate | +|-------|---------|--------------| +| `alknet-core` | `ProtocolHandler`, `Connection`, `AuthContext`, `Capabilities`, `IdentityProvider`, `AuthToken`, `Identity`, `HandlerError` (workspace path) | default | +| `alknet-call` | `OperationAdapter`, `AdapterError`, `OperationSpec`, `Handler`, `HandlerRegistration`, `OperationRegistry`, `OperationProvenance`, `OperationContext`, `ResponseEnvelope`, `CallError`, `EventEnvelope`, `Dispatcher`, `CallConnection` (workspace path) | default | +| `axum` | HTTP server — `Router`, extractors, middleware, WebSocket upgrade | default | +| `hyper` | HTTP/1.1 + HTTP/2 framing (axum is built on hyper) | default | +| `reqwest` | HTTP client — `from_openapi`/`from_mcp` forwarding | default | +| `reqwest-middleware` | `ClientWithMiddleware` wrapper for middleware stack | default | +| `reqwest-retry` | `RetryTransientMiddleware` / `ExponentialBackoff` | default | +| `tokio` 1 (full) | Async runtime | default | +| `serde` 1 | Serialization | default | +| `serde_json` 1 | JSON wire format, JSON Schema values | default | +| `async-trait` 0.1 | Async traits | default | +| `tracing` 0.1 | Structured logging | default | +| `thiserror` 2 | Error enums | default | +| `uuid` 1 | Request ID generation (UUID v4) | default | +| `futures` | Stream trait for SSE / subscriptions | default | +| `rmcp` | MCP streamable HTTP — `from_mcp`/`to_mcp` | `mcp` feature | +| `openapiv3` | OpenAPI document types (or a local type — two-way door) | default | + +### Feature gates + +```toml +[features] +default = ["h2", "http1"] +mcp = ["dep:rmcp"] +# h3 (HTTP/3 + WebTransport) is deferred per ADR-044 — not in the v1 +# feature set. The browser bidirectional path uses WebSocket (native to +# axum, no feature gate). When WebTransport revives, the `h3` feature +# gate returns; see ADR-044 and webtransport.md. +``` + +- `h2` + `http1` (default): the `axum` + `hyper` HTTP/1.1 + HTTP/2 server, + including WebSocket upgrade for browser bidirectional access (ADR-044). +- `mcp`: the `rmcp` dependency with streamable HTTP transport features + only (ADR-037 — stdio is excluded). Adds `from_mcp`/`to_mcp`. + +### Workspace Cargo.toml + +Add `crates/alknet-http` to the workspace `members` list in the root +`Cargo.toml`. + +### Module skeleton + +```rust +// src/lib.rs +//! alknet-http: HTTP interface for alknet — serves HTTP/1.1 + HTTP/2 on +//! standard ALPNs (with WebSocket upgrade for browser bidirectional access) +//! and hosts the HTTP-backed call-protocol adapters. +//! +//! Two roles in one crate (ADR-039): HTTP server (HttpAdapter, a +//! ProtocolHandler for h2/http1.1 + WS upgrade) and HTTP client host +//! (from_openapi/from_mcp forwarding, to_openapi/to_mcp projections). + +pub mod server; +pub mod gateway; +pub mod client; +pub mod adapters; +pub mod websocket; +``` + +Each module file gets a doc comment and `// TODO: implement` marker. + +## Acceptance Criteria + +- [ ] `crates/alknet-http/Cargo.toml` exists with all dependencies and feature gates +- [ ] `crates/alknet-http/src/lib.rs` exists with module declarations +- [ ] All module skeleton files exist (server/, gateway/, client/, adapters/, websocket/) +- [ ] Root `Cargo.toml` `members` list includes `crates/alknet-http` +- [ ] `cargo check -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http` succeeds with no warnings +- [ ] Dual licensing: `MIT OR Apache-2.0` (workspace-inherited) +- [ ] `alknet-core` dependency uses workspace path (`path = "../alknet-core"`) +- [ ] `alknet-call` dependency uses workspace path (`path = "../alknet-call"`) +- [ ] `mcp` feature gate pulls in `rmcp` with streamable HTTP transport features only (no stdio) +- [ ] `h3`/WebTransport dependency is absent (deferred per ADR-044) + +## References + +- docs/architecture/crates/http/README.md — crate index +- docs/architecture/crates/http/overview.md — crate overview, dependencies, feature gates +- docs/architecture/decisions/003-crate-decomposition.md — ADR-003 (Amendment 1: alknet-call is protocol-foundation) +- docs/architecture/decisions/039-http-server-and-client-host-colocated.md — ADR-039 (one crate for server + client host) +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (h3 deferred) +- docs/architecture/decisions/037-mcp-stdio-transport-exclusion.md — ADR-037 (streamable HTTP only) + +## Notes + +> alknet-http depends on alknet-core (ProtocolHandler, Connection, +> IdentityProvider, Capabilities) and alknet-call (OperationAdapter, +> OperationRegistry, HandlerRegistration, EventEnvelope, Dispatcher). The +> crate has five subsystems: server (HttpAdapter, axum over QUIC), gateway +> (shared dispatch spine, error mapping), client (shared HTTP client), +> adapters (from_openapi/to_openapi/from_mcp/to_mcp), and websocket (WS +> upgrade, native session). The module structure reflects this split. The +> `mcp` feature gate is optional; the default feature set is `h2` + `http1`. +> The `h3`/WebTransport dependency is absent (deferred per ADR-044). + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/gateway/error-mapping.md b/tasks/http/gateway/error-mapping.md new file mode 100644 index 0000000..bcf64cc --- /dev/null +++ b/tasks/http/gateway/error-mapping.md @@ -0,0 +1,139 @@ +--- +id: http/gateway/error-mapping +name: Implement CallError-to-HTTP-status error mapping (ADR-023) +status: pending +depends_on: [http/crate-init] +scope: narrow +risk: low +impact: component +level: implementation +--- + +## Description + +Implement the `CallError` code → HTTP status code mapping in +`src/gateway/error.rs`. This is the error-mapping table the HTTP server's +gateway endpoints use to translate call-protocol `CallError` codes into +HTTP response status codes (ADR-023). The mapping is a two-way-door +default (the exact status for ambiguous codes can be refined +additively); the one-way constraint is that protocol-level and +operation-level codes are distinct (ADR-023) and `from_openapi`-imported +codes are prefixed `HTTP_` to avoid collision with protocol +codes. + +### The mapping table (http-server.md §"Error Mapping") + +| Call `code` | HTTP status | Notes | +|-------------|-------------|-------| +| `NOT_FOUND` (operation not registered, or Internal op) | `404` | | +| `FORBIDDEN` (insufficient scopes, or unauthenticated) | `401` (no token) / `403` (token present) | | +| `INVALID_INPUT` (schema mismatch) | `422` | | +| `TIMEOUT` | `504` | `retryable: true` | +| `INTERNAL` | `500` | | +| Operation-level domain code with `http_status` (ADR-023) | the declared `http_status` | `from_openapi`-imported ops carry the original status | +| Operation-level domain code without `http_status` | `500` | | + +### The `HTTP_` prefix rule (ADR-023 §5) + +`from_openapi` maps OpenAPI non-2xx response status codes to +`ErrorDefinition`s with codes prefixed `HTTP_` + the status number: + +```rust +// OpenAPI: 404: { schema: NotFoundError } +// → ErrorDefinition { code: "HTTP_404", http_status: Some(404), schema: NotFoundError } +``` + +The normative rule (review #002 W20): `from_openapi` must not produce +error codes that collide with the five protocol-level codes (`NOT_FOUND`, +`FORBIDDEN`, `INVALID_INPUT`, `INTERNAL`, `TIMEOUT`). The `HTTP_` +prefix enforces this. + +### `retryable` → `Retry-After` hint + +The `retryable` field from `CallError` maps to an HTTP `Retry-After` hint +for `503`/`429`-class errors (operation-level codes with `http_status` in +that range). The hint is optional; if the operation-level error does not +carry a retry-after value, no header is added. + +### API + +```rust +/// Map a CallError to an HTTP status code (ADR-023). +pub fn call_error_to_http_status(error: &CallError) -> u16; + +/// Map a CallError to an HTTP response, including the Retry-After hint +/// when applicable. The body is the serialized CallError (or its +/// `details` field). +pub fn call_error_to_http_response(error: &CallError) -> axum::response::Response; +``` + +The `FORBIDDEN` case needs the caller's identity state to distinguish +`401` (no token) from `403` (token present but insufficient scopes). The +mapping function takes an `Option` (or a flag) so the gateway +endpoint can pass the resolved identity through: + +```rust +/// Map a CallError to an HTTP status code, considering whether the caller +/// was authenticated (FORBIDDEN → 401 if no identity, 403 if identity +/// present but insufficient scopes). +pub fn call_error_to_http_status_with_identity( + error: &CallError, + identity: Option<&Identity>, +) -> u16; +``` + +### What this task does NOT do + +- **No `to_openapi` error projection.** `to_openapi` projects + `error_schemas` to the gateway endpoint's response definitions (the + OpenAPI doc's `responses` block). That is the `to-openapi` task, not + this one. This task is the runtime HTTP response mapping. +- **No `from_openapi` error import.** `from_openapi` builds + `ErrorDefinition`s from OpenAPI non-2xx responses with the `HTTP_` + prefix. That is the `from-openapi` task. This task consumes the + resulting `CallError` codes at runtime. + +## Acceptance Criteria + +- [ ] `call_error_to_http_status(error: &CallError) -> u16` implemented +- [ ] `NOT_FOUND` → 404 +- [ ] `FORBIDDEN` → 401 (no identity) / 403 (identity present) +- [ ] `INVALID_INPUT` → 422 +- [ ] `TIMEOUT` → 504 +- [ ] `INTERNAL` → 500 +- [ ] Operation-level code with `http_status` → declared status +- [ ] Operation-level code without `http_status` → 500 +- [ ] `HTTP_`-prefixed codes (from `from_openapi`) → the status number +- [ ] `call_error_to_http_response(error)` builds an `axum::response::Response` with the status + JSON body +- [ ] `retryable: true` on `503`/`429`-class errors → `Retry-After` header (when value present) +- [ ] `call_error_to_http_status_with_identity(error, identity)` for the 401/403 split +- [ ] Unit test: each protocol code maps to the correct status +- [ ] Unit test: operation-level code with `http_status` maps to declared status +- [ ] Unit test: operation-level code without `http_status` maps to 500 +- [ ] Unit test: `HTTP_404` code maps to 404 (not collided with protocol `NOT_FOUND`) +- [ ] Unit test: `FORBIDDEN` with `None` identity → 401 +- [ ] Unit test: `FORBIDDEN` with `Some(identity)` → 403 +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-server.md — Error Mapping table (§"Error Mapping") +- docs/architecture/crates/http/http-adapters.md — Error Fidelity (§"Error Fidelity (ADR-023)") +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (protocol/operation codes distinct, HTTP_ prefix) + +## Notes + +> The mapping is a two-way-door default (the exact status for ambiguous +> codes can be refined additively); the one-way constraint is that +> protocol-level and operation-level codes are distinct (ADR-023) and +> from_openapi-imported codes are prefixed HTTP_. The FORBIDDEN +> case needs the caller's identity state to distinguish 401 (no token) +> from 403 (token present but insufficient scopes). This task is the +> runtime HTTP response mapping; the to_openapi doc-level error +> projection is the to-openapi task, and the from_openapi error import +> is the from-openapi task. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/gateway/gateway-dispatch-spine.md b/tasks/http/gateway/gateway-dispatch-spine.md new file mode 100644 index 0000000..a8a0085 --- /dev/null +++ b/tasks/http/gateway/gateway-dispatch-spine.md @@ -0,0 +1,183 @@ +--- +id: http/gateway/gateway-dispatch-spine +name: Implement GatewayDispatch shared dispatch spine (thin concrete struct, not a trait) +status: pending +depends_on: [http/crate-init] +scope: narrow +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement the shared dispatch spine for the `to_*` gateway projections +(`to_openapi`, `to_mcp`) in `src/gateway/dispatch.rs`. This is the thin +shared core recommended by the gateway-factoring research: a **concrete +struct, not a trait**. It holds `Arc` + +`Arc` and exposes a `resolve_bearer()` + +`invoke()` method pair returning the neutral `ResponseEnvelope`. Both +gateways call it as a library function; each gateway then maps the +`ResponseEnvelope` to its own wire shape. + +This is the security-relevant shared piece: identity resolution, root +`OperationContext` construction, and the `OperationRegistry::invoke()` +call. A divergence here (one gateway resolving identity differently, or +building `OperationContext` with a different `internal` flag, or mapping +`CallError` inconsistently) would be a real security/correctness bug. +Extracting the spine now makes the two gateways *provably* identical on +the security-relevant axis (auth, authority, ACL) and lets them diverge +only on the wire-framing axis (where divergence is correct). + +### The struct (research §5.1) + +```rust +/// Shared dispatch spine for the `to_*` gateway projections. +/// Resolves identity, builds a root OperationContext, invokes the registry, +/// returns the neutral ResponseEnvelope. Each gateway maps the envelope to +/// its own wire shape. +pub struct GatewayDispatch { + registry: Arc, + identity_provider: Arc, +} + +impl GatewayDispatch { + pub fn new( + registry: Arc, + identity_provider: Arc, + ) -> Self { ... } + + /// Resolve a bearer token to an Identity (shared by both gateways' + /// axum auth middleware). + pub fn resolve_bearer(&self, token: &AuthToken) -> Option { + self.identity_provider.resolve_from_token(token) + } + + /// Invoke an operation as a wire-ingress caller. `internal: false`, + /// `forwarded_for: None`, fresh request_id. Returns the neutral + /// ResponseEnvelope; the gateway maps it to its wire shape. + pub async fn invoke( + &self, + identity: Option, + op: &str, + input: Value, + ) -> ResponseEnvelope { + // build root OperationContext, call self.registry.invoke(op, input, ctx) + // ... + } +} +``` + +### Root OperationContext construction + +The `invoke()` method builds a root `OperationContext` for a wire-ingress +call (the same shape `CallAdapter::build_root_context` builds for +`alknet/call` wire requests): + +- `internal: false` — ACL runs against the caller's `identity`, not a + handler's composition authority (ADR-015). +- `forwarded_for: None` — wire-ingress only (ADR-032). +- `identity` = the resolved bearer identity (from `resolve_bearer`). +- `handler_identity` = the registration bundle's `composition_authority`. +- `capabilities` = the registration bundle's capabilities. +- `scoped_env` = the registration bundle's `scoped_env` (or empty). +- `request_id` = fresh UUID v4 (`generate_request_id()`). +- `deadline` = `now + default_timeout` (30s default). +- `env` = a `LocalOperationEnv` over the registry (the gateway dispatch + path does not compose peer/session overlays — it is a flat invoke). + +Coordinate with the existing `Dispatcher::build_root_context` in +alknet-call (`protocol/dispatch.rs`): if that logic can be extracted as a +shared free function (it should be — it takes `identity`, `capabilities`, +`env`, `deadline` and returns an `OperationContext`), call it from both +`Dispatcher` and `GatewayDispatch`. If it is tangled with +`CallAdapter`-specific state, duplicate the construction logic here (the +invariants — `internal: false`, `forwarded_for: None` — are the +load-bearing part; the construction itself is mechanical). See research +§6 open question #1. + +### What this task does NOT do + +- **No `GatewayDispatch` trait.** A concrete struct, not a polymorphic + trait. The research (§5.2) rules this out: a trait would need an + associated output type (HTTP `Response` vs `CallToolResult`), at which + point it has no shared method bodies. +- **No `into_wire()` method.** The `ResponseEnvelope` → wire mapping is + per-gateway; do not parameterize the core over it. +- **No streaming abstraction.** `/subscribe` SSE is `to_openapi`-only; do + not build a `GatewayStream` trait for one implementation. +- **No discovery abstraction.** `services/list` is the shared backend + (already in `OperationRegistry`); the discovery *framing* (OpenAPI + `/search` vs MCP `tools/list` + `search` tool) is per-gateway. +- **No versioning.** `info.version` is `to_openapi`-only. +- **No `batch` method.** `batch` is a loop over `invoke()` in each + gateway (research §6 open question #3 — confirm `batch` is genuinely + just a loop, no shared batch-specific state). + +### `services/list` / `services/schema` dispatch + +The gateway's `search`/`schema` endpoints/tools dispatch `services/list` +and `services/schema` — these are registered operations in the +`OperationRegistry`, so `GatewayDispatch::invoke()` handles them +unchanged (it calls `OperationRegistry::invoke()`, which works for +`services/list` and `services/schema`). The `AccessControl`-filtered +listing lives in the `services/list` handler (already in the registry), +not in the gateway. Confirm via a spike that the filtering sees the +*caller's* identity when invoked through `GatewayDispatch::invoke` (it +should — `services/list` is `AccessControl::check(identity)`-filtered, +and `GatewayDispatch` passes the resolved identity as the caller). See +research §6 open question #4. + +## Acceptance Criteria + +- [ ] `GatewayDispatch` struct defined in `src/gateway/dispatch.rs` +- [ ] Holds `Arc` + `Arc` +- [ ] `resolve_bearer(&self, token: &AuthToken) -> Option` delegates to `identity_provider.resolve_from_token` +- [ ] `invoke(&self, identity, op, input) -> ResponseEnvelope` builds root context and dispatches +- [ ] Root `OperationContext` has `internal: false`, `forwarded_for: None`, fresh `request_id` +- [ ] `handler_identity` from registration bundle's `composition_authority` +- [ ] `capabilities` from registration bundle +- [ ] `scoped_env` from registration bundle (or empty) +- [ ] `deadline` = `now + 30s` default +- [ ] `invoke()` calls `OperationRegistry::invoke(op, input, ctx)` +- [ ] `invoke()` works for `services/list` and `services/schema` (registered ops) +- [ ] `AccessControl`-filtering in `services/list` sees the caller's resolved identity +- [ ] No `GatewayDispatch` trait (concrete struct only) +- [ ] No `into_wire()` method (per-gateway mapping stays out of the core) +- [ ] No streaming abstraction (per-gateway) +- [ ] `GatewayDispatch` is `pub` and re-exported from `lib.rs` +- [ ] Unit test: `invoke()` dispatches a registered op and returns `ResponseEnvelope` +- [ ] Unit test: `invoke()` for `services/list` returns AccessControl-filtered list +- [ ] Unit test: `invoke()` for unregistered op returns `CallError { code: NOT_FOUND }` +- [ ] Unit test: `invoke()` for Internal op returns `CallError { code: NOT_FOUND }` (not leaked) +- [ ] Unit test: `invoke()` with `None` identity + restricted op → `FORBIDDEN` +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/research/alknet-http-gateway-factoring/findings.md — research recommendation (thin shared core, not a trait) +- docs/architecture/crates/http/http-adapters.md — to_openapi dispatch (§"Shared dispatch spine with to_mcp") +- docs/architecture/crates/http/http-mcp.md — to_mcp dispatch (§"Shared dispatch spine with to_openapi") +- docs/architecture/crates/call/operation-registry.md — OperationRegistry::invoke(), OperationContext construction +- docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (internal: false for wire) +- docs/architecture/decisions/032-forwarded-for-identity.md — ADR-032 (forwarded_for: None for wire-ingress) + +## Notes + +> The dispatch spine is the security-relevant shared piece. A divergence +> here (identity resolution, context construction, invoke shape) would be +> a security bug; extracting the spine now makes the two gateways provably +> identical on the security axis. The research recommends a concrete +> struct, not a trait — a trait would need an associated output type +> (HTTP Response vs CallToolResult), at which point it has no shared +> method bodies. Coordinate with the existing +> `Dispatcher::build_root_context` in alknet-call: if it can be extracted +> as a shared free function, call it from both Dispatcher and +> GatewayDispatch; otherwise duplicate the construction logic (the +> invariants are the load-bearing part). The `batch` endpoint is a loop +> over `invoke()` in each gateway, not a shared method. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/review-http-final.md b/tasks/http/review-http-final.md new file mode 100644 index 0000000..9c5f16b --- /dev/null +++ b/tasks/http/review-http-final.md @@ -0,0 +1,179 @@ +--- +id: http/review-http-final +name: Final review of alknet-http crate — all components, feature gates, pattern consistency +status: pending +depends_on: [http/review-http, http/review-websocket, http/review-mcp] +scope: broad +risk: low +impact: project +level: review +--- + +## Description + +Final review of the entire `alknet-http` crate — all components together, +feature gate isolation, pattern consistency, and cross-cutting concerns. +This is the last quality checkpoint before the crate is considered +complete. It runs after the three phase reviews (server surface, +WebSocket, MCP) and verifies the crate as a whole. + +### Review Checklist + +1. **Crate structure**: + - `crates/alknet-http/Cargo.toml` with all dependencies and feature gates + - `src/lib.rs` with module declarations and re-exports + - Module structure: `server/`, `gateway/`, `client/`, `adapters/`, `websocket/` + - Root `Cargo.toml` `members` includes `crates/alknet-http` + - Workspace path deps for `alknet-core`, `alknet-call` + +2. **Feature gate isolation**: + - `default = ["h2", "http1"]` — the HTTP surface (incl. WebSocket upgrade) + - `mcp = ["dep:rmcp"]` — from_mcp/to_mcp (streamable HTTP only, ADR-037) + - `h3`/WebTransport feature gate is ABSENT (deferred per ADR-044) + - `cargo check -p alknet-http` (default features) succeeds + - `cargo check -p alknet-http --features mcp` succeeds + - `cargo check -p alknet-http --no-default-features` succeeds (if meaningful) + - MCP code (from_mcp, to_mcp) not compiled without `mcp` feature + - stdio transport (`transport-child-process`) is NOT a dependency + +3. **Dependency correctness**: + - `alknet-core` (ProtocolHandler, Connection, IdentityProvider, Capabilities) + - `alknet-call` (OperationAdapter, AdapterError, OperationRegistry, HandlerRegistration, Dispatcher, CallConnection, EventEnvelope, ResponseEnvelope, CallError) + - `axum` (HTTP server, WebSocket upgrade) + - `reqwest` + `reqwest-middleware` + `reqwest-retry` (HTTP client) + - `hyper` (HTTP/1.1 + HTTP/2 framing) + - `rmcp` (MCP, feature-gated, streamable HTTP only) + - No `wtransport` / `h3` dependency (deferred per ADR-044) + - No env-var-based config (no-env-vars invariant, ADR-014) + +4. **Cross-cutting conformance**: + - **No-env-vars invariant** (ADR-014): no `std::env::var` in any handler + (from_openapi, from_mcp forwarding handlers read context.capabilities) + - **No secret material in responses** (ADR-014): Capabilities not serialized + into HTTP response bodies or CallToolResult + - **AccessControl is the sole gate** (ADR-029 §3): no remote_safe/trusted_peer + - **Internal ops → NOT_FOUND** (ADR-015): don't leak existence from wire + - **Error fidelity** (ADR-023): HTTP_ prefix for imported codes, + no collision with protocol codes + - **Browsers/MCP clients are not peers** (ADR-034 §4, ADR-044 §5): + no PeerId, connection-local overlay + +5. **Pattern consistency across the crate**: + - `GatewayDispatch` is a concrete struct (not a trait) — shared by to_openapi, to_mcp + - Auth middleware is shared (HTTP routes + to_mcp rmcp service) + - `SharedHttpClient` is ArcSwap-wrapped (rebuild-and-swap, same pattern as ConfigIdentityProvider) + - Error mapping is a free function (not a trait method) + - `OperationAdapter` implementations (from_openapi, from_mcp) follow the same shape: + parse/discover → construct HandlerRegistration bundles → return + - `to_*` projections (to_openapi, to_mcp) are pure (consume registry, don't produce entries) + - `from_*` adapters (from_openapi, from_mcp) are OperationAdapter impls (produce entries) + +6. **ADR conformance (full list)**: + - ADR-003 Am. 1: alknet-http depends on alknet-call (protocol-foundation) + - ADR-004: Bearer auth via resolve_from_token + - ADR-010: stealth mode (HTTP handler on standard ALPNs, decoy) + - ADR-012: stream-agnostic correlation (Dispatcher over WS) + - ADR-014: no-env-vars, no secret material in responses + - ADR-015: Internal ops → 404 from wire + - ADR-016: abort cascade on disconnect + - ADR-017: OperationAdapter trait, to_* are projections + - ADR-022: leaf provenance (from_openapi, from_mcp) + - ADR-023: error fidelity, HTTP_ prefix + - ADR-024: Layer 2 connection-local overlay (WS browser ops) + - ADR-029 §3: AccessControl is sole gate (no remote_safe) + - ADR-032: forwarded_for: None for wire-ingress + - ADR-034 §4: browsers/MCP clients are not peers + - ADR-036: /healthz raw route, stealth, error mapping (non-routing clauses survive) + - ADR-037: MCP streamable HTTP only (no stdio) + - ADR-039: one crate for server + client host + - ADR-041: to_mcp 4-tool gateway + - ADR-042: to_openapi 5-endpoint gateway + - ADR-044: h3 deferred, WS is v1 browser path, no length prefix + - ADR-045: to_openapi info.version semver + - ADR-046: extra_routes custom routes + - ADR-047: gateway is sole invoke path (no direct-call surface) + - ADR-048: WS carries native session, not gateway shape + +7. **What's NOT in the crate** (verify absence): + - No `h3`/WebTransport handler (deferred per ADR-044) + - No `from_wss` adapter (out of scope, websocket.md §"Future") + - No stdio MCP transport (ADR-037) + - No per-operation `POST /{service}/{op}` direct-call surface (ADR-047) + - No traditional per-operation-paths OpenAPI projection (additive, ADR-042 §5, out of scope) + - No env-var-based client config (ADR-014) + +8. **Test coverage (full crate)**: + - All unit tests pass + - All integration tests pass + - `cargo test -p alknet-http` (default features) succeeds + - `cargo test -p alknet-http --features mcp` succeeds + - `cargo test -p alknet-call` succeeds (no regressions from transport abstraction) + +9. **Build cleanliness**: + - `cargo fmt --check -p alknet-http` passes + - `cargo clippy -p alknet-http` passes with no warnings + - `cargo clippy -p alknet-http --features mcp --all-targets` passes with no warnings + - `cargo build -p alknet-http` succeeds + - `cargo build -p alknet-http --features mcp` succeeds + +## Acceptance Criteria + +- [ ] Crate structure complete (Cargo.toml, lib.rs, all modules) +- [ ] Feature gates correct (default h2+http1, mcp optional, h3 absent) +- [ ] Feature gate isolation verified (MCP code not compiled without mcp) +- [ ] All dependencies correct (alknet-core, alknet-call, axum, reqwest, hyper, rmcp) +- [ ] No wtransport/h3 dependency (deferred per ADR-044) +- [ ] No env-var-based config (ADR-014) +- [ ] No-env-vars invariant holds across all handlers +- [ ] No secret material in responses (ADR-014) +- [ ] AccessControl is the sole gate (ADR-029 §3) +- [ ] Internal ops → NOT_FOUND from wire (ADR-015) +- [ ] Error fidelity (ADR-023 — HTTP_ prefix, no collision) +- [ ] Browsers/MCP clients are not peers (ADR-034 §4, ADR-044 §5) +- [ ] GatewayDispatch is a concrete struct (not a trait) +- [ ] Auth middleware shared (HTTP routes + to_mcp) +- [ ] SharedHttpClient is ArcSwap-wrapped +- [ ] All ADRs conformed to (003-048, full list in checklist) +- [ ] No h3/WebTransport handler (deferred) +- [ ] No from_wss adapter (out of scope) +- [ ] No stdio MCP transport (ADR-037) +- [ ] No direct-call surface (ADR-047) +- [ ] No traditional per-operation-paths OpenAPI (out of scope) +- [ ] `cargo fmt --check -p alknet-http` passes +- [ ] `cargo clippy -p alknet-http` passes with no warnings +- [ ] `cargo clippy -p alknet-http --features mcp --all-targets` passes with no warnings +- [ ] `cargo test -p alknet-http` passes +- [ ] `cargo test -p alknet-http --features mcp` passes +- [ ] `cargo test -p alknet-call` passes (no regressions) + +## References + +- docs/architecture/crates/http/README.md — crate index +- docs/architecture/crates/http/overview.md — overview, dependencies, feature gates +- docs/architecture/crates/http/http-server.md — HttpAdapter +- docs/architecture/crates/http/http-adapters.md — from_openapi, to_openapi +- docs/architecture/crates/http/http-mcp.md — from_mcp, to_mcp +- docs/architecture/crates/http/websocket.md — WebSocket path +- docs/architecture/crates/http/webtransport.md — deferred (verify absence) +- docs/research/alknet-http-gateway-factoring/findings.md — gateway factoring +- docs/architecture/decisions/ (all relevant ADRs: 003-048) + +## Notes + +> This is the final quality checkpoint. It runs after the three phase +> reviews (review-http, review-websocket, review-mcp) and verifies the +> crate as a whole. The key cross-cutting concerns: (1) the no-env-vars +> invariant holds across ALL handlers (from_openapi, from_mcp), (2) the +> GatewayDispatch shared spine is a concrete struct used by both +> to_openapi and to_mcp, (3) the auth middleware is shared between HTTP +> routes and the to_mcp rmcp service, (4) feature gate isolation is +> correct (MCP code not compiled without mcp, h3 absent), (5) no +> regressions in alknet-call from the transport abstraction. Verify the +> absence of the deferred/out-of-scope items (h3, from_wss, stdio, +> direct-call surface, traditional per-operation-paths OpenAPI). If +> deviations are found, document and fix before considering the crate +> complete. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/review-http.md b/tasks/http/review-http.md new file mode 100644 index 0000000..da1ca85 --- /dev/null +++ b/tasks/http/review-http.md @@ -0,0 +1,166 @@ +--- +id: http/review-http +name: Review alknet-http server surface + OpenAPI adapters for spec conformance +status: pending +depends_on: [http/server/gateway-endpoints, http/adapters/to-openapi, http/adapters/from-openapi, http/server/healthz-decoy] +scope: broad +risk: low +impact: phase +level: review +--- + +## Description + +Review the alknet-http server surface and OpenAPI adapters for spec +conformance, pattern consistency, and correctness. This is the quality +checkpoint for the HTTP server + gateway + OpenAPI adapter work — the +core of the crate. + +### Review Checklist + +1. **HttpAdapter conformance** (http-server.md): + - `HttpAdapter` struct with `identity_provider`, `registry`, `decoy`, `extra_routes` + - `DecoyConfig` enum with `NotFound` (default), `StaticSite`, `Redirect` + - `ProtocolHandler::alpn()` returns `http/1.1` or `h2` + - `handle()` branches on `connection.remote_alpn()` for HTTP framing + - axum over QUIC bidirectional stream (`accept_bi` → hyper `TokioIo` → axum router) + - Router built once at construction, cloned per connection (Arc clone) + - `h3` ALPN not registered (deferred per ADR-044) + - Custom routes (`extra_routes`) merged via `Router::merge` (ADR-046) + - Default surface reserved paths take precedence on collision + +2. **Gateway endpoints conformance** (http-server.md, http-adapters.md): + - 5 fixed gateway endpoints (`/search`, `/schema`, `/call`, `/batch`, `/subscribe`) + - No per-operation `POST /{service}/{op}` direct-call surface (ADR-047) + - `/call` body is flat JSON `{ operation, input }` + - `/call` dispatches via `GatewayDispatch::invoke` (shared spine) + - `Internal` ops → 404 on `/call` (ADR-015) + - `External` op + unauthorized → 403; + no identity → 401 + - `/search` dispatches `services/list` (AccessControl-filtered) + - `/schema` dispatches `services/schema` + - `/batch` is a loop over `invoke` (array of results in order) + - `/subscribe` is SSE (`text/event-stream`, `call.responded` → `data:` frames) + - `/subscribe` disconnect → `call.aborted` cascade (ADR-016) + +3. **Error mapping conformance** (http-server.md, ADR-023): + - `NOT_FOUND` → 404, `FORBIDDEN` → 401/403, `INVALID_INPUT` → 422, `TIMEOUT` → 504, `INTERNAL` → 500 + - Operation-level code with `http_status` → declared status + - Operation-level code without `http_status` → 500 + - `HTTP_` prefix for imported codes (no collision with protocol codes) + - `retryable` → `Retry-After` hint for 503/429-class + +4. **Auth conformance** (http-server.md, ADR-004): + - Bearer-only (`Authorization: Bearer` → `resolve_from_token`) + - Shared middleware stashes `Option` in request extensions + - `ResolvedIdentity` extractor reads from extensions + - `connection.set_identity(identity)` for observability (OQ-11) + - No `std::env::var` reads (no-env-vars invariant) + +5. **`/healthz` and decoy conformance** (http-server.md): + - `/healthz` is raw (no auth, no call protocol, no OperationContext) + - `/healthz` returns 200 + "ok" + - Decoy fallback for unknown paths + - `DecoyConfig::NotFound` default (fake nginx 404, no alknet leak) + - Custom routes take precedence over decoy + +6. **`to_openapi` conformance** (http-adapters.md, ADR-042/045): + - 5 fixed gateway endpoints in the doc (not per-operation paths) + - `info.version` semver tracks gateway contract (initial 1.0.0) + - Per-caller operation surface NOT preloaded (discovered via `/search`) + - `/call` responses include protocol-level + operation-level errors + - `HTTP_`-prefixed codes projected correctly + - Pure projection (consumes registry, does not produce entries) + +7. **`from_openapi` conformance** (http-adapters.md, ADR-017/023): + - `OperationAdapter` impl (`async fn import`) + - Parse OpenAPI doc (`$ref` resolution, buildInputSchema/buildOutputSchema) + - `operationId` (or generated name) → `spec.name` + - `op_type` detected from method + response content type + - `visibility` = `Internal` (ADR-015) + - `provenance` = `FromOpenAPI`, leaf (ADR-022) + - Error codes prefixed `HTTP_` (ADR-023) + - Forwarding handler: reqwest via `SharedHttpClient` + - No-env-vars: reads `context.capabilities`, never `std::env::var` (ADR-014) + - SSE parsing for `Subscription` forwarding + - `HttpAuthScheme` (Bearer, ApiKey, Basic) + +8. **Shared HTTP client conformance** (http-adapters.md, OQ-40): + - `ClientWithMiddleware` (not bare `reqwest::Client`) + - `RetryTransientMiddleware` + inlined `RetryAfterMiddleware` + - Bounded `RetryAfterMiddleware` storage (no unbounded growth) + - ArcSwap hot-reload (rebuild-and-swap) + - Per-request credential injection (not at construction) + - No env-var-based client config + +9. **GatewayDispatch conformance** (research §5.1): + - Concrete struct (not a trait) + - `resolve_bearer` + `invoke` → `ResponseEnvelope` + - Root `OperationContext`: `internal: false`, `forwarded_for: None`, fresh `request_id` + - `handler_identity` from registration bundle + - No `into_wire()` method (per-gateway mapping stays out) + - No streaming abstraction (per-gateway) + +10. **Security constraints**: + - No secret material in HTTP response bodies (ADR-014) + - Capabilities not serialized into responses + - No-env-vars invariant (from_openapi reads context.capabilities) + - Internal ops → 404 (don't leak existence) + - AccessControl is the sole authorization gate + +11. **Pattern consistency**: + - GatewayDispatch is a struct, not a trait (research recommendation) + - Auth middleware shared between HTTP routes and to_mcp (research §4.4) + - Error mapping is a free function (not a trait method) + - SharedHttpClient is ArcSwap-wrapped (same pattern as ConfigIdentityProvider) + +12. **Test coverage**: + - Unit tests for error mapping (all codes, 401/403 split) + - Unit tests for auth middleware (valid/absent/malformed Bearer) + - Unit tests for GatewayDispatch (invoke, services/list filtering) + - Unit tests for to_openapi (5 paths, info.version, error projection) + - Unit tests for from_openapi (parse, operationId, op_type, error codes) + - Integration tests for gateway endpoints (call, search, schema, batch, subscribe) + - Integration tests for from_openapi forwarding (no-env-vars, SSE) + +## Acceptance Criteria + +- [ ] HttpAdapter matches http-server.md (struct, DecoyConfig, ProtocolHandler, axum over QUIC) +- [ ] Gateway endpoints match http-server.md (5 endpoints, no direct-call surface, ADR-047) +- [ ] Error mapping matches ADR-023 (all codes, HTTP_ prefix, 401/403 split) +- [ ] Auth matches ADR-004 (Bearer-only, shared middleware, set_identity) +- [ ] /healthz is raw; decoy fallback works; custom routes take precedence +- [ ] to_openapi matches ADR-042/045 (5 endpoints, info.version, per-caller via /search) +- [ ] from_openapi matches http-adapters.md (OperationAdapter, no-env-vars, HTTP_) +- [ ] SharedHttpClient matches OQ-40 (ClientWithMiddleware, retry, RetryAfter, ArcSwap) +- [ ] GatewayDispatch is a concrete struct (not a trait), shared spine correct +- [ ] No secret material in HTTP responses (ADR-014) +- [ ] No-env-vars invariant verified (no std::env::var in from_openapi) +- [ ] Internal ops → 404 (don't leak existence) +- [ ] AccessControl is the sole authorization gate +- [ ] Test coverage adequate for all functionality +- [ ] `cargo fmt --check -p alknet-http` passes +- [ ] `cargo clippy -p alknet-http` passes with no warnings +- [ ] All tests pass + +## References + +- docs/architecture/crates/http/README.md +- docs/architecture/crates/http/overview.md +- docs/architecture/crates/http/http-server.md +- docs/architecture/crates/http/http-adapters.md +- docs/research/alknet-http-gateway-factoring/findings.md +- docs/architecture/decisions/ (relevant ADRs: 004, 010, 014, 015, 017, 022, 023, 036, 039, 042, 045, 046, 047) + +## Notes + +> This is the quality checkpoint for the HTTP server + gateway + OpenAPI +> adapter work — the core of the crate. The review should verify that +> the gateway is the sole invoke path (ADR-047), the error mapping is +> faithful (ADR-023), the no-env-vars invariant holds (ADR-014), and the +> GatewayDispatch shared spine is a concrete struct (not a trait, per +> the research recommendation). If deviations are found, document and +> fix before considering the server surface complete. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/review-mcp.md b/tasks/http/review-mcp.md new file mode 100644 index 0000000..0926d89 --- /dev/null +++ b/tasks/http/review-mcp.md @@ -0,0 +1,161 @@ +--- +id: http/review-mcp +name: Review MCP adapters for ADR-037/041 conformance (streamable HTTP, 4-tool gateway, output handling) +status: pending +depends_on: [http/adapters/from-mcp, http/adapters/to-mcp] +scope: moderate +risk: low +impact: phase +level: review +--- + +## Description + +Review the MCP adapters (`from_mcp`, `to_mcp`) for spec conformance, +pattern consistency, and correctness. This is the quality checkpoint +for the feature-gated MCP work. + +### Review Checklist + +1. **`from_mcp` conformance** (http-mcp.md): + - `FromMCP` struct with `endpoint`, `auth_token`, `namespace` + - `OperationAdapter` impl (`async fn import`) + - Connects via rmcp `StreamableHttpClientTransport::from_uri(endpoint)` + - Connection failure → `AdapterError::DiscoveryFailed` + - 401 → `AdapterError::Unauthorized` + - Calls `tools/list` → MCP tools + - For each tool: `HandlerRegistration` with `FromMCP` provenance + - `spec.name` = tool name (or `namespace/tool_name`) + - `spec.op_type` = `Mutation` (MCP tools are call/response) + - `spec.visibility` = `Internal` (ADR-015) + - `spec.input_schema` = tool's `inputSchema` + - `spec.output_schema` = declared `outputSchema` if present, else `ContentBlock` union + - `provenance` = `FromMCP`, `composition_authority: None`, `scoped_env: None` (ADR-022) + - `capabilities` = bearer token (injected at registration) + +2. **`from_mcp` output handling** (http-mcp.md §"Output handling"): + - `structured_content` present → use as result (validated against `output_schema`) + - `structured_content` absent → map `content: Vec` to `ContentBlock` union + - No heuristic `JSON.parse` of text blocks (carry as `ContentBlock`) + - `isError: true` → `CallError` (not the output handling path) + - rmcp client connection maintained for registration lifetime + +3. **`from_mcp` no-env-vars** (ADR-014): + - Handler reads `context.capabilities`, never `std::env::var` + - Bearer token from `Capabilities`, not env vars + +4. **`to_mcp` conformance** (http-mcp.md, ADR-041): + - Implements rmcp `ServerHandler` trait (`call_tool`, `list_tools`) + - `tools/list` returns 4 fixed gateway tools (`search`, `schema`, `call`, `batch`) + - `tools/list` does NOT return registry's operations (discovered via `search`) + - `search` → dispatches `services/list` via `GatewayDispatch::invoke` + - `search` results are `AccessControl::check(identity)`-filtered + - `Subscription` ops filtered out of `search` results (ADR-041 §2) + - `schema` → dispatches `services/schema` via `GatewayDispatch::invoke` + - `call` → dispatches via `GatewayDispatch::invoke` (shared spine) + - `call` result → `CallToolResult::structured(value)` for `Ok` + - `call` error → `CallToolResult::structured_error(details)` for `Err(CallError)` + - `batch` → loop over `invoke`, returns array + - `to_mcp` is a pure projection (consumes registry, does not produce entries) + +5. **`to_mcp` auth** (http-mcp.md, research §4.4): + - Bearer auth via shared `bearer_auth_middleware` (applied around `nest_service`) + - `Identity` read from `RequestContext.extensions` inside `call_tool` + - Identity survives rmcp framing (research §6 #2 — confirmed by spike) + - MCP client has no `PeerId` (not an alknet peer, ADR-034 §4) + - `AccessControl` gates `search` results and `call` dispatch + +6. **`to_mcp` rmcp integration** (http-mcp.md, research §4): + - `StreamableHttpService` nested into axum `Router` at `/mcp` + - `ServerHandler` impl is a gateway service (4 fixed tools) + - rmcp owns JSON-RPC framing, session management, SSE priming + - `to_mcp` owns `call_tool`/`list_tools` dispatch + `CallToolResult` mapping + +7. **Streamable HTTP only** (ADR-037): + - stdio transport NOT built (not a dependency, not optional, not behind a feature) + - `mcp` feature pulls in rmcp with streamable HTTP transport features only + - `transport-child-process` is not a dependency + +8. **Feature gate isolation**: + - `from_mcp`/`to_mcp` compile only with `mcp` feature + - `cargo check -p alknet-http` (no `mcp`) succeeds — MCP code not compiled + - `cargo check -p alknet-http --features mcp` succeeds + +9. **Shared dispatch spine** (research §5.1): + - `to_mcp` `call` uses `GatewayDispatch::invoke` (same spine as `to_openapi`) + - `ResponseEnvelope` → `CallToolResult` mapping is `to_mcp`-specific + - No `GatewayDispatch` trait (concrete struct) + - Identity resolution, context construction, invoke are shared (security axis) + - Wire framing, discovery, streaming are per-gateway (presentation axis) + +10. **Error fidelity** (ADR-023): + - `from_mcp` error_schemas from MCP tool error descriptions + - `to_mcp` `call` error → `CallToolResult::structured_error` with typed `details` + - No collision with protocol-level codes + +11. **Security constraints**: + - No-env-vars invariant (from_mcp reads context.capabilities, ADR-014) + - MCP clients are not peers (no PeerId, ADR-034 §4) + - AccessControl gates search results and call dispatch + - No secret material in CallToolResult + +12. **Test coverage**: + - Unit tests for from_mcp (import, outputSchema present/absent, isError) + - Unit tests for to_mcp (tools/list returns 4, search/schema/call/batch) + - Integration test: MCP client → search → schema → call round-trip + - Integration test: Bearer auth gates to_mcp service + - Integration test: no-env-vars (no std::env::var in from_mcp) + - Feature gate: cargo check without mcp succeeds + +## Acceptance Criteria + +- [ ] from_mcp matches http-mcp.md (OperationAdapter, tools/list, outputSchema handling) +- [ ] from_mcp output handling: structuredContent-preferred, ContentBlock fallback, no JSON.parse +- [ ] from_mcp no-env-vars: reads context.capabilities, never std::env::var +- [ ] to_mcp matches ADR-041 (4-tool gateway, not one tool per operation) +- [ ] to_mcp Subscription excluded from search results +- [ ] to_mcp uses GatewayDispatch::invoke (shared spine) +- [ ] to_mcp CallToolResult mapping correct (structured/structured_error) +- [ ] to_mcp auth: shared middleware, Identity survives rmcp framing +- [ ] Streamable HTTP only (ADR-037 — stdio NOT built) +- [ ] Feature gate isolation (mcp code not compiled without mcp feature) +- [ ] No GatewayDispatch trait (concrete struct, research recommendation) +- [ ] Error fidelity (ADR-023 — no collision with protocol codes) +- [ ] MCP clients are not peers (no PeerId, ADR-034 §4) +- [ ] AccessControl gates search and call +- [ ] No secret material in CallToolResult +- [ ] Test coverage adequate for all MCP functionality +- [ ] `cargo fmt --check -p alknet-http --features mcp` passes +- [ ] `cargo clippy -p alknet-http --features mcp` passes with no warnings +- [ ] `cargo test -p alknet-http --features mcp` passes +- [ ] `cargo check -p alknet-http` (no mcp) succeeds + +## References + +- docs/architecture/crates/http/http-mcp.md — full MCP spec +- docs/research/alknet-http-gateway-factoring/findings.md — §4 (rmcp constraints), §4.4 (auth sharing) +- docs/architecture/decisions/037-mcp-stdio-transport-exclusion.md — ADR-037 +- docs/architecture/decisions/041-mcp-tool-gateway-pattern.md — ADR-041 +- docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017 §5 +- docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 +- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 +- docs/architecture/decisions/034-outgoing-only-x509-and-three-peer-roles.md — ADR-034 §4 +- /workspace/rust-sdk/ — rmcp SDK (ServerHandler, StreamableHttpService, CallToolResult) + +## Notes + +> The MCP adapters are feature-gated and the most SDK-coupled part of +> the crate (rmcp integration). The review should verify: (1) streamable +> HTTP only (no stdio, ADR-037), (2) the 4-tool gateway pattern (ADR-041, +> not one tool per operation), (3) the output handling +> (structuredContent-preferred, ContentBlock fallback, no JSON.parse), +> (4) the shared dispatch spine is used for to_mcp's call (same spine as +> to_openapi), (5) the auth middleware is shared (Identity survives rmcp +> framing — research §6 #2), (6) the no-env-vars invariant holds, (7) +> feature gate isolation (MCP code not compiled without mcp). If +> deviations are found, document and fix before considering the MCP +> adapters complete. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/review-websocket.md b/tasks/http/review-websocket.md new file mode 100644 index 0000000..40aa934 --- /dev/null +++ b/tasks/http/review-websocket.md @@ -0,0 +1,154 @@ +--- +id: http/review-websocket +name: Review WebSocket path for ADR-044/048 conformance (native session, no length prefix, browsers-not-peers) +status: pending +depends_on: [http/websocket/connection-overlay] +scope: moderate +risk: low +impact: phase +level: review +--- + +## Description + +Review the WebSocket path for spec conformance, pattern consistency, +and correctness. This is the quality checkpoint for the browser +bidirectional path — the most architecturally subtle part of the crate +(browsers are not peers, the WS path carries the native session not the +gateway shape, no length prefix). + +### Review Checklist + +1. **Dispatcher transport abstraction conformance** (ADR-012): + - `Dispatcher::dispatch_requested` is `pub` (was `pub(crate)`) + - Abort-handling method is `pub` (extracted from `handle_stream`) + - `CallConnection` constructible from non-QUIC source + - Non-QUIC `CallConnection` holds overlay + pending + identity + - Existing QUIC path (`CallAdapter`, `CallClient`) unchanged — no regressions + - Full `alknet-call` test suite passes (no regressions in the cross-crate change) + +2. **WS upgrade handler conformance** (websocket.md, ADR-044/048): + - Upgrade route at `/alknet/call` (default, ADR-046 collision rule) + - Bearer auth on upgrade request (shared `bearer_auth_middleware`) + - No token → 401 (upgrade rejected) + - Insufficient scopes → 403 at call time (not upgrade time) + - Resolved identity stored on `CallConnection` + - Upgrade works over HTTP/1.1 (RFC 6455) and HTTP/2 (RFC 8441) + - Handler does not branch on HTTP version (WS frame stream same post-upgrade) + +3. **Framing conformance** (websocket.md §"Framing", ADR-044 Assumption 1): + - Binary WS message = one `EventEnvelope` (JSON serde) + - **No length prefix** (WS message boundary is delimiter, unlike QUIC's 4-byte prefix) + - No `FrameFramedReader`/`FrameFramedWriter` on the WS path + - Text WS messages rejected (protocol-level close) + - Binary payloads follow base64-as-JSON-string convention (same as QUIC) + +4. **Dispatch conformance** (websocket.md §"Dispatch", ADR-012): + - `call.requested` → `Dispatcher::dispatch_requested` (the pub API) + - `AccessControl::check(identity)` gates every `call.requested` + - `FORBIDDEN` → `call.error` (before handler runs) + - `call.responded`/`call.completed`/`call.aborted` correlated by `id` via `PendingRequestMap` + - Response `EventEnvelope` frames written as binary WS messages + - `call.aborted` → the pub abort-handling method + - No `RemoteFilter`/`remote_safe` (retired by ADR-029 §3) + +5. **Bidirectionality conformance** (websocket.md §"Bidirectionality", ADR-043 §2): + - Both sides can send `call.requested` (native call-protocol bidirectionality) + - Hub can call browser-registered ops via overlay + - Browser with no registered ops → server→client unused (use-case scoping) + +6. **Connection-local overlay conformance** (websocket.md §"Connection-local overlay", ADR-024/034/044): + - Browser-registered ops land in `CallConnection`'s Layer 2 overlay (not `PeerCompositeEnv`) + - No `PeerId` for browser (no `PeerEntry`, no peer-graph membership) + - Hub's outgoing `call.requested` routes through `overlay_env()` (not `PeerRef::Specific`) + - `PeerRef::Specific("browser-X")` → routes to nothing (no `PeerEntry`) + - Overlay dropped on WS close (no explicit deregistration) + - `AccessControl` on browser ops gates hub's calls + +7. **Browsers-are-not-peers rationale** (websocket.md §"Browsers are not alknet peers", ADR-044 §5): + - No stable cryptographic identity (bearer token, not fingerprints) + - Ephemeral (close tab → overlay dies) + - Not addressable from other nodes (no `PeerEntry`) + - "Peer" = addressable peer-graph node, not "any endpoint that exchanges calls" + +8. **Streaming conformance** (websocket.md §"Streaming"): + - `Subscription` over WS → `call.responded` as binary WS messages (no SSE) + - `call.completed` closes subscription; `call.aborted` closes with error + - WS disconnect mid-subscription → `call.aborted` cascade (ADR-016) + +9. **ADR conformance**: + - ADR-012: stream-agnostic correlation (Dispatcher runs unchanged) + - ADR-016: abort cascade on disconnect + - ADR-024: Layer 2 connection-local overlay + - ADR-029 §3: AccessControl::check is sole gate (no remote_safe) + - ADR-034 §4: browsers are not peers (amended by ADR-044 §5) + - ADR-044: WS is v1 browser path, no length prefix, no h3 + - ADR-048: WS carries native session, not gateway shape + +10. **Security constraints**: + - AccessControl::check(identity) is the sole authorization gate + - No secret material on the WS path (ADR-014) + - Internal ops → NOT_FOUND (don't leak existence) + - Abort cascade on disconnect (ADR-016) + +11. **Test coverage**: + - Integration test: WS upgrade → call.requested → call.responded round-trip + - Integration test: no Bearer token → 401 + - Integration test: AccessControl denied → call.error FORBIDDEN + - Integration test: Subscription over WS → call.responded + call.completed + - Integration test: WS disconnect mid-subscription → call.aborted cascade + - Integration test: text WS message → protocol close + - Integration test: bidirectional (hub calls browser-registered op) + - Integration test: PeerRef::Specific("browser-X") → NOT_FOUND + - alknet-call tests pass (no regressions from the transport abstraction change) + +## Acceptance Criteria + +- [ ] Dispatcher transport abstraction: pub API, non-QUIC CallConnection, no regressions +- [ ] WS upgrade: /alknet/call, Bearer auth, 401 on no token, HTTP/1.1 + HTTP/2 +- [ ] Framing: binary WS = EventEnvelope, no length prefix, text rejected +- [ ] Dispatch: call.requested → dispatch_requested, AccessControl gates, correlation by id +- [ ] Bidirectionality: both sides can call.requested, hub calls browser ops via overlay +- [ ] Connection-local overlay: no PeerId, no PeerEntry, overlay dies on close +- [ ] Browsers-not-peers: no stable identity, ephemeral, not addressable +- [ ] Streaming: native call.responded (no SSE), abort cascade on disconnect +- [ ] All ADRs conformed to (012, 016, 024, 029, 034, 044, 048) +- [ ] AccessControl is the sole authorization gate +- [ ] No secret material on WS path +- [ ] Internal ops → NOT_FOUND +- [ ] Test coverage adequate for all WS functionality +- [ ] `cargo fmt --check -p alknet-http` passes +- [ ] `cargo clippy -p alknet-http` passes with no warnings +- [ ] `cargo test -p alknet-call` passes (no regressions) +- [ ] All tests pass + +## References + +- docs/architecture/crates/http/websocket.md — full WS spec +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 +- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 +- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 +- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 +- docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 +- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 §3 +- docs/architecture/decisions/034-outgoing-only-x509-and-three-peer-roles.md — ADR-034 §4 +- /workspace/@alkdev/pubsub/src/event-target-websocket-client.ts — prior art + +## Notes + +> The WebSocket path is the most architecturally subtle part of the +> crate. The review should verify: (1) the Dispatcher transport +> abstraction didn't regress the QUIC path (run alknet-call tests), (2) +> the WS path carries the native EventEnvelope session not the gateway +> shape (ADR-048), (3) no length prefix (ADR-044 Assumption 1), (4) +> browsers are not peers (no PeerId, connection-local overlay, ADR-034 §4 +> + ADR-044 §5), (5) AccessControl is the sole gate (ADR-029 §3), (6) +> abort cascade on disconnect (ADR-016). The "browsers are not peers" +> rationale is load-bearing — verify the three grounds (no stable +> identity, ephemeral, not addressable) are reflected in the +> implementation. If deviations are found, document and fix before +> considering the WS path complete. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/server/bearer-auth-middleware.md b/tasks/http/server/bearer-auth-middleware.md new file mode 100644 index 0000000..5d6a124 --- /dev/null +++ b/tasks/http/server/bearer-auth-middleware.md @@ -0,0 +1,179 @@ +--- +id: http/server/bearer-auth-middleware +name: Implement shared Bearer auth middleware (resolve_from_token, stash Identity in request extensions) +status: pending +depends_on: [http/server/http-adapter] +scope: narrow +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement the shared Bearer auth axum middleware in +`src/server/auth.rs`. This is the auth layer shared by the HTTP gateway +endpoints AND the `to_mcp` rmcp service (research §4.4: "the auth +middleware is shareable now"). One axum layer resolves the bearer token +and stashes `Option` in request extensions; the `to_openapi` +route handlers read it from axum state/extractors, and the `to_mcp` +`call_tool` handler reads it from rmcp's `RequestContext.extensions` +(rmcp injects `http::request::Parts` into extensions — research §4.4, +`tower.rs:487-521, 1086-1097`). + +### The middleware (http-server.md §"Auth") + +Inbound HTTP auth is `Authorization: Bearer `, resolved via +`IdentityProvider::resolve_from_token()` (the auth.md handler table: +`HttpAdapter`, Bearer header, `resolve_from_token`). Bearer-only is the +auth mechanism for the default surface; other HTTP auth schemes (Basic, +API key in query param) are not implemented and would be added as axum +middleware (two-way door). + +```rust +/// Axum middleware that resolves the `Authorization: Bearer` header via +/// `IdentityProvider::resolve_from_token()` and stashes the resolved +/// `Option` in request extensions. Shared by the HTTP gateway +/// endpoints and the to_mcp rmcp service (research §4.4). +pub async fn bearer_auth_middleware( + State(identity_provider): State>, + mut request: Request, + next: Next, +) -> Response { + let identity = extract_bearer_identity(&request, &identity_provider); + request.extensions_mut().insert(identity); + next.run(request).await +} + +/// Extract the `Authorization: Bearer ` header and resolve it to +/// an `Option`. Returns `None` if no token is present (the +/// request proceeds unauthenticated; the route handler / AccessControl +/// decides whether to reject). Returns `None` if the token is present +/// but resolution fails (treat as unauthenticated, not as an error — +/// matches the CallAdapter's per-request identity resolution behavior). +pub fn extract_bearer_identity( + request: &Request, + identity_provider: &dyn IdentityProvider, +) -> Option { + let header = request.headers().get(AUTHORIZATION)?; + let token_str = header.to_str().ok()?.strip_prefix("Bearer ")?; + let token = AuthToken { raw: token_str.as_bytes().to_vec() }; + identity_provider.resolve_from_token(&token) +} +``` + +### Auth resolution behavior + +- An unauthenticated request to an operation with `AccessControl` + restrictions returns `401` (no token) or `403` (token present but + insufficient scopes). The call protocol's `FORBIDDEN` protocol code + maps to `403`; `NOT_FOUND` (Internal op) maps to `404`. (The + `error-mapping` task owns the status mapping; this task resolves the + identity and stashes it.) +- The HTTP handler stores the resolved identity on the `Connection` for + observability (`connection.set_identity(identity)`), same as the call + protocol handler (OQ-11 resolved). +- Bearer-only is the auth mechanism. Basic auth, API keys in query + params, and other HTTP auth schemes are not implemented. A deployment + that needs a different auth scheme adds it as axum middleware + (two-way door), but the default surface is Bearer-only. + +### The `Identity` extractor + +Provide an axum extractor so route handlers can declare `identity: +Option` as a parameter and get the resolved identity from +extensions: + +```rust +/// Axum extractor: the resolved bearer identity (or None if +/// unauthenticated). Read from request extensions (stashed by +/// `bearer_auth_middleware`). +#[derive(Clone, Debug)] +pub struct ResolvedIdentity(pub Option); + +#[async_trait] +impl FromRequestParts for ResolvedIdentity { + type Rejection = Infallible; + async fn from_request_parts(parts: &mut Parts, state: &AppState) -> Result { + Ok(ResolvedIdentity(parts.extensions.get::>().cloned().flatten_or(None))) + } +} +``` + +### Shared with `to_mcp` (research §4.4) + +The `to_mcp` rmcp service is nested into the axum router via +`Router::nest_service("/mcp", mcp_service)` (the `to-mcp` task). The +Bearer auth middleware is applied as an axum layer *around* the nested +service (the rmcp `simple_auth_streamhttp.rs` example shows the pattern: +`middleware::from_fn_with_state` around `Router::nest_service`). The +`to_mcp` `call_tool` handler reads the `Identity` from +`RequestContext.extensions` (rmcp injects +`http::request::Parts` into extensions — `tower.rs:487-521, 1086-1097`). + +A spike should confirm this extension-survives-the-rmcp-framing path +works end-to-end — it is the load-bearing assumption for sharing the auth +middleware (research §6 open question #2). The `Identity` stashed by the +axum middleware into `Parts.extensions` should be retrievable via +`ctx.extensions.get::()` inside `call_tool`. + +### What this task does NOT do + +- **No AccessControl enforcement.** The middleware resolves identity; + the route handlers / `GatewayDispatch::invoke()` enforce + `AccessControl::check(identity)`. This task stashes the identity; it + does not reject requests (except for malformed `Authorization` headers, + which are treated as no-token, not as errors). +- **No error response mapping.** The `401`/`403`/`404` status mapping is + the `error-mapping` task. This task resolves identity; the route + handler produces the `CallError`, and the error-mapping task maps it. +- **No `to_mcp` service.** The rmcp service is the `to-mcp` task. This + task provides the middleware that wraps it. + +## Acceptance Criteria + +- [ ] `bearer_auth_middleware` axum middleware in `src/server/auth.rs` +- [ ] Extracts `Authorization: Bearer ` header +- [ ] Resolves via `identity_provider.resolve_from_token(&AuthToken { raw })` +- [ ] Stashes `Option` in request extensions +- [ ] No token present → `None` identity (request proceeds, route handler decides) +- [ ] Malformed `Authorization` header → `None` identity (not an error) +- [ ] Token present but resolution fails → `None` identity (treat as unauthenticated) +- [ ] `ResolvedIdentity` axum extractor reads from extensions +- [ ] Middleware is `pub` and re-exported from `lib.rs` +- [ ] Middleware applicable to both HTTP routes and nested rmcp service (research §4.4) +- [ ] `connection.set_identity(identity)` called for observability (OQ-11) +- [ ] No `std::env::var` reads (no-env-vars invariant) +- [ ] Unit test: request with valid Bearer token → `Some(identity)` in extensions +- [ ] Unit test: request with no `Authorization` header → `None` in extensions +- [ ] Unit test: request with malformed `Authorization` → `None` in extensions +- [ ] Unit test: request with `Basic` auth → `None` (Bearer-only, not an error) +- [ ] Unit test: `ResolvedIdentity` extractor retrieves stashed identity +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-server.md — Auth (§"Auth") +- docs/research/alknet-http-gateway-factoring/findings.md — §4.4 (auth-extraction convergence, shareable now) +- docs/architecture/crates/core/auth.md — IdentityProvider, resolve_from_token +- docs/architecture/decisions/004-auth-as-shared-core.md — ADR-004 (Bearer → resolve_from_token) +- /workspace/rust-sdk/examples/servers/src/simple_auth_streamhttp.rs — rmcp axum middleware pattern + +## Notes + +> The auth middleware is the second small shared piece (alongside the +> dispatch spine). It is shareable between the HTTP gateway routes and +> the to_mcp rmcp service because both use axum middleware — the rmcp +> service is nested via Router::nest_service, and the middleware is +> applied around it. The load-bearing assumption is that the Identity +> stashed in Parts.extensions survives the rmcp framing and is +> retrievable via ctx.extensions.get::() inside call_tool +> (research §6 open question #2 — confirm with a spike). This task +> resolves identity and stashes it; it does not enforce AccessControl +> (that's the route handler / GatewayDispatch's job) or map errors +> (that's the error-mapping task). + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/server/gateway-endpoints.md b/tasks/http/server/gateway-endpoints.md new file mode 100644 index 0000000..0c26a23 --- /dev/null +++ b/tasks/http/server/gateway-endpoints.md @@ -0,0 +1,194 @@ +--- +id: http/server/gateway-endpoints +name: Implement 5 gateway endpoints (search/schema/call/batch/subscribe) — axum route handlers +status: pending +depends_on: [http/server/http-adapter, http/gateway/gateway-dispatch-spine, http/gateway/error-mapping, http/server/bearer-auth-middleware] +scope: broad +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement the 5 fixed gateway endpoints in `src/server/gateway_routes.rs`. +These are the sole invoke path over HTTP (ADR-042, ADR-047): an HTTP +client invokes an operation via `POST /call` with +`{ "operation": "/fs/readFile", "input": {...} }`, discovers what it can +call via the `AccessControl`-filtered `GET /search`, and learns an +operation's shape via `GET /schema`. There is no per-operation +`POST /{service}/{op}` direct-call surface — the gateway is the invoke +path (ADR-047 supersedes ADR-036's direct-call surface). + +### The 5 endpoints (http-server.md §"HTTP-to-call dispatch", http-adapters.md §"The gateway endpoint set") + +| Endpoint | Call protocol | HTTP method | Purpose | +|----------|--------------|-------------|---------| +| `/search` | `services/list` | `GET` | List/search operations (AccessControl-filtered). Names + descriptions. | +| `/schema` | `services/schema` | `GET` | Get an operation's full `OperationSpec`. | +| `/call` | `call.requested` (Query/Mutation) | `POST` | Invoke an operation. Flat JSON body `{ operation, input }`. | +| `/batch` | multiple `call.requested` | `POST` | Invoke multiple operations. Array of `{ operation, input }`. | +| `/subscribe` | `call.requested` (Subscription) | `POST` (SSE) | Invoke a streaming operation. Body `{ operation, input }`; response `text/event-stream`. | + +### `POST /call` dispatch (http-server.md §"HTTP-to-call dispatch") + +1. The axum route handler reads the JSON body + `{ "operation": "/fs/readFile", "input": {...} }`. +2. Resolves the caller's identity from the `Authorization: Bearer` header + (via the shared `bearer_auth_middleware` — stashed in extensions as + `ResolvedIdentity`). +3. Calls `GatewayDispatch::invoke(identity, operation, input)` — the + shared dispatch spine (the `gateway-dispatch-spine` task). This builds + the root `OperationContext` (`internal: false`, `forwarded_for: None`) + and dispatches through `OperationRegistry::invoke()`. +4. The response (`ResponseEnvelope`) is serialized as the HTTP response + body (JSON). Errors map to HTTP status codes via the `error-mapping` + task (`call_error_to_http_response`). + +`Internal` operations (ADR-015) return `404` (`NOT_FOUND`) — the gateway +dispatches only `External` operations, and the caller discovers which +`External` operations it can call via the `AccessControl`-filtered +`/search` endpoint. This is the per-caller API surface property: an HTTP +client cannot stub its toe on a path for an operation it can't call, +because there is no per-operation path — `/search` tells it what it can +call, `/call` invokes it, and the `AccessControl` check runs on `/call` +regardless. + +### `GET /search` (AccessControl-filtered discovery) + +Dispatches `services/list` through `GatewayDispatch::invoke()` with the +resolved caller identity. The `services/list` handler (already in +`OperationRegistry`) filters by `AccessControl::check(identity)` — the +client sees only the operations it is authorized to call. Returns +operation names + descriptions (not full schemas). Query parameters for +filtering/searching are a two-way-door extension (the v1 shape is "list +all I can call"; search/filter sugar is additive). + +### `GET /schema` + +Dispatches `services/schema` through `GatewayDispatch::invoke()` with +the resolved caller identity. Returns the operation's full +`OperationSpec` (input/output JSON Schemas, error schemas). The +`AccessControl` check runs (an unauthorized caller gets `FORBIDDEN`, not +the schema). + +### `POST /batch` + +Follows the same dispatch path as `/call` with an array of +`{ operation, input }` pairs (OQ-14). `batch` is a loop over +`GatewayDispatch::invoke()` in the gateway (research §6 open question +#3 — confirm `batch` is genuinely just a loop, no shared batch-specific +state, no transactional semantics). Returns an array of results (or +errors), one per entry, in order. + +### `POST /subscribe` (SSE streaming projection) + +A `Subscription` operation invoked via the gateway's `POST /subscribe` +endpoint projects its `call.responded` stream as Server-Sent Events. The +request body is `{ operation, input }` (the same flat JSON shape as +`/call`); the response is `text/event-stream` (negotiated via +`Accept: text/event-stream` on the `POST`). The axum route handler: + +- Sets `Content-Type: text/event-stream`. +- For each `call.responded` event, writes an SSE `data:` frame (the + event's `output` serialized as JSON). +- On `call.completed`, closes the SSE stream (normal end). +- On `call.aborted`, closes the stream with an SSE error event. +- On HTTP client disconnect (detected as the response writer closing), + sends `call.aborted` for the in-flight subscription, which cascades + to descendants per ADR-016. + +This is the HTTP/1.1 + HTTP/2 streaming projection. Over WebSocket +(websocket.md), the subscription projects directly onto the WS +connection — `call.responded` events as binary WS messages, no SSE +framing. WebTransport (`h3`, deferred per ADR-044) would project onto +WebTransport bidirectional streams. + +### One-directional projection (http-server.md §"One-directional projection") + +The HTTP/1.1 + HTTP/2 surface is a **lossy, one-directional projection** +of the call protocol. HTTP is request/response: the client initiates, the +server responds. The call protocol is bidirectional — both sides can +initiate calls. The HTTP projection carries only the client→server call +direction; the server→client call direction has no HTTP expression. +`Subscription` streaming is the one partial exception — the server +streams `call.responded` frames back over the SSE response — but even +there, the *call* is client-initiated; only the *results* flow +server→client. WebSocket restores the bidirectional call model for +browsers (the `websocket/` tasks). + +### Constraints + +- **The gateway is the sole invoke path over HTTP (ADR-042, ADR-047).** + No per-operation `POST /{service}/{op}` direct-call surface. +- **`External` operations only.** `Internal` operations return `404` on + the gateway's `/call`, matching the call protocol's `NOT_FOUND`. +- **Bearer-only auth.** Via the shared `bearer_auth_middleware`. +- **No secret material in HTTP responses.** Capabilities are used for + outbound calls (`from_openapi`), never serialized into HTTP response + bodies (ADR-014). + +## Acceptance Criteria + +- [ ] `POST /call` route handler reads `{ operation, input }` JSON body +- [ ] `/call` resolves identity via `ResolvedIdentity` extractor (shared middleware) +- [ ] `/call` dispatches via `GatewayDispatch::invoke(identity, operation, input)` +- [ ] `/call` response is `ResponseEnvelope` serialized as JSON +- [ ] `/call` errors mapped via `call_error_to_http_response` (error-mapping task) +- [ ] `Internal` op on `/call` → `404 NOT_FOUND` +- [ ] `External` op with `AccessControl` restrictions + unauthorized → `403 FORBIDDEN` +- [ ] `External` op with `AccessControl` restrictions + no identity → `401` +- [ ] `GET /search` dispatches `services/list` via `GatewayDispatch::invoke` +- [ ] `/search` results are `AccessControl::check(identity)`-filtered +- [ ] `/search` returns operation names + descriptions (not full schemas) +- [ ] `GET /schema` dispatches `services/schema` via `GatewayDispatch::invoke` +- [ ] `/schema` returns the operation's full `OperationSpec` +- [ ] `/schema` for unauthorized op → `403 FORBIDDEN` +- [ ] `POST /batch` dispatches an array of `{ operation, input }` via loop over `invoke` +- [ ] `/batch` returns an array of results (or errors), one per entry, in order +- [ ] `POST /subscribe` sets `Content-Type: text/event-stream` +- [ ] `/subscribe` writes `call.responded` events as SSE `data:` frames +- [ ] `/subscribe` closes stream on `call.completed` +- [ ] `/subscribe` writes SSE error event on `call.aborted` +- [ ] `/subscribe` sends `call.aborted` on HTTP client disconnect (ADR-016 cascade) +- [ ] No per-operation `POST /{service}/{op}` direct-call surface (ADR-047) +- [ ] No secret material in HTTP response bodies (ADR-014) +- [ ] Integration test: `/call` round-trip (External op → 200 + JSON body) +- [ ] Integration test: `/call` Internal op → 404 +- [ ] Integration test: `/call` unauthorized → 403 +- [ ] Integration test: `/call` unauthenticated + restricted op → 401 +- [ ] Integration test: `/search` returns only AccessControl-allowed ops +- [ ] Integration test: `/schema` returns full spec for authorized op +- [ ] Integration test: `/batch` returns array of results in order +- [ ] Integration test: `/subscribe` streams SSE events until completed +- [ ] Integration test: `/subscribe` client disconnect → abort cascade +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-server.md — HTTP-to-call dispatch, SSE projection, one-directional projection +- docs/architecture/crates/http/http-adapters.md — The gateway endpoint set, per-caller API surface +- docs/architecture/decisions/042-openapi-gateway-pattern.md — ADR-042 (5 fixed gateway endpoints) +- docs/architecture/decisions/047-remove-direct-call-http-surface.md — ADR-047 (gateway is sole invoke path) +- docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (Internal → 404) +- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (disconnect → abort cascade) +- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (no secrets in responses) + +## Notes + +> The 5 gateway endpoints are the sole HTTP invoke path (ADR-047). The +> /call handler delegates to GatewayDispatch::invoke (the shared spine); +> the error mapping is the error-mapping task; the auth is the shared +> bearer-auth-middleware. /subscribe is the SSE streaming projection — +> the one to_openapi-specific piece that does not go through the shared +> spine's request/response invoke (research §6 open question #5 — +> /subscribe is to_openapi-owned, not in the shared core). /batch is a +> loop over invoke (research §6 open question #3 — confirm no +> batch-specific shared state). The one-directional projection is a +> structural property of HTTP; WebSocket restores bidirectionality for +> browsers. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/server/healthz-decoy.md b/tasks/http/server/healthz-decoy.md new file mode 100644 index 0000000..110e3b5 --- /dev/null +++ b/tasks/http/server/healthz-decoy.md @@ -0,0 +1,146 @@ +--- +id: http/server/healthz-decoy +name: Implement /healthz raw route and stealth decoy fallback (DecoyConfig) +status: pending +depends_on: [http/server/http-adapter] +scope: narrow +risk: low +impact: component +level: implementation +--- + +## Description + +Implement the `/healthz` raw route and the stealth decoy fallback in +`src/server/healthz.rs` and `src/server/decoy.rs`. These are the two +non-gateway HTTP surfaces on the `HttpAdapter` router: the one raw +operational endpoint (`/healthz`) and the stealth fallback for unknown +paths (the decoy). + +### `GET /healthz` (raw route, http-server.md §"/healthz (raw route)") + +`GET /healthz` is a raw HTTP route outside the call protocol — no auth, +no operation registration, no `OperationContext`. It returns `200 OK` +with a plain-text body (e.g., `"ok"`) if the endpoint is healthy. This is +the infrastructure endpoint load balancers and orchestrators call; it +must work before identity is resolvable. + +```rust +/// GET /healthz — raw health check. No auth, no call protocol. +/// Returns 200 OK with plain-text body "ok" if the endpoint is healthy. +async fn healthz() -> impl IntoResponse { + (StatusCode::OK, [("content-type", "text/plain"), "ok"]) +} +``` + +Other operational endpoints (metrics, dashboard) are call-protocol +operations if built (`/metrics/list`, `/dashboard/view`), not raw HTTP +routes. `healthz` is the one exception (ADR-036). + +### Stealth decoy (http-server.md §"Stealth decoy") + +For paths that are not the gateway endpoints (`/search`, `/schema`, +`/call`, `/batch`, `/subscribe`), `/healthz`, `/openapi.json`, the MCP +route, the WS upgrade route, or a custom route per ADR-046, the HTTP +handler serves a decoy. The decoy is configurable (`DecoyConfig`): + +- `NotFound` — A fake `404 Not Found` (the default — matches the + reference implementation's "fake nginx 404"). +- `StaticSite { root }` — Serve a static site from a configured + directory. For deployments that want a real decoy website. +- `Redirect { to }` — Redirect to a configured URL. + +The decoy is the stealth surface: a port scanner or a client that +doesn't offer alknet ALPNs connects on `h2`/`http/1.1` and sees the +decoy. Real services use `alknet/ssh`, `alknet/call`, etc. The decoy +config is a two-way-door default (an operator picks what to serve); the +*existence* of the stealth path is fixed by ADR-010. + +Custom routes (ADR-046) take precedence over the decoy — a path matched +by a custom route is served by it, not the decoy; the decoy is the +fallback for paths matched by neither the default surface nor a custom +route. + +### The fallback handler + +```rust +/// Fallback handler for unknown paths (stealth decoy). Serves the +/// configured DecoyConfig: fake 404 (default), static site, or redirect. +async fn decoy_fallback( + State(decoy): State, + request: Request, +) -> Response { + match decoy { + DecoyConfig::NotFound => fake_nginx_404(), + DecoyConfig::StaticSite { root } => serve_static(root, request).await, + DecoyConfig::Redirect { to } => redirect(to), + } +} +``` + +The `NotFound` variant should match the reference implementation's +"fake nginx 404" — a realistic 404 page that looks like a generic web +server, not an alknet-specific error. The exact body is a two-way-door +implementation detail; the one-way constraint is that it does not leak +alknet's presence (no alknet headers, no alknet error format). + +### Wiring into the router + +The `healthz` route and the `decoy_fallback` are wired into the axum +`Router` by the `http-adapter` task. This task implements the handlers; +the `http-adapter` task's router construction calls them: + +```rust +// In the http-adapter task's router construction: +let router = Router::new() + .route("/healthz", get(healthz)) // this task + .fallback(decoy_fallback) // this task + // ... gateway endpoints, /openapi.json, MCP, WS upgrade ... +``` + +## Acceptance Criteria + +- [ ] `GET /healthz` handler returns `200 OK` with plain-text body `"ok"` +- [ ] `/healthz` requires no auth (no Bearer token check) +- [ ] `/healthz` does not construct an `OperationContext` (raw route) +- [ ] `DecoyConfig::NotFound` serves a fake 404 (no alknet-specific headers/format) +- [ ] `DecoyConfig::StaticSite { root }` serves static files from `root` +- [ ] `DecoyConfig::Redirect { to }` returns an HTTP redirect to `to` +- [ ] `DecoyConfig::default()` returns `NotFound` +- [ ] Decoy fallback serves for paths not matched by any other route +- [ ] Custom routes (ADR-046) take precedence over decoy (decoy is fallback only) +- [ ] Gateway endpoints, `/healthz`, `/openapi.json`, MCP route, WS upgrade take precedence over decoy +- [ ] Decoy does not leak alknet presence (no alknet headers, no alknet error format) +- [ ] Unit test: `/healthz` returns 200 + "ok" +- [ ] Unit test: unknown path with `NotFound` decoy → 404 +- [ ] Unit test: unknown path with `StaticSite` decoy → static file +- [ ] Unit test: unknown path with `Redirect` decoy → redirect +- [ ] Unit test: `/healthz` works with no `Authorization` header +- [ ] Integration test: custom route matched → custom handler (not decoy) +- [ ] Integration test: unknown path not matched by custom route → decoy +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-server.md — /healthz (§"/healthz (raw route)"), Stealth decoy (§"Stealth decoy") +- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (stealth, decoy existence) +- docs/architecture/decisions/036-http-to-call-operation-mapping.md — ADR-036 (/healthz is the one raw route) +- docs/architecture/decisions/046-assembly-layer-custom-http-routes.md — ADR-046 (custom routes take precedence over decoy) + +## Notes + +> /healthz is the one raw route — no auth, no call protocol, no +> OperationContext. It must work before identity is resolvable (load +> balancers call it). The decoy is the stealth surface: a port scanner +> sees the decoy, not alknet. The decoy config is a two-way-door +> (operator picks NotFound/StaticSite/Redirect); the existence of the +> stealth path is fixed by ADR-010. The NotFound variant should look +> like a generic web server's 404, not an alknet error — no alknet +> headers, no alknet format. Custom routes take precedence over the +> decoy; the decoy is the fallback for paths matched by neither the +> default surface nor a custom route. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/server/http-adapter.md b/tasks/http/server/http-adapter.md new file mode 100644 index 0000000..0ce647e --- /dev/null +++ b/tasks/http/server/http-adapter.md @@ -0,0 +1,217 @@ +--- +id: http/server/http-adapter +name: Implement HttpAdapter (ProtocolHandler for h2/http1.1) — axum over QUIC stream, ALPN branching, custom routes +status: pending +depends_on: [http/crate-init, http/gateway/gateway-dispatch-spine] +scope: broad +risk: high +impact: component +level: implementation +--- + +## Description + +Implement `HttpAdapter` in `src/server/adapter.rs`. This is the +`ProtocolHandler` implementation for the standard HTTP ALPNs (`h2`, +`http/1.1`) — the highest-risk task in the http crate. It ties together +the axum-over-QUIC-stream integration, ALPN branching, the router +construction (gateway endpoints + `/healthz` + `/openapi.json` + MCP + +custom routes + decoy), and the `extra_routes: Option` extension +point (ADR-046). + +### The struct (http-server.md §"What") + +```rust +pub struct HttpAdapter { + identity_provider: Arc, + registry: Arc, + /// The default handler for paths that are not registered operations + /// (stealth decoy). Configurable: a static site, a fake 404, a + /// redirect. Two-way-door default (ADR-010). + decoy: DecoyConfig, + /// Deployment-specific routes added by the assembly layer (ADR-046). + /// None = the default surface only. Custom routes are raw HTTP, not + /// call-protocol operations; they coexist with the default surface and + /// are not described by to_openapi. + extra_routes: Option, +} + +pub enum DecoyConfig { + /// Serve a fake `404 Not Found` (the default — "fake nginx 404"). + NotFound, + /// Serve a static site from a configured directory. + StaticSite { root: PathBuf }, + /// Redirect to a configured URL. + Redirect { to: String }, +} +``` + +### ProtocolHandler impl + +```rust +#[async_trait] +impl ProtocolHandler for HttpAdapter { + fn alpn(&self) -> &'static [u8]; // returns the configured ALPN + async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>; +} +``` + +The `HttpAdapter` registers for multiple ALPNs (`http/1.1`, `h2`). The +endpoint's `HandlerRegistry` maps each ALPN byte string to the same +adapter instance; `handle()` branches on `connection.remote_alpn()` to +pick the HTTP framing. For `http/1.1` and `h2`, the framing is hyper's +HTTP/1.1 or HTTP/2 over a QUIC bidirectional stream. WebSocket upgrade +layers on top of the same hyper connection driver — a WS upgrade is an +HTTP/1.1 or HTTP/2 request that switches protocols (the WS handler is +the `websocket/upgrade-handler` task; this task hosts the route). + +### Running axum over a QUIC stream (http-server.md §"Running axum over a QUIC stream") + +The `HttpAdapter::handle()` method for `h2`/`http/1.1`: + +1. Accepts one bidirectional stream from the QUIC connection + (`connection.accept_bi()` → `(SendStream, RecvStream)`). +2. Wraps the `(SendStream, RecvStream)` pair as a hyper + `TokioIo`-compatible duplex stream — the same byte stream hyper + expects for an HTTP connection. +3. Constructs the axum `Router` (built once at adapter construction, + cloned per connection — axum `Router` is `Clone` and cheap to clone). +4. Hands the duplex stream + the axum router to hyper's connection driver + (`hyper::server::conn::http1::Builder` or + `http2::Builder::serve_connection`), which reads HTTP frames, parses + them, dispatches to axum routes, and writes HTTP responses. +5. Returns when the HTTP connection closes (the client disconnects or + the stream ends). + +The axum `Router` is built once at adapter construction with the +`Arc` and `Arc` embedded in its +state; cloning the `Router` per connection clones the `Arc`s (cheap, +shared state), so every request handler has access to the registry and +identity provider through the router's state. + +### The router surface (http-server.md §"Architecture") + +The axum `Router` is the single routing surface for HTTP requests. It +contains: + +- **The `to_openapi` gateway endpoints** (`/search`, `/schema`, `/call`, + `/batch`, `/subscribe` — ADR-042). These 5 fixed endpoints are the sole + invoke path over HTTP. (Route handlers are the `gateway-endpoints` + task; this task wires the router.) +- `GET /healthz` (raw route, no auth, no call protocol). (The + `healthz-decoy` task; this task wires the route.) +- `GET /openapi.json` (serves the `to_openapi` projection). (The + `to-openapi` task; this task wires the route.) +- The stealth decoy fallback (unknown paths). (The `healthz-decoy` + task; this task wires the fallback.) +- (Feature-gated) `POST /mcp` (the `to_mcp` streamable HTTP service). (The + `to-mcp` task; this task wires the route behind the `mcp` feature gate.) +- **Deployment-specific custom routes** (ADR-046). The assembly layer + may inject an `axum::Router` of extra routes at `HttpAdapter` + construction. (This task implements the `extra_routes` merge.) + +### Custom routes (ADR-046) + +Custom routes: + +- Are **raw HTTP**, not call-protocol operations — not registered in the + `OperationRegistry`, not discoverable via `/search`, not in the + `to_openapi` gateway doc. +- **May** dispatch into the registry via + `OperationRegistry::invoke()` with a proper `OperationContext` + (caller identity from the resolved bearer token) — the OAI proxy + pattern. Or they may be pure HTTP (a webhook receiver, a static asset + server) that never touches the registry. +- Run under the **default Bearer-auth middleware**; a route that wants + different auth applies its own axum middleware (the deployment owns + its custom routes' middleware stack). +- **Do not collide** with the reserved default-surface paths + (`/search`, `/schema`, `/call`, `/batch`, `/subscribe`, `/healthz`, + `/openapi.json`, the MCP route) — the default surface wins on + collision; custom routes namespace away naturally (`/v1/...`). +- Are **not versioned** by `to_openapi` (ADR-045 versions the gateway + contract, not custom routes). +- Are **immutable after construction** (matches OQ-04 / ADR-010's + static-registration constraint; the `HttpAdapter` router is built once + at startup). + +The extension point is additive: a deployment that passes `None` gets +exactly the default surface. The mechanism (the constructor parameter) +is the one-way door — once downstream deployments build against it, it's +a contract (ADR-046). The specific routes a deployment adds are a +two-way door (add/remove freely). + +### ALPN branching + +The `HttpAdapter` registers for `http/1.1` and `h2`. The endpoint's +`HandlerRegistry` maps each ALPN to the same `HttpAdapter` instance; the +handler branches on `connection.remote_alpn()` to pick the right +framing. The `h3` ALPN is not registered in v1 (deferred per ADR-044). + +### What this task does NOT do + +- **No gateway route handlers.** The 5 gateway endpoints' handler logic + is the `gateway-endpoints` task. This task wires the routes into the + router and provides the router state. +- **No `/healthz` or decoy logic.** The `healthz-decoy` task implements + the healthz handler and the decoy fallback. This task wires the + routes. +- **No `/openapi.json` generation.** The `to-openapi` task implements + the OpenAPI doc generation. This task wires the route. +- **No MCP route.** The `to-mcp` task implements the rmcp service. This + task wires the route behind the `mcp` feature gate. +- **No WebSocket upgrade handler.** The `websocket/upgrade-handler` task + implements the WS upgrade. This task hosts the WS upgrade route on the + router (the WS task depends on this task's router). + +## Acceptance Criteria + +- [ ] `HttpAdapter` struct with `identity_provider`, `registry`, `decoy`, `extra_routes` +- [ ] `DecoyConfig` enum with `NotFound`, `StaticSite { root }`, `Redirect { to }` +- [ ] `HttpAdapter::new(identity_provider, registry)` constructor +- [ ] `HttpAdapter::with_decoy(self, decoy)` builder +- [ ] `HttpAdapter::with_extra_routes(self, routes: Router)` builder (ADR-046) +- [ ] `ProtocolHandler::alpn()` returns the configured ALPN (`http/1.1` or `h2`) +- [ ] `handle()` branches on `connection.remote_alpn()` for HTTP framing +- [ ] `handle()` accepts a QUIC bidirectional stream via `connection.accept_bi()` +- [ ] `handle()` wraps the stream as a hyper `TokioIo`-compatible duplex +- [ ] `handle()` drives hyper's `http1::Builder` or `http2::Builder::serve_connection` +- [ ] axum `Router` built once at construction, cloned per connection +- [ ] Router state holds `Arc` + `Arc` +- [ ] Custom routes (`extra_routes`) merged via `Router::merge` (ADR-046) +- [ ] Default surface reserved paths take precedence on collision with custom routes +- [ ] `h3` ALPN is not registered (deferred per ADR-044) +- [ ] `handle()` returns when the HTTP connection closes +- [ ] Unit test: `alpn()` returns `http/1.1` or `h2` per config +- [ ] Unit test: `DecoyConfig::default()` is `NotFound` +- [ ] Unit test: `with_extra_routes` merges routes without collision on reserved paths +- [ ] Integration test: `handle()` serves an HTTP request over a mock QUIC stream +- [ ] Integration test: custom route (`/v1/foo`) coexists with default surface +- [ ] Integration test: reserved path (`/healthz`) wins over a custom route collision +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/http-server.md — HttpAdapter, axum over QUIC, custom routes +- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (stealth, ALPN dispatch) +- docs/architecture/decisions/046-assembly-layer-custom-http-routes.md — ADR-046 (extra_routes) +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (h3 deferred) +- docs/architecture/decisions/039-http-server-and-client-host-colocated.md — ADR-039 (one crate) + +## Notes + +> This is the highest-risk task in the http crate — the axum-over-QUIC +> integration is the merge point. The router is built once at +> construction with the registry and identity provider in its state; +> cloning per connection is cheap (Arc clone). The extra_routes +> extension point (ADR-046) is additive: None = default surface only; +> Some(routes) = default + custom, with default winning on collision. +> The h3 ALPN is not registered (deferred per ADR-044). This task wires +> the router and provides the QUIC-to-hyper bridge; the gateway +> endpoints, healthz, decoy, openapi.json, MCP, and WS upgrade route are +> wired by their respective tasks (which depend on this task's router). + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/websocket/connection-overlay.md b/tasks/http/websocket/connection-overlay.md new file mode 100644 index 0000000..ee0da6f --- /dev/null +++ b/tasks/http/websocket/connection-overlay.md @@ -0,0 +1,182 @@ +--- +id: http/websocket/connection-overlay +name: Implement connection-local Layer 2 overlay for browser-registered ops (no PeerId, ADR-024/034/044) +status: pending +depends_on: [http/websocket/upgrade-handler] +scope: moderate +risk: medium +impact: component +level: implementation +--- + +## Description + +Implement the connection-local Layer 2 overlay for browser-registered +ops in `src/websocket/overlay.rs`. This is the mechanism that gives a +browser bidirectional-call capability *without* peer-graph membership +(ADR-024, ADR-034 §4, ADR-044 §5). A browser over WebSocket has no +`PeerId`, does not enter `PeerCompositeEnv`, and any ops it registers +land in a per-`CallConnection` overlay that dies when the connection +drops. + +### Browsers are not alknet peers (websocket.md §"Browsers are not alknet peers") + +A browser over WebSocket authenticates by bearer token, gets no +`PeerId`, does not enter `PeerCompositeEnv`, and its registered ops (if +any) land in the connection-local Layer 2 overlay. The rationale, stated +in ADR-044 §5 and amending ADR-034 §4 by reference, is a load-bearing +distinction: + +**"Peer" in alknet means an addressable node in the call-protocol peer +graph** — a stable `PeerId`, reachable via `PeerRef::Specific`, whose ops +land in `PeerCompositeEnv`, whose identity is stable across reconnects. +It does *not* mean "any endpoint that exchanges calls during a live +session." A browser is the second thing but not the first, on three +concrete grounds: + +1. **No stable cryptographic identity of its own.** A `PeerEntry` is + anchored to fingerprints (Ed25519, X.509) that *the peer* presents + and the local node pins. A browser presents a bearer token the *hub* + issued; the "identity" is the hub's bookkeeping for that token, not + something the browser owns or that could be pinned by another node. + There is nothing to put in `PeerEntry.fingerprints`. + +2. **Ephemeral.** Close the tab → connection dies → the connection-local + Layer 2 overlay dies with it. A `PeerEntry` keyed to a browser would + be a permanently-dead entry within seconds. `PeerRef::Specific("browser-X")` + from another node would route to nothing. + +3. **Not addressable from other nodes.** `PeerRef::Specific` resolves + through `PeerEntry` → `PeerId`. Another alknet node has no way to + reach "the browser currently connected to hub-A"; the hub holds that + connection as a live `CallConnection` handle, not as a peer-graph + entry. The connection-local overlay is precisely the mechanism that + gives the browser bidirectional-call capability *without* peer-graph + membership. + +### The overlay (websocket.md §"Connection-local overlay") + +A browser over WebSocket has no `PeerId` on the hub's side. Any ops the +browser registers land in a **connection-local Layer 2 overlay** +(ADR-024) — a per-`CallConnection` overlay that dies when the connection +drops. This is the same mechanism ADR-034 §2 describes for the inbound +browser case: the browser is a bidirectional call target during a live +session, not a peer-graph member, and the connection-local overlay is +what gives it bidirectional-call capability *without* peer-graph +membership. + +When the WS connection closes (browser closes the tab, network drops), +the overlay and all its registered ops are dropped — no explicit +deregistration needed. A `PeerRef::Specific("browser-X")` from another +node would route to nothing, because there is no `PeerEntry` for the +browser. + +### Bidirectionality (websocket.md §"Bidirectionality") + +The WS call-protocol session inherits the call protocol's native +bidirectionality: both sides can send `call.requested` frames. The +browser calls operations on the hub; the hub can call operations +registered on the browser's side, over the same session, using the same +`PendingRequestMap` and `EventEnvelope` framing as `alknet/call`. + +The browser case where the client registers no operations of its own +is the common case — the server→client call direction is unused because +the browser has nothing to call. That is a use-case scoping, not an +architectural limitation. A browser that *does* expose ops (e.g., a UI +that registers a `ui/dragged` op the hub can call to push live updates) +registers them in the connection-local Layer 2 overlay, and the hub +reaches them through the live `CallConnection` handle — not through +`PeerRef::Specific` (the browser is not a peer). + +### Implementation + +The `CallConnection` constructed by the upgrade handler (the +`upgrade-handler` task, via the `dispatcher-transport-abstraction` +task's non-QUIC constructor) already holds a Layer 2 overlay +(`imported_operations: Arc>>`) +and exposes `register_imported()` / `register_imported_all()` / +`overlay_env()`. The browser registers ops via these methods; the +overlay is per-connection and dies when the `CallConnection` is dropped +(WS close). + +This task ensures: + +1. The overlay is correctly scoped to the WS connection (not the + `PeerCompositeEnv` — no `PeerId`, no `PeerEntry`). +2. The hub's outgoing `call.requested` to browser-registered ops routes + through the `CallConnection`'s overlay (via `overlay_env()`), not + through `PeerRef::Specific`. +3. The overlay is dropped on WS close (no explicit deregistration; the + `Arc>` is dropped when the `CallConnection` is + dropped). +4. `AccessControl::check(identity)` gates the hub's calls to + browser-registered ops (the browser's bearer-token identity is the + caller identity for the hub's outgoing calls — wait, no: the *hub* + is the caller when it calls a browser op; the browser's identity is + the *handler* identity. Clarify: the hub's `call.requested` to a + browser op runs with the hub's identity as caller, the browser's + registration bundle's `composition_authority` as handler identity. + The browser's `AccessControl` on its registered ops gates whether + the hub is allowed to call them.) +5. Abort cascade on WS disconnect (ADR-016): when the WS connection + closes, all in-flight subscriptions and calls to browser ops are + aborted, cascading to descendants. + +### What this task does NOT do + +- **No `PeerEntry` for the browser.** The browser is not in the peer + graph. This task ensures the overlay is connection-local, not + peer-graph. +- **No `from_wss` adapter.** Out of scope (websocket.md §"Future" — + scope decision). This task is about the browser *registering* ops on + its connection, not about importing a remote node's ops over WS. + +## Acceptance Criteria + +- [ ] Browser-registered ops land in the `CallConnection`'s Layer 2 overlay (not `PeerCompositeEnv`) +- [ ] No `PeerId` created for the browser (no `PeerEntry`, no peer-graph membership) +- [ ] `register_imported()` / `register_imported_all()` work for browser ops +- [ ] Hub's outgoing `call.requested` to browser ops routes through `overlay_env()` +- [ ] Hub's outgoing calls do NOT route through `PeerRef::Specific` (browser is not a peer) +- [ ] `AccessControl` on browser-registered ops gates the hub's calls +- [ ] Overlay dropped on WS close (no explicit deregistration; `Arc>` dropped) +- [ ] `PeerRef::Specific("browser-X")` from another node → routes to nothing (no `PeerEntry`) +- [ ] WS close → all in-flight subscriptions/calls to browser ops aborted (ADR-016 cascade) +- [ ] WS close → overlay and all registered ops dropped +- [ ] Bidirectionality: hub can `call.requested` to browser-registered ops +- [ ] Browser with no registered ops → server→client direction unused (use-case scoping, not a limitation) +- [ ] Integration test: browser registers op → hub calls it via overlay +- [ ] Integration test: WS close → overlay dropped (op no longer reachable) +- [ ] Integration test: `PeerRef::Specific("browser-X")` → NOT_FOUND (no PeerEntry) +- [ ] Integration test: WS close mid-call to browser op → `call.aborted` cascade +- [ ] Integration test: `AccessControl` on browser op gates hub's call +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/websocket.md — Connection-local overlay (§"Connection-local overlay"), Bidirectionality (§"Bidirectionality"), Browsers are not peers (§"Browsers are not alknet peers") +- docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (Layer 2 connection-local overlay) +- docs/architecture/decisions/034-outgoing-only-x509-and-three-peer-roles.md — ADR-034 §4 (browsers are not peers) +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 §5 (addressability vs bidirectionality rationale) +- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (abort cascade on disconnect) +- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 (PeerRef::Specific routes through PeerEntry → PeerId) + +## Notes + +> The connection-local overlay is the mechanism that gives a browser +> bidirectional-call capability without peer-graph membership. The +> browser has no PeerId, no PeerEntry, no PeerCompositeEnv entry — it is +> a bidirectional call target during a live session, not a peer-graph +> member. The overlay dies with the WS connection (no explicit +> deregistration). The hub reaches browser ops through the live +> CallConnection handle's overlay_env(), not through PeerRef::Specific. +> The "browsers are not peers" rationale (ADR-044 §5) is load-bearing: +> "peer" means addressable peer-graph node, not "any endpoint that +> exchanges calls during a live session." A browser has no stable +> cryptographic identity, is ephemeral, and is not addressable from +> other nodes — three concrete grounds for not being a peer. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/websocket/dispatcher-transport-abstraction.md b/tasks/http/websocket/dispatcher-transport-abstraction.md new file mode 100644 index 0000000..4363a26 --- /dev/null +++ b/tasks/http/websocket/dispatcher-transport-abstraction.md @@ -0,0 +1,180 @@ +--- +id: http/websocket/dispatcher-transport-abstraction +name: Expose EventEnvelope-level dispatch API in alknet-call for non-QUIC transports (WebSocket) +status: pending +depends_on: [http/crate-init] +scope: moderate +risk: high +impact: project +level: implementation +--- + +## Description + +Expose an `EventEnvelope`-level dispatch API in `alknet-call` so the +WebSocket handler can feed deserialized envelopes directly to the shared +`Dispatcher`, without requiring a QUIC `Connection`. This is a +**cross-crate task** (modifies `alknet-call`) and the **highest-risk +task** in the http phase: the spec says "the `Dispatcher` runs unchanged" +over WS (ADR-012, ADR-048), but the current implementation is +QUIC-specific in two places that need loosening. + +### The problem + +The current `Dispatcher` (in `crates/alknet-call/src/protocol/dispatch.rs`) +is transport-agnostic in *intent* (ADR-012 — stream-agnostic +correlation) but QUIC-specific in *two* integration points: + +1. **`Dispatcher::handle_stream`** takes raw `SendStream` / `RecvStream` + (QUIC-backed `alknet_core::types::SendStream` / `RecvStream`) and uses + `FrameFramedReader` (4-byte length-prefixed framing). The WebSocket + path does NOT use length-prefix framing — a WS binary message is + already length-delimited by the WS frame boundary (ADR-044 Assumption + 1). The WS handler deserializes `EventEnvelope` from each binary WS + message directly (no `FrameFramedReader`), and needs to feed the + envelope to the dispatch logic. + +2. **`CallConnection`** wraps an `alknet_core::types::Connection` (which + wraps a QUIC `quinn::Connection` or `iroh::endpoint::Connection`). + The WS path has no QUIC connection — it has a WS message stream. The + `CallConnection` is needed for: the Layer 2 overlay + (`imported_operations`), the `PendingRequestMap` (correlation), and + the `connection.identity()` (the resolved bearer identity). The WS + path needs a `CallConnection`-equivalent that holds these without a + QUIC `Connection`. + +### The fix: expose `dispatch_requested` as `pub` + +The core dispatch logic — `Dispatcher::dispatch_requested` — is already +transport-agnostic: it takes a `request_id: String`, a `payload: Value` +(the `EventEnvelope` payload), and a `&Arc`, and returns +a `ResponseEnvelope`. It is currently `pub(crate)`. **Expose it as +`pub`** so the WS handler can call it directly with a deserialized +`EventEnvelope` payload. + +Similarly, the abort-cascade handling (`call.aborted` events) is in +`Dispatcher::handle_stream` — extract the abort-handling logic into a +`pub` method so the WS handler can call it for `call.aborted` events. + +### The fix: `CallConnection` from a non-QUIC transport + +The `CallConnection` needs to be constructible from a non-QUIC source. +Two options (pick the cleaner one during implementation): + +**Option A: A `CallConnection::new_overlay_only(identity)` constructor.** +Construct a `CallConnection` that holds the Layer 2 overlay + +`PendingRequestMap` + the resolved bearer `Identity`, but no QUIC +`Connection`. The `connection()` accessor returns a stub or the +`identity()` is stored directly. This is the minimal change — +`CallConnection` gains a constructor that doesn't require a QUIC +`Connection`, and the `identity()` is read from a stored field rather +than `connection.identity()`. + +**Option B: Extract a `CallSession` trait.** Define a trait that +`CallConnection` and a new `WsCallSession` both implement, with +`identity()`, `overlay_env()`, `pending()`, `register_imported()`. The +`Dispatcher` takes `&Arc`. This is more invasive but +cleaner; it's the right choice if the QUIC/WS divergence is large. + +**Recommendation: Option A** unless the divergence is larger than it +appears. The `CallConnection` already holds the overlay + pending as +`Arc>` / `Arc>` (independent of the QUIC +`Connection`); the only QUIC-coupled piece is the `connection: Arc` +field and the `connection.identity()` call. A constructor that stores +the `Identity` directly (and returns `None` from `connection()` or +provides a `identity()` accessor that reads the stored field) is the +minimal change. + +### The WS dispatch loop (how the WS handler uses this) + +The WS upgrade handler (the `websocket/upgrade-handler` task) will: + +1. Resolve the bearer identity at upgrade time. +2. Construct a `CallConnection` (via the new constructor — Option A) or + equivalent (Option B) holding the identity, a fresh Layer 2 overlay, + and a fresh `PendingRequestMap`. +3. Construct a `Dispatcher` (already `pub`). +4. For each binary WS message: deserialize `EventEnvelope`, match on + `envelope.r#type`: + - `call.requested` → call `Dispatcher::dispatch_requested(connection, + request_id, payload)` (now `pub`), get `ResponseEnvelope`, convert + to `EventEnvelope`, write back as binary WS message. + - `call.aborted` → call the extracted `pub` abort-handling method. + - `call.responded` / `call.completed` → correlate via + `PendingRequestMap` (the WS handler's outgoing calls — + bidirectionality, ADR-043 §2). +5. On WS close: fail all pending, drop the overlay (connection-local, + dies with the WS connection). + +### What this task does NOT do + +- **No WS upgrade handler.** The upgrade handler is the + `websocket/upgrade-handler` task. This task exposes the API it calls. +- **No WS framing.** The WS message → `EventEnvelope` deserialization is + the `websocket/upgrade-handler` task. This task takes deserialized + envelopes. +- **No `from_wss` adapter.** Out of scope (websocket.md §"Future" — + scope decision, not a two-way-door deferral). + +### Why this is the highest-risk task + +This task modifies `alknet-call`'s security-relevant dispatch code. The +`dispatch_requested` method runs `AccessControl::check(identity)` — the +sole authorization gate (ADR-029 §3). Exposing it as `pub` is safe (the +WS handler is in `alknet-http`, a trusted crate), but the change must +not alter the dispatch logic itself. The `CallConnection` change must +not break the existing QUIC path (the `CallAdapter` and `CallClient` +construct `CallConnection` from a QUIC `Connection` — that path must +continue to work unchanged). Run the full `alknet-call` test suite after +the change. + +## Acceptance Criteria + +- [ ] `Dispatcher::dispatch_requested` is `pub` (was `pub(crate)`) +- [ ] Abort-cascade handling extracted to a `pub` method (was inline in `handle_stream`) +- [ ] `CallConnection` constructible from a non-QUIC source (Option A or B) +- [ ] New `CallConnection` constructor stores `Identity` directly (or equivalent) +- [ ] `CallConnection::identity()` works for the non-QUIC case +- [ ] `CallConnection::overlay_env()`, `pending()`, `register_imported()` work for non-QUIC +- [ ] Existing QUIC path (`CallAdapter`, `CallClient`) unchanged — no regressions +- [ ] `Dispatcher::handle_stream` (QUIC path) still works unchanged +- [ ] `Dispatcher::run_loop` (QUIC path) still works unchanged +- [ ] `cargo test -p alknet-call` — all existing tests pass (no regressions) +- [ ] `cargo clippy -p alknet-call --all-targets` — no warnings +- [ ] Unit test: `dispatch_requested` callable with a non-QUIC `CallConnection` +- [ ] Unit test: abort-handling method callable with a non-QUIC `CallConnection` +- [ ] Unit test: `CallConnection` from non-QUIC source holds overlay + pending + identity +- [ ] Integration test: dispatch a `call.requested` via the `pub` API → `ResponseEnvelope` +- [ ] Integration test: abort cascade via the `pub` API +- [ ] `cargo test -p alknet-http` succeeds (the WS handler can use the API) +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/websocket.md — Dispatch (§"Dispatch: the shared Dispatcher, unchanged"), Framing (§"Framing") +- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (stream-agnostic correlation) +- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 (WS carries native session) +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (WS message boundary is delimiter, no length prefix) +- docs/architecture/crates/call/call-protocol.md — Dispatcher, EventEnvelope wire format +- docs/architecture/crates/call/client-and-adapters.md — Shared Dispatcher (§"Shared Dispatcher") +- crates/alknet-call/src/protocol/dispatch.rs — current Dispatcher implementation +- crates/alknet-call/src/protocol/connection.rs — current CallConnection implementation + +## Notes + +> This is the highest-risk task in the http phase. It modifies +> alknet-call's security-relevant dispatch code to expose an +> EventEnvelope-level API for non-QUIC transports. The spec says "the +> Dispatcher runs unchanged" (ADR-012), but the current implementation is +> QUIC-specific in two places: handle_stream takes raw SendStream/RecvStream +> (length-prefixed framing), and CallConnection wraps a QUIC Connection. +> The fix is to expose dispatch_requested as pub and make CallConnection +> constructible from a non-QUIC source. The existing QUIC path (CallAdapter, +> CallClient) must not regress — run the full alknet-call test suite. The +> WS handler (websocket/upgrade-handler task) is the consumer of this API. +> This task is tracked in tasks/http/ because it unblocks the WS path, but +> it modifies alknet-call — coordinate with the call crate's conventions. + +## Summary + +> To be filled on completion \ No newline at end of file diff --git a/tasks/http/websocket/upgrade-handler.md b/tasks/http/websocket/upgrade-handler.md new file mode 100644 index 0000000..d2cc606 --- /dev/null +++ b/tasks/http/websocket/upgrade-handler.md @@ -0,0 +1,230 @@ +--- +id: http/websocket/upgrade-handler +name: Implement WebSocket upgrade handler (native EventEnvelope session, no length prefix, bearer auth) +status: pending +depends_on: [http/server/http-adapter, http/websocket/dispatcher-transport-abstraction, http/server/bearer-auth-middleware] +scope: broad +risk: high +impact: component +level: implementation +--- + +## Description + +Implement the WebSocket upgrade handler in `src/websocket/upgrade.rs`. +This is the v1 browser bidirectional path (ADR-044): a browser (or any +WS client) upgrades an HTTP/1.1 or HTTP/2 request to WebSocket and +speaks the call protocol over binary WS messages — full-duplex, both +sides can initiate calls (the call protocol's native bidirectionality, +ADR-012). The WS path carries the **native `EventEnvelope` session, not +the HTTP gateway shape** (ADR-048): the gateway endpoints +(`/search`/`/schema`/`/call`/`/batch`/`/subscribe`) are HTTP-only and do +not appear on WS; discovery is via `services/list`/`services/schema` as +ordinary call-protocol ops. + +### The upgrade handler (websocket.md §"The WS upgrade handler") + +The WS upgrade is an HTTP/1.1 or HTTP/2 request handled by an axum route +on `HttpAdapter`'s router. The handler: + +1. Receives the HTTP upgrade request (axum's `WebSocketUpgrade` extractor). +2. Resolves the caller's identity from the `Authorization: Bearer` header + via `identity_provider.resolve_from_token(&AuthToken { raw: + token_bytes })` (the shared `bearer_auth_middleware` — same auth path + as any HTTP request). The upgrade is rejected (`401`) if no token is + present; insufficient scopes for any op the browser later calls + surface as `403`/`FORBIDDEN` at call time, not at upgrade time (the + upgrade doesn't know which ops the browser will call). +3. Upgrades to WebSocket (axum's `WebSocketUpgrade::on_upgrade`), + producing a full-duplex `WebSocket` stream. +4. Wraps the `WebSocket` stream as a `BiStream`-satisfying transport — a + WS binary message in either direction is one `EventEnvelope` frame. +5. Constructs a `Dispatcher` (the shared dispatch loop) with the + `Arc` and `Arc` the + `HttpAdapter` holds, plus a connection-local Layer 2 overlay for any + ops the browser registers (the `connection-overlay` task). +6. Spawns the dispatch task on a tokio task; the WS connection is live + until either side closes it or the browser drops the handle (closes + the tab). + +### The upgrade path + +The **default upgrade path is `/alknet/call`** (the deployment may +override it via the `extra_routes` mechanism of ADR-046, but a +deployment that passes no custom routes gets `/alknet/call`). The path +must not collide with the reserved gateway/`/healthz`/`/openapi.json`/ +MCP/custom-route paths per ADR-046's collision rule; `/alknet/call` +namespaces away from the reserved set naturally. A deployment that +builds a custom REST projection with `POST /{service}/{op}` routes +(ADR-047 §4) coexists with the WS upgrade at `/alknet/call` — axum's +`Router::merge` prioritizes specific routes over wildcards, so the WS +upgrade's exact `/alknet/call` path wins over any `/{service}/{op}` +wildcard. + +The upgrade runs over HTTP/1.1 (the standard `Upgrade: websocket` header, +RFC 6455) or HTTP/2 (the extended CONNECT protocol, RFC 8441); +axum/hyper supports both, and the handler does not branch on which — +the WS frame stream is the same once the upgrade completes. + +### Framing: `EventEnvelope` over binary WS messages (websocket.md §"Framing") + +Every message on the WS connection is a binary WebSocket message +containing one `EventEnvelope`: + +```rust +pub struct EventEnvelope { + pub r#type: String, // "call.requested" | "call.responded" | "call.completed" | "call.aborted" | "call.error" + pub id: String, // Correlation key (request ID, subscription ID) + pub payload: Value, // serde_json::Value — schema depends on event type +} +``` + +This is the call protocol's wire format verbatim. **The WS path carries +no length prefix**: one `EventEnvelope` JSON object = one binary WS +message, and the WS message boundary is the delimiter. The +implementation must not prepend the QUIC length prefix on outbound WS +messages or expect it on inbound ones — the two framings are +deliberately different, matching each transport's native boundary +semantics. (The `FrameFramedReader`/`FrameFramedWriter` types the QUIC +dispatch loop uses are replaced on the WS path by direct JSON serde +over the WS message type; the `Dispatcher` itself is transport-agnostic +and consumes `EventEnvelope` values, not raw bytes.) + +Binary payloads within `EventEnvelope.payload` follow the same +base64-as-JSON-string convention the QUIC path uses — the envelope +carries `serde_json::Value` and does not interpret binary fields; that's +a handler-level concern, transport-agnostic. + +Text WS messages are not used; all call-protocol frames are binary. A +client that sends a text message gets a protocol-level close (the WS +handler validates message type). + +### Dispatch: the shared `Dispatcher` (websocket.md §"Dispatch") + +The WS message stream is handed to the `Dispatcher` — the same dispatch +loop the `CallAdapter` uses for `alknet/call` QUIC connections. The +dispatch half is one implementation; the connection-establishment half +differs (WS upgrade handler vs QUIC accept/dial), but after +establishment the `Dispatcher` runs identically: + +- Reads `EventEnvelope` frames from the WS message stream (deserialized + from binary WS messages — no `FrameFramedReader`). +- For `call.requested`: resolves the peer's identity (the bearer-token + identity resolved at upgrade time, stored on the connection), runs + `AccessControl::check(identity)` against the op's `AccessControl`, + dispatches via `OperationRegistry::invoke()` if allowed, returns + `FORBIDDEN` (→ `call.error`) before the handler runs if not. +- For `call.responded`/`call.completed`/`call.aborted`: correlates by + `id` via `PendingRequestMap` (keyed by request ID, not by transport — + ADR-012). +- Writes response `EventEnvelope` frames back as binary WS messages. + +Peer authorization flows through the existing `AccessControl::check` +against the resolved identity — no `RemoteFilter`, no `remote_safe` +gate (retired by ADR-029 §3). + +### Using the exposed dispatch API + +This task uses the `pub` dispatch API exposed by the +`dispatcher-transport-abstraction` task: + +- `Dispatcher::dispatch_requested(connection, request_id, payload)` — + for `call.requested` events. +- The `pub` abort-handling method — for `call.aborted` events. +- `CallConnection` constructed from the non-QUIC source (holding the + resolved bearer identity, a fresh Layer 2 overlay, a fresh + `PendingRequestMap`). + +### Bidirectionality (websocket.md §"Bidirectionality") + +The WS call-protocol session inherits the call protocol's native +bidirectionality: both sides can send `call.requested` frames. The +browser calls operations on the hub; the hub can call operations +registered on the browser's side, over the same session, using the same +`PendingRequestMap` and `EventEnvelope` framing as `alknet/call`. + +The browser case where the client registers no operations of its own +is the common case — the server→client call direction is unused +because the browser has nothing to call. That is a use-case scoping, +not an architectural limitation. A browser that *does* expose ops +registers them in the connection-local Layer 2 overlay (the +`connection-overlay` task). + +### Streaming: native `call.responded` events, no SSE (websocket.md §"Streaming") + +A `Subscription` operation invoked over WS streams `call.responded` +events as binary WS messages directly — **no SSE `data:` framing**. SSE +is the `h2`/`http/1.1` streaming projection; on WS it is unnecessary +because WS is already a framed full-duplex channel. The browser receives +`call.responded` events one per WS binary message, with the same `id` +correlating them to the original `call.requested`; `call.completed` +closes the subscription; `call.aborted` closes it with an error frame. + +On WS client disconnect (the browser closes the tab mid-subscription), +the WS handler detects the stream close and sends `call.aborted` for +the in-flight subscription, which cascades to descendants per ADR-016. + +## Acceptance Criteria + +- [ ] WS upgrade route at `/alknet/call` (default, ADR-046 collision rule) +- [ ] Upgrade handler uses axum's `WebSocketUpgrade` extractor +- [ ] Bearer auth on upgrade request via shared `bearer_auth_middleware` +- [ ] No token → `401` (upgrade rejected) +- [ ] Token present but insufficient scopes → `403` at call time (not upgrade time) +- [ ] Resolved identity stored on the `CallConnection` (for observability + AccessControl) +- [ ] WS binary message = one `EventEnvelope` (JSON serde, no length prefix) +- [ ] No `FrameFramedReader`/`FrameFramedWriter` on the WS path (WS message boundary is delimiter) +- [ ] Text WS messages rejected (protocol-level close) +- [ ] `call.requested` → `Dispatcher::dispatch_requested` (the pub API) +- [ ] `AccessControl::check(identity)` gates every `call.requested` +- [ ] `FORBIDDEN` → `call.error` event (before handler runs) +- [ ] `call.responded`/`call.completed`/`call.aborted` correlated by `id` via `PendingRequestMap` +- [ ] Response `EventEnvelope` frames written as binary WS messages +- [ ] `call.aborted` → the pub abort-handling method +- [ ] Bidirectionality: hub can `call.requested` to browser-registered ops +- [ ] `Subscription` streams `call.responded` as binary WS messages (no SSE) +- [ ] `call.completed` closes subscription; `call.aborted` closes with error +- [ ] WS client disconnect mid-subscription → `call.aborted` (ADR-016 cascade) +- [ ] WS close → fail all pending, drop overlay (connection-local) +- [ ] Upgrade works over HTTP/1.1 (RFC 6455) and HTTP/2 (RFC 8441) +- [ ] Handler does not branch on HTTP version (WS frame stream is same post-upgrade) +- [ ] Integration test: WS upgrade → `call.requested` → `call.responded` round-trip +- [ ] Integration test: no Bearer token → 401 +- [ ] Integration test: `AccessControl` denied → `call.error` FORBIDDEN +- [ ] Integration test: `Subscription` over WS → multiple `call.responded` + `call.completed` +- [ ] Integration test: WS disconnect mid-subscription → `call.aborted` cascade +- [ ] Integration test: text WS message → protocol close +- [ ] Integration test: bidirectional (hub calls browser-registered op) +- [ ] `cargo test -p alknet-http` succeeds +- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings + +## References + +- docs/architecture/crates/http/websocket.md — full WS spec (upgrade handler, framing, dispatch, bidirectionality, streaming) +- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (WS is v1 browser path, no length prefix) +- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 (native session, not gateway shape) +- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (stream-agnostic correlation) +- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (disconnect → abort cascade) +- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 §3 (AccessControl::check is sole gate) +- docs/architecture/decisions/046-assembly-layer-custom-http-routes.md — ADR-046 (collision rule for /alknet/call) +- /workspace/@alkdev/pubsub/src/event-target-websocket-client.ts — TypeScript prior art (EventEnvelope over WS binary messages) + +## Notes + +> The WS path is the native EventEnvelope session, not the gateway shape +> (ADR-048). The gateway endpoints are HTTP-only; discovery is via +> services/list/services/schema as call-protocol ops. The WS path carries +> no length prefix (ADR-044 Assumption 1 — the WS message boundary is the +> delimiter, unlike QUIC's 4-byte prefix). Text messages are rejected. The +> dispatch uses the pub API exposed by the dispatcher-transport-abstraction +> task (dispatch_requested + abort-handling + non-QUIC CallConnection). +> Bidirectionality: both sides can call.requested (ADR-043 §2 transferred +> per ADR-044 §3). Streaming is native call.responded events, no SSE. The +> default upgrade path is /alknet/call (namespaces away from reserved paths +> per ADR-046). This is the second-highest-risk task (after the transport +> abstraction) — the WS dispatch loop must be identical to the QUIC dispatch +> loop on the security axis (AccessControl, identity, abort cascade). + +## Summary + +> To be filled on completion \ No newline at end of file