Break the alknet-http architecture spec into atomic, dependency-ordered tasks in tasks/http/, following the taskgraph frontmatter conventions used by the call/core/vault crates. Tasks span 7 phases across 5 module subdirectories (server/, gateway/, client/, adapters/, websocket/): - Phase 0: crate-init (foundation) - Phase 1: gateway-dispatch-spine, error-mapping, shared-http-client (shared infrastructure) - Phase 2: http-adapter, bearer-auth-middleware, gateway-endpoints, healthz-decoy (HTTP server surface) - Phase 3: to-openapi (OpenAPI gateway projection) - Phase 4: from-openapi (OpenAPI adapter, reqwest forwarding) - Phase 5: dispatcher-transport-abstraction, upgrade-handler, connection-overlay (WebSocket browser bidirectional path) - Phase 6: from-mcp, to-mcp (MCP adapters, feature-gated) - Phase 7: review-http, review-websocket, review-mcp, review-http-final (quality checkpoints) The gateway-dispatch-spine task implements the thin shared core recommended by the gateway-factoring research (concrete struct, not a trait). The dispatcher-transport-abstraction task is a cross-crate change to alknet-call (exposes EventEnvelope-level dispatch API for non-QUIC transports) — the highest-risk task. WebTransport/h3 is deferred per ADR-044 and has no tasks; from_wss is out of scope. Validated: 19 tasks, no cycles, 8 parallel generations, critical path length 8 (through the WebSocket strand).
230 lines
12 KiB
Markdown
230 lines
12 KiB
Markdown
---
|
|
id: http/websocket/upgrade-handler
|
|
name: Implement WebSocket upgrade handler (native EventEnvelope session, no length prefix, bearer auth)
|
|
status: pending
|
|
depends_on: [http/server/http-adapter, http/websocket/dispatcher-transport-abstraction, http/server/bearer-auth-middleware]
|
|
scope: broad
|
|
risk: high
|
|
impact: component
|
|
level: implementation
|
|
---
|
|
|
|
## Description
|
|
|
|
Implement the WebSocket upgrade handler in `src/websocket/upgrade.rs`.
|
|
This is the v1 browser bidirectional path (ADR-044): a browser (or any
|
|
WS client) upgrades an HTTP/1.1 or HTTP/2 request to WebSocket and
|
|
speaks the call protocol over binary WS messages — full-duplex, both
|
|
sides can initiate calls (the call protocol's native bidirectionality,
|
|
ADR-012). The WS path carries the **native `EventEnvelope` session, not
|
|
the HTTP gateway shape** (ADR-048): the gateway endpoints
|
|
(`/search`/`/schema`/`/call`/`/batch`/`/subscribe`) are HTTP-only and do
|
|
not appear on WS; discovery is via `services/list`/`services/schema` as
|
|
ordinary call-protocol ops.
|
|
|
|
### The upgrade handler (websocket.md §"The WS upgrade handler")
|
|
|
|
The WS upgrade is an HTTP/1.1 or HTTP/2 request handled by an axum route
|
|
on `HttpAdapter`'s router. The handler:
|
|
|
|
1. Receives the HTTP upgrade request (axum's `WebSocketUpgrade` extractor).
|
|
2. Resolves the caller's identity from the `Authorization: Bearer` header
|
|
via `identity_provider.resolve_from_token(&AuthToken { raw:
|
|
token_bytes })` (the shared `bearer_auth_middleware` — same auth path
|
|
as any HTTP request). The upgrade is rejected (`401`) if no token is
|
|
present; insufficient scopes for any op the browser later calls
|
|
surface as `403`/`FORBIDDEN` at call time, not at upgrade time (the
|
|
upgrade doesn't know which ops the browser will call).
|
|
3. Upgrades to WebSocket (axum's `WebSocketUpgrade::on_upgrade`),
|
|
producing a full-duplex `WebSocket` stream.
|
|
4. Wraps the `WebSocket` stream as a `BiStream`-satisfying transport — a
|
|
WS binary message in either direction is one `EventEnvelope` frame.
|
|
5. Constructs a `Dispatcher` (the shared dispatch loop) with the
|
|
`Arc<OperationRegistry>` and `Arc<dyn IdentityProvider>` the
|
|
`HttpAdapter` holds, plus a connection-local Layer 2 overlay for any
|
|
ops the browser registers (the `connection-overlay` task).
|
|
6. Spawns the dispatch task on a tokio task; the WS connection is live
|
|
until either side closes it or the browser drops the handle (closes
|
|
the tab).
|
|
|
|
### The upgrade path
|
|
|
|
The **default upgrade path is `/alknet/call`** (the deployment may
|
|
override it via the `extra_routes` mechanism of ADR-046, but a
|
|
deployment that passes no custom routes gets `/alknet/call`). The path
|
|
must not collide with the reserved gateway/`/healthz`/`/openapi.json`/
|
|
MCP/custom-route paths per ADR-046's collision rule; `/alknet/call`
|
|
namespaces away from the reserved set naturally. A deployment that
|
|
builds a custom REST projection with `POST /{service}/{op}` routes
|
|
(ADR-047 §4) coexists with the WS upgrade at `/alknet/call` — axum's
|
|
`Router::merge` prioritizes specific routes over wildcards, so the WS
|
|
upgrade's exact `/alknet/call` path wins over any `/{service}/{op}`
|
|
wildcard.
|
|
|
|
The upgrade runs over HTTP/1.1 (the standard `Upgrade: websocket` header,
|
|
RFC 6455) or HTTP/2 (the extended CONNECT protocol, RFC 8441);
|
|
axum/hyper supports both, and the handler does not branch on which —
|
|
the WS frame stream is the same once the upgrade completes.
|
|
|
|
### Framing: `EventEnvelope` over binary WS messages (websocket.md §"Framing")
|
|
|
|
Every message on the WS connection is a binary WebSocket message
|
|
containing one `EventEnvelope`:
|
|
|
|
```rust
|
|
pub struct EventEnvelope {
|
|
pub r#type: String, // "call.requested" | "call.responded" | "call.completed" | "call.aborted" | "call.error"
|
|
pub id: String, // Correlation key (request ID, subscription ID)
|
|
pub payload: Value, // serde_json::Value — schema depends on event type
|
|
}
|
|
```
|
|
|
|
This is the call protocol's wire format verbatim. **The WS path carries
|
|
no length prefix**: one `EventEnvelope` JSON object = one binary WS
|
|
message, and the WS message boundary is the delimiter. The
|
|
implementation must not prepend the QUIC length prefix on outbound WS
|
|
messages or expect it on inbound ones — the two framings are
|
|
deliberately different, matching each transport's native boundary
|
|
semantics. (The `FrameFramedReader`/`FrameFramedWriter` types the QUIC
|
|
dispatch loop uses are replaced on the WS path by direct JSON serde
|
|
over the WS message type; the `Dispatcher` itself is transport-agnostic
|
|
and consumes `EventEnvelope` values, not raw bytes.)
|
|
|
|
Binary payloads within `EventEnvelope.payload` follow the same
|
|
base64-as-JSON-string convention the QUIC path uses — the envelope
|
|
carries `serde_json::Value` and does not interpret binary fields; that's
|
|
a handler-level concern, transport-agnostic.
|
|
|
|
Text WS messages are not used; all call-protocol frames are binary. A
|
|
client that sends a text message gets a protocol-level close (the WS
|
|
handler validates message type).
|
|
|
|
### Dispatch: the shared `Dispatcher` (websocket.md §"Dispatch")
|
|
|
|
The WS message stream is handed to the `Dispatcher` — the same dispatch
|
|
loop the `CallAdapter` uses for `alknet/call` QUIC connections. The
|
|
dispatch half is one implementation; the connection-establishment half
|
|
differs (WS upgrade handler vs QUIC accept/dial), but after
|
|
establishment the `Dispatcher` runs identically:
|
|
|
|
- Reads `EventEnvelope` frames from the WS message stream (deserialized
|
|
from binary WS messages — no `FrameFramedReader`).
|
|
- For `call.requested`: resolves the peer's identity (the bearer-token
|
|
identity resolved at upgrade time, stored on the connection), runs
|
|
`AccessControl::check(identity)` against the op's `AccessControl`,
|
|
dispatches via `OperationRegistry::invoke()` if allowed, returns
|
|
`FORBIDDEN` (→ `call.error`) before the handler runs if not.
|
|
- For `call.responded`/`call.completed`/`call.aborted`: correlates by
|
|
`id` via `PendingRequestMap` (keyed by request ID, not by transport —
|
|
ADR-012).
|
|
- Writes response `EventEnvelope` frames back as binary WS messages.
|
|
|
|
Peer authorization flows through the existing `AccessControl::check`
|
|
against the resolved identity — no `RemoteFilter`, no `remote_safe`
|
|
gate (retired by ADR-029 §3).
|
|
|
|
### Using the exposed dispatch API
|
|
|
|
This task uses the `pub` dispatch API exposed by the
|
|
`dispatcher-transport-abstraction` task:
|
|
|
|
- `Dispatcher::dispatch_requested(connection, request_id, payload)` —
|
|
for `call.requested` events.
|
|
- The `pub` abort-handling method — for `call.aborted` events.
|
|
- `CallConnection` constructed from the non-QUIC source (holding the
|
|
resolved bearer identity, a fresh Layer 2 overlay, a fresh
|
|
`PendingRequestMap`).
|
|
|
|
### Bidirectionality (websocket.md §"Bidirectionality")
|
|
|
|
The WS call-protocol session inherits the call protocol's native
|
|
bidirectionality: both sides can send `call.requested` frames. The
|
|
browser calls operations on the hub; the hub can call operations
|
|
registered on the browser's side, over the same session, using the same
|
|
`PendingRequestMap` and `EventEnvelope` framing as `alknet/call`.
|
|
|
|
The browser case where the client registers no operations of its own
|
|
is the common case — the server→client call direction is unused
|
|
because the browser has nothing to call. That is a use-case scoping,
|
|
not an architectural limitation. A browser that *does* expose ops
|
|
registers them in the connection-local Layer 2 overlay (the
|
|
`connection-overlay` task).
|
|
|
|
### Streaming: native `call.responded` events, no SSE (websocket.md §"Streaming")
|
|
|
|
A `Subscription` operation invoked over WS streams `call.responded`
|
|
events as binary WS messages directly — **no SSE `data:` framing**. SSE
|
|
is the `h2`/`http/1.1` streaming projection; on WS it is unnecessary
|
|
because WS is already a framed full-duplex channel. The browser receives
|
|
`call.responded` events one per WS binary message, with the same `id`
|
|
correlating them to the original `call.requested`; `call.completed`
|
|
closes the subscription; `call.aborted` closes it with an error frame.
|
|
|
|
On WS client disconnect (the browser closes the tab mid-subscription),
|
|
the WS handler detects the stream close and sends `call.aborted` for
|
|
the in-flight subscription, which cascades to descendants per ADR-016.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] WS upgrade route at `/alknet/call` (default, ADR-046 collision rule)
|
|
- [ ] Upgrade handler uses axum's `WebSocketUpgrade` extractor
|
|
- [ ] Bearer auth on upgrade request via shared `bearer_auth_middleware`
|
|
- [ ] No token → `401` (upgrade rejected)
|
|
- [ ] Token present but insufficient scopes → `403` at call time (not upgrade time)
|
|
- [ ] Resolved identity stored on the `CallConnection` (for observability + AccessControl)
|
|
- [ ] WS binary message = one `EventEnvelope` (JSON serde, no length prefix)
|
|
- [ ] No `FrameFramedReader`/`FrameFramedWriter` on the WS path (WS message boundary is delimiter)
|
|
- [ ] Text WS messages rejected (protocol-level close)
|
|
- [ ] `call.requested` → `Dispatcher::dispatch_requested` (the pub API)
|
|
- [ ] `AccessControl::check(identity)` gates every `call.requested`
|
|
- [ ] `FORBIDDEN` → `call.error` event (before handler runs)
|
|
- [ ] `call.responded`/`call.completed`/`call.aborted` correlated by `id` via `PendingRequestMap`
|
|
- [ ] Response `EventEnvelope` frames written as binary WS messages
|
|
- [ ] `call.aborted` → the pub abort-handling method
|
|
- [ ] Bidirectionality: hub can `call.requested` to browser-registered ops
|
|
- [ ] `Subscription` streams `call.responded` as binary WS messages (no SSE)
|
|
- [ ] `call.completed` closes subscription; `call.aborted` closes with error
|
|
- [ ] WS client disconnect mid-subscription → `call.aborted` (ADR-016 cascade)
|
|
- [ ] WS close → fail all pending, drop overlay (connection-local)
|
|
- [ ] Upgrade works over HTTP/1.1 (RFC 6455) and HTTP/2 (RFC 8441)
|
|
- [ ] Handler does not branch on HTTP version (WS frame stream is same post-upgrade)
|
|
- [ ] Integration test: WS upgrade → `call.requested` → `call.responded` round-trip
|
|
- [ ] Integration test: no Bearer token → 401
|
|
- [ ] Integration test: `AccessControl` denied → `call.error` FORBIDDEN
|
|
- [ ] Integration test: `Subscription` over WS → multiple `call.responded` + `call.completed`
|
|
- [ ] Integration test: WS disconnect mid-subscription → `call.aborted` cascade
|
|
- [ ] Integration test: text WS message → protocol close
|
|
- [ ] Integration test: bidirectional (hub calls browser-registered op)
|
|
- [ ] `cargo test -p alknet-http` succeeds
|
|
- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings
|
|
|
|
## References
|
|
|
|
- docs/architecture/crates/http/websocket.md — full WS spec (upgrade handler, framing, dispatch, bidirectionality, streaming)
|
|
- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (WS is v1 browser path, no length prefix)
|
|
- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 (native session, not gateway shape)
|
|
- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (stream-agnostic correlation)
|
|
- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (disconnect → abort cascade)
|
|
- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 §3 (AccessControl::check is sole gate)
|
|
- docs/architecture/decisions/046-assembly-layer-custom-http-routes.md — ADR-046 (collision rule for /alknet/call)
|
|
- /workspace/@alkdev/pubsub/src/event-target-websocket-client.ts — TypeScript prior art (EventEnvelope over WS binary messages)
|
|
|
|
## Notes
|
|
|
|
> The WS path is the native EventEnvelope session, not the gateway shape
|
|
> (ADR-048). The gateway endpoints are HTTP-only; discovery is via
|
|
> services/list/services/schema as call-protocol ops. The WS path carries
|
|
> no length prefix (ADR-044 Assumption 1 — the WS message boundary is the
|
|
> delimiter, unlike QUIC's 4-byte prefix). Text messages are rejected. The
|
|
> dispatch uses the pub API exposed by the dispatcher-transport-abstraction
|
|
> task (dispatch_requested + abort-handling + non-QUIC CallConnection).
|
|
> Bidirectionality: both sides can call.requested (ADR-043 §2 transferred
|
|
> per ADR-044 §3). Streaming is native call.responded events, no SSE. The
|
|
> default upgrade path is /alknet/call (namespaces away from reserved paths
|
|
> per ADR-046). This is the second-highest-risk task (after the transport
|
|
> abstraction) — the WS dispatch loop must be identical to the QUIC dispatch
|
|
> loop on the security axis (AccessControl, identity, abort cascade).
|
|
|
|
## Summary
|
|
|
|
> To be filled on completion |