docs(http): decompose alknet-http spec into 19 implementation tasks

Break the alknet-http architecture spec into atomic, dependency-ordered
tasks in tasks/http/, following the taskgraph frontmatter conventions
used by the call/core/vault crates.

Tasks span 7 phases across 5 module subdirectories (server/, gateway/,
client/, adapters/, websocket/):
- Phase 0: crate-init (foundation)
- Phase 1: gateway-dispatch-spine, error-mapping, shared-http-client
  (shared infrastructure)
- Phase 2: http-adapter, bearer-auth-middleware, gateway-endpoints,
  healthz-decoy (HTTP server surface)
- Phase 3: to-openapi (OpenAPI gateway projection)
- Phase 4: from-openapi (OpenAPI adapter, reqwest forwarding)
- Phase 5: dispatcher-transport-abstraction, upgrade-handler,
  connection-overlay (WebSocket browser bidirectional path)
- Phase 6: from-mcp, to-mcp (MCP adapters, feature-gated)
- Phase 7: review-http, review-websocket, review-mcp, review-http-final
  (quality checkpoints)

The gateway-dispatch-spine task implements the thin shared core
recommended by the gateway-factoring research (concrete struct, not a
trait). The dispatcher-transport-abstraction task is a cross-crate
change to alknet-call (exposes EventEnvelope-level dispatch API for
non-QUIC transports) — the highest-risk task. WebTransport/h3 is
deferred per ADR-044 and has no tasks; from_wss is out of scope.

Validated: 19 tasks, no cycles, 8 parallel generations, critical path
length 8 (through the WebSocket strand).
This commit is contained in:
2026-07-01 07:11:17 +00:00
parent e0c6f61e6a
commit e855c8c7eb
19 changed files with 3493 additions and 0 deletions

View File

@@ -0,0 +1,182 @@
---
id: http/websocket/connection-overlay
name: Implement connection-local Layer 2 overlay for browser-registered ops (no PeerId, ADR-024/034/044)
status: pending
depends_on: [http/websocket/upgrade-handler]
scope: moderate
risk: medium
impact: component
level: implementation
---
## Description
Implement the connection-local Layer 2 overlay for browser-registered
ops in `src/websocket/overlay.rs`. This is the mechanism that gives a
browser bidirectional-call capability *without* peer-graph membership
(ADR-024, ADR-034 §4, ADR-044 §5). A browser over WebSocket has no
`PeerId`, does not enter `PeerCompositeEnv`, and any ops it registers
land in a per-`CallConnection` overlay that dies when the connection
drops.
### Browsers are not alknet peers (websocket.md §"Browsers are not alknet peers")
A browser over WebSocket authenticates by bearer token, gets no
`PeerId`, does not enter `PeerCompositeEnv`, and its registered ops (if
any) land in the connection-local Layer 2 overlay. The rationale, stated
in ADR-044 §5 and amending ADR-034 §4 by reference, is a load-bearing
distinction:
**"Peer" in alknet means an addressable node in the call-protocol peer
graph** — a stable `PeerId`, reachable via `PeerRef::Specific`, whose ops
land in `PeerCompositeEnv`, whose identity is stable across reconnects.
It does *not* mean "any endpoint that exchanges calls during a live
session." A browser is the second thing but not the first, on three
concrete grounds:
1. **No stable cryptographic identity of its own.** A `PeerEntry` is
anchored to fingerprints (Ed25519, X.509) that *the peer* presents
and the local node pins. A browser presents a bearer token the *hub*
issued; the "identity" is the hub's bookkeeping for that token, not
something the browser owns or that could be pinned by another node.
There is nothing to put in `PeerEntry.fingerprints`.
2. **Ephemeral.** Close the tab → connection dies → the connection-local
Layer 2 overlay dies with it. A `PeerEntry` keyed to a browser would
be a permanently-dead entry within seconds. `PeerRef::Specific("browser-X")`
from another node would route to nothing.
3. **Not addressable from other nodes.** `PeerRef::Specific` resolves
through `PeerEntry``PeerId`. Another alknet node has no way to
reach "the browser currently connected to hub-A"; the hub holds that
connection as a live `CallConnection` handle, not as a peer-graph
entry. The connection-local overlay is precisely the mechanism that
gives the browser bidirectional-call capability *without* peer-graph
membership.
### The overlay (websocket.md §"Connection-local overlay")
A browser over WebSocket has no `PeerId` on the hub's side. Any ops the
browser registers land in a **connection-local Layer 2 overlay**
(ADR-024) — a per-`CallConnection` overlay that dies when the connection
drops. This is the same mechanism ADR-034 §2 describes for the inbound
browser case: the browser is a bidirectional call target during a live
session, not a peer-graph member, and the connection-local overlay is
what gives it bidirectional-call capability *without* peer-graph
membership.
When the WS connection closes (browser closes the tab, network drops),
the overlay and all its registered ops are dropped — no explicit
deregistration needed. A `PeerRef::Specific("browser-X")` from another
node would route to nothing, because there is no `PeerEntry` for the
browser.
### Bidirectionality (websocket.md §"Bidirectionality")
The WS call-protocol session inherits the call protocol's native
bidirectionality: both sides can send `call.requested` frames. The
browser calls operations on the hub; the hub can call operations
registered on the browser's side, over the same session, using the same
`PendingRequestMap` and `EventEnvelope` framing as `alknet/call`.
The browser case where the client registers no operations of its own
is the common case — the server→client call direction is unused because
the browser has nothing to call. That is a use-case scoping, not an
architectural limitation. A browser that *does* expose ops (e.g., a UI
that registers a `ui/dragged` op the hub can call to push live updates)
registers them in the connection-local Layer 2 overlay, and the hub
reaches them through the live `CallConnection` handle — not through
`PeerRef::Specific` (the browser is not a peer).
### Implementation
The `CallConnection` constructed by the upgrade handler (the
`upgrade-handler` task, via the `dispatcher-transport-abstraction`
task's non-QUIC constructor) already holds a Layer 2 overlay
(`imported_operations: Arc<RwLock<HashMap<String, HandlerRegistration>>>`)
and exposes `register_imported()` / `register_imported_all()` /
`overlay_env()`. The browser registers ops via these methods; the
overlay is per-connection and dies when the `CallConnection` is dropped
(WS close).
This task ensures:
1. The overlay is correctly scoped to the WS connection (not the
`PeerCompositeEnv` — no `PeerId`, no `PeerEntry`).
2. The hub's outgoing `call.requested` to browser-registered ops routes
through the `CallConnection`'s overlay (via `overlay_env()`), not
through `PeerRef::Specific`.
3. The overlay is dropped on WS close (no explicit deregistration; the
`Arc<RwLock<HashMap>>` is dropped when the `CallConnection` is
dropped).
4. `AccessControl::check(identity)` gates the hub's calls to
browser-registered ops (the browser's bearer-token identity is the
caller identity for the hub's outgoing calls — wait, no: the *hub*
is the caller when it calls a browser op; the browser's identity is
the *handler* identity. Clarify: the hub's `call.requested` to a
browser op runs with the hub's identity as caller, the browser's
registration bundle's `composition_authority` as handler identity.
The browser's `AccessControl` on its registered ops gates whether
the hub is allowed to call them.)
5. Abort cascade on WS disconnect (ADR-016): when the WS connection
closes, all in-flight subscriptions and calls to browser ops are
aborted, cascading to descendants.
### What this task does NOT do
- **No `PeerEntry` for the browser.** The browser is not in the peer
graph. This task ensures the overlay is connection-local, not
peer-graph.
- **No `from_wss` adapter.** Out of scope (websocket.md §"Future" —
scope decision). This task is about the browser *registering* ops on
its connection, not about importing a remote node's ops over WS.
## Acceptance Criteria
- [ ] Browser-registered ops land in the `CallConnection`'s Layer 2 overlay (not `PeerCompositeEnv`)
- [ ] No `PeerId` created for the browser (no `PeerEntry`, no peer-graph membership)
- [ ] `register_imported()` / `register_imported_all()` work for browser ops
- [ ] Hub's outgoing `call.requested` to browser ops routes through `overlay_env()`
- [ ] Hub's outgoing calls do NOT route through `PeerRef::Specific` (browser is not a peer)
- [ ] `AccessControl` on browser-registered ops gates the hub's calls
- [ ] Overlay dropped on WS close (no explicit deregistration; `Arc<RwLock<HashMap>>` dropped)
- [ ] `PeerRef::Specific("browser-X")` from another node → routes to nothing (no `PeerEntry`)
- [ ] WS close → all in-flight subscriptions/calls to browser ops aborted (ADR-016 cascade)
- [ ] WS close → overlay and all registered ops dropped
- [ ] Bidirectionality: hub can `call.requested` to browser-registered ops
- [ ] Browser with no registered ops → server→client direction unused (use-case scoping, not a limitation)
- [ ] Integration test: browser registers op → hub calls it via overlay
- [ ] Integration test: WS close → overlay dropped (op no longer reachable)
- [ ] Integration test: `PeerRef::Specific("browser-X")` → NOT_FOUND (no PeerEntry)
- [ ] Integration test: WS close mid-call to browser op → `call.aborted` cascade
- [ ] Integration test: `AccessControl` on browser op gates hub's call
- [ ] `cargo test -p alknet-http` succeeds
- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings
## References
- docs/architecture/crates/http/websocket.md — Connection-local overlay (§"Connection-local overlay"), Bidirectionality (§"Bidirectionality"), Browsers are not peers (§"Browsers are not alknet peers")
- docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (Layer 2 connection-local overlay)
- docs/architecture/decisions/034-outgoing-only-x509-and-three-peer-roles.md — ADR-034 §4 (browsers are not peers)
- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 §5 (addressability vs bidirectionality rationale)
- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (abort cascade on disconnect)
- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 (PeerRef::Specific routes through PeerEntry → PeerId)
## Notes
> The connection-local overlay is the mechanism that gives a browser
> bidirectional-call capability without peer-graph membership. The
> browser has no PeerId, no PeerEntry, no PeerCompositeEnv entry — it is
> a bidirectional call target during a live session, not a peer-graph
> member. The overlay dies with the WS connection (no explicit
> deregistration). The hub reaches browser ops through the live
> CallConnection handle's overlay_env(), not through PeerRef::Specific.
> The "browsers are not peers" rationale (ADR-044 §5) is load-bearing:
> "peer" means addressable peer-graph node, not "any endpoint that
> exchanges calls during a live session." A browser has no stable
> cryptographic identity, is ephemeral, and is not addressable from
> other nodes — three concrete grounds for not being a peer.
## Summary
> To be filled on completion

View File

@@ -0,0 +1,180 @@
---
id: http/websocket/dispatcher-transport-abstraction
name: Expose EventEnvelope-level dispatch API in alknet-call for non-QUIC transports (WebSocket)
status: pending
depends_on: [http/crate-init]
scope: moderate
risk: high
impact: project
level: implementation
---
## Description
Expose an `EventEnvelope`-level dispatch API in `alknet-call` so the
WebSocket handler can feed deserialized envelopes directly to the shared
`Dispatcher`, without requiring a QUIC `Connection`. This is a
**cross-crate task** (modifies `alknet-call`) and the **highest-risk
task** in the http phase: the spec says "the `Dispatcher` runs unchanged"
over WS (ADR-012, ADR-048), but the current implementation is
QUIC-specific in two places that need loosening.
### The problem
The current `Dispatcher` (in `crates/alknet-call/src/protocol/dispatch.rs`)
is transport-agnostic in *intent* (ADR-012 — stream-agnostic
correlation) but QUIC-specific in *two* integration points:
1. **`Dispatcher::handle_stream`** takes raw `SendStream` / `RecvStream`
(QUIC-backed `alknet_core::types::SendStream` / `RecvStream`) and uses
`FrameFramedReader` (4-byte length-prefixed framing). The WebSocket
path does NOT use length-prefix framing — a WS binary message is
already length-delimited by the WS frame boundary (ADR-044 Assumption
1). The WS handler deserializes `EventEnvelope` from each binary WS
message directly (no `FrameFramedReader`), and needs to feed the
envelope to the dispatch logic.
2. **`CallConnection`** wraps an `alknet_core::types::Connection` (which
wraps a QUIC `quinn::Connection` or `iroh::endpoint::Connection`).
The WS path has no QUIC connection — it has a WS message stream. The
`CallConnection` is needed for: the Layer 2 overlay
(`imported_operations`), the `PendingRequestMap` (correlation), and
the `connection.identity()` (the resolved bearer identity). The WS
path needs a `CallConnection`-equivalent that holds these without a
QUIC `Connection`.
### The fix: expose `dispatch_requested` as `pub`
The core dispatch logic — `Dispatcher::dispatch_requested` — is already
transport-agnostic: it takes a `request_id: String`, a `payload: Value`
(the `EventEnvelope` payload), and a `&Arc<CallConnection>`, and returns
a `ResponseEnvelope`. It is currently `pub(crate)`. **Expose it as
`pub`** so the WS handler can call it directly with a deserialized
`EventEnvelope` payload.
Similarly, the abort-cascade handling (`call.aborted` events) is in
`Dispatcher::handle_stream` — extract the abort-handling logic into a
`pub` method so the WS handler can call it for `call.aborted` events.
### The fix: `CallConnection` from a non-QUIC transport
The `CallConnection` needs to be constructible from a non-QUIC source.
Two options (pick the cleaner one during implementation):
**Option A: A `CallConnection::new_overlay_only(identity)` constructor.**
Construct a `CallConnection` that holds the Layer 2 overlay +
`PendingRequestMap` + the resolved bearer `Identity`, but no QUIC
`Connection`. The `connection()` accessor returns a stub or the
`identity()` is stored directly. This is the minimal change —
`CallConnection` gains a constructor that doesn't require a QUIC
`Connection`, and the `identity()` is read from a stored field rather
than `connection.identity()`.
**Option B: Extract a `CallSession` trait.** Define a trait that
`CallConnection` and a new `WsCallSession` both implement, with
`identity()`, `overlay_env()`, `pending()`, `register_imported()`. The
`Dispatcher` takes `&Arc<dyn CallSession>`. This is more invasive but
cleaner; it's the right choice if the QUIC/WS divergence is large.
**Recommendation: Option A** unless the divergence is larger than it
appears. The `CallConnection` already holds the overlay + pending as
`Arc<RwLock<...>>` / `Arc<Mutex<...>>` (independent of the QUIC
`Connection`); the only QUIC-coupled piece is the `connection: Arc<Connection>`
field and the `connection.identity()` call. A constructor that stores
the `Identity` directly (and returns `None` from `connection()` or
provides a `identity()` accessor that reads the stored field) is the
minimal change.
### The WS dispatch loop (how the WS handler uses this)
The WS upgrade handler (the `websocket/upgrade-handler` task) will:
1. Resolve the bearer identity at upgrade time.
2. Construct a `CallConnection` (via the new constructor — Option A) or
equivalent (Option B) holding the identity, a fresh Layer 2 overlay,
and a fresh `PendingRequestMap`.
3. Construct a `Dispatcher` (already `pub`).
4. For each binary WS message: deserialize `EventEnvelope`, match on
`envelope.r#type`:
- `call.requested` → call `Dispatcher::dispatch_requested(connection,
request_id, payload)` (now `pub`), get `ResponseEnvelope`, convert
to `EventEnvelope`, write back as binary WS message.
- `call.aborted` → call the extracted `pub` abort-handling method.
- `call.responded` / `call.completed` → correlate via
`PendingRequestMap` (the WS handler's outgoing calls —
bidirectionality, ADR-043 §2).
5. On WS close: fail all pending, drop the overlay (connection-local,
dies with the WS connection).
### What this task does NOT do
- **No WS upgrade handler.** The upgrade handler is the
`websocket/upgrade-handler` task. This task exposes the API it calls.
- **No WS framing.** The WS message → `EventEnvelope` deserialization is
the `websocket/upgrade-handler` task. This task takes deserialized
envelopes.
- **No `from_wss` adapter.** Out of scope (websocket.md §"Future" —
scope decision, not a two-way-door deferral).
### Why this is the highest-risk task
This task modifies `alknet-call`'s security-relevant dispatch code. The
`dispatch_requested` method runs `AccessControl::check(identity)` — the
sole authorization gate (ADR-029 §3). Exposing it as `pub` is safe (the
WS handler is in `alknet-http`, a trusted crate), but the change must
not alter the dispatch logic itself. The `CallConnection` change must
not break the existing QUIC path (the `CallAdapter` and `CallClient`
construct `CallConnection` from a QUIC `Connection` — that path must
continue to work unchanged). Run the full `alknet-call` test suite after
the change.
## Acceptance Criteria
- [ ] `Dispatcher::dispatch_requested` is `pub` (was `pub(crate)`)
- [ ] Abort-cascade handling extracted to a `pub` method (was inline in `handle_stream`)
- [ ] `CallConnection` constructible from a non-QUIC source (Option A or B)
- [ ] New `CallConnection` constructor stores `Identity` directly (or equivalent)
- [ ] `CallConnection::identity()` works for the non-QUIC case
- [ ] `CallConnection::overlay_env()`, `pending()`, `register_imported()` work for non-QUIC
- [ ] Existing QUIC path (`CallAdapter`, `CallClient`) unchanged — no regressions
- [ ] `Dispatcher::handle_stream` (QUIC path) still works unchanged
- [ ] `Dispatcher::run_loop` (QUIC path) still works unchanged
- [ ] `cargo test -p alknet-call` — all existing tests pass (no regressions)
- [ ] `cargo clippy -p alknet-call --all-targets` — no warnings
- [ ] Unit test: `dispatch_requested` callable with a non-QUIC `CallConnection`
- [ ] Unit test: abort-handling method callable with a non-QUIC `CallConnection`
- [ ] Unit test: `CallConnection` from non-QUIC source holds overlay + pending + identity
- [ ] Integration test: dispatch a `call.requested` via the `pub` API → `ResponseEnvelope`
- [ ] Integration test: abort cascade via the `pub` API
- [ ] `cargo test -p alknet-http` succeeds (the WS handler can use the API)
- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings
## References
- docs/architecture/crates/http/websocket.md — Dispatch (§"Dispatch: the shared Dispatcher, unchanged"), Framing (§"Framing")
- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (stream-agnostic correlation)
- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 (WS carries native session)
- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (WS message boundary is delimiter, no length prefix)
- docs/architecture/crates/call/call-protocol.md — Dispatcher, EventEnvelope wire format
- docs/architecture/crates/call/client-and-adapters.md — Shared Dispatcher (§"Shared Dispatcher")
- crates/alknet-call/src/protocol/dispatch.rs — current Dispatcher implementation
- crates/alknet-call/src/protocol/connection.rs — current CallConnection implementation
## Notes
> This is the highest-risk task in the http phase. It modifies
> alknet-call's security-relevant dispatch code to expose an
> EventEnvelope-level API for non-QUIC transports. The spec says "the
> Dispatcher runs unchanged" (ADR-012), but the current implementation is
> QUIC-specific in two places: handle_stream takes raw SendStream/RecvStream
> (length-prefixed framing), and CallConnection wraps a QUIC Connection.
> The fix is to expose dispatch_requested as pub and make CallConnection
> constructible from a non-QUIC source. The existing QUIC path (CallAdapter,
> CallClient) must not regress — run the full alknet-call test suite. The
> WS handler (websocket/upgrade-handler task) is the consumer of this API.
> This task is tracked in tasks/http/ because it unblocks the WS path, but
> it modifies alknet-call — coordinate with the call crate's conventions.
## Summary
> To be filled on completion

View File

@@ -0,0 +1,230 @@
---
id: http/websocket/upgrade-handler
name: Implement WebSocket upgrade handler (native EventEnvelope session, no length prefix, bearer auth)
status: pending
depends_on: [http/server/http-adapter, http/websocket/dispatcher-transport-abstraction, http/server/bearer-auth-middleware]
scope: broad
risk: high
impact: component
level: implementation
---
## Description
Implement the WebSocket upgrade handler in `src/websocket/upgrade.rs`.
This is the v1 browser bidirectional path (ADR-044): a browser (or any
WS client) upgrades an HTTP/1.1 or HTTP/2 request to WebSocket and
speaks the call protocol over binary WS messages — full-duplex, both
sides can initiate calls (the call protocol's native bidirectionality,
ADR-012). The WS path carries the **native `EventEnvelope` session, not
the HTTP gateway shape** (ADR-048): the gateway endpoints
(`/search`/`/schema`/`/call`/`/batch`/`/subscribe`) are HTTP-only and do
not appear on WS; discovery is via `services/list`/`services/schema` as
ordinary call-protocol ops.
### The upgrade handler (websocket.md §"The WS upgrade handler")
The WS upgrade is an HTTP/1.1 or HTTP/2 request handled by an axum route
on `HttpAdapter`'s router. The handler:
1. Receives the HTTP upgrade request (axum's `WebSocketUpgrade` extractor).
2. Resolves the caller's identity from the `Authorization: Bearer` header
via `identity_provider.resolve_from_token(&AuthToken { raw:
token_bytes })` (the shared `bearer_auth_middleware` — same auth path
as any HTTP request). The upgrade is rejected (`401`) if no token is
present; insufficient scopes for any op the browser later calls
surface as `403`/`FORBIDDEN` at call time, not at upgrade time (the
upgrade doesn't know which ops the browser will call).
3. Upgrades to WebSocket (axum's `WebSocketUpgrade::on_upgrade`),
producing a full-duplex `WebSocket` stream.
4. Wraps the `WebSocket` stream as a `BiStream`-satisfying transport — a
WS binary message in either direction is one `EventEnvelope` frame.
5. Constructs a `Dispatcher` (the shared dispatch loop) with the
`Arc<OperationRegistry>` and `Arc<dyn IdentityProvider>` the
`HttpAdapter` holds, plus a connection-local Layer 2 overlay for any
ops the browser registers (the `connection-overlay` task).
6. Spawns the dispatch task on a tokio task; the WS connection is live
until either side closes it or the browser drops the handle (closes
the tab).
### The upgrade path
The **default upgrade path is `/alknet/call`** (the deployment may
override it via the `extra_routes` mechanism of ADR-046, but a
deployment that passes no custom routes gets `/alknet/call`). The path
must not collide with the reserved gateway/`/healthz`/`/openapi.json`/
MCP/custom-route paths per ADR-046's collision rule; `/alknet/call`
namespaces away from the reserved set naturally. A deployment that
builds a custom REST projection with `POST /{service}/{op}` routes
(ADR-047 §4) coexists with the WS upgrade at `/alknet/call` — axum's
`Router::merge` prioritizes specific routes over wildcards, so the WS
upgrade's exact `/alknet/call` path wins over any `/{service}/{op}`
wildcard.
The upgrade runs over HTTP/1.1 (the standard `Upgrade: websocket` header,
RFC 6455) or HTTP/2 (the extended CONNECT protocol, RFC 8441);
axum/hyper supports both, and the handler does not branch on which —
the WS frame stream is the same once the upgrade completes.
### Framing: `EventEnvelope` over binary WS messages (websocket.md §"Framing")
Every message on the WS connection is a binary WebSocket message
containing one `EventEnvelope`:
```rust
pub struct EventEnvelope {
pub r#type: String, // "call.requested" | "call.responded" | "call.completed" | "call.aborted" | "call.error"
pub id: String, // Correlation key (request ID, subscription ID)
pub payload: Value, // serde_json::Value — schema depends on event type
}
```
This is the call protocol's wire format verbatim. **The WS path carries
no length prefix**: one `EventEnvelope` JSON object = one binary WS
message, and the WS message boundary is the delimiter. The
implementation must not prepend the QUIC length prefix on outbound WS
messages or expect it on inbound ones — the two framings are
deliberately different, matching each transport's native boundary
semantics. (The `FrameFramedReader`/`FrameFramedWriter` types the QUIC
dispatch loop uses are replaced on the WS path by direct JSON serde
over the WS message type; the `Dispatcher` itself is transport-agnostic
and consumes `EventEnvelope` values, not raw bytes.)
Binary payloads within `EventEnvelope.payload` follow the same
base64-as-JSON-string convention the QUIC path uses — the envelope
carries `serde_json::Value` and does not interpret binary fields; that's
a handler-level concern, transport-agnostic.
Text WS messages are not used; all call-protocol frames are binary. A
client that sends a text message gets a protocol-level close (the WS
handler validates message type).
### Dispatch: the shared `Dispatcher` (websocket.md §"Dispatch")
The WS message stream is handed to the `Dispatcher` — the same dispatch
loop the `CallAdapter` uses for `alknet/call` QUIC connections. The
dispatch half is one implementation; the connection-establishment half
differs (WS upgrade handler vs QUIC accept/dial), but after
establishment the `Dispatcher` runs identically:
- Reads `EventEnvelope` frames from the WS message stream (deserialized
from binary WS messages — no `FrameFramedReader`).
- For `call.requested`: resolves the peer's identity (the bearer-token
identity resolved at upgrade time, stored on the connection), runs
`AccessControl::check(identity)` against the op's `AccessControl`,
dispatches via `OperationRegistry::invoke()` if allowed, returns
`FORBIDDEN` (→ `call.error`) before the handler runs if not.
- For `call.responded`/`call.completed`/`call.aborted`: correlates by
`id` via `PendingRequestMap` (keyed by request ID, not by transport —
ADR-012).
- Writes response `EventEnvelope` frames back as binary WS messages.
Peer authorization flows through the existing `AccessControl::check`
against the resolved identity — no `RemoteFilter`, no `remote_safe`
gate (retired by ADR-029 §3).
### Using the exposed dispatch API
This task uses the `pub` dispatch API exposed by the
`dispatcher-transport-abstraction` task:
- `Dispatcher::dispatch_requested(connection, request_id, payload)`
for `call.requested` events.
- The `pub` abort-handling method — for `call.aborted` events.
- `CallConnection` constructed from the non-QUIC source (holding the
resolved bearer identity, a fresh Layer 2 overlay, a fresh
`PendingRequestMap`).
### Bidirectionality (websocket.md §"Bidirectionality")
The WS call-protocol session inherits the call protocol's native
bidirectionality: both sides can send `call.requested` frames. The
browser calls operations on the hub; the hub can call operations
registered on the browser's side, over the same session, using the same
`PendingRequestMap` and `EventEnvelope` framing as `alknet/call`.
The browser case where the client registers no operations of its own
is the common case — the server→client call direction is unused
because the browser has nothing to call. That is a use-case scoping,
not an architectural limitation. A browser that *does* expose ops
registers them in the connection-local Layer 2 overlay (the
`connection-overlay` task).
### Streaming: native `call.responded` events, no SSE (websocket.md §"Streaming")
A `Subscription` operation invoked over WS streams `call.responded`
events as binary WS messages directly — **no SSE `data:` framing**. SSE
is the `h2`/`http/1.1` streaming projection; on WS it is unnecessary
because WS is already a framed full-duplex channel. The browser receives
`call.responded` events one per WS binary message, with the same `id`
correlating them to the original `call.requested`; `call.completed`
closes the subscription; `call.aborted` closes it with an error frame.
On WS client disconnect (the browser closes the tab mid-subscription),
the WS handler detects the stream close and sends `call.aborted` for
the in-flight subscription, which cascades to descendants per ADR-016.
## Acceptance Criteria
- [ ] WS upgrade route at `/alknet/call` (default, ADR-046 collision rule)
- [ ] Upgrade handler uses axum's `WebSocketUpgrade` extractor
- [ ] Bearer auth on upgrade request via shared `bearer_auth_middleware`
- [ ] No token → `401` (upgrade rejected)
- [ ] Token present but insufficient scopes → `403` at call time (not upgrade time)
- [ ] Resolved identity stored on the `CallConnection` (for observability + AccessControl)
- [ ] WS binary message = one `EventEnvelope` (JSON serde, no length prefix)
- [ ] No `FrameFramedReader`/`FrameFramedWriter` on the WS path (WS message boundary is delimiter)
- [ ] Text WS messages rejected (protocol-level close)
- [ ] `call.requested``Dispatcher::dispatch_requested` (the pub API)
- [ ] `AccessControl::check(identity)` gates every `call.requested`
- [ ] `FORBIDDEN``call.error` event (before handler runs)
- [ ] `call.responded`/`call.completed`/`call.aborted` correlated by `id` via `PendingRequestMap`
- [ ] Response `EventEnvelope` frames written as binary WS messages
- [ ] `call.aborted` → the pub abort-handling method
- [ ] Bidirectionality: hub can `call.requested` to browser-registered ops
- [ ] `Subscription` streams `call.responded` as binary WS messages (no SSE)
- [ ] `call.completed` closes subscription; `call.aborted` closes with error
- [ ] WS client disconnect mid-subscription → `call.aborted` (ADR-016 cascade)
- [ ] WS close → fail all pending, drop overlay (connection-local)
- [ ] Upgrade works over HTTP/1.1 (RFC 6455) and HTTP/2 (RFC 8441)
- [ ] Handler does not branch on HTTP version (WS frame stream is same post-upgrade)
- [ ] Integration test: WS upgrade → `call.requested``call.responded` round-trip
- [ ] Integration test: no Bearer token → 401
- [ ] Integration test: `AccessControl` denied → `call.error` FORBIDDEN
- [ ] Integration test: `Subscription` over WS → multiple `call.responded` + `call.completed`
- [ ] Integration test: WS disconnect mid-subscription → `call.aborted` cascade
- [ ] Integration test: text WS message → protocol close
- [ ] Integration test: bidirectional (hub calls browser-registered op)
- [ ] `cargo test -p alknet-http` succeeds
- [ ] `cargo clippy -p alknet-http --all-targets` succeeds with no warnings
## References
- docs/architecture/crates/http/websocket.md — full WS spec (upgrade handler, framing, dispatch, bidirectionality, streaming)
- docs/architecture/decisions/044-defer-webtransport-browsers-use-websocket.md — ADR-044 (WS is v1 browser path, no length prefix)
- docs/architecture/decisions/048-websocket-native-session-not-gateway.md — ADR-048 (native session, not gateway shape)
- docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (stream-agnostic correlation)
- docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (disconnect → abort cascade)
- docs/architecture/decisions/029-peer-graph-routing-model.md — ADR-029 §3 (AccessControl::check is sole gate)
- docs/architecture/decisions/046-assembly-layer-custom-http-routes.md — ADR-046 (collision rule for /alknet/call)
- /workspace/@alkdev/pubsub/src/event-target-websocket-client.ts — TypeScript prior art (EventEnvelope over WS binary messages)
## Notes
> The WS path is the native EventEnvelope session, not the gateway shape
> (ADR-048). The gateway endpoints are HTTP-only; discovery is via
> services/list/services/schema as call-protocol ops. The WS path carries
> no length prefix (ADR-044 Assumption 1 — the WS message boundary is the
> delimiter, unlike QUIC's 4-byte prefix). Text messages are rejected. The
> dispatch uses the pub API exposed by the dispatcher-transport-abstraction
> task (dispatch_requested + abort-handling + non-QUIC CallConnection).
> Bidirectionality: both sides can call.requested (ADR-043 §2 transferred
> per ADR-044 §3). Streaming is native call.responded events, no SSE. The
> default upgrade path is /alknet/call (namespaces away from reserved paths
> per ADR-046). This is the second-highest-risk task (after the transport
> abstraction) — the WS dispatch loop must be identical to the QUIC dispatch
> loop on the security axis (AccessControl, identity, abort cascade).
## Summary
> To be filled on completion