docs: add auth, call protocol architecture specs and ADRs 023-025
Unified authentication (ADR-023): SSH and WebTransport auth share the same
Ed25519 key material. Token auth uses signed timestamps verified against the
same authorized_keys set. IdentityProvider trait decouples core from identity
storage.
Bidirectional call protocol (ADR-024): Generalizes control channel (ADR-018)
to support hub→spoke and spoke→hub calls. Operation paths use /{spoke}/{service}/{op}
format for three-level routing. EventEnvelope wire format, five call events,
PendingRequestMap for correlation.
Handler/spec separation (ADR-025): Downstream consumers register operations
without modifying core. OperationRegistry maps paths to specs + handlers.
Service discovery via /services/list and /services/schema.
Resolves OQ-17 (transport-aware auth), OQ-21 (spoke routing), OQ-CFG-04 and
OQ-CFG-06 (WebTransport auth and transport-aware auth layer). Adds OQ-18
through OQ-22 for remaining open questions.
This commit is contained in:
@@ -0,0 +1,85 @@
|
||||
# ADR-023: Unified Authentication with Shared Key Material
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Wraith currently authenticates connections exclusively through SSH public key
|
||||
auth in the SSH handshake. This works for SSH-over-any-transport (TCP, TLS,
|
||||
iroh) because SSH carries its own auth protocol. But WebTransport and other
|
||||
HTTP-level transports cannot perform SSH key exchange — browsers speak HTTP/3,
|
||||
not SSH.
|
||||
|
||||
Without unification, non-SSH transports would need a completely separate
|
||||
identity system (API keys, JWTs, session tokens). This creates two problems:
|
||||
(1) operators manage two key sets with two rotation mechanisms, and (2) the
|
||||
same person connecting via SSH and WebTransport appears as two different
|
||||
identities.
|
||||
|
||||
The `IdentityProvider` trait is needed to decouple wraith-core from any
|
||||
specific identity storage (config file vs. database). Without it, wraith-core
|
||||
would either hardcode config-file-based auth or take a database dependency —
|
||||
neither is acceptable for a library crate.
|
||||
|
||||
## Decision
|
||||
|
||||
**Unified authentication**: The same Ed25519 key material (`authorized_keys`
|
||||
and `cert_authorities`) is shared across both SSH auth and token auth. The
|
||||
presentation differs per transport, but the verification result (an
|
||||
`Identity` with scopes) is the same.
|
||||
|
||||
**Token auth for non-SSH transports**: WebTransport clients present a signed
|
||||
timestamp token in the CONNECT request URL:
|
||||
|
||||
```
|
||||
AuthToken = base64url(key_id || timestamp || signature)
|
||||
key_id = SHA-256 fingerprint of the Ed25519 public key (32 bytes)
|
||||
timestamp = Unix seconds, big-endian u64 (8 bytes)
|
||||
signature = Ed25519 sign(key_id || timestamp_bytes, private_key)
|
||||
```
|
||||
|
||||
Server extracts the fingerprint, looks it up in the same `authorized_keys`
|
||||
set, verifies the signature, and checks the timestamp window (default ±300s).
|
||||
|
||||
**`IdentityProvider` trait**: Decouples wraith-core from identity storage. The
|
||||
trait resolves a fingerprint or token to an `Identity`. Default implementation
|
||||
loads from `DynamicConfig.auth` (no database). Hub implementation can back it
|
||||
with `@alkdev/storage`.
|
||||
|
||||
**`TokenKeySource::Shared`**: The token auth uses the same authorized keys set
|
||||
as SSH auth by default. Deployments that want separate access control can use
|
||||
`TokenKeySource::Separate` with a distinct key set.
|
||||
|
||||
**Replay protection via timestamps**: V1 uses timestamp-only (no server state).
|
||||
Zero-replay can be added later via a nonce challenge-response without changing
|
||||
the key material.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: One key set, one rotation, one `reloadAuth()` call. Adding a
|
||||
key to `authorized_keys` immediately grants access via both SSH and
|
||||
WebTransport.
|
||||
- **Positive**: `IdentityProvider` trait makes wraith-core independent of any
|
||||
specific database. Default: config file. Hub: `@alkdev/storage`.
|
||||
- **Positive**: Browser clients can authenticate using Ed25519 keys via
|
||||
SubtleCrypto (Chrome 105+, Firefox 130+, Safari 17+). Deno supports it
|
||||
natively.
|
||||
- **Positive**: No JWT library dependency. The token is a simple Ed25519
|
||||
signature over a fixed structure — same primitives SSH already uses.
|
||||
- **Negative**: V1 has a replay window (±300s). An attacker who intercepts a
|
||||
QUIC packet can replay the token within the window. Acceptable because QUIC
|
||||
interception is the same threat level as connection hijacking.
|
||||
- **Negative**: Certificate authority tokens are not supported in v1. CA
|
||||
verification requires the full OpenSSH certificate structure, which doesn't
|
||||
fit in a signed timestamp.
|
||||
- **Negative**: Browser-side key management is less ergonomic than SSH key
|
||||
files. The private key must be imported into SubtleCrypto. This is a UI/UX
|
||||
concern, not a protocol concern.
|
||||
|
||||
## References
|
||||
|
||||
- [auth.md](../auth.md) — Full auth architecture spec
|
||||
- [ADR-012](012-auth-ed25519-and-cert-authority.md) — Ed25519 + cert-authority auth
|
||||
- [OQ-17](../open-questions.md) — Transport-aware auth (resolved by this ADR)
|
||||
- [configuration.md](../../research/configuration.md) — OQ-CFG-04, OQ-CFG-06 (resolved)
|
||||
@@ -0,0 +1,63 @@
|
||||
# ADR-024: Bidirectional Call Protocol
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The wraith control channel (ADR-018) routes from client → server's event bus.
|
||||
This is unidirectional: clients can send events to the server, but the server
|
||||
cannot call operations on the client. In the hub/spoke model, spokes (dev env
|
||||
containers) connect to a hub and expose operations (fs, bash, search) that the
|
||||
hub invokes. The hub needs to call *spoke* operations.
|
||||
|
||||
Additionally, the current control channel provides no request/response semantics.
|
||||
Every consumer that needs call/response reinvents the pending-request correlation.
|
||||
|
||||
## Decision
|
||||
|
||||
The call protocol is bidirectional. Both sides can send `call.requested` and
|
||||
receive `call.responded`. The protocol uses `EventEnvelope` wire format (4-byte
|
||||
BE length prefix + JSON) — the same as `@alkdev/pubsub`.
|
||||
|
||||
Five event types: `call.requested`, `call.responded`, `call.completed`,
|
||||
`call.aborted`, `call.error`.
|
||||
|
||||
A call is a subscribe that resolves after one event. Both use `call.requested`
|
||||
with correlated `requestId`. `PendingRequestMap` in core provides correlation.
|
||||
|
||||
Operation names use slash-based paths: `/{spoke}/{service}/{op}`. The first
|
||||
path segment routes the call to the correct connected node. The hub's registry
|
||||
maps spoke prefixes to connections. This mirrors iroh's ALPN dispatch: the
|
||||
first segment is the routing key, remaining path dispatches within the node.
|
||||
|
||||
Core-provided operations use short paths without a spoke prefix
|
||||
(`/services/list`, `/services/schema`). Spoke operations are prefixed
|
||||
(`/dev1/fs/readFile`).
|
||||
|
||||
This generalizes ADR-018's control channel: the `wraith-*` destination becomes
|
||||
a transport for `EventEnvelope` frames with call protocol semantics, instead of
|
||||
raw pubsub dispatch.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Hub can invoke operations on spokes. Dev env containers
|
||||
expose fs, bash, search — the hub calls them as needed.
|
||||
- **Positive**: Browser clients can expose custom UDFs. Any connected participant
|
||||
can both call and serve operations.
|
||||
- **Positive**: Built-in request/response correlation. One `PendingRequestMap`
|
||||
in core serves all consumers.
|
||||
- **Positive**: Slash-based paths align with URL routing, OpenAPI, MCP, and
|
||||
iroh's ALPN dispatch. First segment = routing key.
|
||||
- **Positive**: Multiple spokes exposing the same service (two dev envs both
|
||||
exposing `/fs/*`) are naturally differentiated by the spoke prefix.
|
||||
- **Negative**: The `PendingRequestMap` adds in-memory state. Entries must be
|
||||
cleaned up on timeout or connection close.
|
||||
- **Negative**: The hub must maintain a routing table mapping spoke identities
|
||||
to connections, with registration on connect and cleanup on disconnect.
|
||||
|
||||
## References
|
||||
|
||||
- [call-protocol.md](../call-protocol.md) — Full call protocol spec
|
||||
- [ADR-018](018-control-channel-for-pubsub.md) — Control channel (generalized)
|
||||
- [napi-and-pubsub.md](../napi-and-pubsub.md) — NAPI wrapper and pubsub adapter
|
||||
73
docs/architecture/decisions/025-handler-spec-separation.md
Normal file
73
docs/architecture/decisions/025-handler-spec-separation.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# ADR-025: Handler/Spec Separation for Downstream Service Registration
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The current control channel (ADR-018) is hardcoded: `wraith-control:0` bridges
|
||||
to the local pubsub event bus. If NAPI wants to expose `fs.readFile` or
|
||||
`bash.exec` as callable operations, it has no way to register these with core's
|
||||
channel routing. The NAPI handler would need to intercept channel data outside
|
||||
of core.
|
||||
|
||||
For the hub/spoke model, spokes register their operations with the hub when
|
||||
they connect. The hub's registry must include both hub-local operations and
|
||||
remote operations exposed by spokes.
|
||||
|
||||
## Decision
|
||||
|
||||
Operation specs and handlers are separated from core. Core provides:
|
||||
|
||||
1. `OperationSpec` — describes what an operation does (name, type, input/output
|
||||
schemas, access control)
|
||||
2. `OperationHandler` — implements the operation logic
|
||||
3. `OperationRegistry` — maps paths to specs + handlers
|
||||
4. Built-in operations: `/services/list`, `/services/schema`
|
||||
|
||||
Downstream consumers register their own operations:
|
||||
|
||||
```rust
|
||||
// NAPI layer registers dev env tools
|
||||
registry.register(OperationSpec { name: "/fs/readFile", ... }, fs_read_handler);
|
||||
registry.register(OperationSpec { name: "/bash/exec", ... }, bash_exec_handler);
|
||||
|
||||
// Browser client registers a custom UDF
|
||||
registry.register(OperationSpec { name: "/notify/alert", ... }, notify_handler);
|
||||
```
|
||||
|
||||
Operation names use slash-based paths: `/{spoke}/{service}/{op}`. The first
|
||||
segment routes to the node. The `namespace` field on `OperationSpec` is
|
||||
derived from the second path segment (`service`).
|
||||
|
||||
When spoke operations are registered with the hub, the hub adds the spoke
|
||||
prefix: a spoke that registers `/fs/readFile` as "dev1" becomes addressable as
|
||||
`/dev1/fs/readFile` in the hub's routing table.
|
||||
|
||||
The `/services/list` operation returns all registered specs. The
|
||||
`/services/schema` operation returns the spec for a specific operation. These
|
||||
are read-only — no admin operations.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: NAPI, Python, and any downstream consumer can register
|
||||
operations without modifying core.
|
||||
- **Positive**: Service discovery is built in. Clients query `/services/list`
|
||||
to learn what operations a hub offers.
|
||||
- **Positive**: Spoke prefix naturally differentiates multiple spokes exposing
|
||||
the same service (dev1 vs dev2).
|
||||
- **Positive**: `AccessControl` on each `OperationSpec` enables per-operation
|
||||
authorization. Higher-risk operations (shell, filesystem write) can require
|
||||
tighter scopes.
|
||||
- **Positive**: Schema exposure enables MCP adapter generation. OperationSpec
|
||||
maps directly to MCP tool definitions.
|
||||
- **Negative**: The registry adds complexity. Core now owns `OperationSpec`,
|
||||
`OperationRegistry`, and `PendingRequestMap`.
|
||||
- **Negative**: Namespace collisions between downstream consumers are possible.
|
||||
The spoke prefix mitigates this: `/dev1/fs/readFile` vs `/dev2/fs/readFile`.
|
||||
|
||||
## References
|
||||
|
||||
- [call-protocol.md](../call-protocol.md) — Full call protocol spec
|
||||
- [ADR-018](018-control-channel-for-pubsub.md) — Control channel (generalized)
|
||||
- `@alkdev/operations` — TypeScript `OperationSpec`, `CallHandler`, registry
|
||||
Reference in New Issue
Block a user