Third POC iteration (alknet-fs-sync-poc, 9/9 tests) proves multi-node
path-tree sync:
- Path tree modeled as automerge CRDT document, synced via automerge's
sync protocol over iroh QUIC connections
- Each node has a local replica; writes are local + immediate (no
network latency); sync is async, gossip-style, eventually consistent
- Concurrent writes to different paths converge cleanly; concurrent
writes to same path resolve via LWW (NFS-equivalent semantics)
- Content (blobs) and metadata (path tree) sync separately — automerge
for path edges, iroh-blobs for file bytes
- Branch inheritance works through automerge sync
Key finding: automerge concurrent put_object on same key creates a
conflict, not a merge. Root structures must be created by one node and
synced before other nodes write. This is a design constraint for the
spec.
24 total tests pass across both POC crates. All remaining unknowns are
implementation-scope, not feasibility blockers.
Validates the three-layer architecture for a content-addressed, branch-aware,
mountable filesystem:
- SQLite path tree over iroh-blobs MemStore (15/15 tests pass)
- Fossil-style branching with free content dedup via BLAKE3 content addressing
- honker-core for notify-on-commit inside the same transaction as path-tree
mutations (transactional outbox pattern)
- Write path: "branch on write, merge on close" reconciles BLAKE3-must-hash-
complete-file with chunked filesystem writes; concurrent readers see old
version until close commits atomically; crash/abort leaves old version intact
- Multi-tenancy via bucket_id column (free isolation, auth is an adapter problem)
Remaining unknowns (FsStore/redb coexistence, distributed incomplete-blob reads,
SFTP wiring, GC/tag management, branch chain depth) are implementation-scope,
not feasibility blockers.
Documents the metatensor format: a binary data format where a TypeBox/jsonschema
schema describes the layout of binary data at schema-computed offsets. Extends
safetensors (fixed TensorRef schema) to arbitrary schemas, enabling struct tensors
(records), blob tensors (variable-length via indirection), and nested layouts.
Key points:
- TypeBox schemas render to standard JSON Schema; the jsonschema Rust crate
validates them with zero translation. Custom typedef.ts kinds (TFloat32,
TInt32, TStruct) map to jsonschema custom keywords via with_keyword().
- This eliminates typebox-rs as a schema engine — replaced by jsonschema +
a small offset-computation module + ~50 lines of custom keyword impls.
- Three tensor kinds: flat (safetensor today), struct (record of typed fields),
blob (struct tensor as index + flat tensor as data store, for variable-length)
- Memory-mappable: parse header, compute offsets, mmap data, typed views per
schema. No copy, no deserialization.
- QUIC-streamable: header is one small JSON message, each tensor is a separate
stream. Lazy loading, parallel transfer, incremental compute.
- ujsx-authorable: <Tensor>, <Struct>, <Field> as layout components, same
reconciler that diffs UI trees diffs model schemas. Model versioning is
tree diffing.
- Category-theory foundation: ujsx as universal typed-tree IR, HostConfig as
interpreter. <Tensor> is no stranger than <div>.
Adds a major section documenting how @alkdev/flowgraph (already npm-published,
uses ujsx) becomes the compute graph authoring and execution layer for
alknet-tensor, replacing webgpu-torch's imperative nn.Module hierarchy and
autograd recording with declarative ujsx templates and reactive DAG execution.
Key points documented:
- The ujsx tree IS the compute graph (CUDA-graphs-shaped but declarative)
- flowgraph's two HostConfigs: GraphologyHostConfig (compile/validate) and
ReactiveHostConfig (execute with signal-driven status propagation)
- nn modules become ujsx components, autograd becomes reverse tree walk
- Conditional/Map components enable dynamic structure CUDA graphs can't express
- Network-callable compute graphs (mix local + remote ops in one template)
- TSX authoring via standard JSX→h transform (ujsx jsx-runtime as target)
- graphology → petgraph port: ~15 API methods map 1:1, removes ~5400 lines of JS
- Updated POC priorities: end-to-end skeleton now includes flowgraph integration,
petgraph host port as a separate POC
Documents the architectural direction for a PyTorch-shaped tensor computation
library built on Rust + wgpu, where QuickJS is a thin API/composition layer
and Rust owns memory, dispatch, and WGSL codegen. Derived from webgpu-torch
as the reference design (op_spec → opgen → WGSL shader pipeline) but not a
port of its code — webgpu-torch is the reference, alknet-tensor is the
production architecture.
Key decisions: JS holds handles (BufferId), Rust owns wgpu::Buffers; ~4-5
high-level Rust ops (create_tensor/dispatch_kernel/register_kernel/read/write)
not ~20 low-level GPU API calls; WgslGenerator as a third handlebars backend
in typebox-rs codegen alongside RustGenerator and TypeScriptGenerator; tensor
ops as OperationSpecs on the registry (network-callable over irpc, verified
protocol-compatible on quickjs by POC 2).
Documents the downstream problems this solves as a side effect: distributed
compute over irpc, LLM-authored model code (toolEnv pattern), edge/embedded
tensor compute, the compositing problem sidestepped (compute has no surface),
and cross-platform by construction (wgpu's many backends).
The quickjs-reactive-probe was extended to load @alkdev/operations (registry,
call protocol, response envelopes, ACL, buildCallHandler) alongside the
reactive core. All five operations assertions pass on QuickJS-NG via rquickjs:
registry/execute/envelope/acl/callHandler. 271 modules loaded total.
This closes the third highest-leverage unknown: the operations protocol is
runtime-agnostic in practice, not just in theory. Adds a new section on the
QuickJS UDF host convergence — a minimal isolate speaking the same bidirectional
operations protocol as the TypeScript reference, the Rust alknet-call port,
and the planned NAPI/Python adapters, without needing Node/Deno/Bun. Connects
to the toolEnv WASM-QuickJS sandbox precedent at /workspace/toolEnv.
Captures the two completed POCs that resolve the highest-leverage unknowns
around the alknet-desktop direction (Rust + wgpu + rquickjs + ujsx over three.js):
- ui-spoke-poc: headless WebGPU rendering in Deno, three.js WebGPURenderer via
device-capture, MSDF text (the '2D UI is rocket surgery' subproblem)
- quickjs-reactive-probe: @preact/signals-core + @alkdev/typebox + @alkdev/ujsx
reconciler verified compatible with QuickJS-NG via rquickjs
Documents the rejected deno-desktop alternative, the established architectural
direction (head-worker over irpc/ALPN, two HostConfigs over one wgpu surface),
headless/headed parity via llvmpipe, the supply-chain surface reduction, and
the open unknowns that remain before SDD can begin.
Phase 5 now references the architect role and SDD process from
docs/sdd_process.md instead of creating ad-hoc spec stubs. Added
key new ADRs and architecture docs the architect will need to produce.
Updated gitserver reference note (MPL concern, archive it).
Kept rudolfs reference (MIT/Apache, fork candidate).
Also removed 'needs-update' status from the lifecycle states since
it's not part of the SDD process — stale docs get annotated with a
note and existing status, not a new status.
- 2.1: Add prerequisites note (verify call::frame module, ControlChannelRouter
wiring) before decomposition
- 2.2: Add raw framing auth design decision (first-frame auth event pattern
instead of per-frame auth) — simpler, more secure, matches InterfaceEvent model
- 2.3: Add InterfaceConfig restructuring note, TransportKind::WebTransport
tag addition (missed in Phase 1), note that TransportKind::Dns removal
is a no-op (never added). Add scheduling note: do 2.3 early since
subsequent tasks reference new trait names. Update ADR reference to 035.
- 2.4: Split into 2.4a (trait+enum+ConfigCredentialProvider) and 2.4b
(SecretStoreCredentialProvider, Phase 3). Clarify that the Phase 2 impl
is config-backed, not secret-backed.
- 2.5: Mark TransportKind::Dns removal as no-op since it was never added.
- 4.5: Note that doc sync round 1 is already done (commit cfc4400).
Second sync needed after implementation to capture any deviations.
- Open questions: Mark OQ-IF-01 and OQ-IF-02 as resolved with ADR-035
and ADR-031 references. Update OQ-P2-01 through P2-04 with ADR-036
and resolution status.
The axum router scaffold now only includes auth middleware and stealth
handoff — no operational routes or path conventions. External HTTP path
routing (from_openapi inverse, custom S3/git/OpenAI paths) is deferred
to Phase 5 since it depends on the spec-generation work.
- rustfs-events-select.md: deep dive into rustfs S3 event notification
system (9 target types, 30+ event types, rule engine, queue store)
and S3 Select (DataFusion-based SQL, CSV/JSON/Parquet input)
- honker-reference.md: deep dive into honker SQLite extension for
pub/sub, queue, and notification — core primitives, SQL API,
wake mechanism, single-machine design, and mapping to alknet
storage patterns
- definitions.md: formal term disambiguation for overloaded concepts
(service, interface, token, identity, domain) with cross-domain mapping
tables (alknet ↔ Keystone, distributed git, rustfs) and 8 open questions
- references/rustfs/: research on rustfs S3 store, Keystone/OIDC integration,
and credential mapping to CredentialSet
- references/gitserver/: research on gitserver library architecture and
integration paths as HTTP MessageInterface and SSH adapter
- references/openstack-keystone/: research on Keystone identity concepts
(tokens, scoping, service catalog, RBAC, trust delegation, federation)
and what alknet should adopt vs skip
- references/distributed-identity/: research on decentralized git, smart
contract ACL, on-chain identity, and Radicle comparison
Three research documents for Phase 2 planning:
- credential-provider.md: Outbound auth (CredentialProvider trait, CredentialSet enum),
account model as storage-layer concern (Identity.id as account UUID), SecretStoreCredentialProvider,
ManagedCredentialProvider, self-hosted service auth analysis (rustfs S3/OIDC, gitea OAuth2),
implementation phases A-D.
- interface-model.md: StreamInterface vs MessageInterface trait design, HTTP interface
as axum handler, DNS as MessageInterface, unified auth across all interfaces
(AuthToken + API keys via resolve_from_token), removal of TransportKind::Dns.
- tls-transport.md: Unified multi-interface architecture on port 443. Byte-peek protocol
detection (existing stealth mode) routes SSH vs axum. Axum multiplexes REST, WebSocket,
SSE, gRPC. QUIC/UDP with ALPN routing for WebTransport and iroh P2P. Single AuthToken
mechanism for all non-SSH interfaces. Four primitive operations (call/batch/schema/subscribe)
map to HTTP, MCP, and DNS.
Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to
reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy,
OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034)
for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05
in open-questions.md, deprecated hub terminology, undefined AuthService and noq
terms. Replace inline OQ text with cross-references per format rules. Add
ConfigServiceImpl definition to configuration.md. Port absolute workspace paths
to project-relative links by copying referenced docs (feasibility, certbot,
fail2ban, event_source_types) into docs/research/.
- Replace hub/spoke with head/worker terminology in call-protocol.md,
auth.md, open-questions.md, napi-and-pubsub.md
- Update operation paths from /{spoke}/{service}/{op} to
/{node}/{service}/{op} throughout call-protocol.md
- Unify Identity struct: auth.md already had {id, scopes, resources},
add note clarifying this is canonical (vs research/services.md which
used {node_id, fingerprint, scopes})
- Update integration-plan.md inconsistencies section to track what's
been fixed (hub/spoke, identity model) and expand service naming
to include external services
- Update call-protocol.md last_updated date
ADRs are intentionally left unchanged as historical records.
Organizes findings from the research phase (core, services, configuration,
storage, flow) into an actionable phased plan covering:
- Transport/Interface/Protocol three-layer model
- OperationEnv as universal composition mechanism (not replaced by irpc)
- Phase 0: Architecture foundation (9 ADRs, ~10 spec docs)
- Phase 1: Core modifications (config split, identity, forwarding, auth,
OperationEnv, interface abstraction)
- Phase 2: External crates (alknet-secret, alknet-storage, alknet-flowgraph)
- Phase 3: Integration and wiring
- Phase 4: Advanced features (DNS, WebTransport, app services)
Key clarifications: irpc services are one dispatch backend for OperationEnv,
not a replacement for it. DNS control channel is a (DNS transport, raw framing
interface) pair, not SSH-over-DNS. Call protocol and irpc operate at different
scope boundaries within Layer 3.
Document the OperationContext (request_id, identity, metadata, env, trusted),
OperationEnv (namespaced callables for handler composition), ResponseEnvelope
pattern, and how MCP/OpenAPI adapters map to the irpc service model.
- Replace hub/spoke terminology with head/worker throughout all research docs
- Add irpc service layer architecture (AuthProtocol, SecretProtocol,
ConfigProtocol, StorageProtocol)
- Add BIP39/SLIP-0010 HD key derivation for secrets management
- Add event boundary discipline (domain events vs integration events)
- Add application services layer (Docker, Node, Wallet, Proxy, Compute)
- New docs/research/services.md defining irpc service protocols
- Update core.md with service layer section and head/worker model
- Update configuration.md to delegate auth to AuthService (irpc)
- Update storage.md with secrets/key derivation and event boundaries
- Update flow.md with event boundary decision and cross-references
Unified authentication (ADR-023): SSH and WebTransport auth share the same
Ed25519 key material. Token auth uses signed timestamps verified against the
same authorized_keys set. IdentityProvider trait decouples core from identity
storage.
Bidirectional call protocol (ADR-024): Generalizes control channel (ADR-018)
to support hub→spoke and spoke→hub calls. Operation paths use /{spoke}/{service}/{op}
format for three-level routing. EventEnvelope wire format, five call events,
PendingRequestMap for correlation.
Handler/spec separation (ADR-025): Downstream consumers register operations
without modifying core. OperationRegistry maps paths to specs + handlers.
Service discovery via /services/list and /services/schema.
Resolves OQ-17 (transport-aware auth), OQ-21 (spoke routing), OQ-CFG-04 and
OQ-CFG-06 (WebTransport auth and transport-aware auth layer). Adds OQ-18
through OQ-22 for remaining open questions.