Third POC iteration (alknet-fs-sync-poc, 9/9 tests) proves multi-node path-tree sync: - Path tree modeled as automerge CRDT document, synced via automerge's sync protocol over iroh QUIC connections - Each node has a local replica; writes are local + immediate (no network latency); sync is async, gossip-style, eventually consistent - Concurrent writes to different paths converge cleanly; concurrent writes to same path resolve via LWW (NFS-equivalent semantics) - Content (blobs) and metadata (path tree) sync separately — automerge for path edges, iroh-blobs for file bytes - Branch inheritance works through automerge sync Key finding: automerge concurrent put_object on same key creates a conflict, not a merge. Root structures must be created by one node and synced before other nodes write. This is a design constraint for the spec. 24 total tests pass across both POC crates. All remaining unknowns are implementation-scope, not feasibility blockers.
37 KiB
alknet-filesystem: POC Research Summary
Status: Research complete on the three highest-leverage unknowns (path-tree layer, write path, distributed sync); the approach is viable enough to spec. Remaining unknowns are implementation-scope, not feasibility.
Date: 2026-06-20
Scope: Captures what the three POC iterations proved, what unknowns they closed, what remains open, and the architectural direction they establish. Source material for the eventual alknet-filesystem crate spec.
Executive Summary
A POC (alknet-filesystem-poc, /workspace/alknet-filesystem-poc) was completed that resolves the two largest sources of feasibility uncertainty around building a content-addressed, branch-aware, mountable filesystem from three orthogonal layers: SQLite (path tree + application file format), iroh-blobs (content-addressed blob store), and honker (durable pub/sub + queue + locks inside the same SQLite file).
The POC was built in three iterations:
- Path-tree layer (Tier 1) — proved that a SQLite-backed path tree over an iroh-blobs
MemStoregives Fossil-style branching with free content dedup, honker notify-on-commit inside the same transaction as path-tree mutations, and free multi-tenant isolation via abucket_idcolumn. 8 tests. - Write path — proved that "branch on write, merge on close" reconciles the fundamental mismatch between content-addressed storage (BLAKE3 must hash the complete file) and filesystem write semantics (chunks arrive incrementally, possibly out of order). A concurrent reader sees the old version until
close()commits atomically; crash/abort leaves the old version intact. 7 tests. - Distributed sync — proved that modeling the path tree as an automerge CRDT document, synced via automerge's sync protocol over iroh QUIC connections, gives multi-node convergence with local-latency writes. Concurrent writes to different paths converge cleanly; concurrent writes to the same path resolve via LWW (NFS-equivalent semantics). Content (blobs) and metadata (path tree) sync separately. 9 tests.
15 + 9 = 24 tests pass across two POC crates. All deps are published crates.io versions (no workspace path deps), so the POCs are portable.
The three layers compose cleanly for both the read path and the write path. The remaining unknowns — FsStore/redb vs SQLite, actual SFTP wiring, network distribution, GC/tag management — are implementation details rather than architectural risks.
Background: The Insight
The POC originated from a correction on X: SQLite is not just a database, it's a legitimate application file format. The key observations from that link:
- BLOBs < ~100KB are faster inline in SQLite than as filesystem files. This inverts the usual "databases are slow for big binaries" assumption at exactly the granularity iroh-blobs already cares about (16 KiB chunk groups).
- Atomic transactions over path-tree metadata, independent of content blobs.
- The schema is the documentation. An SQL schema defines the file format more concisely than a custom format spec.
iroh-blobs had already discovered this independently: its FsStore uses a hybrid approach (DESIGN.md, /workspace/iroh-blobs/DESIGN.md) with redb (an embedded KV DB, SQLite-shaped philosophy) for small blobs and filesystem files for large blobs, with four metadata tables (blobs, tags, inline_data, inline_outboard — see src/store/fs/meta/tables.rs). The path-tree layer is the missing piece iroh-blobs deliberately doesn't provide: iroh answers "given a BLAKE3 hash, where are the bytes?" but not "given a path, which hash does it resolve to?"
The architectural insight: SQLite for path edges, iroh for content, honker for coordination. Each layer does what it's best at, and the boundaries are clean.
The Three Layers
Layer 1: iroh-blobs — content-addressed blob storage
Crate: iroh-blobs 0.103 from crates.io, MemStore backend (no redb, no fsync rabbit hole — deliberate POC scope choice).
Source reviewed: /workspace/iroh-blobs (v0.100 local checkout) + published 0.103 source in cargo cache.
iroh-blobs provides content-addressed blob storage with BLAKE3 verified streaming. A blob is a sequence of bytes of arbitrary size, identified by its 32-byte BLAKE3 hash. Content is dedup'd by definition: same hash = same bytes. This is the layer that gives us the "many agents / many forks share most content" property — content is shared across branches because it lives under its hash, not under its path.
Key API surface used by the POC:
MemStore::new()— in-memory store, no persistencestore.blobs().add_slice(bytes).with_tag().await→TagInfo(returns the hash + creates a persistent named tag protecting the blob from GC)store.blobs().reader(hash)→BlobReader(implementsAsyncRead + AsyncSeek)store.blobs().has(hash)→bool
One sharp edge worth recording: Hash::new(buf) computes a BLAKE3 hash of the input; Hash::from_bytes(bytes) wraps 32 raw bytes as a Hash. Round-tripping a hash through hex requires from_bytes, not new. This cost ~20 minutes of debugging and is worth documenting for anyone touching the iroh API.
Layer 2: SQLite — path tree + application file format
Crate: rusqlite 0.39 (bundled), pinned to match honker-core's rusqlite version.
Source: src/schema.rs, src/fs.rs
SQLite stores the path tree: a mapping from (bucket, branch, path) to an irob Link (64-hex BLAKE3 hash). The schema has four core tables:
buckets— top-level isolation unit (multi-tenant). One row per tenant.branches— Fossil-style named snapshots with parent pointers. A branch starts empty and inherits everything from its parent.paths— one row per(bucket, branch, path)entry that has been overridden or created on this branch (only the delta from the parent).linkis the BLAKE3 hash;sizeis cached sofstatis a single indexed lookup.tombstones— deletion markers. A path may exist on a parent branch but be deleted on a child; the tombstone stops the chain walk from inheriting.
Plus two write-path tables (see Write Path below):
write_sessions— active write sessions (one per open file handle)write_chunks— chunks for active write sessions (one row perwrite()call, stored inline as BLOBs)
The core read operation is a recursive CTE that walks the branch parent chain:
WITH RECURSIVE chain(branch_id, parent_id, depth) AS (
SELECT b.id, b.parent_id, 0 FROM branches b
JOIN buckets bk ON bk.id = b.bucket_id
WHERE bk.name = ?1 AND b.name = ?2
UNION ALL
SELECT b.id, b.parent_id, c.depth + 1 FROM chain c
JOIN branches b ON b.id = c.parent_id
)
SELECT p.kind, p.link, p.symlink_to, p.size
FROM chain c JOIN paths p ON p.branch_id = c.branch_id
LEFT JOIN tombstones t ON t.branch_id IN (SELECT branch_id FROM chain WHERE depth <= c.depth)
AND t.path = p.path
WHERE p.path = ?3 AND t.id IS NULL
ORDER BY c.depth ASC LIMIT 1
This returns the first matching path row that isn't tombstoned on a closer branch. The ORDER BY c.depth ASC LIMIT 1 correctly picks the closest branch's override. Even on the POC's naive schema, resolves are sub-millisecond in-memory.
Layer 3: honker — durable pub/sub + queue + locks inside the SQLite file
Crate: honker-core 0.2.4 (not honker — see below), bundled-sqlite feature.
Source: src/schema.rs (bootstrap), src/watch.rs (listener)
honker provides pub/sub, durable queues, named locks, rate limits, and a scheduler — all as SQL functions registered on your own rusqlite connection. The key integration point:
honker_core::apply_default_pragmas(conn)?; // WAL, synchronous=NORMAL, ...
honker_core::attach_notify(conn)?; // _honker_notifications table + notify() SQL function
honker_core::attach_honker_functions(conn)?; // enqueue, claim, lock_acquire, stream_publish, cron, ...
honker_core::bootstrap_honker_schema(conn)?; // queue/stream/scheduler tables
This registers notify(channel, payload) as a SQL scalar function on the same connection that owns the path-tree tables. The critical property: calling SELECT notify(...) inside the same transaction as a path-tree mutation means the event is atomic with the data change. A watcher wakes on commit, not on poll. A rolled-back mutation produces no event. This is the transactional-outbox pattern, built in.
Why honker-core not honker: The honker crate opens its own SQLite connection and manages its own database handle. To get the SQL functions on your connection — the whole point of the transactional-outbox property — you need honker-core, which exposes attach_honker_functions(conn) for any rusqlite connection. The honker crate is the ergonomic Rust wrapper for the "honker manages everything" use case; honker-core is the foundation for the "I own the connection" use case. The POC is the latter.
honker is single-machine, file-backed (explicitly: "two servers writing the same .db over NFS is not a Honker deployment strategy"). That's fine for the local-VFS layer; the distributed part is iroh's job at the blob layer. The split is clean: honker coordinates local state + local workers; iroh coordinates cross-node content. They don't compete.
POC Iteration 1: Path-Tree Layer (Tier 1)
Scope: SQLite path-tree + iroh-blobs MemStore + honker notify, no SFTP, no FsStore, no network. The goal was to validate the SQLite+iroh seam and the branching model cheaply.
What it proved
1. SQLite is a workable path-tree layer over content-addressed blobs. Path rows store BLAKE3 hashes; bytes live in iroh. put/get roundtrips cleanly. Atomic path-tree mutations (rename, mkdir, unlink) compose with iroh blob add/delete. rename is O(1) on path edges regardless of file size — content stays in iroh under its hash, not under its path. This is the property git also gets right. (Test: put_get_roundtrip, rename_is_o1_on_path_edges)
2. Fossil-style branching gives free content sharing across forks. A child branch inherits parent files via the recursive-CTE chain walk. Same content hashes to the same link (free dedup via content addressing). Writes on a child are invisible on the parent. Tombstones hide parent files on child branches. (Tests: branch_inherits_parent_content, branch_modifications_do_not_leak_to_parent, tombstone_hides_parent_file_on_child, content_is_deduped_across_branches)
3. honker notify-on-commit works inside the same transaction. notify() is called inside the path-tree mutation; watcher receives the event on commit. (Test: watch_fires_on_commit)
4. Buckets (multi-tenancy) are free. A bucket_id column on every row = isolation. Alpha files don't leak into Beta. Auth is an adapter problem (which connection sees which buckets). (Test: multi_bucket_isolation)
The share-check demo
Running cargo run -- share-check demonstrates the content-sharing property end-to-end:
== content sharing across branches ==
main shared.txt = "hello from main\n"
agent-a shared.txt = "hello from main\n" ← inherited from parent, same hash
agent-a agent-a.txt = "agent a working here\n"
agent-a main-only.txt resolved = false ← tombstoned on agent-a
== dir listing on agent-a (parent chain walk) ==
agent-a.txt kind=File link=f7a5aa575a40…
shared.txt kind=File link=b9d5a428d102…
shared.txt is byte-identical across main and agent-a (content shared by hash). main-only.txt is hidden on agent-a by the tombstone. agent-a.txt is only on the branch.
POC Iteration 2: Write Path — "Branch on Write, Merge on Close"
Scope: Chunked writes (SFTP-style open → write at offset × N → close), crash/abort semantics, concurrent-reader isolation. The deal-breaker question: can content-addressed storage serve as the backend for a mountable filesystem's write path, or is the BLAKE3-must-hash-the-whole-thing constraint a fundamental mismatch?
The problem
BLAKE3 must hash the complete file to produce the content address. But filesystem writes arrive as chunks — SFTP writes in ~32KB chunks, possibly pipelined and out of order. You can't hash until you have all the bytes. Where do partial writes live? What does a concurrent reader see? What happens on crash?
The solution: "branch on write, merge on close"
A write session is a short-lived branch:
open(path, WRITE)creates a temp child branch of the target branch. Inserts awrite_sessionsrow.write(offset, chunk)inserts a row inwrite_chunks— one transaction per chunk, crash-safe. Chunks may arrive out of order (SFTP pipelines writes). Chunk-sized BLOBs (~32KB) are SQLite's sweet spot per the appfileformat paper's "BLOBs < 100KB faster inline" finding. The offset is the unique key — two writes to the same offset overwrite (last-write-wins within a session, matching POSIX overlapping-write semantics).- Reads on the target branch during the session see the old version via the parent chain walk. This is POSIX "concurrent readers see old version until close commits", for free, from the branching model — no separate snapshot mechanism needed.
close()assembles chunks in offset order → BLAKE3 hash → irohadd_bytes→ updates the path row on the target branch →notify()→ marks session closed. All the SQLite parts in one transaction. The new version becomes visible atomically.abort()or crash: the session row staysopen/aborted, the target branch is untouched, the old version is still readable. Orphaned sessions can be found and cleaned up later.
What it proved
| Test | What it proves |
|---|---|
chunked_write_close_produces_correct_hash |
Assembling chunks + BLAKE3 hashing produces the same hash as a whole-file write. Content addressing works through the chunk boundary. |
chunked_write_out_of_order |
SFTP-style pipelined writes (offset 6 before offset 0) assemble correctly. The offset is the key, not arrival order. |
concurrent_reader_sees_old_version_during_write |
The key POSIX property. During a write session, reads return the old version. After close(), reads return the new version. For free, from the branching model. |
abort_leaves_old_version_untouched |
Simulated crash/abort: old version survives, chunks discarded. |
abort_on_new_file_leaves_no_trace |
Aborting a write to a new path doesn't create a phantom entry. |
large_chunked_file_writes_correctly |
1MB file in 32KB chunks assembles, hashes, and reads back correctly. Matches whole-file hash. |
multiple_concurrent_write_sessions_same_path |
Two sessions on the same path don't corrupt each other's chunks (session-scoped). Last close wins. |
The bug that was found and fixed
The initial chunk_idx = offset / 32768 computation was wrong — two writes at offsets 0 and 6 (both < 32768) collided on chunk_idx = 0, causing the INSERT OR REPLACE to overwrite the first chunk with the second. The out-of-order test caught this immediately. Fix: use the offset itself as the unique key. This is the kind of thing the POC exists to catch — a spec written without the POC would have shipped this bug.
Why SQLite + honker wins for the write path specifically
The write path is where the SQLite-vs-redb-vs-filesystem decision matters most, and SQLite wins for three reasons that all show up here:
-
Chunk-sized BLOBs are SQLite's sweet spot. SFTP writes in ~32KB chunks. The appfileformat page's "BLOBs < 100KB are faster inline in SQLite than as filesystem files" finding is exactly this case. redb could do it too, but now you have two databases for one transaction.
-
The chunks, the path tree, and the honker notification are one transaction. No dual-write between "where chunks live" and "where the path tree lives." No fsync ordering problem (the thing iroh's DESIGN.md spent the most words on — "files are hard," the bitfield/data/outboard write-ordering problem). One WAL, one commit boundary.
-
honker coordinates the session. Named locks on the path prevent concurrent writers from stomping each other (not yet wired in the POC — see Open Unknowns). The write session can be a honker job — enqueue on open, track progress, close completes the job, notify fires on merge. Crash leaves an orphaned job you can see and clean up.
POC Iteration 3: Distributed Sync — Automerge CRDT over iroh QUIC
POC: alknet-fs-sync-poc (/workspace/alknet-fs-sync-poc)
Crates: automerge 0.10, iroh 1.0, iroh-blobs 0.103
Tests: 9/9 pass (4 local + 5 cross-node)
The question
The first two POCs proved the local architecture. The open question was: how do multiple nodes see each other's path-tree changes? The cache-invalidation/staleness problem — Node B has a local copy of hash X (correct, content-addressed), Node A updates the path to point to hash Y, B's cache is correct but stale.
The solution: path tree as automerge CRDT
Model the path tree as an automerge document. Each node has a local replica. Writes are local + immediate (no network latency). Sync happens via automerge's sync protocol over iroh's QUIC connections — gossip-style, eventually consistent. Conflicts on the same path merge via last-write-wins (LWW) on scalar values, which is what NFS does too.
The architecture splits cleanly:
| Layer | What | How |
|---|---|---|
| Content (blobs) | File bytes, content-addressed | iroh-blobs — BLAKE3 hash → blob. Network-transferable via iroh's blob protocol. |
| Path tree (metadata) | Path → hash mapping, branches, tombstones | Automerge document, synced via iroh QUIC. Local replica per node. |
| Local write path | Chunked writes, crash safety | SQLite write_chunks + honker (from first POC). |
| Cross-node sync | "Node A changed path X to hash Y" | Automerge sync messages over iroh QUIC. Gossip-style, eventually consistent. |
Content and metadata sync are separate: automerge syncs the path tree (small — path edges only, not file bytes); iroh-blobs syncs the content (large — actual file bytes, content-addressed). A node that learns a new path→hash mapping via automerge sync can fetch the blob from a peer that has it via iroh-blobs' network protocol. This keeps the automerge doc small and the sync fast.
Document structure
{
"branches": { // branch_name → { parent: "main" | null }
"main": { "parent": null },
"agent-a": { "parent": "main" }
},
"paths": { // "{branch}:{path}" → { kind, link, size }
"main:hello.txt": { "kind": "file", "link": "abc123...", "size": 42 },
"agent-a:agent.txt": { "kind": "file", "link": "def456...", "size": 20 }
},
"tombstones": { // "{branch}:{path}" → true
"agent-a:old.txt": true
}
}
The branch chain walk (resolve) is done in Rust by reading the branches map and following parent pointers — same recursive logic as the SQLite CTE, but operating on the CRDT's local replica.
Sync protocol
Same pattern as the iroh-automerge example (/workspace/iroh-examples/iroh-automerge): length-prefixed automerge::sync::Message exchanged over a QUIC bidi stream. The PathTreeSync struct implements iroh::protocol::ProtocolHandler, accepting incoming sync connections and running the sync loop until both sides converge.
For full convergence with concurrent writes on both sides, 2-3 sync rounds are needed (A→B, B→A, A→B). In a production system, sync would be continuous (like the iroh-automerge-repo example with samod), not one-shot.
What it proved
| Test | What it proves |
|---|---|
sync_node_a_write_visible_on_node_b |
The core test. Node A writes a file, syncs to Node B, Node B resolves the same path to the same hash. CRDT sync works. |
sync_concurrent_writes_different_paths_converge |
Both nodes write different files concurrently, sync, both see both files. No conflict on different paths. |
sync_concurrent_writes_same_path_lww |
Both nodes write the SAME path with different values, sync, both converge to the same value (LWW). NFS-equivalent semantics. |
sync_branch_inheritance_across_nodes |
Node A creates a branch with a parent and a file. After sync, Node B resolves a file on the child branch by walking the parent chain — even though the branch was created on Node A. |
content_separate_from_metadata_sync |
Node A writes a file (path tree + blob), syncs path tree to Node B. B can resolve path→hash (metadata synced), but B's blob store is empty (content not synced). This validates the metadata/content separation — B knows the file exists but needs to fetch the content separately. |
The bug that was found and fixed: concurrent root-map initialization
The most important finding from this POC iteration: automerge concurrent put_object on the same key creates a conflict, not a merge. If two nodes independently call put_object(ROOT, "paths", Map), automerge sees two concurrent writes to the same key and keeps both as conflicting values. When merged, one map wins and the other's contents are invisible.
This affected two places:
- Root maps (
branches,paths,tombstones): if both nodes lazily create them on first write, the maps conflict. - Branch creation: if both nodes independently
ensure_branch("main"), the branch objects conflict.
The fix: one node initializes the root maps and branches, syncs to the other node, then both write independently. The ensure_branch method creates all root maps eagerly (idempotent — only creates if absent). This mirrors the real use case: a node joins the network, syncs the current state, then writes locally. A spec must call this out as a design constraint: root structures must be created by one node and synced before other nodes write.
What this means for the spec
The three-layer architecture (SQLite path-tree, iroh content store, honker coordination) extends cleanly to the distributed case by making the path tree an automerge CRDT. The remaining unknowns are now all implementation-scope:
- SFTP wiring — mechanical, the
Handlertrait maps 1:1 to thePathTree+WriteSessionAPI. - FsStore/redb — needed for production persistence, coexists with automerge for the path tree.
- Continuous sync — the POC does one-shot sync; production needs continuous sync (like
samod'sRepowithsync_with). - Content fetch on read — when a node learns a new path→hash but doesn't have the blob, it needs to fetch via iroh-blobs' network protocol. The
content_separate_from_metadata_synctest validates this is the correct behavior. - GC and tag management — same as before, plus cross-node GC coordination.
The stack
┌─────────────────────────────────────────────────────────────────┐
│ SFTP / SSH (russh-sftp Handler trait) — not yet wired │
│ open / read / write / close / readdir / rename / unlink / ... │
│ maps 1:1 to PathTree + WriteSession API │
└────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ PathTree (SQLite) │
│ buckets, branches, paths, tombstones │
│ recursive-CTE chain walk for branch-aware resolve │
│ WriteSession: branch-on-write, merge-on-close │
│ honker notify() inside every mutation txn │
└────────────────────────────┬────────────────────────────────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌─────────────────────────────┐ ┌────────────────────────────────┐
│ iroh-blobs (content store) │ │ honker (in the SQLite file) │
│ BLAKE3 content addressing │ │ notify/listen (watch) │
│ MemStore (POC) / │ │ durable queues (sync/replicate)│
│ FsStore (production, redb) │ │ named locks (writer coord) │
│ Tags + GC (mark-sweep) │ │ scheduler (maintenance cron) │
└─────────────────────────────┘ └────────────────────────────────┘
Layer separation
| Concern | Layer | Why |
|---|---|---|
| Path → content hash mapping | SQLite path tree | Hierarchical, indexed, transactional, schema-as-doc |
| Content bytes | iroh-blobs | Content-addressed, dedup, verified streaming, network transfer |
| Path-tree mutations | SQLite txn | Atomic, crash-safe, single WAL |
| Filesystem events (watch/inotify) | honker notify() | Atomic with the mutation, wake-on-commit not poll |
| Background sync/replication kicks | honker queues | Transactional outbox: enqueue in same txn as mutation |
| Writer coordination | honker named locks | Prevent concurrent writers on same path |
| Branching / forking | branches table + chain walk | Fossil-style, content shared by hash, only path edges diverge |
| Multi-tenancy | bucket_id column | Free — just a where-clause, auth is an adapter problem |
| Chunked writes | write_sessions + write_chunks | Branch-on-write, merge-on-close, crash-safe per-chunk |
SFTP mapping (conceptual, not yet wired)
The russh-sftp Handler trait (/workspace/russh-sftp/src/server/handler.rs) is a near 1:1 mirror of POSIX FS syscalls translated to SFTP packets. Each maps onto SQL operations against the path-tree tables:
| SFTP op | PathTree operation |
|---|---|
open(path, O_RDONLY) |
resolve(bucket, branch, path) → Link → iroh BlobReader |
open(path, O_WRONLY) |
WriteSession::open(bucket, branch, path) |
write(handle, offset, data) |
WriteSession::write_chunk(offset, data) |
close(handle) |
WriteSession::close() → hash → merge → notify |
readdir(path) |
list_dir(bucket, branch, path) |
stat/lstat/fstat |
resolve(...) → entry size/kind (indexed lookup, cheaper than real FS fstat) |
rename(from, to) |
PathTree::rename(...) — O(1) on edges, content stays in iroh |
remove(path) |
PathTree::unlink(...) — tombstone, content GC'd later |
mkdir/rmdir |
PathTree::mkdir/unlink with kind=dir |
symlink/readlink |
PathTree::symlink/resolve with kind=symlink |
extended |
SFTP escape hatch — could expose iroh-specific ops (get ticket, resolve to hash) |
The russh-sftp client's File already implements AsyncRead + AsyncSeek + AsyncWrite with pipelined writes (write_nowait + ack window), and the iroh BlobReader already supports range reads. The two trait surfaces line up. SeekFrom::End (round-trip-costly in real SFTP — calls fstat) becomes a single indexed SQLite lookup, so it's actually cheaper than a real FS fstat.
Multi-tenancy / buckets
The "bucket" concept maps to S3's bucket format (from the rudolfs reference, /workspace/@alkdev/alknet/docs/research/references/gitlfs/rudolfs-reference.md). In rudolfs, StorageKey = (Namespace, Oid) where Namespace = (org, project) — tenant isolation by URL path. In the POC, bucket_id on every row achieves the same isolation with a single where-clause. Auth from the SSH/SFTP connection determines which buckets are visible — it's an adapter problem, not a storage problem.
The rudolfs caching layer (LRU + disk cache → permanent storage, with fanout() to stream to both client and cache simultaneously) is a useful pattern for the production version: a local iroh-blobs FsStore as cache, a remote iroh node as permanent storage. The decorator composition pattern (Verify ↔ Encrypted ↔ Cached ↔ Retrying(Disk → S3)) translates directly: Verify becomes BLAKE3 verification (built into iroh), Cached becomes the local iroh store, S3 becomes the remote iroh node.
Open Unknowns (For Future POCs)
These are the unknowns that remain after the POC. None are feasibility blockers (the basic mechanics work); they are scope/work-quantity questions that affect spec sizing.
1. FsStore (redb + filesystem) vs SQLite — the two-database question (scoping, not feasibility)
The POC used MemStore deliberately — no redb, no fsync rabbit hole, no partial-file lifecycle. The production version needs FsStore for persistence of large blobs. The open question: is having both redb (iroh's metadata) and SQLite (our path tree) in one process a problem?
Two embedded databases means two WAL files, two fsync paths, two crash-recovery stories. The likely answer is "fine, they serve different purposes" — redb stores blob metadata + inline data, SQLite stores path trees + write chunks + honker tables. But it needs validation.
The alternative — forking iroh-blobs to use SQLite instead of redb — is a big maintenance commitment. The POC's write_chunks table proves SQLite can handle chunk-sized inline BLOBs at iroh's granularity, so the swap is mechanically possible. But it should only be done if the two-database coexistence proves problematic, not for aesthetics. A scoping probe would run FsStore alongside the SQLite path tree and measure: double-fsync overhead, WAL contention, operational confusion.
2. Incomplete blobs in a distributed context (partially solved)
The "many agents" scenario's incomplete-blob problem is now partially solved. The content_separate_from_metadata_sync test in the sync POC proves that a node can learn a new path→hash mapping via automerge sync without having the blob locally. The remaining design question: what does the read return when the blob isn't local? Does it block? Does it trigger a fetch? Does it return EIO? A design probe would wire iroh-blobs' network fetch into the read path: resolve → miss → async fetch from a peer that has it → block or return EIO temporarily. The honker queue is the coordination mechanism for background fetching.
3. SFTP wiring (mechanical, not design)
The Handler trait maps 1:1 to the PathTree + WriteSession API (see SFTP mapping table above). Wiring it is straightforward but non-trivial: handle management (open files, open directories), error code mapping (SQLite errors → SFTP status codes), fsync@openssh.com extension negotiation, and the extended channel for iroh-specific ops. A POC would implement the Handler trait and test with an actual sshfs mount.
4. honker named locks for writer coordination (mechanical)
The POC's multiple_concurrent_write_sessions_same_path test shows that two sessions on the same path don't corrupt each other's chunks (session-scoped), and last-close wins. But a real filesystem needs explicit locking: honker_lock_acquire('path:<bucket>:<branch>:<path>', writer_id, timeout) to prevent concurrent writers from stomping each other. The lock is a SQL function already registered on the connection — it just needs to be called in WriteSession::open. Not wired in the POC; straightforward to add.
5. GC and tag management (design)
iroh-blobs uses tags + mark-sweep GC: blobs are protected from deletion by tags (persistent or temp), and GC walks all tags to find reachable hashes, then sweeps everything else. The path tree needs to manage tags: when a path row points to a Link, that blob needs a persistent tag so it survives GC. When a path is tombstoned (unlinked), the tag can be removed and the blob becomes eligible for GC. The mapping from path rows to tags is the design question. A POC would wire tags.create() / tags.delete() into upsert_path / unlink and verify GC reclaims orphaned content.
6. Branch chain depth and performance (perf, deferred)
The recursive CTE walks the full parent chain on every resolve. For shallow chains (2-3 levels, as in "main → agent-a → working-session") this is sub-millisecond. For deep chains (many nested forks, or long-lived agent workspaces with many snapshot branches), performance could degrade. A materialized view or a "resolved paths" cache table (updated on commit) would solve this if it becomes an issue. Worth a perf probe with realistic branch depths before spec.
7. Snapshot / commit semantics (design)
The POC has branches but no explicit "commit" or "snapshot" operation — a branch is just a name, and writes to it are immediate. A real filesystem (especially one backing git) needs snapshot points: "this branch was at this state at this time." The branches table has created_at but no snapshot history. The design question: is a snapshot a new branch (Fossil's model), or is it a recorded point-in-time within a branch (git's model)? Fossil's model maps more naturally to the existing recursive CTE.
Test Coverage
running 15 tests
test abort_leaves_old_version_untouched ... ok
test abort_on_new_file_leaves_no_trace ... ok
test branch_inherits_parent_content ... ok
test branch_modifications_do_not_leak_to_parent ... ok
test chunked_write_close_produces_correct_hash ... ok
test chunked_write_out_of_order ... ok
test concurrent_reader_sees_old_version_during_write ... ok
test content_is_deduped_across_branches ... ok
test large_chunked_file_writes_correctly ... ok
test multi_bucket_isolation ... ok
test multiple_concurrent_write_sessions_same_path ... ok
test put_get_roundtrip ... ok
test rename_is_o1_on_path_edges ... ok
test tombstone_hides_parent_file_on_child ... ok
test watch_fires_on_commit ... ok
POCs
alknet-filesystem-poc (local architecture)
src/
main.rs # CLI: mkfs, ls, put, get, rm, mv, branch, watch, share-check
schema.rs # SQLite schema (paths, branches, tombstones, write_sessions,
# write_chunks) + honker bootstrap
fs.rs # PathTree: resolve, put_file, mkdir, symlink, unlink, rename,
# list_dir, read_file, open_write
# recursive-CTE chain walk for branch-aware resolve
blob_bridge.rs # iroh MemStore adapter: put_bytes → Link, get_bytes → bytes
branch.rs # Branch listing / creation
write_session.rs # Chunked write: open, write_chunk, close, abort
# branch-on-write, merge-on-close
watch.rs # honker notify/listen wrapper for fs events
lib.rs # re-exports
tests/
integration.rs # 15 tests: 8 path-tree + 7 write-path
alknet-fs-sync-poc (distributed sync)
src/
path_tree.rs # AutomergePathTree: automerge-backed path tree with
# branch chain walk, put_file, resolve, list_dir, unlink
sync.rs # PathTreeSync: iroh ProtocolHandler that exchanges
# automerge sync::Messages over QUIC bidi streams
blob_bridge.rs # iroh MemStore adapter (same as first POC)
lib.rs # re-exports
tests/
integration.rs # 9 tests: 4 local + 5 cross-node sync
References
- POC 1 (local):
/workspace/alknet-filesystem-poc—Cargo.toml,src/,tests/integration.rs(15 tests) - POC 2 (sync):
/workspace/alknet-fs-sync-poc—Cargo.toml,src/,tests/integration.rs(9 tests) - SQLite appfileformat: https://sqlite.org/appfileformat.html
- iroh-blobs source (v0.100):
/workspace/iroh-blobs—DESIGN.md(blob store tradeoffs, hybrid approach, files-are-hard),src/store/fs/(FsStore, redb tables, EntryState),src/api/blobs.rs(Blobs API) - iroh-blobs published (v0.103): cargo cache —
store/mem.rs(MemStore),api/blobs.rs(AddProgress, BlobReader) - honker: https://honker.dev/docs/ —
honker-coreAPI:attach_honker_functions,attach_notify,bootstrap_honker_schema,apply_default_pragmas - honker source:
~/.cargo/registry/src/*/honker-core-0.2.4/src/—lib.rs,honker_ops.rs - russh-sftp source:
/workspace/russh-sftp/src/—server/handler.rs(Handler trait),client/fs/file.rs(pipelined AsyncWrite),client/fs/dir.rs - rudolfs reference:
/workspace/@alkdev/alknet/docs/research/references/gitlfs/rudolfs-reference.md— decorator pattern, LRU cache, namespace/bucket isolation, fanout streaming - iroh-blobs research docs:
/workspace/@alkdev/alknet/docs/research/references/iroh/iroh-blobs/— overview, storage, key types, transfer protocol - russh-sftp research docs:
/workspace/@alkdev/alknet/docs/research/references/ssh/russh-sftp/— overview, client API, server API, wire protocol - iroh-automerge example:
/workspace/iroh-examples/iroh-automerge/—src/protocol.rs(SyncDoc over QUIC bidi stream pattern) - iroh-automerge-repo example:
/workspace/iroh-examples/iroh-automerge-repo/—src/lib.rs(samod Repo with continuous sync, TokioFilesystemStorage) - automerge source:
~/.cargo/registry/src/*/automerge-0.10.0/—src/automerge.rs(AutoCommit, fork, merge),src/sync.rs(SyncDoc trait, Message),src/iter/map_range.rs(MapRangeItem)