docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs

This commit is contained in:
2026-06-10 12:34:30 +00:00
parent 6e71d1f306
commit 5bb5e1064c
49 changed files with 9923 additions and 0 deletions

View File

@@ -0,0 +1,98 @@
# iroh-docs: Overview and Architecture
> Reference document for the `iroh-docs` crate (v0.98.0).
> Source: `/workspace/iroh-docs`
## What Is iroh-docs?
`iroh-docs` is a Rust crate implementing **multi-dimensional key-value documents with an efficient synchronization protocol**. It provides:
1. **A CRDT-based document model** — Replicas (documents) hold entries identified by namespace + author + key, with content-addressed values (BLAKE3 hashes).
2. **Range-based set reconciliation** — An efficient sync protocol based on [Aljoscha Meyer's paper](https://arxiv.org/abs/2212.13567) for reconciling sets between peers.
3. **Live sync via gossip** — Real-time document updates propagated through an iroh-gossip swarm.
4. **Persistent storage** — A `redb`-backed store supporting both in-memory and file-based modes.
## High-Level Architecture
```
┌──────────────────────────────────────────────────────────────┐
│ Docs (Protocol) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Engine │ │
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │ │
│ │ │ LiveActor│ │ GossipState │ │ SyncHandle/Actor │ │ │
│ │ │ (events) │ │ (iroh-gossip)│ │ (store + sync) │ │ │
│ │ └──────────┘ └──────────────┘ └───────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Replica │ │ SignedEntry │ │ Author/ │ │
│ │ (sync.rs) │ │ Entry/Record │ │ Namespace keys │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Store (redb) │ │
│ │ Authors │ Namespaces │ Records │ RecordsByKey │ ... │ │
│ └─────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
```
### Module Layout
| Module | Purpose |
|--------|---------|
| `sync.rs` | Core types: `Replica`, `Entry`, `SignedEntry`, `Record`, `RecordIdentifier`, `Capability`, events |
| `keys.rs` | Cryptographic key types: `Author`, `NamespaceSecret`, `AuthorId`, `NamespaceId` |
| `ranger.rs` | Range-based set reconciliation algorithm implementation |
| `heads.rs` | `AuthorHeads` — latest timestamps per author for efficient sync decisions |
| `store/` | Storage abstraction and `redb`-backed persistent store |
| `store/fs.rs` | File-based `Store` implementation with redb tables |
| `store/pubkeys.rs` | `PublicKeyStore` trait for caching expanded ed25519 public keys |
| `actor.rs` | `SyncHandle` / Actor — single-threaded executor for store and replica operations |
| `engine/` | Live sync coordination: `Engine`, `LiveActor`, `GossipState`, `NamespaceStates` |
| `engine/live.rs` | The `LiveActor` event loop: handles sync, gossip, content download |
| `engine/gossip.rs` | Integration with `iroh-gossip` for broadcasting document operations |
| `engine/state.rs` | `NamespaceStates` — tracks per-namespace, per-peer sync state |
| `net/` | Network protocol: ALPN `/iroh-sync/1`, connection handling |
| `net/codec.rs` | Wire codec: length-prefixed postcard-serialized `Message` frames |
| `protocol.rs` | `Docs` struct (the `ProtocolHandler`) and `Builder` |
| `api/` | irpc-based RPC API for external access |
| `ticket.rs` | `DocTicket` — shareable document capability + peer addresses |
## Key Design Principles
1. **Two-key identity model**: Every entry is uniquely identified by (namespace, author, key). The namespace key provides write authorization; the author key provides attribution.
2. **Content-addressed values**: Entries store a BLAKE3 hash + length, not the actual content. Content blobs are handled separately by `iroh-blobs`.
3. **Prefix deletion**: An entry with key "foo" acts as a tombstone for all entries whose keys start with "foo/" (prefix deletion semantics). This enables hierarchical key structures.
4. **Last-writer-wins with per-author timestamps**: Entries are ordered by (timestamp, hash). Newer entries dominate older ones. Different authors can have entries for the same key simultaneously (multi-dimensional).
5. **Actor-based concurrency**: All store and replica mutations go through a single `SyncHandle` actor thread, eliminating the need for locks on the store.
6. **Event-driven live sync**: The `LiveActor` coordinates gossip, direct sync, and content downloads through a `tokio::select!` event loop.
## Dependencies
Key dependencies from `Cargo.toml`:
| Crate | Purpose |
|-------|---------|
| `iroh` | Networking: endpoints, connections, protocol routing |
| `iroh-blobs` | Content-addressed blob storage and transfer |
| `iroh-gossip` | Gossip protocol for broadcasting updates |
| `iroh-tickets` | Ticket-based sharing mechanism |
| `redb` | Embedded key-value store for persistence |
| `ed25519-dalek` | Ed25519 signatures for entries |
| `blake3` | Hashing (fingerprints + content hashes) |
| `postcard` | Serialization (wire format for sync protocol) |
| `irpc` / `noq` | RPC framework for API |
## Feature Flags
| Feature | Default | Description |
|---------|---------|-------------|
| `metrics` | Yes | Enables iroh-metrics instrumentation |
| `rpc` | Yes | Enables irpc-based RPC API (depends on `noq`) |
| `fs-store` | Yes | Enables persistent file-based store |

View File

@@ -0,0 +1,201 @@
# iroh-docs: Document Model and CRDT Details
## Core Data Model
### Namespace (Document Identity)
A **Namespace** is the identity of a document. It consists of:
- **`NamespaceSecret`** — An Ed25519 signing key (32 bytes) that grants write capability
- **`NamespacePublicKey`** — The corresponding verifying key (32 bytes)
- **`NamespaceId`** — A `[u8; 32]` that is the byte representation of the public key; this serves as the unique identifier for a document/replica
```
NamespaceSecret (signing key) ──derives──▶ NamespacePublicKey (verifying key)
──into─────▶ NamespaceId ([u8; 32])
```
### Author (Writer Identity)
An **Author** represents a writer identity within a document. Multiple authors can write to the same namespace.
- **`Author`** — An Ed25519 signing key (32 bytes)
- **`AuthorPublicKey`** — The corresponding verifying key (32 bytes)
- **`AuthorId`** — A `[u8; 32]` byte representation of the public key
Authors are application-defined: an application might create one author per device, per user, or per session.
### Capability
Access to a document is controlled through a `Capability`:
```rust
pub enum Capability {
Write(NamespaceSecret), // Full read-write access
Read(NamespaceId), // Read-only access (can sync but not insert)
}
```
Capabilities can be **merged** — a `Read` capability can be upgraded to `Write` if a matching `Write` is presented:
```rust
capability.merge(other_capability) // Read + Write → Write
```
The raw representation is `(u8, [u8; 32])` — a kind byte followed by 32 bytes of key material.
### Entry (The Fundamental Record)
An **`Entry`** is the core data unit, consisting of:
```rust
pub struct Entry {
id: RecordIdentifier, // (namespace, author, key)
record: Record, // (hash, len, timestamp)
}
```
#### RecordIdentifier
```rust
pub struct RecordIdentifier(Bytes); // namespace[0..32] || author[32..64] || key[64..]
```
The key is a variable-length byte sequence. `RecordIdentifier` implements `Ord` by comparing namespace first, then author, then key — this ordering is critical for the range-based sync algorithm.
#### Record
```rust
pub struct Record {
len: u64, // byte length of the content
hash: Hash, // BLAKE3 hash of the content (32 bytes)
timestamp: u64, // microseconds since Unix epoch
}
```
The `Record` comparison uses `(timestamp, hash)` ordering — this is the **Last-Writer-Wins** rule for same-key entries. When two records for the same key exist, the one with the higher timestamp wins; if timestamps are equal, the higher hash wins as a tiebreaker.
### SignedEntry (Entry with Proofs)
```rust
pub struct SignedEntry {
signature: EntrySignature, // dual Ed25519 signatures
entry: Entry,
}
```
#### EntrySignature
```rust
pub struct EntrySignature {
author_signature: Signature, // 64-byte Ed25519 signature
namespace_signature: Signature, // 64-byte Ed25519 signature
}
```
Both signatures cover the canonical byte encoding of the `Entry` (id + record). This means:
- The **namespace signature** proves write authorization (only holders of `NamespaceSecret` can produce valid entries)
- The **author signature** proves authorship (provides attribution and non-repudiation)
#### Verification
```rust
fn verify<S: PublicKeyStore>(&self, store: &S) -> Result<(), SignatureError>
```
Verification requires both the `NamespacePublicKey` and `AuthorPublicKey`, which are derived from the entry's namespace and author IDs. The `PublicKeyStore` trait provides caching for these expanded keys.
### Empty Entries (Tombstones / Prefix Deletion)
An entry is **empty** when `hash == Hash::EMPTY && len == 0`. Empty entries serve as **deletion markers**:
- **Key deletion**: Inserting an empty entry with the exact key removes the previous entry for that key
- **Prefix deletion**: Inserting an empty entry with key "foo" removes all entries whose keys start with "foo" (prefix deletion)
```rust
pub async fn delete_prefix(&mut self, prefix: impl AsRef<[u8]>, author: &Author) -> Result<usize, InsertError>
```
### Insert Semantics (CRDT Rules)
When a `SignedEntry` is inserted into a replica via `Store::put()` (the ranger store trait):
1. **Check prefixes**: Look up all existing entries whose key is a **prefix** of the new entry's key. If any prefix entry has a value `>=` the new entry's value, the new entry is **rejected** (`InsertOutcome::NotInserted`).
2. **Remove dominated entries**: Remove all existing entries whose key **starts with** the new entry's key (i.e., the new key is a prefix of theirs) AND whose value is `<=` the new entry's value.
3. **Insert**: If not rejected, the new entry is stored.
This implements a **prefix-aware last-writer-wins** CRDT:
- Newer entries for the same (namespace, author, key) tuple replace older ones
- A new entry at key "/foo" can delete all entries under "/foo/*" if it's newer
- Different authors can coexist on the same key — each author's latest entry is kept
### Timestamp and Future Shift
Timestamps are in **microseconds since Unix epoch**. There is a maximum allowed future shift:
```rust
pub const MAX_TIMESTAMP_FUTURE_SHIFT: u64 = 10 * 60 * Duration::from_secs(1).as_millis() as u64;
```
Entries with timestamps more than 10 minutes in the future of the local clock are rejected during validation.
### Content Status
Each entry's content has an availability status:
```rust
pub enum ContentStatus {
Complete, // Content blob is fully available locally
Incomplete, // Partially available
Missing, // Not available
}
```
This status is communicated during sync to help peers decide whether to download content.
### AuthorHeads (Efficient Sync Optimization)
`AuthorHeads` tracks the latest timestamp for each author in a document:
```rust
pub struct AuthorHeads {
heads: BTreeMap<AuthorId, Timestamp>,
}
```
This enables a quick check: `has_news_for(other)` — comparing local and remote heads to determine whether sync would yield any new entries. If all timestamps are at least as recent locally, no sync is needed.
`AuthorHeads` can be serialized with a size limit, dropping the oldest entries when the limit is exceeded.
## Event System
Replicas emit events through a subscription system:
```rust
pub enum Event {
LocalInsert {
namespace: NamespaceId,
entry: SignedEntry,
},
RemoteInsert {
namespace: NamespaceId,
entry: SignedEntry,
from: PeerIdBytes,
should_download: bool, // based on download policy
remote_content_status: ContentStatus,
},
}
```
Subscribers use `async_channel` for non-blocking notification delivery. The `ReplicaInfo::subscribe()` method registers a sender, and events are fanned out to all subscribers.
## Validation
Entry validation during insertion checks:
1. **Namespace match**: The entry's namespace must match the replica's namespace
2. **Signature verification**: For non-local entries, both namespace and author signatures are verified
3. **Timestamp check**: The entry must not be more than `MAX_TIMESTAMP_FUTURE_SHIFT` in the future
4. **Empty entry check**: An empty entry must have `hash == EMPTY && len == 0`, and a non-empty entry must have `len != 0`

View File

@@ -0,0 +1,272 @@
# iroh-docs: Range-Based Set Reconciliation (Ranger)
## Overview
The sync protocol in iroh-docs is based on **Range-Based Set Reconciliation**, implementing the algorithm described in [Aljoscha Meyer's paper (arXiv:2212.13567)](https://arxiv.org/abs/2212.13567).
The core idea: two peers can efficiently compute the union of their entry sets by recursively partitioning the sets and comparing **fingerprints** (hashes) of partitions. When fingerprints match, no further work is needed. When they differ, the partition is subdivided until the difference can be resolved by sending the actual entries.
## Key Abstractions
### RangeEntry Trait
```rust
pub trait RangeEntry: Debug + Clone {
type Key: RangeKey;
type Value: RangeValue;
fn key(&self) -> &Self::Key;
fn value(&self) -> &Self::Value;
fn as_fingerprint(&self) -> Fingerprint;
}
```
`SignedEntry` implements `RangeEntry`:
- `Key` = `RecordIdentifier` (namespace || author || key bytes)
- `Value` = `Record` (timestamp, hash, len)
- Fingerprint = BLAKE3 hash of (namespace || author || key || timestamp || content_hash)
### RangeKey Trait
```rust
pub trait RangeKey: Sized + Debug + Ord + PartialEq + Clone + 'static {
fn is_prefix_of(&self, other: &Self) -> bool; // test-only
}
```
`RecordIdentifier` implements this via byte-level prefix matching: `(namespace, author, key)` where key prefix matching supports the hierarchical deletion semantics.
### RangeValue Trait
```rust
pub trait RangeValue: Sized + Debug + Ord + PartialEq + Clone + 'static {}
```
`Record` implements `RangeValue` with ordering by `(timestamp, hash)` — the Last-Writer-Wins ordering.
### Fingerprint
```rust
pub struct Fingerprint(pub [u8; 32]); // BLAKE3 hash
```
Fingerprints are computed by XOR-ing the individual entry fingerprints within a range. This means:
- The fingerprint of the empty set is `BLAKE3([])` (the hash of nothing)
- Adding/removing an entry toggles its contribution via XOR
- Equal sets produce equal fingerprints
## Range Concept
A `Range<K>` represents a half-open interval `[x, y)` in the key space, with special semantics:
```rust
pub(crate) struct Range<K> {
x: K,
y: K,
}
```
- `x == y`: The entire set (all elements)
- `x < y`: Standard half-open interval `[x, y)` — includes `x`, excludes `y`
- `x > y`: Wrapping range — elements from `x` to end + beginning to `y`
This wrapping range concept allows the algorithm to work with circular key spaces where the "first" element might be anywhere.
## Protocol Messages
```rust
pub type ProtocolMessage = crate::ranger::Message<SignedEntry>;
```
### Message Structure
```rust
pub struct Message<E: RangeEntry> {
parts: Vec<MessagePart<E>>,
}
pub enum MessagePart<E: RangeEntry> {
RangeFingerprint(RangeFingerprint<E::Key>), // "Here's a fingerprint for this range"
RangeItem(RangeItem<E>), // "Here are the entries in this range"
}
pub struct RangeFingerprint<K> {
range: Range<K>,
fingerprint: Fingerprint,
}
pub struct RangeItem<E: RangeEntry> {
range: Range<E::Key>,
values: Vec<(E, ContentStatus)>,
have_local: bool, // If true, sender already has these entries
}
```
The `have_local` flag is an optimization: when a peer sends entries AND indicates it already has them locally, the receiver doesn't need to send its own entries in that range back.
### Wire Format
Messages are serialized using `postcard` (a compact serde format) and framed with a 4-byte big-endian length prefix via `SyncCodec`:
```
┌─────────────────┬──────────────────────────────┐
│ u32 BE length │ postcard-encoded Message │
└─────────────────┴──────────────────────────────┘
```
Max message size: 1 GiB (`MAX_MESSAGE_SIZE = 1024 * 1024 * 1024`).
## Sync Algorithm Walkthrough
### 1. Initiation (Alice → Bob)
Alice generates the initial message:
```rust
fn init<S: Store<E>>(store: &mut S) -> Result<Self, S::Error> {
let x = store.get_first()?; // First key, or default
let range = Range::new(x.clone(), x); // "All elements" range
let fingerprint = store.get_fingerprint(&range)?;
Ok(Message { parts: vec![RangeFingerprint { range, fingerprint }] })
}
```
This sends a single fingerprint covering the entire set.
### 2. Processing (Bob processes Alice's message)
For each part in the message:
**Case 1: RangeFingerprint matches local fingerprint** → Nothing to do, sets are equal in this range.
**Case 2: RangeFingerprint is empty OR range has ≤ 1 local entry** → Send all entries in the range as a `RangeItem`.
**Case 3: Recurse** → Split the range into `split_factor` partitions, compute fingerprints, and send either `RangeFingerprint` (if partition is large) or `RangeItem` (if partition is small enough, ≤ `max_set_size`).
### 3. Processing RangeItem
When a peer receives a `RangeItem`:
1. **Validate** each incoming entry using `validate_cb`
2. **Insert** valid entries via `Store::put()` (which handles prefix deletion)
3. **Notify** via `on_insert_cb` for actually-inserted entries
4. If `have_local` is false, compute the **diff** — entries in the local range not present in the received set — and send them back
### Configuration
```rust
struct SyncConfig {
max_set_size: usize, // Default: 1 — entries to send before using fingerprints
split_factor: usize, // Default: 2 — number of partitions per recursion step
}
```
With `max_set_size = 1` and `split_factor = 2`, the algorithm behaves like a binary search: each fingerprint mismatch splits the range in two and sends fingerprints for both halves.
## Store Trait
The `Store` trait provides the interface that the reconciliation algorithm needs:
```rust
pub trait Store<E: RangeEntry>: Sized {
type Error: Debug + Send + Sync + Into<anyhow::Error> + 'static;
type RangeIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
type ParentIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
fn get_first(&mut self) -> Result<E::Key, Self::Error>;
fn get_fingerprint(&mut self, range: &Range<E::Key>) -> Result<Fingerprint, Self::Error>;
fn entry_put(&mut self, entry: E) -> Result<(), Self::Error>;
fn get_range(&mut self, range: Range<E::Key>) -> Result<Self::RangeIterator<'_>, Self::Error>;
fn prefixes_of(&mut self, key: &E::Key) -> Result<Self::ParentIterator<'_>, Self::Error>;
fn remove_prefix_filtered(&mut self, prefix: &E::Key, predicate: impl Fn(&E::Value) -> bool) -> Result<usize, Self::Error>;
fn initial_message(&mut self) -> Result<Message<E>, Self::Error>;
async fn process_message<F, F2, F3>(...) -> Result<Option<Message<E>>, Self::Error>;
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error>;
}
```
### Insert Semantics in `Store::put()`
The `put` method implements the CRDT insert logic:
```rust
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error> {
// 1. Check prefix entries — if any parent entry has value >= new entry, reject
for prefix_entry in self.prefixes_of(entry.key())? {
if entry.value() <= prefix_entry.value() {
return Ok(InsertOutcome::NotInserted);
}
}
// 2. Remove entries whose key is prefixed by new entry's key AND whose value is <=
let removed = self.remove_prefix_filtered(entry.key(), |v| entry.value() >= v)?;
// 3. Insert the new entry
self.entry_put(entry)?;
Ok(InsertOutcome::Inserted { removed })
}
```
### InsertOutcome
```rust
enum InsertOutcome {
NotInserted, // A newer or equal entry already exists
Inserted { removed: usize }, // Successfully inserted; reports removed entries
}
```
## Sync Flow at the Protocol Level
The `Replica` type provides the sync interface:
```rust
// Create initial message for sync
fn sync_initial_message(&mut self) -> anyhow::Result<ProtocolMessage>
// Process an incoming message and produce optional reply
async fn sync_process_message(
&mut self,
message: ProtocolMessage,
from_peer: PeerIdBytes,
state: &mut SyncOutcome,
) -> Result<Option<ProtocolMessage>, anyhow::Error>
```
### SyncOutcome
Tracks the result of a sync session:
```rust
pub struct SyncOutcome {
pub heads_received: AuthorHeads, // Latest timestamps per author from remote
pub num_recv: usize, // Number of entries received
pub num_sent: usize, // Number of entries sent
}
```
## Network Protocol (Codec)
The sync protocol operates over a QUIC bidirectional stream:
1. **Alice** (initiator) sends `Message::Init { namespace, message }`
2. **Bob** (responder) validates the namespace and either:
- Accepts and processes the initial message
- Rejects with `Message::Abort { reason }`
3. Both peers exchange `Message::Sync(message)` rounds until one side has no reply (convergence reached)
The `BobState` manages the responder side, tracking namespace and `SyncOutcome` progress across message rounds.
### Abort Reasons
```rust
pub enum AbortReason {
NotFound, // Namespace not available
AlreadySyncing, // Already syncing this namespace
InternalServerError,
}
```
### Concurrent Sync Prevention
When both peers try to sync with each other simultaneously, the system uses a deterministic tiebreaker based on comparing `EndpointId` bytes — the peer with the larger ID accepts, the other connects.

View File

@@ -0,0 +1,257 @@
# iroh-docs: Store and Persistence
## Store Architecture
The store is implemented in `store::fs::Store` using `redb`, an embedded key-value database. It supports two modes:
- **In-memory**: `Store::memory()` — backed by a `Vec<u8>` via `redb::backends::InMemoryBackend`
- **Persistent**: `Store::persistent(path)` — backed by a single file on disk
Both modes use the same `redb` table structure.
## redb Table Schema
### Authors Table
```
Table: "authors-1"
Key: [u8; 32] (AuthorId)
Value: [u8; 32] (Author secret key bytes)
```
### Namespaces Table
```
Table: "namespaces-2"
Key: [u8; 32] (NamespaceId)
Value: (u8, [u8; 32]) (CapabilityKind, key bytes)
```
The `CapabilityKind` discriminates between `Write = 1` (full key stored) and `Read = 2` (only the public key / namespace ID stored).
### Records Table (Primary)
```
Table: "records-1"
Key: (NamespaceId, AuthorId, key_bytes) = ([u8; 32], [u8; 32], &[u8])
Value: (timestamp, namespace_sig, author_sig, len, hash) = (u64, &[u8; 64], &[u8; 64], u64, &[u8; 32])
```
This is the main table storing all document entries. The key layout `(namespace, author, key)` enables efficient range queries for the sync algorithm.
### Latest-Per-Author Table
```
Table: "latest-by-author-1"
Key: (NamespaceId, AuthorId) = (&[u8; 32], &[u8; 32])
Value: (timestamp, key_bytes) = (u64, &[u8])
```
Used to quickly determine the latest entry timestamp for each author, supporting `AuthorHeads` computation and `has_news_for_us()` checks.
### Records-By-Key Table (Index)
```
Table: "records-by-key-1"
Key: (NamespaceId, key_bytes, AuthorId) = (&[u8; 32], &[u8], &[u8; 32])
Value: ()
```
An index table that enables efficient queries by key prefix, supporting `Query::key_prefix()` and `Query::key_exact()` lookups.
### Namespace Peers Table (Multimap)
```
MultimapTable: "sync-peers-1"
Key: &[u8; 32] (NamespaceId)
Value: (Nanos, &PeerIdBytes) (timestamp_nanos, peer_id)
```
Stores up to 5 (`PEERS_PER_DOC_CACHE_SIZE`) recently-useful peers per namespace. This is an LRU cache: when full, the oldest peer is evicted when a new one is registered.
### Download Policy Table
```
Table: "download-policy-1"
Key: &[u8; 32] (NamespaceId)
Value: &[u8] (postcard-encoded DownloadPolicy)
```
Per-namespace download policies controlling which content blobs to automatically download.
## Store Operations
### Transaction Model
The `Store` uses a "current transaction" approach:
```rust
enum CurrentTransaction {
None,
Read(ReadOnlyTables),
Write(TransactionAndTables),
}
```
- Read operations obtain a read snapshot
- Write operations batch into a write transaction
- Transactions older than `MAX_COMMIT_DELAY` (500ms) are automatically committed
- `flush()` commits any pending write transaction
### Core Methods
```rust
// Create/open/close replicas
fn new_replica(&mut self, namespace: NamespaceSecret) -> Result<Replica<'_>>;
fn open_replica(&mut self, namespace_id: &NamespaceId) -> Result<Replica<'_>>;
fn close_replica(&mut self, id: NamespaceId);
fn import_namespace(&mut self, capability: Capability) -> Result<ImportNamespaceOutcome>;
// Author management
fn new_author<R: CryptoRng>(&mut self, rng: &mut R) -> Result<Author>;
fn import_author(&mut self, author: Author) -> Result<()>;
fn get_author(&mut self, author_id: &AuthorId) -> Result<Option<Author>>;
fn delete_author(&mut self, author: AuthorId) -> Result<()>;
// Queries
fn get_many(&mut self, namespace: NamespaceId, query: impl Into<Query>) -> Result<QueryIterator>;
fn get_exact(&mut self, namespace: NamespaceId, author: AuthorId, key: impl AsRef<[u8]>, include_empty: bool) -> Result<Option<SignedEntry>>;
fn get_latest_for_each_author(&mut self, namespace: NamespaceId) -> Result<LatestIterator<'_>>;
// Sync support
fn has_news_for_us(&mut self, namespace: NamespaceId, heads: &AuthorHeads) -> Result<Option<NonZeroU64>>;
fn get_sync_peers(&mut self, namespace: &NamespaceId) -> Result<Option<PeersIter>>;
fn register_useful_peer(&mut self, namespace: NamespaceId, peer: PeerIdBytes) -> Result<()>;
// Content
fn content_hashes(&mut self) -> Result<ContentHashesIterator>;
```
### ImportNamespaceOutcome
```rust
pub enum ImportNamespaceOutcome {
Inserted, // New namespace created
Upgraded, // Existing namespace upgraded from Read to Write
NoChange, // Namespace already existed with same or higher capability
}
```
## Query System
The `Query` type supports flexible entry lookups:
```rust
pub struct Query {
kind: QueryKind,
filter_author: AuthorFilter,
filter_key: KeyFilter,
limit: Option<u64>,
offset: u64,
include_empty: bool,
sort_direction: SortDirection,
}
```
### Query Kinds
```rust
enum QueryKind {
Flat(FlatQuery), // Returns all matching entries
SingleLatestPerKey(SingleLatestPerKeyQuery), // Returns only latest entry per key
}
```
- **Flat**: Returns all entries matching the filters, sorted by `(namespace, author, key)` or `(namespace, key, author)` depending on `SortBy`
- **SingleLatestPerKey**: Groups by key and returns only the latest entry (by record value ordering) per key
### Filters
```rust
enum KeyFilter {
Any, // Match all keys
Exact(Bytes), // Exact key match
Prefix(Bytes), // Key starts with prefix
}
enum AuthorFilter {
Any, // Match all authors
Exact(AuthorId), // Match specific author
}
```
### Builder Pattern
```rust
// Get all entries
Query::all()
// Get entries by author
Query::author(author_id)
// Get entries by key prefix
Query::key_prefix(b"/path/")
// Get single latest entry per key
Query::single_latest_per_key()
.key_prefix(b"/path/")
.author(author_id)
```
## Download Policy
Controls which content blobs to automatically download after sync:
```rust
pub enum DownloadPolicy {
NothingExcept(Vec<FilterKind>), // Only download matching entries
EverythingExcept(Vec<FilterKind>), // Download all except matching (default)
}
pub enum FilterKind {
Prefix(Bytes), // Matches keys starting with bytes
Exact(Bytes), // Matches exact key
}
```
Default: `EverythingExcept(Vec::new())` — download everything.
## PublicKeyStore
The `PublicKeyStore` trait caches expanded `ed25519_dalek::VerifyingKey` objects to avoid repeated curve point decompression:
```rust
pub trait PublicKeyStore {
fn public_key(&self, id: &[u8; 32]) -> Result<VerifyingKey, SignatureError>;
fn namespace_key(&self, bytes: &NamespaceId) -> Result<NamespacePublicKey, SignatureError>;
fn author_key(&self, bytes: &AuthorId) -> Result<AuthorPublicKey, SignatureError>;
}
```
The `MemPublicKeyStore` implementation uses `Arc<RwLock<HashMap<[u8; 32], VerifyingKey>>>` for thread-safe caching.
The `Store` itself implements `PublicKeyStore`, leveraging its redb tables for author storage and the in-memory cache for fast verification.
## StoreInstance
```rust
pub struct StoreInstance<'a> {
namespace: NamespaceId,
store: &'a mut Store,
}
```
A `StoreInstance` bundles a namespace ID with a mutable reference to the store, providing the `ranger::Store<SignedEntry>` implementation for the sync algorithm. This is what `Replica` uses internally to perform sync operations.
## Replica
```rust
pub struct Replica<'a, I = Box<ReplicaInfo>> {
store: StoreInstance<'a>,
info: I,
}
```
`Replica` is the primary user-facing type for document operations. It combines:
- A `StoreInstance` for data access
- `ReplicaInfo` for metadata (capability, subscribers, content status callback)
Key methods:
- `insert(key, author, hash, len)` — Insert a new entry
- `delete_prefix(prefix, author)` — Delete entries by key prefix
- `insert_remote_entry(entry, from, content_status)` — Insert from sync
- `hash_and_insert(key, author, data)` — Hash data and insert
- `sync_initial_message()` / `sync_process_message()` — Sync protocol operations

View File

@@ -0,0 +1,343 @@
# iroh-docs: Engine and Live Sync
## Overview
The `Engine` is the top-level coordinator for live document synchronization. It brings together:
1. **SyncHandle/Actor** — Single-threaded actor for all store and replica operations
2. **LiveActor** — Async event loop coordinating sync, gossip, and content downloads
3. **GossipState** — Integration with `iroh-gossip` for broadcasting updates
4. **Blobs/Downloader** — Integration with `iroh-blobs` for content transfer
## Engine
```rust
pub struct Engine {
pub endpoint: Endpoint,
pub sync: SyncHandle,
pub default_author: DefaultAuthor,
to_live_actor: mpsc::Sender<ToLiveActor>,
actor_handle: AbortOnDropHandle<()>,
content_status_cb: ContentStatusCallback,
blob_store: iroh_blobs::api::Store,
_gc_protect_task: AbortOnDropHandle<()>,
}
```
### Initialization
```rust
Engine::spawn(
endpoint, // iroh Endpoint for QUIC connections
gossip, // iroh-gossip instance
replica_store, // Store for document data
bao_store, // iroh-blobs Store for content blobs
downloader, // Downloader for fetching blobs
default_author_storage, // Where to persist the default author
protect_cb, // Optional GC protection callback
) -> Result<Self>
```
During spawn:
1. A `ContentStatusCallback` is created that checks blob availability in `iroh-blobs`
2. A `SyncHandle` actor is spawned on a dedicated thread
3. A `LiveActor` is spawned as a tokio task
4. The default author is loaded or created
5. A GC protection task is started (if callback provided)
### Key Engine Methods
```rust
// Start syncing a document with given peers
async fn start_sync(&self, namespace: NamespaceId, peers: Vec<EndpointAddr>) -> Result<()>
// Stop syncing and leave gossip swarm
async fn leave(&self, namespace: NamespaceId, kill_subscribers: bool) -> Result<()>
// Subscribe to document events
async fn subscribe(&self, namespace: NamespaceId) -> Result<impl Stream<Item = Result<LiveEvent>>>
// Handle incoming QUIC connections
async fn handle_connection(&self, conn: Connection) -> Result<()>
// Shutdown the engine
async fn shutdown(&self) -> Result<()>
```
### GC Protection
The `ProtectCallbackHandler` bridges iroh-docs with iroh-blobs' garbage collection:
```rust
let (handler, protect_cb) = ProtectCallbackHandler::new();
// protect_cb goes into iroh-blobs GC config
// handler goes into Engine::spawn
```
When iroh-blobs runs GC, it calls `protect_cb` which queries the docs store for all content hashes, ensuring blobs referenced by document entries are not garbage-collected.
## SyncHandle / Actor
The `SyncHandle` is a handle to a single-threaded actor that processes all store and replica operations sequentially:
```rust
pub struct SyncHandle {
tx: async_channel::Sender<Action>,
join_handle: Arc<Option<std::thread::JoinHandle<()>>>,
metrics: Arc<Metrics>,
}
```
### Actor Architecture
```
External Code ──async──▶ SyncHandle ──channel──▶ Actor Thread
Store (redb)
Replica operations
Flush on timeout (500ms)
```
The actor runs on a **dedicated OS thread** (not a tokio task), using `tokio::runtime::Builder::new_current_thread()` internally. This ensures store operations are never concurrent.
### Action Types
```rust
enum Action {
ImportAuthor { author, reply },
ExportAuthor { author, reply },
DeleteAuthor { author, reply },
ImportNamespace { capability, reply },
ListAuthors { reply },
ListReplicas { reply },
ContentHashes { reply },
FlushStore { reply },
Replica(NamespaceId, ReplicaAction),
Shutdown { reply },
}
enum ReplicaAction {
Open { reply, opts },
Close { reply },
GetState { reply },
SetSync { sync, reply },
Subscribe { sender, reply },
Unsubscribe { sender, reply },
InsertLocal { author, key, hash, len, reply },
DeletePrefix { author, key, reply },
InsertRemote { entry, from, content_status, reply },
SyncInitialMessage { reply },
SyncProcessMessage { message, from, state, reply },
GetSyncPeers { reply },
RegisterUsefulPeer { peer, reply },
GetExact { author, key, include_empty, reply },
GetMany { query, reply },
DropReplica { reply },
ExportSecretKey { reply },
HasNewsForUs { heads, reply },
SetDownloadPolicy { policy, reply },
GetDownloadPolicy { reply },
}
```
### Replica Opening
When a replica is opened via the actor, an `OpenReplica` struct is created:
```rust
struct OpenReplica {
info: ReplicaInfo, // Capability, subscribers, content status callback
sync: bool, // Whether to accept sync requests
handles: usize, // Reference count for open handles
}
```
Multiple handles to the same replica are supported via reference counting.
## LiveActor
The `LiveActor` is the central async coordinator:
```rust
pub struct LiveActor {
inbox: mpsc::Receiver<ToLiveActor>,
sync: SyncHandle,
endpoint: Endpoint,
bao_store: Store,
downloader: Downloader,
memory_lookup: MemoryLookup,
replica_events_tx: async_channel::Sender<Event>,
replica_events_rx: async_channel::Receiver<Event>,
sync_actor_tx: mpsc::Sender<ToLiveActor>,
gossip: GossipState,
running_sync_connect: JoinSet<SyncConnectRes>,
running_sync_accept: JoinSet<SyncAcceptRes>,
download_tasks: JoinSet<DownloadRes>,
missing_hashes: HashSet<Hash>,
queued_hashes: QueuedHashes,
hash_providers: ProviderNodes,
subscribers: SubscribersMap,
state: NamespaceStates,
metrics: Arc<Metrics>,
}
```
### Event Loop
The `LiveActor::run_inner()` loop uses `tokio::select!` with biased polling:
```rust
tokio::select! {
biased;
msg = self.inbox.recv() => { /* handle actor messages */ }
event = self.replica_events_rx.recv() => { /* handle replica insert events */ }
res = self.running_sync_connect.join_next() => { /* sync connect finished */ }
res = self.running_sync_accept.join_next() => { /* sync accept finished */ }
res = self.download_tasks.join_next() => { /* download completed */ }
res = self.gossip.progress() => { /* gossip task progress */ }
}
```
### ToLiveActor Messages
```rust
pub enum ToLiveActor {
StartSync { namespace, peers, reply },
Leave { namespace, kill_subscribers, reply },
Shutdown { reply },
Subscribe { namespace, sender, reply },
HandleConnection { conn },
AcceptSyncRequest { namespace, peer, reply },
IncomingSyncReport { from, report },
NeighborContentReady { namespace, node, hash },
NeighborUp { namespace, peer },
NeighborDown { namespace, peer },
}
```
### Gossip Operations (Op)
```rust
pub enum Op {
Put(SignedEntry), // New entry inserted
ContentReady(Hash), // Content blob now available
SyncReport(SyncReport), // Heads summary after sync
}
```
Gossip broadcasts `Op` messages to all swarm participants. When a `Put` is received, the entry is inserted into the local replica. When a `ContentReady` is received, peers know they can download the blob. When a `SyncReport` is received, peers check `has_news_for_us()` to decide if they should sync.
### Content Download Flow
1. When a `RemoteInsert` event occurs with `should_download: true`, the entry's content hash is queued for download
2. The `LiveActor` uses `iroh_blobs::downloader::Downloader` to fetch the blob
3. Known providers (peers who had `ContentStatus::Complete`) are used as download sources
4. On download completion, a `LiveEvent::ContentReady` event is emitted
### LiveEvent (Public API)
```rust
pub enum LiveEvent {
InsertLocal { entry: Entry },
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
ContentReady { hash: Hash },
PendingContentReady,
NeighborUp(PublicKey),
NeighborDown(PublicKey),
SyncFinished(SyncEvent),
}
```
`SyncEvent` wraps `SyncFinished`:
```rust
pub struct SyncFinished {
pub namespace: NamespaceId,
pub peer: PublicKey,
pub outcome: SyncOutcome,
pub timings: Timings,
}
```
## NamespaceStates
```rust
pub struct NamespaceStates(BTreeMap<NamespaceId, NamespaceState>);
struct NamespaceState {
nodes: BTreeMap<EndpointId, PeerState>,
may_emit_ready: bool,
}
```
Each peer has a `PeerState` tracking sync progress:
```rust
struct PeerState {
state: SyncState, // Idle or Running
resync_requested: bool, // Whether a resync was requested during active sync
last_sync: Option<(Instant, Result<SyncFinished>)>,
}
```
This state machine prevents concurrent syncs with the same peer for the same namespace and queues resync requests when needed.
## DefaultAuthor
```rust
pub struct DefaultAuthor {
value: RwLock<AuthorId>,
storage: DefaultAuthorStorage,
}
```
- `DefaultAuthorStorage::Mem` — Ephemeral, creates a new author each time
- `DefaultAuthorStorage::Persistent(path)` — Stores the author ID as hex in a file, loads it on startup
The default author provides a convenient "current user" identity for applications.
## Docs Protocol Handler
```rust
pub struct Docs {
engine: Arc<Engine>,
api: DocsApi,
}
```
`Docs` implements `ProtocolHandler` for integration with iroh's `Router`:
```rust
impl ProtocolHandler for Docs {
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> { ... }
async fn shutdown(&self) { ... }
}
```
The `Builder` pattern configures storage:
```rust
let docs = Docs::memory()
.spawn(endpoint, blobs, gossip)
.await?;
// or
let docs = Docs::persistent(path)
.protect_handler(handler)
.spawn(endpoint, blobs, gossip)
.await?;
```
## DocTicket
```rust
pub struct DocTicket {
pub capability: Capability,
pub nodes: Vec<EndpointAddr>,
}
```
A `DocTicket` encapsulates everything needed to join a document:
- A `Capability` (Read or Write) — provides the namespace key
- A list of `EndpointAddr` — bootstrap peers to connect to
Tickets are serialized as base32-encoded postcard data with a `"doc"` prefix, using the `iroh_tickets::Ticket` trait.

View File

@@ -0,0 +1,189 @@
# iroh-docs: Network Protocol and Wire Format
## ALPN
The docs protocol uses ALPN `/iroh-sync/1` for QUIC connection identification.
```rust
pub const ALPN: &[u8] = b"/iroh-sync/1";
```
## Connection Flow
### Outgoing Sync (Alice — Initiator)
```rust
pub async fn connect_and_sync(
endpoint: &Endpoint,
sync: &SyncHandle,
namespace: NamespaceId,
peer: EndpointAddr,
metrics: Option<&Metrics>,
) -> Result<SyncFinished, ConnectError>
```
1. Open a QUIC connection to the peer with ALPN `/iroh-sync/1`
2. Open a bidirectional QUIC stream
3. Run the Alice (initiator) protocol via `run_alice()`
4. Close the stream and return `SyncFinished`
### Incoming Sync (Bob — Responder)
```rust
pub async fn handle_connection<F, Fut>(
sync: SyncHandle,
connection: Connection,
accept_cb: F,
metrics: Option<&Metrics>,
) -> Result<SyncFinished, AcceptError>
```
1. Accept a bidirectional QUIC stream from the connection
2. Run the Bob (responder) protocol via `BobState::run()`
3. The `accept_cb` determines whether to accept or reject each namespace
4. Close the stream and return `SyncFinished`
## Wire Format
### Frame Codec
All messages are length-prefixed:
```
┌──────────────────────┬──────────────────────────────┐
│ u32 big-endian len │ postcard-serialized Message │
└──────────────────────┴──────────────────────────────┘
```
Maximum message size: 1 GiB.
### Message Types
```rust
enum Message {
Init {
namespace: NamespaceId, // Which document to sync
message: ProtocolMessage, // Initial sync message (ranger::Message<SignedEntry>)
},
Sync(ProtocolMessage), // Subsequent sync round-trip messages
Abort { reason: AbortReason }, // Responder rejects the request
}
```
### Serialization
Messages use `postcard` (a compact `serde` format optimized for embedded/no-std use). The `SyncCodec` implements `tokio_util::codec::Encoder` and `Decoder` for async stream framing.
## Protocol Sequence
```
Alice (Initiator) Bob (Responder)
│ │
│──── Init { namespace, initial_msg } ───────▶│
│ │
│◀─── Sync(reply_msg) ────────────────────── │ (or Abort)
│ │
│──── Sync(next_msg) ──────────────────────▶│
│ │
│◀─── Sync(reply_msg) ────────────────────── │
│ │
│──── Sync(next_msg) ──────────────────────▶│
│ │
│ ... until convergence ... │
│ │
│──── (stream closed) ─────────────────────▶│
│ │
```
The protocol terminates when one side has no more messages to send (convergence reached). Each `Sync` message carries a `ProtocolMessage` which is a `ranger::Message<SignedEntry>` containing `MessagePart`s (either `RangeFingerprint` or `RangeItem`).
## SyncFinished Result
```rust
pub struct SyncFinished {
pub namespace: NamespaceId,
pub peer: PublicKey,
pub outcome: SyncOutcome, // heads_received, num_recv, num_sent
pub timings: Timings, // connect duration, process duration
}
```
## Error Types
### ConnectError
```rust
pub enum ConnectError {
Connect { error: anyhow::Error }, // Connection failed
RemoteAbort(AbortReason), // Remote rejected our request
Sync { error: anyhow::Error }, // Sync protocol error
Close { error: anyhow::Error }, // Stream close error
}
```
### AcceptError
```rust
pub enum AcceptError {
Connect { error: anyhow::Error }, // Connection failed
Open { peer: PublicKey, error }, // Failed to open replica
Abort { peer, namespace, reason }, // We aborted
Sync { peer, namespace, error }, // Sync protocol error
Close { peer, namespace, error }, // Stream close error
}
```
## Gossip Integration
The `GossipState` manages iroh-gossip subscriptions per namespace:
```rust
pub struct GossipState {
gossip: Gossip,
sync: SyncHandle,
to_live_actor: mpsc::Sender<ToLiveActor>,
active: HashMap<NamespaceId, ActiveState>,
active_tasks: JoinSet<(NamespaceId, Result<()>)>,
}
```
When a document starts syncing:
1. The engine joins a gossip topic for that namespace
2. `GossipState::join()` subscribes with bootstrap peers
3. A receive loop task is spawned to process incoming gossip messages
4. `Op` messages (Put, ContentReady, SyncReport) are deserialized and forwarded to `LiveActor`
When receiving an `Op::Put`:
```rust
// In the gossip receive loop:
let entry = SignedEntry::from_entry(...); // deserialize
sync.insert_remote(namespace, entry, from, content_status).await?;
```
When receiving an `Op::SyncReport`:
```rust
// Forward to LiveActor which checks has_news_for_us()
to_live_actor.send(ToLiveActor::IncomingSyncReport { from, report }).await?;
```
Broadcasting:
```rust
// When a local insert occurs:
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::Put(entry))).await;
// When content becomes ready:
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::ContentReady(hash))).await;
```
## Sync Report Compression
`SyncReport` encodes `AuthorHeads` with an optional size limit:
```rust
pub struct SyncReport {
namespace: NamespaceId,
heads: Vec<u8>, // postcard-encoded AuthorHeads with size limit
}
```
The size limit ensures gossip messages stay small, dropping the oldest (least recent) author timestamps when necessary.

View File

@@ -0,0 +1,188 @@
# iroh-docs: API and RPC
## DocsApi
The `DocsApi` provides an RPC-based interface to the docs engine, implemented via `irpc`:
```rust
#[derive(Debug, Clone)]
pub struct DocsApi {
inner: Client<DocsProtocol>,
}
```
### Methods (via irpc)
The API exposes document operations through an RPC protocol defined in `api/protocol.rs`:
| Method | Request | Response | Description |
|--------|---------|----------|-------------|
| `Open` | `OpenRequest { doc_id }` | `OpenResponse` | Open a document for operations |
| `Close` | `CloseRequest { doc_id }` | `CloseResponse` | Close a document |
| `Status` | `StatusRequest { doc_id }` | `StatusResponse { status: OpenState }` | Get document open state |
| `List` | `ListRequest` | Stream of `ListResponse { id, capability }` | List all documents |
| `Create` | `CreateRequest` | `CreateResponse { id }` | Create a new document |
| `Drop` | `DropRequest { doc_id }` | `DropResponse` | Remove a document |
| `Import` | `ImportRequest { capability }` | `ImportResponse { doc_id }` | Import a document by capability |
| `Set` | `SetRequest { doc_id, author_id, key, value }` | `SetResponse { entry }` | Set a key-value pair |
| `SetHash` | `SetHashRequest { doc_id, author_id, key, hash, size }` | `SetHashResponse` | Set a key with pre-hashed content |
| `GetMany` | `GetManyRequest { doc_id, query }` | Stream of entries | Query entries |
| `GetExact` | `GetExactRequest { doc_id, key, author, include_empty }` | `GetExactResponse { entry }` | Get single entry |
| `Del` | `DelRequest { doc_id, author_id, key }` | `DelResponse { removed }` | Delete by key prefix |
| `Subscribe` | `SubscribeRequest { doc_id }` | Stream of `LiveEvent` | Subscribe to document events |
| `Share` | `ShareRequest { doc_id, mode, peers }` | `ShareResponse { ticket }` | Create a sharing ticket |
| `StartSync` | `StartSyncRequest { doc_id, peers }` | `StartSyncResponse` | Start live sync |
| `Leave` | `LeaveRequest { doc_id }` | `LeaveResponse` | Leave gossip swarm |
| `ImportFile` | `ImportFileRequest { ... }` | Stream of `ImportProgress` | Import file content and set key |
| `ExportFile` | `ExportFileRequest { ... }` | Stream of `ExportProgress` | Export content to file |
| `AuthorList` | `AuthorListRequest` | Stream of `AuthorListResponse` | List authors |
| `AuthorCreate` | `AuthorCreateRequest` | `AuthorCreateResponse { author_id }` | Create new author |
| `AuthorImport` | `AuthorImportRequest { author }` | `AuthorImportResponse { author_id }` | Import author key |
| `AuthorExport` | `AuthorExportRequest { author_id }` | `AuthorExportResponse { author }` | Export author key |
| `AuthorDelete` | `AuthorDeleteRequest { author_id }` | `AuthorDeleteResponse` | Delete author |
| `AuthorGetDefault` | `AuthorGetDefaultRequest` | `AuthorGetDefaultResponse { author_id }` | Get default author |
| `AuthorSetDefault` | `AuthorSetDefaultRequest { author_id }` | `AuthorSetDefaultResponse` | Set default author |
| `SetDownloadPolicy` | `SetDownloadPolicyRequest { doc_id, policy }` | `SetDownloadPolicyResponse` | Set download policy |
| `GetDownloadPolicy` | `GetDownloadPolicyRequest { doc_id }` | `GetDownloadPolicyResponse { policy }` | Get download policy |
| `GetSyncPeers` | `GetSyncPeersRequest { doc_id }` | `GetSyncPeersResponse { peers }` | Get known sync peers |
## RPC Implementation
The RPC is implemented via `irpc` (for local/remote procedure calls) and `noq` (for remote network access):
### Local API
`DocsApi::spawn(engine)` creates an `RpcActor` that processes requests against the engine directly:
```rust
impl DocsApi {
pub fn spawn(engine: Arc<Engine>) -> Self {
RpcActor::spawn(engine)
}
}
```
### Remote API
When the `rpc` feature is enabled, `DocsApi::connect(endpoint, addr)` creates a remote client that sends requests over the network via `noq`.
### Protocol Dispatch
```rust
irpc::rpc::Handler<DocsProtocol> dispatches:
DocsProtocol::Open(msg) => local.send((msg, tx)).await
DocsProtocol::Set(msg) => local.send((msg, tx)).await
// ... etc
```
## RpcActor
The `RpcActor` (in `api/actor.rs`) bridges the RPC protocol to the `Engine`:
```rust
struct RpcActor {
engine: Arc<Engine>,
}
```
It handles each request type by calling the corresponding `Engine`/`SyncHandle` method and returning the result through the RPC channel.
For streaming responses (like `GetMany`, `Subscribe`, `AuthorList`), the actor sends results through an `mpsc` channel that the RPC framework streams back to the client.
## Share Mode and Tickets
When sharing a document:
```rust
pub enum ShareMode {
Read, // Share with read-only capability
Write, // Share with full write capability
}
```
The `Share` RPC method:
1. Gets or creates the namespace capability
2. Creates a `DocTicket` with the capability and provided peer addresses
3. Starts sync with the provided peers
4. Returns the ticket for distribution
## Example: Basic Setup
```rust
use iroh::{endpoint::presets, protocol::Router, Endpoint};
use iroh_blobs::{BlobsProtocol, store::mem::MemStore, ALPN as BLOBS_ALPN};
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let endpoint = Endpoint::bind(presets::N0).await?;
let blobs = MemStore::default();
let gossip = Gossip::builder().spawn(endpoint.clone());
let docs = Docs::memory()
.spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
.await?;
let router = Router::builder(endpoint.clone())
.accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
.accept(GOSSIP_ALPN, gossip)
.accept(DOCS_ALPN, docs)
.spawn();
Ok(())
}
```
## Data Flow Summary
```
┌─────────────────────────────────────────────────────────────────┐
│ Application / RPC │
│ DocsApi ──irpc──▶ RpcActor ──▶ Engine / SyncHandle │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Live Sync (per document) │
│ │
│ LiveActor event loop: │
│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Actor Messages │ │ Replica Events │ │ Gossip Events │ │
│ │ (StartSync, │ │ (LocalInsert, │ │ (Put, │ │
│ │ Subscribe, │ │ RemoteInsert) │ │ ContentReady, │ │
│ │ Leave, ...) │ │ │ │ SyncReport) │ │
│ └──────┬─────────┘ └───────┬────────┘ └──────┬──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LiveActor::run_inner() │ │
│ │ tokio::select! { ... } │ │
│ │ │ │
│ │ - Start/stop gossip subscriptions │ │
│ │ - Initiate outgoing syncs (connect_and_sync) │ │
│ │ - Accept incoming syncs (handle_connection) │ │
│ │ - Queue content downloads │ │
│ │ - Broadcast local inserts via gossip │ │
│ │ - Emit LiveEvent to subscribers │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ Running Tasks: │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ sync_connect tasks│ │ sync_accept tasks │ │
│ └───────────────────┘ └───────────────────┘ │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ download tasks │ │ gossip receive loop│ │
│ └───────────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Sync Actor (dedicated thread) │
│ │
│ ┌────────────┐ ┌─────────────────────────────────────────┐ │
│ │ Action │ │ Replica Operations: │ │
│ │ Channel │──▶│ Insert, Delete, Get, Query, │ │
│ │ (bounded) │ │ SyncInit, SyncProcess, Open, Close, ...│ │
│ └────────────┘ └─────────────────────────────────────────┘ │
│ │
│ Store (redb) ──▶ All reads/writes on this thread │
└─────────────────────────────────────────────────────────────────┘
```

View File

@@ -0,0 +1,318 @@
# iroh-docs: Key Types Reference
## Cryptographic Keys
### NamespaceSecret
```rust
pub struct NamespaceSecret {
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
}
```
- The write capability for a document
- Can sign entries (namespace signature)
- Derives `NamespacePublicKey` and `NamespaceId`
- Serialized as 32 bytes
### NamespacePublicKey
```rust
pub struct NamespacePublicKey(VerifyingKey); // ed25519_dalek::VerifyingKey
```
- The verifying key corresponding to `NamespaceSecret`
- Can verify namespace signatures on entries
- Serialized as 32 bytes
### NamespaceId
```rust
pub struct NamespaceId([u8; 32]);
```
- The byte representation of `NamespacePublicKey`
- Serves as the unique identifier for a document
- Can be converted back to `NamespacePublicKey` via `PublicKeyStore` (handles invalid curve points)
### Author
```rust
pub struct Author {
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
}
```
- A writer identity within a document
- Can sign entries (author signature)
- Derives `AuthorPublicKey` and `AuthorId`
- Created randomly with `Author::new(&mut rng)`
- Stored persistently in the redb authors table
### AuthorPublicKey
```rust
pub struct AuthorPublicKey(VerifyingKey);
```
- The verifying key corresponding to an `Author`
- Can verify author signatures on entries
- Serialized as 32 bytes
### AuthorId
```rust
pub struct AuthorId([u8; 32]);
```
- Byte representation of `AuthorPublicKey`
- Used as a component of `RecordIdentifier`
- Has `fmt_short()` for human-readable display (first 10 hex chars)
## Entry Types
### RecordIdentifier
```rust
pub struct RecordIdentifier(Bytes);
// Layout: [NamespaceId(32) | AuthorId(32) | Key(variable)]
```
- The composite key for an entry
- Byte layout: 32 bytes namespace + 32 bytes author + variable-length key
- Ordering: namespace → author → key (lexicographic)
- This ordering is critical for the range-based sync algorithm
### Record
```rust
pub struct Record {
len: u64, // Byte length of content
hash: Hash, // BLAKE3 hash of content (32 bytes)
timestamp: u64, // Microseconds since Unix epoch
}
```
- The value portion of an entry
- Ordering: timestamp first, then hash (Last-Writer-Wins)
- `Record::empty(timestamp)` creates a tombstone (hash=EMPTY, len=0)
- `Record::new_current(hash, len)` uses current system time
### Entry
```rust
pub struct Entry {
id: RecordIdentifier,
record: Record,
}
```
- Combines key and value
- `Entry::new(id, record)` constructor
- `Entry::new_empty(id)` creates a tombstone with current timestamp
- `entry.sign(namespace, author)` produces a `SignedEntry`
### SignedEntry
```rust
pub struct SignedEntry {
signature: EntrySignature, // Dual Ed25519 signatures
entry: Entry,
}
```
- An entry with cryptographic proof of authorization and authorship
- `SignedEntry::from_entry(entry, namespace, author)` — create from entry
- `signed_entry.verify(store)` — verify both signatures using a `PublicKeyStore`
- Implements `RangeEntry` for the sync algorithm
### EntrySignature
```rust
pub struct EntrySignature {
author_signature: Signature, // 64-byte Ed25519 signature
namespace_signature: Signature, // 64-byte Ed25519 signature
}
```
- Created by signing the canonical byte encoding of the `Entry`
- Both signatures cover the same message bytes
- Verification requires both `NamespacePublicKey` and `AuthorPublicKey`
## Sync Types
### SyncOutcome
```rust
pub struct SyncOutcome {
pub heads_received: AuthorHeads,
pub num_recv: usize,
pub num_sent: usize,
}
```
- Tracks the result of a sync session
- `heads_received` accumulates the latest timestamp seen from each author on the remote side
### ProtocolMessage
```rust
pub type ProtocolMessage = ranger::Message<SignedEntry>;
```
- The wire type for sync protocol messages
- Contains `Vec<MessagePart<SignedEntry>>`
### ContentStatus
```rust
pub enum ContentStatus {
Complete, // Content blob fully available
Incomplete, // Partially available
Missing, // Not available
}
```
- Communicated alongside entries during sync
- Helps peers decide whether to download content
### InsertOrigin
```rust
pub enum InsertOrigin {
Local,
Sync {
from: PeerIdBytes, // [u8; 32] — the remote peer
remote_content_status: ContentStatus,
},
}
```
## Event Types
### Event (Internal)
```rust
pub enum Event {
LocalInsert {
namespace: NamespaceId,
entry: SignedEntry,
},
RemoteInsert {
namespace: NamespaceId,
entry: SignedEntry,
from: PeerIdBytes,
should_download: bool,
remote_content_status: ContentStatus,
},
}
```
- Emitted by `Replica` via `ReplicaInfo` subscribers
- `should_download` is determined by the `DownloadPolicy`
### LiveEvent (Public)
```rust
pub enum LiveEvent {
InsertLocal { entry: Entry },
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
ContentReady { hash: Hash },
PendingContentReady,
NeighborUp(PublicKey),
NeighborDown(PublicKey),
SyncFinished(SyncEvent),
}
```
- Emitted by the `Engine` through `subscribe()`
- `InsertLocal` / `InsertRemote` are derived from `Event` by stripping `SignedEntry``Entry`
- `ContentReady` is emitted when a blob download completes
- `SyncFinished` wraps `SyncFinished` from the network layer
## Store Types
### Store (store::fs::Store)
```rust
pub struct Store {
db: Database, // redb database
transaction: CurrentTransaction, // Current read/write transaction
open_replicas: HashSet<NamespaceId>, // Track which replicas are open
pubkeys: MemPublicKeyStore, // Cache for expanded public keys
}
```
### Query
```rust
pub struct Query {
kind: QueryKind, // Flat or SingleLatestPerKey
filter_author: AuthorFilter, // Any or Exact
filter_key: KeyFilter, // Any, Exact, or Prefix
limit: Option<u64>,
offset: u64,
include_empty: bool,
sort_direction: SortDirection,
}
```
### Capability
```rust
pub enum Capability {
Write(NamespaceSecret),
Read(NamespaceId),
}
```
- `Write` allows inserting entries and signing them
- `Read` allows syncing and reading but not inserting
- Can be serialized as `(u8, [u8; 32])` — kind byte + key bytes
- `merge()` can upgrade `Read` to `Write`
### DownloadPolicy
```rust
pub enum DownloadPolicy {
NothingExcept(Vec<FilterKind>), // Whitelist mode
EverythingExcept(Vec<FilterKind>), // Blacklist mode (default)
}
```
### DocTicket
```rust
pub struct DocTicket {
pub capability: Capability,
pub nodes: Vec<EndpointAddr>,
}
```
- Serializable as a base32 string with "doc" prefix
- Contains everything needed to join a document
- The wire format uses a versioned enum: `TicketWireFormat::Variant0(DocTicket)`
## OpenState
```rust
pub struct OpenState {
pub sync: bool, // Whether sync is enabled
pub subscribers: usize, // Number of event subscribers
pub handles: usize, // Number of open handles
}
```
Returned by the `Status` RPC method to report the state of an open document.
## Utility Constants
| Constant | Value | Purpose |
|----------|-------|---------|
| `MAX_TIMESTAMP_FUTURE_SHIFT` | 10 min in μs | Max future drift for entry timestamps |
| `MAX_COMMIT_DELAY` | 500ms | Auto-commit interval for store transactions |
| `ACTION_CAP` | 1024 | Bounded channel capacity for SyncHandle actions |
| `ACTOR_CHANNEL_CAP` | 64 | Channel capacity for LiveActor messages |
| `SUBSCRIBE_CHANNEL_CAP` | 256 | Channel capacity for event subscriptions |
| `PEERS_PER_DOC_CACHE_SIZE` | 5 | LRU cache size for sync peers per document |
| `MAX_MESSAGE_SIZE` | 1 GiB | Max wire message size |

View File

@@ -0,0 +1,59 @@
# iroh-docs Reference Documentation
> Version: 0.98.0
> Repository: https://github.com/n0-computer/iroh-docs
> License: MIT/Apache-2.0
> Based on: [Range-Based Set Reconciliation (Meyer, 2022)](https://arxiv.org/abs/2212.13567)
## Document Index
| # | File | Topic |
|---|------|-------|
| 01 | [Overview and Architecture](01-overview-and-architecture.md) | High-level architecture, module layout, dependencies, feature flags |
| 02 | [Document Model](02-document-model.md) | CRDT data model: namespaces, authors, entries, signatures, prefix deletion, timestamps |
| 03 | [Sync Protocol](03-sync-protocol.md) | Range-based set reconciliation algorithm, fingerprints, message format, Store trait |
| 04 | [Store and Persistence](04-store-and-persistence.md) | redb table schema, transaction model, queries, download policies, PublicKeyStore |
| 05 | [Engine and Live Sync](05-engine-and-live-sync.md) | Engine, LiveActor, GossipState, content download, event system, DefaultAuthor |
| 06 | [Network Protocol](06-network-protocol.md) | ALPN, wire format, Alice/Bob protocol flow, error types, gossip integration |
| 07 | [API and Data Flow](07-api-and-data-flow.md) | RPC API, DocsApi, protocol messages, data flow diagrams |
| 08 | [Key Types Reference](08-key-types-reference.md) | All public types, constants, and their relationships |
## Quick Reference
### Core Concepts
- **Namespace**: A document identity. Identified by `NamespaceId` (32 bytes), backed by an Ed25519 keypair (`NamespaceSecret`).
- **Author**: A writer identity. Identified by `AuthorId` (32 bytes), backed by an Ed25519 keypair (`Author`).
- **Entry**: A record identified by (namespace, author, key) with a value of (hash, len, timestamp).
- **SignedEntry**: An entry with dual Ed25519 signatures (namespace + author) proving authorization and authorship.
- **Replica**: A local instance of a document, holding entries in a store.
- **Capability**: Either `Write(NamespaceSecret)` or `Read(NamespaceId)` — controls whether entries can be inserted.
- **Store**: A `redb`-backed persistent store managing authors, namespaces, entries, and peer caches.
- **Engine**: Coordinates sync actors, gossip, and content downloads for live synchronization.
### Key Algorithms
1. **Range-based set reconciliation**: Efficiently compute the union of two entry sets over a network by comparing fingerprints of partitions, subdividing when fingerprints differ.
2. **Prefix deletion**: An entry at key "foo" acts as a tombstone for all entries whose key starts with "foo/".
3. **Last-writer-wins**: When entries conflict on the same (namespace, author, key), the one with the higher (timestamp, hash) wins.
4. **XOR fingerprints**: Fingerprint of a set is the XOR of individual entry fingerprints (BLAKE3 hashes of key data).
### Data Flow
```
Application → DocsApi → Engine → LiveActor → GossipState → iroh-gossip
↓ ↓
SyncHandle → Actor → Store (redb) ← QUIC streams (iroh)
iroh-blobs (content transfer)
```
### Dependencies
- `iroh` — QUIC networking
- `iroh-blobs` — Content-addressed blob storage and transfer
- `iroh-gossip` — Gossip protocol for live updates
- `redb` — Embedded key-value store
- `ed25519-dalek` — Ed25519 signatures
- `blake3` — Hashing
- `postcard` — Serialization