docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs
This commit is contained in:
@@ -0,0 +1,98 @@
|
||||
# iroh-docs: Overview and Architecture
|
||||
|
||||
> Reference document for the `iroh-docs` crate (v0.98.0).
|
||||
> Source: `/workspace/iroh-docs`
|
||||
|
||||
## What Is iroh-docs?
|
||||
|
||||
`iroh-docs` is a Rust crate implementing **multi-dimensional key-value documents with an efficient synchronization protocol**. It provides:
|
||||
|
||||
1. **A CRDT-based document model** — Replicas (documents) hold entries identified by namespace + author + key, with content-addressed values (BLAKE3 hashes).
|
||||
2. **Range-based set reconciliation** — An efficient sync protocol based on [Aljoscha Meyer's paper](https://arxiv.org/abs/2212.13567) for reconciling sets between peers.
|
||||
3. **Live sync via gossip** — Real-time document updates propagated through an iroh-gossip swarm.
|
||||
4. **Persistent storage** — A `redb`-backed store supporting both in-memory and file-based modes.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ Docs (Protocol) │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Engine │ │
|
||||
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │ │
|
||||
│ │ │ LiveActor│ │ GossipState │ │ SyncHandle/Actor │ │ │
|
||||
│ │ │ (events) │ │ (iroh-gossip)│ │ (store + sync) │ │ │
|
||||
│ │ └──────────┘ └──────────────┘ └───────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
||||
│ │ Replica │ │ SignedEntry │ │ Author/ │ │
|
||||
│ │ (sync.rs) │ │ Entry/Record │ │ Namespace keys │ │
|
||||
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Store (redb) │ │
|
||||
│ │ Authors │ Namespaces │ Records │ RecordsByKey │ ... │ │
|
||||
│ └─────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Module Layout
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `sync.rs` | Core types: `Replica`, `Entry`, `SignedEntry`, `Record`, `RecordIdentifier`, `Capability`, events |
|
||||
| `keys.rs` | Cryptographic key types: `Author`, `NamespaceSecret`, `AuthorId`, `NamespaceId` |
|
||||
| `ranger.rs` | Range-based set reconciliation algorithm implementation |
|
||||
| `heads.rs` | `AuthorHeads` — latest timestamps per author for efficient sync decisions |
|
||||
| `store/` | Storage abstraction and `redb`-backed persistent store |
|
||||
| `store/fs.rs` | File-based `Store` implementation with redb tables |
|
||||
| `store/pubkeys.rs` | `PublicKeyStore` trait for caching expanded ed25519 public keys |
|
||||
| `actor.rs` | `SyncHandle` / Actor — single-threaded executor for store and replica operations |
|
||||
| `engine/` | Live sync coordination: `Engine`, `LiveActor`, `GossipState`, `NamespaceStates` |
|
||||
| `engine/live.rs` | The `LiveActor` event loop: handles sync, gossip, content download |
|
||||
| `engine/gossip.rs` | Integration with `iroh-gossip` for broadcasting document operations |
|
||||
| `engine/state.rs` | `NamespaceStates` — tracks per-namespace, per-peer sync state |
|
||||
| `net/` | Network protocol: ALPN `/iroh-sync/1`, connection handling |
|
||||
| `net/codec.rs` | Wire codec: length-prefixed postcard-serialized `Message` frames |
|
||||
| `protocol.rs` | `Docs` struct (the `ProtocolHandler`) and `Builder` |
|
||||
| `api/` | irpc-based RPC API for external access |
|
||||
| `ticket.rs` | `DocTicket` — shareable document capability + peer addresses |
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
1. **Two-key identity model**: Every entry is uniquely identified by (namespace, author, key). The namespace key provides write authorization; the author key provides attribution.
|
||||
|
||||
2. **Content-addressed values**: Entries store a BLAKE3 hash + length, not the actual content. Content blobs are handled separately by `iroh-blobs`.
|
||||
|
||||
3. **Prefix deletion**: An entry with key "foo" acts as a tombstone for all entries whose keys start with "foo/" (prefix deletion semantics). This enables hierarchical key structures.
|
||||
|
||||
4. **Last-writer-wins with per-author timestamps**: Entries are ordered by (timestamp, hash). Newer entries dominate older ones. Different authors can have entries for the same key simultaneously (multi-dimensional).
|
||||
|
||||
5. **Actor-based concurrency**: All store and replica mutations go through a single `SyncHandle` actor thread, eliminating the need for locks on the store.
|
||||
|
||||
6. **Event-driven live sync**: The `LiveActor` coordinates gossip, direct sync, and content downloads through a `tokio::select!` event loop.
|
||||
|
||||
## Dependencies
|
||||
|
||||
Key dependencies from `Cargo.toml`:
|
||||
|
||||
| Crate | Purpose |
|
||||
|-------|---------|
|
||||
| `iroh` | Networking: endpoints, connections, protocol routing |
|
||||
| `iroh-blobs` | Content-addressed blob storage and transfer |
|
||||
| `iroh-gossip` | Gossip protocol for broadcasting updates |
|
||||
| `iroh-tickets` | Ticket-based sharing mechanism |
|
||||
| `redb` | Embedded key-value store for persistence |
|
||||
| `ed25519-dalek` | Ed25519 signatures for entries |
|
||||
| `blake3` | Hashing (fingerprints + content hashes) |
|
||||
| `postcard` | Serialization (wire format for sync protocol) |
|
||||
| `irpc` / `noq` | RPC framework for API |
|
||||
|
||||
## Feature Flags
|
||||
|
||||
| Feature | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `metrics` | Yes | Enables iroh-metrics instrumentation |
|
||||
| `rpc` | Yes | Enables irpc-based RPC API (depends on `noq`) |
|
||||
| `fs-store` | Yes | Enables persistent file-based store |
|
||||
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
@@ -0,0 +1,201 @@
|
||||
# iroh-docs: Document Model and CRDT Details
|
||||
|
||||
## Core Data Model
|
||||
|
||||
### Namespace (Document Identity)
|
||||
|
||||
A **Namespace** is the identity of a document. It consists of:
|
||||
|
||||
- **`NamespaceSecret`** — An Ed25519 signing key (32 bytes) that grants write capability
|
||||
- **`NamespacePublicKey`** — The corresponding verifying key (32 bytes)
|
||||
- **`NamespaceId`** — A `[u8; 32]` that is the byte representation of the public key; this serves as the unique identifier for a document/replica
|
||||
|
||||
```
|
||||
NamespaceSecret (signing key) ──derives──▶ NamespacePublicKey (verifying key)
|
||||
──into─────▶ NamespaceId ([u8; 32])
|
||||
```
|
||||
|
||||
### Author (Writer Identity)
|
||||
|
||||
An **Author** represents a writer identity within a document. Multiple authors can write to the same namespace.
|
||||
|
||||
- **`Author`** — An Ed25519 signing key (32 bytes)
|
||||
- **`AuthorPublicKey`** — The corresponding verifying key (32 bytes)
|
||||
- **`AuthorId`** — A `[u8; 32]` byte representation of the public key
|
||||
|
||||
Authors are application-defined: an application might create one author per device, per user, or per session.
|
||||
|
||||
### Capability
|
||||
|
||||
Access to a document is controlled through a `Capability`:
|
||||
|
||||
```rust
|
||||
pub enum Capability {
|
||||
Write(NamespaceSecret), // Full read-write access
|
||||
Read(NamespaceId), // Read-only access (can sync but not insert)
|
||||
}
|
||||
```
|
||||
|
||||
Capabilities can be **merged** — a `Read` capability can be upgraded to `Write` if a matching `Write` is presented:
|
||||
|
||||
```rust
|
||||
capability.merge(other_capability) // Read + Write → Write
|
||||
```
|
||||
|
||||
The raw representation is `(u8, [u8; 32])` — a kind byte followed by 32 bytes of key material.
|
||||
|
||||
### Entry (The Fundamental Record)
|
||||
|
||||
An **`Entry`** is the core data unit, consisting of:
|
||||
|
||||
```rust
|
||||
pub struct Entry {
|
||||
id: RecordIdentifier, // (namespace, author, key)
|
||||
record: Record, // (hash, len, timestamp)
|
||||
}
|
||||
```
|
||||
|
||||
#### RecordIdentifier
|
||||
|
||||
```rust
|
||||
pub struct RecordIdentifier(Bytes); // namespace[0..32] || author[32..64] || key[64..]
|
||||
```
|
||||
|
||||
The key is a variable-length byte sequence. `RecordIdentifier` implements `Ord` by comparing namespace first, then author, then key — this ordering is critical for the range-based sync algorithm.
|
||||
|
||||
#### Record
|
||||
|
||||
```rust
|
||||
pub struct Record {
|
||||
len: u64, // byte length of the content
|
||||
hash: Hash, // BLAKE3 hash of the content (32 bytes)
|
||||
timestamp: u64, // microseconds since Unix epoch
|
||||
}
|
||||
```
|
||||
|
||||
The `Record` comparison uses `(timestamp, hash)` ordering — this is the **Last-Writer-Wins** rule for same-key entries. When two records for the same key exist, the one with the higher timestamp wins; if timestamps are equal, the higher hash wins as a tiebreaker.
|
||||
|
||||
### SignedEntry (Entry with Proofs)
|
||||
|
||||
```rust
|
||||
pub struct SignedEntry {
|
||||
signature: EntrySignature, // dual Ed25519 signatures
|
||||
entry: Entry,
|
||||
}
|
||||
```
|
||||
|
||||
#### EntrySignature
|
||||
|
||||
```rust
|
||||
pub struct EntrySignature {
|
||||
author_signature: Signature, // 64-byte Ed25519 signature
|
||||
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||
}
|
||||
```
|
||||
|
||||
Both signatures cover the canonical byte encoding of the `Entry` (id + record). This means:
|
||||
- The **namespace signature** proves write authorization (only holders of `NamespaceSecret` can produce valid entries)
|
||||
- The **author signature** proves authorship (provides attribution and non-repudiation)
|
||||
|
||||
#### Verification
|
||||
|
||||
```rust
|
||||
fn verify<S: PublicKeyStore>(&self, store: &S) -> Result<(), SignatureError>
|
||||
```
|
||||
|
||||
Verification requires both the `NamespacePublicKey` and `AuthorPublicKey`, which are derived from the entry's namespace and author IDs. The `PublicKeyStore` trait provides caching for these expanded keys.
|
||||
|
||||
### Empty Entries (Tombstones / Prefix Deletion)
|
||||
|
||||
An entry is **empty** when `hash == Hash::EMPTY && len == 0`. Empty entries serve as **deletion markers**:
|
||||
|
||||
- **Key deletion**: Inserting an empty entry with the exact key removes the previous entry for that key
|
||||
- **Prefix deletion**: Inserting an empty entry with key "foo" removes all entries whose keys start with "foo" (prefix deletion)
|
||||
|
||||
```rust
|
||||
pub async fn delete_prefix(&mut self, prefix: impl AsRef<[u8]>, author: &Author) -> Result<usize, InsertError>
|
||||
```
|
||||
|
||||
### Insert Semantics (CRDT Rules)
|
||||
|
||||
When a `SignedEntry` is inserted into a replica via `Store::put()` (the ranger store trait):
|
||||
|
||||
1. **Check prefixes**: Look up all existing entries whose key is a **prefix** of the new entry's key. If any prefix entry has a value `>=` the new entry's value, the new entry is **rejected** (`InsertOutcome::NotInserted`).
|
||||
|
||||
2. **Remove dominated entries**: Remove all existing entries whose key **starts with** the new entry's key (i.e., the new key is a prefix of theirs) AND whose value is `<=` the new entry's value.
|
||||
|
||||
3. **Insert**: If not rejected, the new entry is stored.
|
||||
|
||||
This implements a **prefix-aware last-writer-wins** CRDT:
|
||||
- Newer entries for the same (namespace, author, key) tuple replace older ones
|
||||
- A new entry at key "/foo" can delete all entries under "/foo/*" if it's newer
|
||||
- Different authors can coexist on the same key — each author's latest entry is kept
|
||||
|
||||
### Timestamp and Future Shift
|
||||
|
||||
Timestamps are in **microseconds since Unix epoch**. There is a maximum allowed future shift:
|
||||
|
||||
```rust
|
||||
pub const MAX_TIMESTAMP_FUTURE_SHIFT: u64 = 10 * 60 * Duration::from_secs(1).as_millis() as u64;
|
||||
```
|
||||
|
||||
Entries with timestamps more than 10 minutes in the future of the local clock are rejected during validation.
|
||||
|
||||
### Content Status
|
||||
|
||||
Each entry's content has an availability status:
|
||||
|
||||
```rust
|
||||
pub enum ContentStatus {
|
||||
Complete, // Content blob is fully available locally
|
||||
Incomplete, // Partially available
|
||||
Missing, // Not available
|
||||
}
|
||||
```
|
||||
|
||||
This status is communicated during sync to help peers decide whether to download content.
|
||||
|
||||
### AuthorHeads (Efficient Sync Optimization)
|
||||
|
||||
`AuthorHeads` tracks the latest timestamp for each author in a document:
|
||||
|
||||
```rust
|
||||
pub struct AuthorHeads {
|
||||
heads: BTreeMap<AuthorId, Timestamp>,
|
||||
}
|
||||
```
|
||||
|
||||
This enables a quick check: `has_news_for(other)` — comparing local and remote heads to determine whether sync would yield any new entries. If all timestamps are at least as recent locally, no sync is needed.
|
||||
|
||||
`AuthorHeads` can be serialized with a size limit, dropping the oldest entries when the limit is exceeded.
|
||||
|
||||
## Event System
|
||||
|
||||
Replicas emit events through a subscription system:
|
||||
|
||||
```rust
|
||||
pub enum Event {
|
||||
LocalInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
},
|
||||
RemoteInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
from: PeerIdBytes,
|
||||
should_download: bool, // based on download policy
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Subscribers use `async_channel` for non-blocking notification delivery. The `ReplicaInfo::subscribe()` method registers a sender, and events are fanned out to all subscribers.
|
||||
|
||||
## Validation
|
||||
|
||||
Entry validation during insertion checks:
|
||||
|
||||
1. **Namespace match**: The entry's namespace must match the replica's namespace
|
||||
2. **Signature verification**: For non-local entries, both namespace and author signatures are verified
|
||||
3. **Timestamp check**: The entry must not be more than `MAX_TIMESTAMP_FUTURE_SHIFT` in the future
|
||||
4. **Empty entry check**: An empty entry must have `hash == EMPTY && len == 0`, and a non-empty entry must have `len != 0`
|
||||
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# iroh-docs: Range-Based Set Reconciliation (Ranger)
|
||||
|
||||
## Overview
|
||||
|
||||
The sync protocol in iroh-docs is based on **Range-Based Set Reconciliation**, implementing the algorithm described in [Aljoscha Meyer's paper (arXiv:2212.13567)](https://arxiv.org/abs/2212.13567).
|
||||
|
||||
The core idea: two peers can efficiently compute the union of their entry sets by recursively partitioning the sets and comparing **fingerprints** (hashes) of partitions. When fingerprints match, no further work is needed. When they differ, the partition is subdivided until the difference can be resolved by sending the actual entries.
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### RangeEntry Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeEntry: Debug + Clone {
|
||||
type Key: RangeKey;
|
||||
type Value: RangeValue;
|
||||
|
||||
fn key(&self) -> &Self::Key;
|
||||
fn value(&self) -> &Self::Value;
|
||||
fn as_fingerprint(&self) -> Fingerprint;
|
||||
}
|
||||
```
|
||||
|
||||
`SignedEntry` implements `RangeEntry`:
|
||||
- `Key` = `RecordIdentifier` (namespace || author || key bytes)
|
||||
- `Value` = `Record` (timestamp, hash, len)
|
||||
- Fingerprint = BLAKE3 hash of (namespace || author || key || timestamp || content_hash)
|
||||
|
||||
### RangeKey Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeKey: Sized + Debug + Ord + PartialEq + Clone + 'static {
|
||||
fn is_prefix_of(&self, other: &Self) -> bool; // test-only
|
||||
}
|
||||
```
|
||||
|
||||
`RecordIdentifier` implements this via byte-level prefix matching: `(namespace, author, key)` where key prefix matching supports the hierarchical deletion semantics.
|
||||
|
||||
### RangeValue Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeValue: Sized + Debug + Ord + PartialEq + Clone + 'static {}
|
||||
```
|
||||
|
||||
`Record` implements `RangeValue` with ordering by `(timestamp, hash)` — the Last-Writer-Wins ordering.
|
||||
|
||||
### Fingerprint
|
||||
|
||||
```rust
|
||||
pub struct Fingerprint(pub [u8; 32]); // BLAKE3 hash
|
||||
```
|
||||
|
||||
Fingerprints are computed by XOR-ing the individual entry fingerprints within a range. This means:
|
||||
- The fingerprint of the empty set is `BLAKE3([])` (the hash of nothing)
|
||||
- Adding/removing an entry toggles its contribution via XOR
|
||||
- Equal sets produce equal fingerprints
|
||||
|
||||
## Range Concept
|
||||
|
||||
A `Range<K>` represents a half-open interval `[x, y)` in the key space, with special semantics:
|
||||
|
||||
```rust
|
||||
pub(crate) struct Range<K> {
|
||||
x: K,
|
||||
y: K,
|
||||
}
|
||||
```
|
||||
|
||||
- `x == y`: The entire set (all elements)
|
||||
- `x < y`: Standard half-open interval `[x, y)` — includes `x`, excludes `y`
|
||||
- `x > y`: Wrapping range — elements from `x` to end + beginning to `y`
|
||||
|
||||
This wrapping range concept allows the algorithm to work with circular key spaces where the "first" element might be anywhere.
|
||||
|
||||
## Protocol Messages
|
||||
|
||||
```rust
|
||||
pub type ProtocolMessage = crate::ranger::Message<SignedEntry>;
|
||||
```
|
||||
|
||||
### Message Structure
|
||||
|
||||
```rust
|
||||
pub struct Message<E: RangeEntry> {
|
||||
parts: Vec<MessagePart<E>>,
|
||||
}
|
||||
|
||||
pub enum MessagePart<E: RangeEntry> {
|
||||
RangeFingerprint(RangeFingerprint<E::Key>), // "Here's a fingerprint for this range"
|
||||
RangeItem(RangeItem<E>), // "Here are the entries in this range"
|
||||
}
|
||||
|
||||
pub struct RangeFingerprint<K> {
|
||||
range: Range<K>,
|
||||
fingerprint: Fingerprint,
|
||||
}
|
||||
|
||||
pub struct RangeItem<E: RangeEntry> {
|
||||
range: Range<E::Key>,
|
||||
values: Vec<(E, ContentStatus)>,
|
||||
have_local: bool, // If true, sender already has these entries
|
||||
}
|
||||
```
|
||||
|
||||
The `have_local` flag is an optimization: when a peer sends entries AND indicates it already has them locally, the receiver doesn't need to send its own entries in that range back.
|
||||
|
||||
### Wire Format
|
||||
|
||||
Messages are serialized using `postcard` (a compact serde format) and framed with a 4-byte big-endian length prefix via `SyncCodec`:
|
||||
|
||||
```
|
||||
┌─────────────────┬──────────────────────────────┐
|
||||
│ u32 BE length │ postcard-encoded Message │
|
||||
└─────────────────┴──────────────────────────────┘
|
||||
```
|
||||
|
||||
Max message size: 1 GiB (`MAX_MESSAGE_SIZE = 1024 * 1024 * 1024`).
|
||||
|
||||
## Sync Algorithm Walkthrough
|
||||
|
||||
### 1. Initiation (Alice → Bob)
|
||||
|
||||
Alice generates the initial message:
|
||||
|
||||
```rust
|
||||
fn init<S: Store<E>>(store: &mut S) -> Result<Self, S::Error> {
|
||||
let x = store.get_first()?; // First key, or default
|
||||
let range = Range::new(x.clone(), x); // "All elements" range
|
||||
let fingerprint = store.get_fingerprint(&range)?;
|
||||
Ok(Message { parts: vec![RangeFingerprint { range, fingerprint }] })
|
||||
}
|
||||
```
|
||||
|
||||
This sends a single fingerprint covering the entire set.
|
||||
|
||||
### 2. Processing (Bob processes Alice's message)
|
||||
|
||||
For each part in the message:
|
||||
|
||||
**Case 1: RangeFingerprint matches local fingerprint** → Nothing to do, sets are equal in this range.
|
||||
|
||||
**Case 2: RangeFingerprint is empty OR range has ≤ 1 local entry** → Send all entries in the range as a `RangeItem`.
|
||||
|
||||
**Case 3: Recurse** → Split the range into `split_factor` partitions, compute fingerprints, and send either `RangeFingerprint` (if partition is large) or `RangeItem` (if partition is small enough, ≤ `max_set_size`).
|
||||
|
||||
### 3. Processing RangeItem
|
||||
|
||||
When a peer receives a `RangeItem`:
|
||||
|
||||
1. **Validate** each incoming entry using `validate_cb`
|
||||
2. **Insert** valid entries via `Store::put()` (which handles prefix deletion)
|
||||
3. **Notify** via `on_insert_cb` for actually-inserted entries
|
||||
4. If `have_local` is false, compute the **diff** — entries in the local range not present in the received set — and send them back
|
||||
|
||||
### Configuration
|
||||
|
||||
```rust
|
||||
struct SyncConfig {
|
||||
max_set_size: usize, // Default: 1 — entries to send before using fingerprints
|
||||
split_factor: usize, // Default: 2 — number of partitions per recursion step
|
||||
}
|
||||
```
|
||||
|
||||
With `max_set_size = 1` and `split_factor = 2`, the algorithm behaves like a binary search: each fingerprint mismatch splits the range in two and sends fingerprints for both halves.
|
||||
|
||||
## Store Trait
|
||||
|
||||
The `Store` trait provides the interface that the reconciliation algorithm needs:
|
||||
|
||||
```rust
|
||||
pub trait Store<E: RangeEntry>: Sized {
|
||||
type Error: Debug + Send + Sync + Into<anyhow::Error> + 'static;
|
||||
type RangeIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||
type ParentIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||
|
||||
fn get_first(&mut self) -> Result<E::Key, Self::Error>;
|
||||
fn get_fingerprint(&mut self, range: &Range<E::Key>) -> Result<Fingerprint, Self::Error>;
|
||||
fn entry_put(&mut self, entry: E) -> Result<(), Self::Error>;
|
||||
fn get_range(&mut self, range: Range<E::Key>) -> Result<Self::RangeIterator<'_>, Self::Error>;
|
||||
fn prefixes_of(&mut self, key: &E::Key) -> Result<Self::ParentIterator<'_>, Self::Error>;
|
||||
fn remove_prefix_filtered(&mut self, prefix: &E::Key, predicate: impl Fn(&E::Value) -> bool) -> Result<usize, Self::Error>;
|
||||
fn initial_message(&mut self) -> Result<Message<E>, Self::Error>;
|
||||
async fn process_message<F, F2, F3>(...) -> Result<Option<Message<E>>, Self::Error>;
|
||||
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error>;
|
||||
}
|
||||
```
|
||||
|
||||
### Insert Semantics in `Store::put()`
|
||||
|
||||
The `put` method implements the CRDT insert logic:
|
||||
|
||||
```rust
|
||||
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error> {
|
||||
// 1. Check prefix entries — if any parent entry has value >= new entry, reject
|
||||
for prefix_entry in self.prefixes_of(entry.key())? {
|
||||
if entry.value() <= prefix_entry.value() {
|
||||
return Ok(InsertOutcome::NotInserted);
|
||||
}
|
||||
}
|
||||
|
||||
// 2. Remove entries whose key is prefixed by new entry's key AND whose value is <=
|
||||
let removed = self.remove_prefix_filtered(entry.key(), |v| entry.value() >= v)?;
|
||||
|
||||
// 3. Insert the new entry
|
||||
self.entry_put(entry)?;
|
||||
Ok(InsertOutcome::Inserted { removed })
|
||||
}
|
||||
```
|
||||
|
||||
### InsertOutcome
|
||||
|
||||
```rust
|
||||
enum InsertOutcome {
|
||||
NotInserted, // A newer or equal entry already exists
|
||||
Inserted { removed: usize }, // Successfully inserted; reports removed entries
|
||||
}
|
||||
```
|
||||
|
||||
## Sync Flow at the Protocol Level
|
||||
|
||||
The `Replica` type provides the sync interface:
|
||||
|
||||
```rust
|
||||
// Create initial message for sync
|
||||
fn sync_initial_message(&mut self) -> anyhow::Result<ProtocolMessage>
|
||||
|
||||
// Process an incoming message and produce optional reply
|
||||
async fn sync_process_message(
|
||||
&mut self,
|
||||
message: ProtocolMessage,
|
||||
from_peer: PeerIdBytes,
|
||||
state: &mut SyncOutcome,
|
||||
) -> Result<Option<ProtocolMessage>, anyhow::Error>
|
||||
```
|
||||
|
||||
### SyncOutcome
|
||||
|
||||
Tracks the result of a sync session:
|
||||
|
||||
```rust
|
||||
pub struct SyncOutcome {
|
||||
pub heads_received: AuthorHeads, // Latest timestamps per author from remote
|
||||
pub num_recv: usize, // Number of entries received
|
||||
pub num_sent: usize, // Number of entries sent
|
||||
}
|
||||
```
|
||||
|
||||
## Network Protocol (Codec)
|
||||
|
||||
The sync protocol operates over a QUIC bidirectional stream:
|
||||
|
||||
1. **Alice** (initiator) sends `Message::Init { namespace, message }`
|
||||
2. **Bob** (responder) validates the namespace and either:
|
||||
- Accepts and processes the initial message
|
||||
- Rejects with `Message::Abort { reason }`
|
||||
3. Both peers exchange `Message::Sync(message)` rounds until one side has no reply (convergence reached)
|
||||
|
||||
The `BobState` manages the responder side, tracking namespace and `SyncOutcome` progress across message rounds.
|
||||
|
||||
### Abort Reasons
|
||||
|
||||
```rust
|
||||
pub enum AbortReason {
|
||||
NotFound, // Namespace not available
|
||||
AlreadySyncing, // Already syncing this namespace
|
||||
InternalServerError,
|
||||
}
|
||||
```
|
||||
|
||||
### Concurrent Sync Prevention
|
||||
|
||||
When both peers try to sync with each other simultaneously, the system uses a deterministic tiebreaker based on comparing `EndpointId` bytes — the peer with the larger ID accepts, the other connects.
|
||||
@@ -0,0 +1,257 @@
|
||||
# iroh-docs: Store and Persistence
|
||||
|
||||
## Store Architecture
|
||||
|
||||
The store is implemented in `store::fs::Store` using `redb`, an embedded key-value database. It supports two modes:
|
||||
|
||||
- **In-memory**: `Store::memory()` — backed by a `Vec<u8>` via `redb::backends::InMemoryBackend`
|
||||
- **Persistent**: `Store::persistent(path)` — backed by a single file on disk
|
||||
|
||||
Both modes use the same `redb` table structure.
|
||||
|
||||
## redb Table Schema
|
||||
|
||||
### Authors Table
|
||||
```
|
||||
Table: "authors-1"
|
||||
Key: [u8; 32] (AuthorId)
|
||||
Value: [u8; 32] (Author secret key bytes)
|
||||
```
|
||||
|
||||
### Namespaces Table
|
||||
```
|
||||
Table: "namespaces-2"
|
||||
Key: [u8; 32] (NamespaceId)
|
||||
Value: (u8, [u8; 32]) (CapabilityKind, key bytes)
|
||||
```
|
||||
|
||||
The `CapabilityKind` discriminates between `Write = 1` (full key stored) and `Read = 2` (only the public key / namespace ID stored).
|
||||
|
||||
### Records Table (Primary)
|
||||
```
|
||||
Table: "records-1"
|
||||
Key: (NamespaceId, AuthorId, key_bytes) = ([u8; 32], [u8; 32], &[u8])
|
||||
Value: (timestamp, namespace_sig, author_sig, len, hash) = (u64, &[u8; 64], &[u8; 64], u64, &[u8; 32])
|
||||
```
|
||||
|
||||
This is the main table storing all document entries. The key layout `(namespace, author, key)` enables efficient range queries for the sync algorithm.
|
||||
|
||||
### Latest-Per-Author Table
|
||||
```
|
||||
Table: "latest-by-author-1"
|
||||
Key: (NamespaceId, AuthorId) = (&[u8; 32], &[u8; 32])
|
||||
Value: (timestamp, key_bytes) = (u64, &[u8])
|
||||
```
|
||||
|
||||
Used to quickly determine the latest entry timestamp for each author, supporting `AuthorHeads` computation and `has_news_for_us()` checks.
|
||||
|
||||
### Records-By-Key Table (Index)
|
||||
```
|
||||
Table: "records-by-key-1"
|
||||
Key: (NamespaceId, key_bytes, AuthorId) = (&[u8; 32], &[u8], &[u8; 32])
|
||||
Value: ()
|
||||
```
|
||||
|
||||
An index table that enables efficient queries by key prefix, supporting `Query::key_prefix()` and `Query::key_exact()` lookups.
|
||||
|
||||
### Namespace Peers Table (Multimap)
|
||||
```
|
||||
MultimapTable: "sync-peers-1"
|
||||
Key: &[u8; 32] (NamespaceId)
|
||||
Value: (Nanos, &PeerIdBytes) (timestamp_nanos, peer_id)
|
||||
```
|
||||
|
||||
Stores up to 5 (`PEERS_PER_DOC_CACHE_SIZE`) recently-useful peers per namespace. This is an LRU cache: when full, the oldest peer is evicted when a new one is registered.
|
||||
|
||||
### Download Policy Table
|
||||
```
|
||||
Table: "download-policy-1"
|
||||
Key: &[u8; 32] (NamespaceId)
|
||||
Value: &[u8] (postcard-encoded DownloadPolicy)
|
||||
```
|
||||
|
||||
Per-namespace download policies controlling which content blobs to automatically download.
|
||||
|
||||
## Store Operations
|
||||
|
||||
### Transaction Model
|
||||
|
||||
The `Store` uses a "current transaction" approach:
|
||||
|
||||
```rust
|
||||
enum CurrentTransaction {
|
||||
None,
|
||||
Read(ReadOnlyTables),
|
||||
Write(TransactionAndTables),
|
||||
}
|
||||
```
|
||||
|
||||
- Read operations obtain a read snapshot
|
||||
- Write operations batch into a write transaction
|
||||
- Transactions older than `MAX_COMMIT_DELAY` (500ms) are automatically committed
|
||||
- `flush()` commits any pending write transaction
|
||||
|
||||
### Core Methods
|
||||
|
||||
```rust
|
||||
// Create/open/close replicas
|
||||
fn new_replica(&mut self, namespace: NamespaceSecret) -> Result<Replica<'_>>;
|
||||
fn open_replica(&mut self, namespace_id: &NamespaceId) -> Result<Replica<'_>>;
|
||||
fn close_replica(&mut self, id: NamespaceId);
|
||||
fn import_namespace(&mut self, capability: Capability) -> Result<ImportNamespaceOutcome>;
|
||||
|
||||
// Author management
|
||||
fn new_author<R: CryptoRng>(&mut self, rng: &mut R) -> Result<Author>;
|
||||
fn import_author(&mut self, author: Author) -> Result<()>;
|
||||
fn get_author(&mut self, author_id: &AuthorId) -> Result<Option<Author>>;
|
||||
fn delete_author(&mut self, author: AuthorId) -> Result<()>;
|
||||
|
||||
// Queries
|
||||
fn get_many(&mut self, namespace: NamespaceId, query: impl Into<Query>) -> Result<QueryIterator>;
|
||||
fn get_exact(&mut self, namespace: NamespaceId, author: AuthorId, key: impl AsRef<[u8]>, include_empty: bool) -> Result<Option<SignedEntry>>;
|
||||
fn get_latest_for_each_author(&mut self, namespace: NamespaceId) -> Result<LatestIterator<'_>>;
|
||||
|
||||
// Sync support
|
||||
fn has_news_for_us(&mut self, namespace: NamespaceId, heads: &AuthorHeads) -> Result<Option<NonZeroU64>>;
|
||||
fn get_sync_peers(&mut self, namespace: &NamespaceId) -> Result<Option<PeersIter>>;
|
||||
fn register_useful_peer(&mut self, namespace: NamespaceId, peer: PeerIdBytes) -> Result<()>;
|
||||
|
||||
// Content
|
||||
fn content_hashes(&mut self) -> Result<ContentHashesIterator>;
|
||||
```
|
||||
|
||||
### ImportNamespaceOutcome
|
||||
|
||||
```rust
|
||||
pub enum ImportNamespaceOutcome {
|
||||
Inserted, // New namespace created
|
||||
Upgraded, // Existing namespace upgraded from Read to Write
|
||||
NoChange, // Namespace already existed with same or higher capability
|
||||
}
|
||||
```
|
||||
|
||||
## Query System
|
||||
|
||||
The `Query` type supports flexible entry lookups:
|
||||
|
||||
```rust
|
||||
pub struct Query {
|
||||
kind: QueryKind,
|
||||
filter_author: AuthorFilter,
|
||||
filter_key: KeyFilter,
|
||||
limit: Option<u64>,
|
||||
offset: u64,
|
||||
include_empty: bool,
|
||||
sort_direction: SortDirection,
|
||||
}
|
||||
```
|
||||
|
||||
### Query Kinds
|
||||
|
||||
```rust
|
||||
enum QueryKind {
|
||||
Flat(FlatQuery), // Returns all matching entries
|
||||
SingleLatestPerKey(SingleLatestPerKeyQuery), // Returns only latest entry per key
|
||||
}
|
||||
```
|
||||
|
||||
- **Flat**: Returns all entries matching the filters, sorted by `(namespace, author, key)` or `(namespace, key, author)` depending on `SortBy`
|
||||
- **SingleLatestPerKey**: Groups by key and returns only the latest entry (by record value ordering) per key
|
||||
|
||||
### Filters
|
||||
|
||||
```rust
|
||||
enum KeyFilter {
|
||||
Any, // Match all keys
|
||||
Exact(Bytes), // Exact key match
|
||||
Prefix(Bytes), // Key starts with prefix
|
||||
}
|
||||
|
||||
enum AuthorFilter {
|
||||
Any, // Match all authors
|
||||
Exact(AuthorId), // Match specific author
|
||||
}
|
||||
```
|
||||
|
||||
### Builder Pattern
|
||||
|
||||
```rust
|
||||
// Get all entries
|
||||
Query::all()
|
||||
|
||||
// Get entries by author
|
||||
Query::author(author_id)
|
||||
|
||||
// Get entries by key prefix
|
||||
Query::key_prefix(b"/path/")
|
||||
|
||||
// Get single latest entry per key
|
||||
Query::single_latest_per_key()
|
||||
.key_prefix(b"/path/")
|
||||
.author(author_id)
|
||||
```
|
||||
|
||||
## Download Policy
|
||||
|
||||
Controls which content blobs to automatically download after sync:
|
||||
|
||||
```rust
|
||||
pub enum DownloadPolicy {
|
||||
NothingExcept(Vec<FilterKind>), // Only download matching entries
|
||||
EverythingExcept(Vec<FilterKind>), // Download all except matching (default)
|
||||
}
|
||||
|
||||
pub enum FilterKind {
|
||||
Prefix(Bytes), // Matches keys starting with bytes
|
||||
Exact(Bytes), // Matches exact key
|
||||
}
|
||||
```
|
||||
|
||||
Default: `EverythingExcept(Vec::new())` — download everything.
|
||||
|
||||
## PublicKeyStore
|
||||
|
||||
The `PublicKeyStore` trait caches expanded `ed25519_dalek::VerifyingKey` objects to avoid repeated curve point decompression:
|
||||
|
||||
```rust
|
||||
pub trait PublicKeyStore {
|
||||
fn public_key(&self, id: &[u8; 32]) -> Result<VerifyingKey, SignatureError>;
|
||||
fn namespace_key(&self, bytes: &NamespaceId) -> Result<NamespacePublicKey, SignatureError>;
|
||||
fn author_key(&self, bytes: &AuthorId) -> Result<AuthorPublicKey, SignatureError>;
|
||||
}
|
||||
```
|
||||
|
||||
The `MemPublicKeyStore` implementation uses `Arc<RwLock<HashMap<[u8; 32], VerifyingKey>>>` for thread-safe caching.
|
||||
|
||||
The `Store` itself implements `PublicKeyStore`, leveraging its redb tables for author storage and the in-memory cache for fast verification.
|
||||
|
||||
## StoreInstance
|
||||
|
||||
```rust
|
||||
pub struct StoreInstance<'a> {
|
||||
namespace: NamespaceId,
|
||||
store: &'a mut Store,
|
||||
}
|
||||
```
|
||||
|
||||
A `StoreInstance` bundles a namespace ID with a mutable reference to the store, providing the `ranger::Store<SignedEntry>` implementation for the sync algorithm. This is what `Replica` uses internally to perform sync operations.
|
||||
|
||||
## Replica
|
||||
|
||||
```rust
|
||||
pub struct Replica<'a, I = Box<ReplicaInfo>> {
|
||||
store: StoreInstance<'a>,
|
||||
info: I,
|
||||
}
|
||||
```
|
||||
|
||||
`Replica` is the primary user-facing type for document operations. It combines:
|
||||
- A `StoreInstance` for data access
|
||||
- `ReplicaInfo` for metadata (capability, subscribers, content status callback)
|
||||
|
||||
Key methods:
|
||||
- `insert(key, author, hash, len)` — Insert a new entry
|
||||
- `delete_prefix(prefix, author)` — Delete entries by key prefix
|
||||
- `insert_remote_entry(entry, from, content_status)` — Insert from sync
|
||||
- `hash_and_insert(key, author, data)` — Hash data and insert
|
||||
- `sync_initial_message()` / `sync_process_message()` — Sync protocol operations
|
||||
@@ -0,0 +1,343 @@
|
||||
# iroh-docs: Engine and Live Sync
|
||||
|
||||
## Overview
|
||||
|
||||
The `Engine` is the top-level coordinator for live document synchronization. It brings together:
|
||||
|
||||
1. **SyncHandle/Actor** — Single-threaded actor for all store and replica operations
|
||||
2. **LiveActor** — Async event loop coordinating sync, gossip, and content downloads
|
||||
3. **GossipState** — Integration with `iroh-gossip` for broadcasting updates
|
||||
4. **Blobs/Downloader** — Integration with `iroh-blobs` for content transfer
|
||||
|
||||
## Engine
|
||||
|
||||
```rust
|
||||
pub struct Engine {
|
||||
pub endpoint: Endpoint,
|
||||
pub sync: SyncHandle,
|
||||
pub default_author: DefaultAuthor,
|
||||
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||
actor_handle: AbortOnDropHandle<()>,
|
||||
content_status_cb: ContentStatusCallback,
|
||||
blob_store: iroh_blobs::api::Store,
|
||||
_gc_protect_task: AbortOnDropHandle<()>,
|
||||
}
|
||||
```
|
||||
|
||||
### Initialization
|
||||
|
||||
```rust
|
||||
Engine::spawn(
|
||||
endpoint, // iroh Endpoint for QUIC connections
|
||||
gossip, // iroh-gossip instance
|
||||
replica_store, // Store for document data
|
||||
bao_store, // iroh-blobs Store for content blobs
|
||||
downloader, // Downloader for fetching blobs
|
||||
default_author_storage, // Where to persist the default author
|
||||
protect_cb, // Optional GC protection callback
|
||||
) -> Result<Self>
|
||||
```
|
||||
|
||||
During spawn:
|
||||
1. A `ContentStatusCallback` is created that checks blob availability in `iroh-blobs`
|
||||
2. A `SyncHandle` actor is spawned on a dedicated thread
|
||||
3. A `LiveActor` is spawned as a tokio task
|
||||
4. The default author is loaded or created
|
||||
5. A GC protection task is started (if callback provided)
|
||||
|
||||
### Key Engine Methods
|
||||
|
||||
```rust
|
||||
// Start syncing a document with given peers
|
||||
async fn start_sync(&self, namespace: NamespaceId, peers: Vec<EndpointAddr>) -> Result<()>
|
||||
|
||||
// Stop syncing and leave gossip swarm
|
||||
async fn leave(&self, namespace: NamespaceId, kill_subscribers: bool) -> Result<()>
|
||||
|
||||
// Subscribe to document events
|
||||
async fn subscribe(&self, namespace: NamespaceId) -> Result<impl Stream<Item = Result<LiveEvent>>>
|
||||
|
||||
// Handle incoming QUIC connections
|
||||
async fn handle_connection(&self, conn: Connection) -> Result<()>
|
||||
|
||||
// Shutdown the engine
|
||||
async fn shutdown(&self) -> Result<()>
|
||||
```
|
||||
|
||||
### GC Protection
|
||||
|
||||
The `ProtectCallbackHandler` bridges iroh-docs with iroh-blobs' garbage collection:
|
||||
|
||||
```rust
|
||||
let (handler, protect_cb) = ProtectCallbackHandler::new();
|
||||
// protect_cb goes into iroh-blobs GC config
|
||||
// handler goes into Engine::spawn
|
||||
```
|
||||
|
||||
When iroh-blobs runs GC, it calls `protect_cb` which queries the docs store for all content hashes, ensuring blobs referenced by document entries are not garbage-collected.
|
||||
|
||||
## SyncHandle / Actor
|
||||
|
||||
The `SyncHandle` is a handle to a single-threaded actor that processes all store and replica operations sequentially:
|
||||
|
||||
```rust
|
||||
pub struct SyncHandle {
|
||||
tx: async_channel::Sender<Action>,
|
||||
join_handle: Arc<Option<std::thread::JoinHandle<()>>>,
|
||||
metrics: Arc<Metrics>,
|
||||
}
|
||||
```
|
||||
|
||||
### Actor Architecture
|
||||
|
||||
```
|
||||
External Code ──async──▶ SyncHandle ──channel──▶ Actor Thread
|
||||
│
|
||||
Store (redb)
|
||||
Replica operations
|
||||
Flush on timeout (500ms)
|
||||
```
|
||||
|
||||
The actor runs on a **dedicated OS thread** (not a tokio task), using `tokio::runtime::Builder::new_current_thread()` internally. This ensures store operations are never concurrent.
|
||||
|
||||
### Action Types
|
||||
|
||||
```rust
|
||||
enum Action {
|
||||
ImportAuthor { author, reply },
|
||||
ExportAuthor { author, reply },
|
||||
DeleteAuthor { author, reply },
|
||||
ImportNamespace { capability, reply },
|
||||
ListAuthors { reply },
|
||||
ListReplicas { reply },
|
||||
ContentHashes { reply },
|
||||
FlushStore { reply },
|
||||
Replica(NamespaceId, ReplicaAction),
|
||||
Shutdown { reply },
|
||||
}
|
||||
|
||||
enum ReplicaAction {
|
||||
Open { reply, opts },
|
||||
Close { reply },
|
||||
GetState { reply },
|
||||
SetSync { sync, reply },
|
||||
Subscribe { sender, reply },
|
||||
Unsubscribe { sender, reply },
|
||||
InsertLocal { author, key, hash, len, reply },
|
||||
DeletePrefix { author, key, reply },
|
||||
InsertRemote { entry, from, content_status, reply },
|
||||
SyncInitialMessage { reply },
|
||||
SyncProcessMessage { message, from, state, reply },
|
||||
GetSyncPeers { reply },
|
||||
RegisterUsefulPeer { peer, reply },
|
||||
GetExact { author, key, include_empty, reply },
|
||||
GetMany { query, reply },
|
||||
DropReplica { reply },
|
||||
ExportSecretKey { reply },
|
||||
HasNewsForUs { heads, reply },
|
||||
SetDownloadPolicy { policy, reply },
|
||||
GetDownloadPolicy { reply },
|
||||
}
|
||||
```
|
||||
|
||||
### Replica Opening
|
||||
|
||||
When a replica is opened via the actor, an `OpenReplica` struct is created:
|
||||
|
||||
```rust
|
||||
struct OpenReplica {
|
||||
info: ReplicaInfo, // Capability, subscribers, content status callback
|
||||
sync: bool, // Whether to accept sync requests
|
||||
handles: usize, // Reference count for open handles
|
||||
}
|
||||
```
|
||||
|
||||
Multiple handles to the same replica are supported via reference counting.
|
||||
|
||||
## LiveActor
|
||||
|
||||
The `LiveActor` is the central async coordinator:
|
||||
|
||||
```rust
|
||||
pub struct LiveActor {
|
||||
inbox: mpsc::Receiver<ToLiveActor>,
|
||||
sync: SyncHandle,
|
||||
endpoint: Endpoint,
|
||||
bao_store: Store,
|
||||
downloader: Downloader,
|
||||
memory_lookup: MemoryLookup,
|
||||
replica_events_tx: async_channel::Sender<Event>,
|
||||
replica_events_rx: async_channel::Receiver<Event>,
|
||||
sync_actor_tx: mpsc::Sender<ToLiveActor>,
|
||||
gossip: GossipState,
|
||||
running_sync_connect: JoinSet<SyncConnectRes>,
|
||||
running_sync_accept: JoinSet<SyncAcceptRes>,
|
||||
download_tasks: JoinSet<DownloadRes>,
|
||||
missing_hashes: HashSet<Hash>,
|
||||
queued_hashes: QueuedHashes,
|
||||
hash_providers: ProviderNodes,
|
||||
subscribers: SubscribersMap,
|
||||
state: NamespaceStates,
|
||||
metrics: Arc<Metrics>,
|
||||
}
|
||||
```
|
||||
|
||||
### Event Loop
|
||||
|
||||
The `LiveActor::run_inner()` loop uses `tokio::select!` with biased polling:
|
||||
|
||||
```rust
|
||||
tokio::select! {
|
||||
biased;
|
||||
msg = self.inbox.recv() => { /* handle actor messages */ }
|
||||
event = self.replica_events_rx.recv() => { /* handle replica insert events */ }
|
||||
res = self.running_sync_connect.join_next() => { /* sync connect finished */ }
|
||||
res = self.running_sync_accept.join_next() => { /* sync accept finished */ }
|
||||
res = self.download_tasks.join_next() => { /* download completed */ }
|
||||
res = self.gossip.progress() => { /* gossip task progress */ }
|
||||
}
|
||||
```
|
||||
|
||||
### ToLiveActor Messages
|
||||
|
||||
```rust
|
||||
pub enum ToLiveActor {
|
||||
StartSync { namespace, peers, reply },
|
||||
Leave { namespace, kill_subscribers, reply },
|
||||
Shutdown { reply },
|
||||
Subscribe { namespace, sender, reply },
|
||||
HandleConnection { conn },
|
||||
AcceptSyncRequest { namespace, peer, reply },
|
||||
IncomingSyncReport { from, report },
|
||||
NeighborContentReady { namespace, node, hash },
|
||||
NeighborUp { namespace, peer },
|
||||
NeighborDown { namespace, peer },
|
||||
}
|
||||
```
|
||||
|
||||
### Gossip Operations (Op)
|
||||
|
||||
```rust
|
||||
pub enum Op {
|
||||
Put(SignedEntry), // New entry inserted
|
||||
ContentReady(Hash), // Content blob now available
|
||||
SyncReport(SyncReport), // Heads summary after sync
|
||||
}
|
||||
```
|
||||
|
||||
Gossip broadcasts `Op` messages to all swarm participants. When a `Put` is received, the entry is inserted into the local replica. When a `ContentReady` is received, peers know they can download the blob. When a `SyncReport` is received, peers check `has_news_for_us()` to decide if they should sync.
|
||||
|
||||
### Content Download Flow
|
||||
|
||||
1. When a `RemoteInsert` event occurs with `should_download: true`, the entry's content hash is queued for download
|
||||
2. The `LiveActor` uses `iroh_blobs::downloader::Downloader` to fetch the blob
|
||||
3. Known providers (peers who had `ContentStatus::Complete`) are used as download sources
|
||||
4. On download completion, a `LiveEvent::ContentReady` event is emitted
|
||||
|
||||
### LiveEvent (Public API)
|
||||
|
||||
```rust
|
||||
pub enum LiveEvent {
|
||||
InsertLocal { entry: Entry },
|
||||
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||
ContentReady { hash: Hash },
|
||||
PendingContentReady,
|
||||
NeighborUp(PublicKey),
|
||||
NeighborDown(PublicKey),
|
||||
SyncFinished(SyncEvent),
|
||||
}
|
||||
```
|
||||
|
||||
`SyncEvent` wraps `SyncFinished`:
|
||||
|
||||
```rust
|
||||
pub struct SyncFinished {
|
||||
pub namespace: NamespaceId,
|
||||
pub peer: PublicKey,
|
||||
pub outcome: SyncOutcome,
|
||||
pub timings: Timings,
|
||||
}
|
||||
```
|
||||
|
||||
## NamespaceStates
|
||||
|
||||
```rust
|
||||
pub struct NamespaceStates(BTreeMap<NamespaceId, NamespaceState>);
|
||||
|
||||
struct NamespaceState {
|
||||
nodes: BTreeMap<EndpointId, PeerState>,
|
||||
may_emit_ready: bool,
|
||||
}
|
||||
```
|
||||
|
||||
Each peer has a `PeerState` tracking sync progress:
|
||||
|
||||
```rust
|
||||
struct PeerState {
|
||||
state: SyncState, // Idle or Running
|
||||
resync_requested: bool, // Whether a resync was requested during active sync
|
||||
last_sync: Option<(Instant, Result<SyncFinished>)>,
|
||||
}
|
||||
```
|
||||
|
||||
This state machine prevents concurrent syncs with the same peer for the same namespace and queues resync requests when needed.
|
||||
|
||||
## DefaultAuthor
|
||||
|
||||
```rust
|
||||
pub struct DefaultAuthor {
|
||||
value: RwLock<AuthorId>,
|
||||
storage: DefaultAuthorStorage,
|
||||
}
|
||||
```
|
||||
|
||||
- `DefaultAuthorStorage::Mem` — Ephemeral, creates a new author each time
|
||||
- `DefaultAuthorStorage::Persistent(path)` — Stores the author ID as hex in a file, loads it on startup
|
||||
|
||||
The default author provides a convenient "current user" identity for applications.
|
||||
|
||||
## Docs Protocol Handler
|
||||
|
||||
```rust
|
||||
pub struct Docs {
|
||||
engine: Arc<Engine>,
|
||||
api: DocsApi,
|
||||
}
|
||||
```
|
||||
|
||||
`Docs` implements `ProtocolHandler` for integration with iroh's `Router`:
|
||||
|
||||
```rust
|
||||
impl ProtocolHandler for Docs {
|
||||
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> { ... }
|
||||
async fn shutdown(&self) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
The `Builder` pattern configures storage:
|
||||
|
||||
```rust
|
||||
let docs = Docs::memory()
|
||||
.spawn(endpoint, blobs, gossip)
|
||||
.await?;
|
||||
// or
|
||||
let docs = Docs::persistent(path)
|
||||
.protect_handler(handler)
|
||||
.spawn(endpoint, blobs, gossip)
|
||||
.await?;
|
||||
```
|
||||
|
||||
## DocTicket
|
||||
|
||||
```rust
|
||||
pub struct DocTicket {
|
||||
pub capability: Capability,
|
||||
pub nodes: Vec<EndpointAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
A `DocTicket` encapsulates everything needed to join a document:
|
||||
- A `Capability` (Read or Write) — provides the namespace key
|
||||
- A list of `EndpointAddr` — bootstrap peers to connect to
|
||||
|
||||
Tickets are serialized as base32-encoded postcard data with a `"doc"` prefix, using the `iroh_tickets::Ticket` trait.
|
||||
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# iroh-docs: Network Protocol and Wire Format
|
||||
|
||||
## ALPN
|
||||
|
||||
The docs protocol uses ALPN `/iroh-sync/1` for QUIC connection identification.
|
||||
|
||||
```rust
|
||||
pub const ALPN: &[u8] = b"/iroh-sync/1";
|
||||
```
|
||||
|
||||
## Connection Flow
|
||||
|
||||
### Outgoing Sync (Alice — Initiator)
|
||||
|
||||
```rust
|
||||
pub async fn connect_and_sync(
|
||||
endpoint: &Endpoint,
|
||||
sync: &SyncHandle,
|
||||
namespace: NamespaceId,
|
||||
peer: EndpointAddr,
|
||||
metrics: Option<&Metrics>,
|
||||
) -> Result<SyncFinished, ConnectError>
|
||||
```
|
||||
|
||||
1. Open a QUIC connection to the peer with ALPN `/iroh-sync/1`
|
||||
2. Open a bidirectional QUIC stream
|
||||
3. Run the Alice (initiator) protocol via `run_alice()`
|
||||
4. Close the stream and return `SyncFinished`
|
||||
|
||||
### Incoming Sync (Bob — Responder)
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection<F, Fut>(
|
||||
sync: SyncHandle,
|
||||
connection: Connection,
|
||||
accept_cb: F,
|
||||
metrics: Option<&Metrics>,
|
||||
) -> Result<SyncFinished, AcceptError>
|
||||
```
|
||||
|
||||
1. Accept a bidirectional QUIC stream from the connection
|
||||
2. Run the Bob (responder) protocol via `BobState::run()`
|
||||
3. The `accept_cb` determines whether to accept or reject each namespace
|
||||
4. Close the stream and return `SyncFinished`
|
||||
|
||||
## Wire Format
|
||||
|
||||
### Frame Codec
|
||||
|
||||
All messages are length-prefixed:
|
||||
|
||||
```
|
||||
┌──────────────────────┬──────────────────────────────┐
|
||||
│ u32 big-endian len │ postcard-serialized Message │
|
||||
└──────────────────────┴──────────────────────────────┘
|
||||
```
|
||||
|
||||
Maximum message size: 1 GiB.
|
||||
|
||||
### Message Types
|
||||
|
||||
```rust
|
||||
enum Message {
|
||||
Init {
|
||||
namespace: NamespaceId, // Which document to sync
|
||||
message: ProtocolMessage, // Initial sync message (ranger::Message<SignedEntry>)
|
||||
},
|
||||
Sync(ProtocolMessage), // Subsequent sync round-trip messages
|
||||
Abort { reason: AbortReason }, // Responder rejects the request
|
||||
}
|
||||
```
|
||||
|
||||
### Serialization
|
||||
|
||||
Messages use `postcard` (a compact `serde` format optimized for embedded/no-std use). The `SyncCodec` implements `tokio_util::codec::Encoder` and `Decoder` for async stream framing.
|
||||
|
||||
## Protocol Sequence
|
||||
|
||||
```
|
||||
Alice (Initiator) Bob (Responder)
|
||||
│ │
|
||||
│──── Init { namespace, initial_msg } ───────▶│
|
||||
│ │
|
||||
│◀─── Sync(reply_msg) ────────────────────── │ (or Abort)
|
||||
│ │
|
||||
│──── Sync(next_msg) ──────────────────────▶│
|
||||
│ │
|
||||
│◀─── Sync(reply_msg) ────────────────────── │
|
||||
│ │
|
||||
│──── Sync(next_msg) ──────────────────────▶│
|
||||
│ │
|
||||
│ ... until convergence ... │
|
||||
│ │
|
||||
│──── (stream closed) ─────────────────────▶│
|
||||
│ │
|
||||
```
|
||||
|
||||
The protocol terminates when one side has no more messages to send (convergence reached). Each `Sync` message carries a `ProtocolMessage` which is a `ranger::Message<SignedEntry>` containing `MessagePart`s (either `RangeFingerprint` or `RangeItem`).
|
||||
|
||||
## SyncFinished Result
|
||||
|
||||
```rust
|
||||
pub struct SyncFinished {
|
||||
pub namespace: NamespaceId,
|
||||
pub peer: PublicKey,
|
||||
pub outcome: SyncOutcome, // heads_received, num_recv, num_sent
|
||||
pub timings: Timings, // connect duration, process duration
|
||||
}
|
||||
```
|
||||
|
||||
## Error Types
|
||||
|
||||
### ConnectError
|
||||
|
||||
```rust
|
||||
pub enum ConnectError {
|
||||
Connect { error: anyhow::Error }, // Connection failed
|
||||
RemoteAbort(AbortReason), // Remote rejected our request
|
||||
Sync { error: anyhow::Error }, // Sync protocol error
|
||||
Close { error: anyhow::Error }, // Stream close error
|
||||
}
|
||||
```
|
||||
|
||||
### AcceptError
|
||||
|
||||
```rust
|
||||
pub enum AcceptError {
|
||||
Connect { error: anyhow::Error }, // Connection failed
|
||||
Open { peer: PublicKey, error }, // Failed to open replica
|
||||
Abort { peer, namespace, reason }, // We aborted
|
||||
Sync { peer, namespace, error }, // Sync protocol error
|
||||
Close { peer, namespace, error }, // Stream close error
|
||||
}
|
||||
```
|
||||
|
||||
## Gossip Integration
|
||||
|
||||
The `GossipState` manages iroh-gossip subscriptions per namespace:
|
||||
|
||||
```rust
|
||||
pub struct GossipState {
|
||||
gossip: Gossip,
|
||||
sync: SyncHandle,
|
||||
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||
active: HashMap<NamespaceId, ActiveState>,
|
||||
active_tasks: JoinSet<(NamespaceId, Result<()>)>,
|
||||
}
|
||||
```
|
||||
|
||||
When a document starts syncing:
|
||||
1. The engine joins a gossip topic for that namespace
|
||||
2. `GossipState::join()` subscribes with bootstrap peers
|
||||
3. A receive loop task is spawned to process incoming gossip messages
|
||||
4. `Op` messages (Put, ContentReady, SyncReport) are deserialized and forwarded to `LiveActor`
|
||||
|
||||
When receiving an `Op::Put`:
|
||||
```rust
|
||||
// In the gossip receive loop:
|
||||
let entry = SignedEntry::from_entry(...); // deserialize
|
||||
sync.insert_remote(namespace, entry, from, content_status).await?;
|
||||
```
|
||||
|
||||
When receiving an `Op::SyncReport`:
|
||||
```rust
|
||||
// Forward to LiveActor which checks has_news_for_us()
|
||||
to_live_actor.send(ToLiveActor::IncomingSyncReport { from, report }).await?;
|
||||
```
|
||||
|
||||
Broadcasting:
|
||||
```rust
|
||||
// When a local insert occurs:
|
||||
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::Put(entry))).await;
|
||||
|
||||
// When content becomes ready:
|
||||
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::ContentReady(hash))).await;
|
||||
```
|
||||
|
||||
## Sync Report Compression
|
||||
|
||||
`SyncReport` encodes `AuthorHeads` with an optional size limit:
|
||||
|
||||
```rust
|
||||
pub struct SyncReport {
|
||||
namespace: NamespaceId,
|
||||
heads: Vec<u8>, // postcard-encoded AuthorHeads with size limit
|
||||
}
|
||||
```
|
||||
|
||||
The size limit ensures gossip messages stay small, dropping the oldest (least recent) author timestamps when necessary.
|
||||
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# iroh-docs: API and RPC
|
||||
|
||||
## DocsApi
|
||||
|
||||
The `DocsApi` provides an RPC-based interface to the docs engine, implemented via `irpc`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct DocsApi {
|
||||
inner: Client<DocsProtocol>,
|
||||
}
|
||||
```
|
||||
|
||||
### Methods (via irpc)
|
||||
|
||||
The API exposes document operations through an RPC protocol defined in `api/protocol.rs`:
|
||||
|
||||
| Method | Request | Response | Description |
|
||||
|--------|---------|----------|-------------|
|
||||
| `Open` | `OpenRequest { doc_id }` | `OpenResponse` | Open a document for operations |
|
||||
| `Close` | `CloseRequest { doc_id }` | `CloseResponse` | Close a document |
|
||||
| `Status` | `StatusRequest { doc_id }` | `StatusResponse { status: OpenState }` | Get document open state |
|
||||
| `List` | `ListRequest` | Stream of `ListResponse { id, capability }` | List all documents |
|
||||
| `Create` | `CreateRequest` | `CreateResponse { id }` | Create a new document |
|
||||
| `Drop` | `DropRequest { doc_id }` | `DropResponse` | Remove a document |
|
||||
| `Import` | `ImportRequest { capability }` | `ImportResponse { doc_id }` | Import a document by capability |
|
||||
| `Set` | `SetRequest { doc_id, author_id, key, value }` | `SetResponse { entry }` | Set a key-value pair |
|
||||
| `SetHash` | `SetHashRequest { doc_id, author_id, key, hash, size }` | `SetHashResponse` | Set a key with pre-hashed content |
|
||||
| `GetMany` | `GetManyRequest { doc_id, query }` | Stream of entries | Query entries |
|
||||
| `GetExact` | `GetExactRequest { doc_id, key, author, include_empty }` | `GetExactResponse { entry }` | Get single entry |
|
||||
| `Del` | `DelRequest { doc_id, author_id, key }` | `DelResponse { removed }` | Delete by key prefix |
|
||||
| `Subscribe` | `SubscribeRequest { doc_id }` | Stream of `LiveEvent` | Subscribe to document events |
|
||||
| `Share` | `ShareRequest { doc_id, mode, peers }` | `ShareResponse { ticket }` | Create a sharing ticket |
|
||||
| `StartSync` | `StartSyncRequest { doc_id, peers }` | `StartSyncResponse` | Start live sync |
|
||||
| `Leave` | `LeaveRequest { doc_id }` | `LeaveResponse` | Leave gossip swarm |
|
||||
| `ImportFile` | `ImportFileRequest { ... }` | Stream of `ImportProgress` | Import file content and set key |
|
||||
| `ExportFile` | `ExportFileRequest { ... }` | Stream of `ExportProgress` | Export content to file |
|
||||
| `AuthorList` | `AuthorListRequest` | Stream of `AuthorListResponse` | List authors |
|
||||
| `AuthorCreate` | `AuthorCreateRequest` | `AuthorCreateResponse { author_id }` | Create new author |
|
||||
| `AuthorImport` | `AuthorImportRequest { author }` | `AuthorImportResponse { author_id }` | Import author key |
|
||||
| `AuthorExport` | `AuthorExportRequest { author_id }` | `AuthorExportResponse { author }` | Export author key |
|
||||
| `AuthorDelete` | `AuthorDeleteRequest { author_id }` | `AuthorDeleteResponse` | Delete author |
|
||||
| `AuthorGetDefault` | `AuthorGetDefaultRequest` | `AuthorGetDefaultResponse { author_id }` | Get default author |
|
||||
| `AuthorSetDefault` | `AuthorSetDefaultRequest { author_id }` | `AuthorSetDefaultResponse` | Set default author |
|
||||
| `SetDownloadPolicy` | `SetDownloadPolicyRequest { doc_id, policy }` | `SetDownloadPolicyResponse` | Set download policy |
|
||||
| `GetDownloadPolicy` | `GetDownloadPolicyRequest { doc_id }` | `GetDownloadPolicyResponse { policy }` | Get download policy |
|
||||
| `GetSyncPeers` | `GetSyncPeersRequest { doc_id }` | `GetSyncPeersResponse { peers }` | Get known sync peers |
|
||||
|
||||
## RPC Implementation
|
||||
|
||||
The RPC is implemented via `irpc` (for local/remote procedure calls) and `noq` (for remote network access):
|
||||
|
||||
### Local API
|
||||
|
||||
`DocsApi::spawn(engine)` creates an `RpcActor` that processes requests against the engine directly:
|
||||
|
||||
```rust
|
||||
impl DocsApi {
|
||||
pub fn spawn(engine: Arc<Engine>) -> Self {
|
||||
RpcActor::spawn(engine)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Remote API
|
||||
|
||||
When the `rpc` feature is enabled, `DocsApi::connect(endpoint, addr)` creates a remote client that sends requests over the network via `noq`.
|
||||
|
||||
### Protocol Dispatch
|
||||
|
||||
```rust
|
||||
irpc::rpc::Handler<DocsProtocol> dispatches:
|
||||
DocsProtocol::Open(msg) => local.send((msg, tx)).await
|
||||
DocsProtocol::Set(msg) => local.send((msg, tx)).await
|
||||
// ... etc
|
||||
```
|
||||
|
||||
## RpcActor
|
||||
|
||||
The `RpcActor` (in `api/actor.rs`) bridges the RPC protocol to the `Engine`:
|
||||
|
||||
```rust
|
||||
struct RpcActor {
|
||||
engine: Arc<Engine>,
|
||||
}
|
||||
```
|
||||
|
||||
It handles each request type by calling the corresponding `Engine`/`SyncHandle` method and returning the result through the RPC channel.
|
||||
|
||||
For streaming responses (like `GetMany`, `Subscribe`, `AuthorList`), the actor sends results through an `mpsc` channel that the RPC framework streams back to the client.
|
||||
|
||||
## Share Mode and Tickets
|
||||
|
||||
When sharing a document:
|
||||
|
||||
```rust
|
||||
pub enum ShareMode {
|
||||
Read, // Share with read-only capability
|
||||
Write, // Share with full write capability
|
||||
}
|
||||
```
|
||||
|
||||
The `Share` RPC method:
|
||||
1. Gets or creates the namespace capability
|
||||
2. Creates a `DocTicket` with the capability and provided peer addresses
|
||||
3. Starts sync with the provided peers
|
||||
4. Returns the ticket for distribution
|
||||
|
||||
## Example: Basic Setup
|
||||
|
||||
```rust
|
||||
use iroh::{endpoint::presets, protocol::Router, Endpoint};
|
||||
use iroh_blobs::{BlobsProtocol, store::mem::MemStore, ALPN as BLOBS_ALPN};
|
||||
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
|
||||
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() -> anyhow::Result<()> {
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let blobs = MemStore::default();
|
||||
let gossip = Gossip::builder().spawn(endpoint.clone());
|
||||
let docs = Docs::memory()
|
||||
.spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
|
||||
.await?;
|
||||
|
||||
let router = Router::builder(endpoint.clone())
|
||||
.accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
|
||||
.accept(GOSSIP_ALPN, gossip)
|
||||
.accept(DOCS_ALPN, docs)
|
||||
.spawn();
|
||||
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
## Data Flow Summary
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Application / RPC │
|
||||
│ DocsApi ──irpc──▶ RpcActor ──▶ Engine / SyncHandle │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Live Sync (per document) │
|
||||
│ │
|
||||
│ LiveActor event loop: │
|
||||
│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
|
||||
│ │ Actor Messages │ │ Replica Events │ │ Gossip Events │ │
|
||||
│ │ (StartSync, │ │ (LocalInsert, │ │ (Put, │ │
|
||||
│ │ Subscribe, │ │ RemoteInsert) │ │ ContentReady, │ │
|
||||
│ │ Leave, ...) │ │ │ │ SyncReport) │ │
|
||||
│ └──────┬─────────┘ └───────┬────────┘ └──────┬──────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||||
│ │ LiveActor::run_inner() │ │
|
||||
│ │ tokio::select! { ... } │ │
|
||||
│ │ │ │
|
||||
│ │ - Start/stop gossip subscriptions │ │
|
||||
│ │ - Initiate outgoing syncs (connect_and_sync) │ │
|
||||
│ │ - Accept incoming syncs (handle_connection) │ │
|
||||
│ │ - Queue content downloads │ │
|
||||
│ │ - Broadcast local inserts via gossip │ │
|
||||
│ │ - Emit LiveEvent to subscribers │ │
|
||||
│ └──────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Running Tasks: │
|
||||
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||
│ │ sync_connect tasks│ │ sync_accept tasks │ │
|
||||
│ └───────────────────┘ └───────────────────┘ │
|
||||
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||
│ │ download tasks │ │ gossip receive loop│ │
|
||||
│ └───────────────────┘ └───────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Sync Actor (dedicated thread) │
|
||||
│ │
|
||||
│ ┌────────────┐ ┌─────────────────────────────────────────┐ │
|
||||
│ │ Action │ │ Replica Operations: │ │
|
||||
│ │ Channel │──▶│ Insert, Delete, Get, Query, │ │
|
||||
│ │ (bounded) │ │ SyncInit, SyncProcess, Open, Close, ...│ │
|
||||
│ └────────────┘ └─────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Store (redb) ──▶ All reads/writes on this thread │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,318 @@
|
||||
# iroh-docs: Key Types Reference
|
||||
|
||||
## Cryptographic Keys
|
||||
|
||||
### NamespaceSecret
|
||||
|
||||
```rust
|
||||
pub struct NamespaceSecret {
|
||||
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||
}
|
||||
```
|
||||
|
||||
- The write capability for a document
|
||||
- Can sign entries (namespace signature)
|
||||
- Derives `NamespacePublicKey` and `NamespaceId`
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### NamespacePublicKey
|
||||
|
||||
```rust
|
||||
pub struct NamespacePublicKey(VerifyingKey); // ed25519_dalek::VerifyingKey
|
||||
```
|
||||
|
||||
- The verifying key corresponding to `NamespaceSecret`
|
||||
- Can verify namespace signatures on entries
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### NamespaceId
|
||||
|
||||
```rust
|
||||
pub struct NamespaceId([u8; 32]);
|
||||
```
|
||||
|
||||
- The byte representation of `NamespacePublicKey`
|
||||
- Serves as the unique identifier for a document
|
||||
- Can be converted back to `NamespacePublicKey` via `PublicKeyStore` (handles invalid curve points)
|
||||
|
||||
### Author
|
||||
|
||||
```rust
|
||||
pub struct Author {
|
||||
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||
}
|
||||
```
|
||||
|
||||
- A writer identity within a document
|
||||
- Can sign entries (author signature)
|
||||
- Derives `AuthorPublicKey` and `AuthorId`
|
||||
- Created randomly with `Author::new(&mut rng)`
|
||||
- Stored persistently in the redb authors table
|
||||
|
||||
### AuthorPublicKey
|
||||
|
||||
```rust
|
||||
pub struct AuthorPublicKey(VerifyingKey);
|
||||
```
|
||||
|
||||
- The verifying key corresponding to an `Author`
|
||||
- Can verify author signatures on entries
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### AuthorId
|
||||
|
||||
```rust
|
||||
pub struct AuthorId([u8; 32]);
|
||||
```
|
||||
|
||||
- Byte representation of `AuthorPublicKey`
|
||||
- Used as a component of `RecordIdentifier`
|
||||
- Has `fmt_short()` for human-readable display (first 10 hex chars)
|
||||
|
||||
## Entry Types
|
||||
|
||||
### RecordIdentifier
|
||||
|
||||
```rust
|
||||
pub struct RecordIdentifier(Bytes);
|
||||
// Layout: [NamespaceId(32) | AuthorId(32) | Key(variable)]
|
||||
```
|
||||
|
||||
- The composite key for an entry
|
||||
- Byte layout: 32 bytes namespace + 32 bytes author + variable-length key
|
||||
- Ordering: namespace → author → key (lexicographic)
|
||||
- This ordering is critical for the range-based sync algorithm
|
||||
|
||||
### Record
|
||||
|
||||
```rust
|
||||
pub struct Record {
|
||||
len: u64, // Byte length of content
|
||||
hash: Hash, // BLAKE3 hash of content (32 bytes)
|
||||
timestamp: u64, // Microseconds since Unix epoch
|
||||
}
|
||||
```
|
||||
|
||||
- The value portion of an entry
|
||||
- Ordering: timestamp first, then hash (Last-Writer-Wins)
|
||||
- `Record::empty(timestamp)` creates a tombstone (hash=EMPTY, len=0)
|
||||
- `Record::new_current(hash, len)` uses current system time
|
||||
|
||||
### Entry
|
||||
|
||||
```rust
|
||||
pub struct Entry {
|
||||
id: RecordIdentifier,
|
||||
record: Record,
|
||||
}
|
||||
```
|
||||
|
||||
- Combines key and value
|
||||
- `Entry::new(id, record)` constructor
|
||||
- `Entry::new_empty(id)` creates a tombstone with current timestamp
|
||||
- `entry.sign(namespace, author)` produces a `SignedEntry`
|
||||
|
||||
### SignedEntry
|
||||
|
||||
```rust
|
||||
pub struct SignedEntry {
|
||||
signature: EntrySignature, // Dual Ed25519 signatures
|
||||
entry: Entry,
|
||||
}
|
||||
```
|
||||
|
||||
- An entry with cryptographic proof of authorization and authorship
|
||||
- `SignedEntry::from_entry(entry, namespace, author)` — create from entry
|
||||
- `signed_entry.verify(store)` — verify both signatures using a `PublicKeyStore`
|
||||
- Implements `RangeEntry` for the sync algorithm
|
||||
|
||||
### EntrySignature
|
||||
|
||||
```rust
|
||||
pub struct EntrySignature {
|
||||
author_signature: Signature, // 64-byte Ed25519 signature
|
||||
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||
}
|
||||
```
|
||||
|
||||
- Created by signing the canonical byte encoding of the `Entry`
|
||||
- Both signatures cover the same message bytes
|
||||
- Verification requires both `NamespacePublicKey` and `AuthorPublicKey`
|
||||
|
||||
## Sync Types
|
||||
|
||||
### SyncOutcome
|
||||
|
||||
```rust
|
||||
pub struct SyncOutcome {
|
||||
pub heads_received: AuthorHeads,
|
||||
pub num_recv: usize,
|
||||
pub num_sent: usize,
|
||||
}
|
||||
```
|
||||
|
||||
- Tracks the result of a sync session
|
||||
- `heads_received` accumulates the latest timestamp seen from each author on the remote side
|
||||
|
||||
### ProtocolMessage
|
||||
|
||||
```rust
|
||||
pub type ProtocolMessage = ranger::Message<SignedEntry>;
|
||||
```
|
||||
|
||||
- The wire type for sync protocol messages
|
||||
- Contains `Vec<MessagePart<SignedEntry>>`
|
||||
|
||||
### ContentStatus
|
||||
|
||||
```rust
|
||||
pub enum ContentStatus {
|
||||
Complete, // Content blob fully available
|
||||
Incomplete, // Partially available
|
||||
Missing, // Not available
|
||||
}
|
||||
```
|
||||
|
||||
- Communicated alongside entries during sync
|
||||
- Helps peers decide whether to download content
|
||||
|
||||
### InsertOrigin
|
||||
|
||||
```rust
|
||||
pub enum InsertOrigin {
|
||||
Local,
|
||||
Sync {
|
||||
from: PeerIdBytes, // [u8; 32] — the remote peer
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Event Types
|
||||
|
||||
### Event (Internal)
|
||||
|
||||
```rust
|
||||
pub enum Event {
|
||||
LocalInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
},
|
||||
RemoteInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
from: PeerIdBytes,
|
||||
should_download: bool,
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
- Emitted by `Replica` via `ReplicaInfo` subscribers
|
||||
- `should_download` is determined by the `DownloadPolicy`
|
||||
|
||||
### LiveEvent (Public)
|
||||
|
||||
```rust
|
||||
pub enum LiveEvent {
|
||||
InsertLocal { entry: Entry },
|
||||
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||
ContentReady { hash: Hash },
|
||||
PendingContentReady,
|
||||
NeighborUp(PublicKey),
|
||||
NeighborDown(PublicKey),
|
||||
SyncFinished(SyncEvent),
|
||||
}
|
||||
```
|
||||
|
||||
- Emitted by the `Engine` through `subscribe()`
|
||||
- `InsertLocal` / `InsertRemote` are derived from `Event` by stripping `SignedEntry` → `Entry`
|
||||
- `ContentReady` is emitted when a blob download completes
|
||||
- `SyncFinished` wraps `SyncFinished` from the network layer
|
||||
|
||||
## Store Types
|
||||
|
||||
### Store (store::fs::Store)
|
||||
|
||||
```rust
|
||||
pub struct Store {
|
||||
db: Database, // redb database
|
||||
transaction: CurrentTransaction, // Current read/write transaction
|
||||
open_replicas: HashSet<NamespaceId>, // Track which replicas are open
|
||||
pubkeys: MemPublicKeyStore, // Cache for expanded public keys
|
||||
}
|
||||
```
|
||||
|
||||
### Query
|
||||
|
||||
```rust
|
||||
pub struct Query {
|
||||
kind: QueryKind, // Flat or SingleLatestPerKey
|
||||
filter_author: AuthorFilter, // Any or Exact
|
||||
filter_key: KeyFilter, // Any, Exact, or Prefix
|
||||
limit: Option<u64>,
|
||||
offset: u64,
|
||||
include_empty: bool,
|
||||
sort_direction: SortDirection,
|
||||
}
|
||||
```
|
||||
|
||||
### Capability
|
||||
|
||||
```rust
|
||||
pub enum Capability {
|
||||
Write(NamespaceSecret),
|
||||
Read(NamespaceId),
|
||||
}
|
||||
```
|
||||
|
||||
- `Write` allows inserting entries and signing them
|
||||
- `Read` allows syncing and reading but not inserting
|
||||
- Can be serialized as `(u8, [u8; 32])` — kind byte + key bytes
|
||||
- `merge()` can upgrade `Read` to `Write`
|
||||
|
||||
### DownloadPolicy
|
||||
|
||||
```rust
|
||||
pub enum DownloadPolicy {
|
||||
NothingExcept(Vec<FilterKind>), // Whitelist mode
|
||||
EverythingExcept(Vec<FilterKind>), // Blacklist mode (default)
|
||||
}
|
||||
```
|
||||
|
||||
### DocTicket
|
||||
|
||||
```rust
|
||||
pub struct DocTicket {
|
||||
pub capability: Capability,
|
||||
pub nodes: Vec<EndpointAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
- Serializable as a base32 string with "doc" prefix
|
||||
- Contains everything needed to join a document
|
||||
- The wire format uses a versioned enum: `TicketWireFormat::Variant0(DocTicket)`
|
||||
|
||||
## OpenState
|
||||
|
||||
```rust
|
||||
pub struct OpenState {
|
||||
pub sync: bool, // Whether sync is enabled
|
||||
pub subscribers: usize, // Number of event subscribers
|
||||
pub handles: usize, // Number of open handles
|
||||
}
|
||||
```
|
||||
|
||||
Returned by the `Status` RPC method to report the state of an open document.
|
||||
|
||||
## Utility Constants
|
||||
|
||||
| Constant | Value | Purpose |
|
||||
|----------|-------|---------|
|
||||
| `MAX_TIMESTAMP_FUTURE_SHIFT` | 10 min in μs | Max future drift for entry timestamps |
|
||||
| `MAX_COMMIT_DELAY` | 500ms | Auto-commit interval for store transactions |
|
||||
| `ACTION_CAP` | 1024 | Bounded channel capacity for SyncHandle actions |
|
||||
| `ACTOR_CHANNEL_CAP` | 64 | Channel capacity for LiveActor messages |
|
||||
| `SUBSCRIBE_CHANNEL_CAP` | 256 | Channel capacity for event subscriptions |
|
||||
| `PEERS_PER_DOC_CACHE_SIZE` | 5 | LRU cache size for sync peers per document |
|
||||
| `MAX_MESSAGE_SIZE` | 1 GiB | Max wire message size |
|
||||
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# iroh-docs Reference Documentation
|
||||
|
||||
> Version: 0.98.0
|
||||
> Repository: https://github.com/n0-computer/iroh-docs
|
||||
> License: MIT/Apache-2.0
|
||||
> Based on: [Range-Based Set Reconciliation (Meyer, 2022)](https://arxiv.org/abs/2212.13567)
|
||||
|
||||
## Document Index
|
||||
|
||||
| # | File | Topic |
|
||||
|---|------|-------|
|
||||
| 01 | [Overview and Architecture](01-overview-and-architecture.md) | High-level architecture, module layout, dependencies, feature flags |
|
||||
| 02 | [Document Model](02-document-model.md) | CRDT data model: namespaces, authors, entries, signatures, prefix deletion, timestamps |
|
||||
| 03 | [Sync Protocol](03-sync-protocol.md) | Range-based set reconciliation algorithm, fingerprints, message format, Store trait |
|
||||
| 04 | [Store and Persistence](04-store-and-persistence.md) | redb table schema, transaction model, queries, download policies, PublicKeyStore |
|
||||
| 05 | [Engine and Live Sync](05-engine-and-live-sync.md) | Engine, LiveActor, GossipState, content download, event system, DefaultAuthor |
|
||||
| 06 | [Network Protocol](06-network-protocol.md) | ALPN, wire format, Alice/Bob protocol flow, error types, gossip integration |
|
||||
| 07 | [API and Data Flow](07-api-and-data-flow.md) | RPC API, DocsApi, protocol messages, data flow diagrams |
|
||||
| 08 | [Key Types Reference](08-key-types-reference.md) | All public types, constants, and their relationships |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Core Concepts
|
||||
|
||||
- **Namespace**: A document identity. Identified by `NamespaceId` (32 bytes), backed by an Ed25519 keypair (`NamespaceSecret`).
|
||||
- **Author**: A writer identity. Identified by `AuthorId` (32 bytes), backed by an Ed25519 keypair (`Author`).
|
||||
- **Entry**: A record identified by (namespace, author, key) with a value of (hash, len, timestamp).
|
||||
- **SignedEntry**: An entry with dual Ed25519 signatures (namespace + author) proving authorization and authorship.
|
||||
- **Replica**: A local instance of a document, holding entries in a store.
|
||||
- **Capability**: Either `Write(NamespaceSecret)` or `Read(NamespaceId)` — controls whether entries can be inserted.
|
||||
- **Store**: A `redb`-backed persistent store managing authors, namespaces, entries, and peer caches.
|
||||
- **Engine**: Coordinates sync actors, gossip, and content downloads for live synchronization.
|
||||
|
||||
### Key Algorithms
|
||||
|
||||
1. **Range-based set reconciliation**: Efficiently compute the union of two entry sets over a network by comparing fingerprints of partitions, subdividing when fingerprints differ.
|
||||
2. **Prefix deletion**: An entry at key "foo" acts as a tombstone for all entries whose key starts with "foo/".
|
||||
3. **Last-writer-wins**: When entries conflict on the same (namespace, author, key), the one with the higher (timestamp, hash) wins.
|
||||
4. **XOR fingerprints**: Fingerprint of a set is the XOR of individual entry fingerprints (BLAKE3 hashes of key data).
|
||||
|
||||
### Data Flow
|
||||
|
||||
```
|
||||
Application → DocsApi → Engine → LiveActor → GossipState → iroh-gossip
|
||||
↓ ↓
|
||||
SyncHandle → Actor → Store (redb) ← QUIC streams (iroh)
|
||||
↓
|
||||
iroh-blobs (content transfer)
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- `iroh` — QUIC networking
|
||||
- `iroh-blobs` — Content-addressed blob storage and transfer
|
||||
- `iroh-gossip` — Gossip protocol for live updates
|
||||
- `redb` — Embedded key-value store
|
||||
- `ed25519-dalek` — Ed25519 signatures
|
||||
- `blake3` — Hashing
|
||||
- `postcard` — Serialization
|
||||
Reference in New Issue
Block a user