docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs

This commit is contained in:
2026-06-10 12:34:30 +00:00
parent 6e71d1f306
commit 5bb5e1064c
49 changed files with 9923 additions and 0 deletions

View File

@@ -0,0 +1,138 @@
# iroh-blobs: Overview and Architecture
**Version**: 0.100.0
**Repository**: https://github.com/n0-computer/iroh-blobs
**License**: MIT OR Apache-2.0
**Rust Edition**: 2021
**MSRV**: 1.89
## What It Is
`iroh-blobs` is a Rust crate for content-addressed blob transfer over QUIC connections, built on top of [iroh](https://docs.rs/iroh). It implements a request-response protocol for streaming BLAKE3-verified data between peers, along with store implementations for persisting blobs locally.
The core value proposition: transfer arbitrary-sized data with **cryptographic integrity guaranteed in-stream** — every 16 KiB chunk group can be verified against the BLAKE3 hash tree as it arrives, without waiting for the complete transfer.
## Core Concepts
| Concept | Description |
|---------|-------------|
| **Blob** | A sequence of bytes of arbitrary size, identified by its BLAKE3 hash. No metadata. |
| **Link** | A 32-byte BLAKE3 hash of a blob — the content address. |
| **HashSeq** | A blob whose content is a sequence of BLAKE3 hashes (each 32 bytes). Length must be a multiple of 32. |
| **Provider** | The side serving data. Waits for incoming requests and responds. |
| **Requester** | The side requesting data. Initiates connections and sends requests. |
| **Tag** | A persistent named reference to a `HashAndFormat`, protecting blobs from garbage collection. |
| **TempTag** | An ephemeral in-memory reference that protects content while the process runs. |
| **Chunk** | The fundamental BLAKE3 unit: 1024 bytes. |
| **Chunk Group** | Iroh's grouping of 16 chunks (16 KiB), the minimum granularity for range requests and verification. |
## Architecture Diagram
```
┌─────────────────────────────────────────────────────┐
│ Application │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Blobs │ │ Tags │ │ Downloader │ │
│ │ API │ │ API │ │ API │ │
│ └────┬─────┘ └────┬─────┘ └───────┬──────────┘ │
│ │ │ │ │
│ └──────────────┴────────────────┘ │
│ │ │
│ ┌───────┴───────┐ │
│ │ Store (API) │ ← Actor-based, RPC │
│ │ Commands │ message passing │
│ └───────┬───────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ │ │ │ │
│ ┌─────┴─────┐ ┌────┴────┐ ┌─────┴─────┐ │
│ │ MemStore │ │ FsStore │ │ Readonly │ │
│ │ │ │ (redb + │ │ MemStore │ │
│ │ │ │ fs) │ │ │ │
│ └────────────┘ └─────────┘ └───────────┘ │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ Network Layer │
│ │
│ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ BlobsProtocol │ │ Remote (Client) │ │
│ │ (Provider side) │ │ (Requester side) │ │
│ │ │ │ │ │
│ │ handle_conn() │ │ Remote::fetch() │ │
│ │ handle_stream() │ │ Remote::local() │ │
│ └────────┬─────────┘ └──────────┬───────────┘ │
│ │ │ │
│ └──────── iroh QUIC ───────┘ │
│ ALPN: /iroh-bytes/4 │
└─────────────────────────────────────────────────────┘
```
## Module Structure
```
iroh-blobs/src/
├── lib.rs # Crate root, re-exports
├── hash.rs # Hash, BlobFormat, HashAndFormat
├── hashseq.rs # HashSeq type
├── format.rs # Format module (Collection)
│ └── collection.rs # Collection type with metadata
├── protocol.rs # Wire protocol types (GetRequest, etc.)
│ └── range_spec.rs # ChunkRangesSeq, RangeSpec wire encoding
├── net_protocol.rs # BlobsProtocol (iroh ProtocolHandler)
├── provider.rs # Server-side request handling
│ └── events.rs # Event system (connect/disconnect/progress)
├── get.rs # Client-side FSM for getting data
│ ├── error.rs # GetError, GetResult types
│ └── request.rs # Request execution helpers
├── api/ # High-level store API
│ ├── blobs.rs # Blob operations (add, export, read, etc.)
│ │ └── reader.rs # BlobReader (AsyncRead + AsyncSeek)
│ ├── downloader.rs # Multi-source download coordinator
│ ├── remote.rs # Remote peer interaction (fetch, observe)
│ ├── tags.rs # Tag management API
│ ├── proto.rs # Store command protocol (RPC messages)
│ └── proto/ # Proto sub-modules
│ └── bitfield.rs # Bitfield type for chunk tracking
├── store/ # Storage implementations
│ ├── mod.rs # IROH_BLOCK_SIZE, GcConfig
│ ├── mem.rs # MemStore (in-memory, mutable)
│ ├── fs.rs # FsStore (filesystem + redb hybrid)
│ ├── readonly_mem.rs # Read-only memory store
│ ├── gc.rs # Garbage collection
│ ├── util.rs # Shared utilities (Tag, SparseMemFile, etc.)
│ └── test.rs # Test utilities
├── ticket.rs # BlobTicket (shareable connection info)
├── metrics.rs # Prometheus metrics definitions
└── util/ # Utilities
├── channel.rs # Channel helpers
├── connection_pool.rs # Connection pooling
├── stream.rs # Stream abstractions
└── temp_tag.rs # TempTag, TagCounter, TempTags scope management
```
## Key Dependencies
| Dependency | Purpose |
|------------|---------|
| `bao-tree` | BLAKE3 verified streaming, outboard storage, BaoTree encoding/decoding |
| `iroh` | QUIC networking, endpoint, router |
| `irpc` | RPC framework for store commands |
| `postcard` | Wire serialization (compact, no-schema) |
| `redb` | Embedded key-value database (fs-store feature) |
| `range-collections` | RangeSet2 / ChunkRanges for chunk tracking |
| `bytes` | Efficient byte buffer handling |
## Feature Flags
| Feature | Default | Description |
|---------|---------|-------------|
| `fs-store` | ✅ | Filesystem-based store with redb + file hybrid |
| `rpc` | ✅ | RPC support via `noq` / `irpc` |
| `metrics` | ❌ | Prometheus metrics |
| `hide-proto-docs` | ✅ | Hides protocol docs from rustdocs |
## BLAKE3 Block Size
The crate uses a fixed block size of `IROH_BLOCK_SIZE = BlockSize::from_chunk_log(4)`, which means each chunk group is 2^4 = 16 chunks = 16 × 1024 = 16,384 bytes (16 KiB). This is the minimum granularity for range requests and verification.

View File

@@ -0,0 +1,195 @@
# iroh-blobs: Key Types and Data Structures
## Hash
```rust
// src/hash.rs
pub struct Hash(blake3::Hash); // 32-byte BLAKE3 hash, wraps blake3::Hash
```
The fundamental content-address. Created via `Hash::new(data)` or `Hash::from_bytes([u8; 32])`. Has a constant `Hash::EMPTY` for the empty blob. Supports hex display, serde (compact binary for non-human-readable), and is stored as a 32-byte fixed array in redb.
Wire format: 32 raw bytes (postcard serialization). No framing overhead.
## BlobFormat
```rust
pub enum BlobFormat {
Raw, // A single blob
HashSeq, // A sequence of BLAKE3 hashes
}
```
Distinguishes between a raw binary blob and a hash sequence. Wire format: single byte (0 = Raw, 1 = HashSeq).
## HashAndFormat
```rust
pub struct HashAndFormat {
pub hash: Hash,
pub format: BlobFormat,
}
```
Pairs a hash with its format. Wire format: 33 bytes (32 for hash + 1 for format). Display format: hex string, optionally prefixed with 's' for HashSeq.
## HashSeq
```rust
// src/hashseq.rs
pub struct HashSeq(Bytes); // Wrapper around Bytes, length must be multiple of 32
```
A blob interpreted as a sequence of 32-byte BLAKE3 hashes. Created from `Bytes` via `HashSeq::new(bytes)` (returns `None` if length is not a multiple of 32). Iterable, supports `get(index)`, `pop_front()`.
Used extensively: collections are stored as a HashSeq where the first child is metadata and subsequent children are data blobs.
## Bitfield
```rust
// src/api/proto/bitfield.rs
pub struct Bitfield {
pub size: u64, // Total size of the blob in bytes
pub ranges: ChunkRanges, // Which chunks are verified/present
}
```
Tracks which chunks of a blob are present and verified. Key methods:
- `is_complete()` — all chunks present
- `validated_size()` — how many bytes are verified
- `diff(&other)` — compute the delta between two bitfields
Used by the observe protocol and internal state tracking.
## Tag
```rust
// src/store/util.rs
pub struct Tag(pub Bytes); // Named reference, arbitrary bytes, typically UTF-8
```
A persistent named reference to content in the store. Tags protect content from garbage collection. Auto-generated tags use the format `"auto-2026-01-15T12:34:56.789Z"`. Tags are stored in the store's database and can be listed, created, renamed, and deleted.
## TempTag
```rust
// src/util/temp_tag.rs
pub struct TempTag {
inner: HashAndFormat,
on_drop: Option<Weak<dyn TagDrop>>, // Callback when dropped
}
```
An ephemeral, in-memory tag. While a `TempTag` exists, its referenced content is protected from garbage collection. When dropped, the `TagDrop` callback notifies the store to unprotect. Can be `leak()`ed to make the protection permanent for the process lifetime.
Scopes: `TempTagScope` manages groups of temp tags. `Scope::GLOBAL` is the default scope. Batches of operations can create scoped temp tags that are cleaned up together.
## BlobTicket
```rust
// src/ticket.rs
pub struct BlobTicket {
addr: EndpointAddr, // How to reach the provider (includes EndpointId, relay URL, direct addresses)
format: BlobFormat, // Raw or HashSeq
hash: Hash, // What to retrieve
}
```
A shareable token containing everything needed to retrieve a blob from a provider. Serialized via `iroh_tickets::Ticket` trait (base32-encoded with "blob" prefix). Wire format uses postcard with a variant discriminator.
```rust
// Creating a ticket
let ticket = BlobTicket::new(addr, hash, BlobFormat::Raw);
// From a ticket string
let ticket: BlobTicket = ticket_str.parse()?;
```
## ChunkRanges and ChunkRangesSeq
### ChunkRanges
```rust
pub type ChunkRanges = RangeSet2<ChunkNum>; // From range_collections crate
```
A set of non-overlapping chunk ranges. Supports boolean operations (union, intersection, difference). The fundamental unit is `ChunkNum` (a u64 newtype representing a 1024-byte BLAKE3 chunk).
Helper trait `ChunkRangesExt` provides:
- `ChunkRanges::all()` — all chunks
- `ChunkRanges::bytes(range)` — byte range rounded up to chunk boundaries
- `ChunkRanges::chunks(range)` — chunk range from u64 bounds
- `ChunkRanges::last_chunk()` — the very last chunk (for size verification)
- `ChunkRanges::chunk(n)` — a single chunk
- `ChunkRanges::offset(n)` — a single byte offset rounded to chunk
### ChunkRangesSeq
```rust
// src/protocol/range_spec.rs
pub struct ChunkRangesSeq(SmallVec<[(u64, ChunkRanges); 2]>);
```
A sequence of `ChunkRanges`, one per blob in a HashSeq. Uses run-length encoding: stores `(offset, ranges)` pairs, where offset is the first blob index with that range spec. Unspecified indices default to the most recent range (or empty for finite sequences).
Key methods:
- `ChunkRangesSeq::all()` — request everything (root + all children, forever)
- `ChunkRangesSeq::root()` — request only the root blob
- `ChunkRangesSeq::empty()` — request nothing
- `ChunkRangesSeq::from_ranges(ranges)` — from explicit iterator
- `ChunkRangesSeq::from_ranges_infinite(ranges)` — last range repeats forever
- `.iter_non_empty_infinite()` — iterate only non-empty ranges
- `.is_blob()` — true if requesting a single blob (offset 0 with one entry)
### RangeSpec (Wire Format)
```rust
pub struct RangeSpec(SmallVec<[u64; 2]>);
```
The on-wire encoding of `ChunkRanges`. Uses alternating spans: first span is deselected, second is selected, etc. SmallVec avoids allocation for the common case of a single range.
Examples:
- `[]` — empty (nothing selected)
- `[0]` — everything from chunk 0 selected (entire blob)
- `[2, 5, 3, 1]` — chunks 2-7 and 10-11 selected
- `[u64::MAX]` — only the last chunk (size proof)
### ChunkRangesSeq Wire Format
Serialized as `(SmallVec<[(u64, RangeSpec); 2]>)` where each element is `(delta_offset, rangespec)`. The `delta_offset` is the distance from the previous entry. Uses postcard varint encoding for compact transmission.
## Store Command Protocol
The store API uses an RPC-style command pattern via `irpc`. Each command has a `Command` enum variant with typed request/response channels:
```rust
#[rpc_requests(message = Command, alias = "Msg", rpc_feature = "rpc")]
pub enum Request {
ListBlobs(ListRequest),
Batch(BatchRequest),
DeleteBlobs(BlobDeleteRequest),
ImportBao(ImportBaoRequest), // streaming: rx bao items, tx result
ExportBao(ExportBaoRequest), // streaming: tx encoded items
ExportRanges(ExportRangesRequest), // streaming: tx range data
Observe(ObserveRequest), // streaming: tx bitfield updates
BlobStatus(BlobStatusRequest),
ImportBytes(ImportBytesRequest),
ImportByteStream(ImportByteStreamRequest), // duplex streaming
ImportPath(ImportPathRequest),
ExportPath(ExportPathRequest),
ListTags(ListTagsRequest),
SetTag(SetTagRequest),
DeleteTags(DeleteTagsRequest),
RenameTag(RenameTagRequest),
CreateTag(CreateTagRequest),
CreateTempTag(CreateTempTagRequest),
ListTempTags(ListTempTagsRequest),
SyncDb(SyncDbRequest),
WaitIdle(WaitIdleRequest),
Shutdown(ShutdownRequest),
ClearProtected(ClearProtectedRequest),
}
```
This allows both local (in-process) and remote (RPC) store access through the same API surface.

View File

@@ -0,0 +1,249 @@
# iroh-blobs: Transfer Protocol
## Overview
The transfer protocol is a **request-response** protocol operating over QUIC streams (via iroh). The ALPN is `b"/iroh-bytes/4"`.
The requester opens a bidirectional QUIC stream, sends a request, and the provider responds with BLAKE3-verified streaming data on the same stream.
**Key properties**:
- Data integrity is verified in-stream — every 16 KiB chunk group can be independently verified against the BLAKE3 hash tree
- No upper limit on blob or collection size — streaming design avoids buffering entire transfers
- Zero round-trip overhead for multiple small blobs (via HashSeq/GetManyRequest)
- Range requests supported at chunk granularity
## Request Types
```rust
pub enum Request {
Get(GetRequest),
Observe(ObserveRequest),
Slot2, Slot3, Slot4, Slot5, Slot6, Slot7, // Reserved
Push(PushRequest),
GetMany(GetManyRequest),
}
```
Wire format: 1-byte discriminator (postcard-encoded `RequestType` enum), followed by postcard-serialized request body.
### GetRequest
```rust
pub struct GetRequest {
pub hash: Hash, // BLAKE3 hash of the root blob
pub ranges: ChunkRangesSeq, // What ranges to request
}
```
The most common request type. The `ranges` field uses `ChunkRangesSeq` to express which parts of the root blob and its children to request.
**Common patterns**:
```rust
// Request an entire single blob
let req = GetRequest::blob(hash);
// -> ChunkRangesSeq with a single element: all chunks of the root
// Request a HashSeq (root + all children)
let req = GetRequest::all(hash);
// -> ChunkRangesSeq::all() - infinite sequence of "all chunks"
// Request parts of a single blob
let req = GetRequest::builder()
.root(ChunkRanges::bytes(0..1000))
.build(hash);
// Request a HashSeq with specific child ranges
let req = GetRequest::builder()
.root(ChunkRanges::all()) // full root (the hash seq)
.child(1, ChunkRanges::bytes(0..100)) // partial child 1
.next(ChunkRanges::all()) // full remaining children
.build_open(hash); // build_open = last range repeats forever
```
### GetManyRequest
```rust
pub struct GetManyRequest {
pub hashes: Vec<Hash>, // Sorted, deduplicated list of hashes
pub ranges: ChunkRangesSeq, // Ranges for each hash (no root entry)
}
```
Like a `GetRequest` for a HashSeq, but the hashes are provided by the requester instead of looked up from the provider. This avoids the provider needing to have a pre-existing HashSeq blob.
```rust
let req = GetManyRequest::builder()
.hash(hash1, ChunkRanges::all())
.hash(hash2, ChunkRanges::all())
.build();
// Deduplicates and sorts hashes automatically
```
### PushRequest
```rust
pub struct PushRequest(GetRequest); // Wraps a GetRequest
```
The inverse of a GetRequest — the requester pushes data to the provider. The request describes what will be sent, followed by the actual data stream. Providers may reject push requests (disabled by default via `EventMask`).
### ObserveRequest
```rust
pub struct ObserveRequest {
pub hash: Hash,
pub ranges: RangeSpec, // Which ranges to observe
}
```
Subscribes to availability changes for a blob's bitfield. The provider sends `ObserveItem` updates as chunks become available.
## Response Format
### For Get/GetMany/Push
The response is BLAKE3-verified streaming data (bao-tree format). For each blob in the request:
1. **8-byte size header** (little-endian u64) — the total size of the blob
2. **BLAKE3 verified stream** — encoded data for the requested ranges, using bao-tree's mixed encoding:
- `BaoContentItem::Parent(node, (left_hash, right_hash))` — internal hash tree nodes (64 bytes each)
- `BaoContentItem::Leaf(Leaf { offset, data })` — actual data chunks
The data is sent in order: ascending chunks for each blob, blobs in HashSeq order.
**Verification**: The requester validates each chunk group against the expected BLAKE3 hash tree. Invalid data is detected within at most 16 KiB of reception. Missing data (provider doesn't have a chunk) causes the provider to close the stream at the point where data becomes unavailable.
### For Observe
The provider sends length-prefixed `ObserveItem` messages:
```rust
pub struct ObserveItem {
pub size: u64, // Blob size
pub ranges: ChunkRanges, // Available chunks
}
```
Updates are sent as deltas — only the new chunks that have become available since the last update.
## Error Handling
Error codes for stream/connection closure:
| Code | Name | Meaning |
|------|------|---------|
| 0 | StreamDropped | RecvStream was dropped |
| 1 | ProviderTerminating | Provider is shutting down |
| 2 | RequestReceived | Only one request per stream allowed |
| 1 (application) | ERR_PERMISSION | Permission denied |
| 2 (application) | ERR_LIMIT | Rate limited |
| 3 (application) | ERR_INTERNAL | Internal error |
## Client-Side FSM (Get)
The `get::fsm` module implements the get request as a **finite state machine** for maximum control:
```
AtInitial
│ (open QUIC stream)
AtConnected
│ (send request, drop writer)
ConnectedNext ─┬─ StartRoot(hash, ranges) // offset 0 = root blob
├─ StartChild(offset, ranges) // offset > 0 = child blob
└─ Closing // empty request
AtStartRoot / AtStartChild
│ (determine hash for child)
AtBlobHeader
│ (read 8-byte size)
AtBlobContent
│ (stream BLAKE3-verified items)
├─ More(content_item) → AtBlobContent // loop
└─ Done → AtEndBlob
AtEndBlob
│ (iterate to next blob in sequence)
├─ MoreChildren(AtStartChild)
└─ Closing
│ (drain remaining bytes)
Stats (transfer statistics)
```
Each state transition is explicit. The FSM gives the consumer full control:
- `AtBlobContent::next()` returns `BlobContentNext::More((content, item))` or `BlobContentNext::Done(end)`
- `AtBlobHeader::next()` reads the size header and creates a `ResponseDecoder`
- `AtStartChild::next(hash)` requires the caller to supply the hash (from the HashSeq)
### Stats Tracking
```rust
pub struct Stats {
pub payload_bytes_read: u64, // Actual data bytes
pub other_bytes_read: u64, // Hash pairs, headers
pub payload_bytes_written: u64, // For push
pub other_bytes_written: u64, // For push
pub elapsed: Duration,
}
```
## Provider-Side Handling
```rust
pub async fn handle_connection(connection: Connection, store: Store, events: EventSender);
```
The provider accepts QUIC streams on a connection. For each stream:
1. Read the request type byte
2. Deserialize the request
3. Dispatch to `handle_get`, `handle_get_many`, `handle_observe`, or `handle_push`
4. For `handle_get`: iterate over the `ChunkRangesSeq`, streaming each blob via `store.export_bao(hash, ranges)`
5. For HashSeq requests: load the root blob, parse it as `HashSeq`, then stream each requested child
### Event System
The provider can emit events for monitoring and access control:
```rust
pub struct EventMask {
pub connected: ConnectMode, // None, Notify, Intercept
pub get: RequestMode, // None, Notify, Intercept, NotifyLog, InterceptLog, Disabled
pub get_many: RequestMode,
pub push: RequestMode, // Disabled by default!
pub observe: ObserveMode,
pub throttle: ThrottleMode, // None, Intercept
}
```
- **None**: No events, requests processed normally
- **Notify**: Events sent but cannot block requests
- **Intercept**: Events sent as RPC requests; handler can reject with `AbortReason`
- **Disabled**: All requests of this type rejected
Progress events: `TransferStarted`, `TransferProgress`, `TransferCompleted`, `TransferAborted`.
## Collection Format
```rust
pub struct Collection {
blobs: Vec<(String, Hash)>, // Named references to child blobs
}
```
Wire format (as a HashSeq blob):
1. First child blob: `CollectionMeta` serialized with postcard
2. Remaining children: the actual data blobs
```rust
pub struct CollectionMeta {
header: [u8; 13], // Must be b"CollectionV0."
names: Vec<String>, // Names for each child blob
}
```
The header `b"CollectionV0."` is a magic number for format identification. The meta blob's hash becomes the first entry in the HashSeq, followed by the hashes of each data blob. Names correspond 1:1 with data blobs (excluding the meta entry).

View File

@@ -0,0 +1,250 @@
# iroh-blobs: Storage Architecture
## Overview
iroh-blobs provides three store implementations sharing a common `Store` API surface:
| Store | Location | Mutable | Use Case |
|-------|----------|---------|----------|
| `MemStore` | In-memory | ✅ | Small data, testing, WASM |
| `FsStore` | Filesystem + redb | ✅ | Production, large data |
| `ReadonlyMemStore` | In-memory | ❌ | Static data serving |
All stores implement the same RPC-based command protocol (`Command` enum), allowing both local in-process and remote RPC access through the same `Store` type.
## Store API Surface
The `Store` type (from `api::Store`) is the primary interface. It's accessed via typed sub-APIs:
```rust
let store: Store = /* ... */;
// Blob operations
store.blobs() // → Blobs API (add, export, read, delete, observe, etc.)
store.tags() // → Tags API (create, list, set, delete, rename)
// Direct operations
store.add_bytes(data) // → AddProgress
store.add_slice(data) // → TempTag (convenience)
store.get_bytes(hash) // → Result<Bytes>
store.has(hash) // → bool
store.shutdown() // Clean shutdown
store.wait_idle() // Wait for all tasks to complete
store.sync_db() // Sync database to disk (FsStore)
```
## Blobs API
```rust
let blobs = store.blobs();
// Import
blobs.add_slice(data) // → AddProgress (raw format)
blobs.add_bytes(data) // → AddProgress (raw format)
blobs.add_bytes_with_opts(AddBytesOptions{..}) // → AddProgress (with format)
blobs.import_byte_stream(format) // → streaming import
// Export
blobs.reader(hash) // → BlobReader (AsyncRead + AsyncSeek)
blobs.export(hash, path) // → export to filesystem
blobs.export_bao(hash, ranges) // → ExportBao (BLAKE3 verified stream)
blobs.export_ranges(hash, ranges) // → ExportRanges (raw data ranges)
// Observe (subscribe to chunk availability)
blobs.observe(hash) // → ObserveAt (bitfield stream)
// Status
blobs.status(hash) // → BlobStatus (NotFound/Partial/Complete)
// Import BAO-encoded data
blobs.import_bao_bytes(hash, ranges, data) // → import verified BAO stream
blobs.import_bao_reader(hash, ranges, reader) // → import from async reader
// Batch operations (scoped temp tags)
blobs.batch() // → Batch (auto-cleanup scope)
// Delete
blobs.delete(hashes) // → force delete (use GC normally)
```
## Tags API
```rust
let tags = store.tags();
tags.set(name, value) // Set a persistent tag
tags.create(value) // Auto-generate a tag name, return Tag
tags.get(name) // → Option<TagInfo>
tags.list() // → Stream<TagInfo>
tags.list_hash_seq() // → Stream<TagInfo> (only HashSeq format)
tags.delete(name) // Delete a tag
tags.delete_range(range) // Delete tags in range
tags.delete_prefix(prefix) // Delete tags with prefix
tags.rename(from, to) // Atomically rename a tag
tags.temp_tag(value) // → TempTag (ephemeral protection)
```
## MemStore Architecture
The in-memory store uses a simple actor pattern:
```
MemStore (ApiClient)
└── Actor (tokio task)
├── State
│ ├── data: HashMap<Hash, BaoFileHandle> // All blob data
│ ├── tags: BTreeMap<Tag, HashAndFormat> // Persistent tags
│ └── empty_hash: BaoFileHandle // Special entry for empty blob
├── tasks: JoinSet<TaskResult> // Spawned import/export tasks
├── temp_tags: TempTags // Ephemeral protection
├── protected: HashSet<Hash> // GC-protected hashes
└── idle_waiters: Vec<oneshot::Sender<()>> // Wait-idle notifications
```
### BaoFileHandle / BaoFileStorage
```rust
pub enum BaoFileStorage {
Partial(PartialMemStorage), // Still downloading
Complete(CompleteStorage), // Fully available
}
pub struct PartialMemStorage {
data: SparseMemFile, // Sparse byte array for data
outboard: SparseMemFile, // Sparse byte array for BLAKE3 hash tree
size: SizeInfo, // Known/estimated size
bitfield: Bitfield, // Which chunks are verified
}
pub struct CompleteStorage {
data: Bytes, // Complete data
outboard: Bytes, // Complete outboard (hash tree)
}
```
The `watch::Sender<BaoFileStorage>` pattern allows subscribers to observe state changes (for the `observe` API).
### Data Flow (Import)
1. `add_bytes(data)` → compute outboard via `PreOrderMemOutboard::create()` → transition `Partial → Complete`
2. `import_bao(hash, size, stream)` → receive `BaoContentItem` stream → write to `PartialMemStorage` → update bitfield → transition to `Complete` when all chunks present
### Data Flow (Export)
1. `export_bao(hash, ranges)` → look up `BaoFileHandle``traverse_ranges_validated(data, outboard, &ranges, tx)` — streams validated BAO data
## FsStore Architecture (Hybrid Store)
The filesystem store uses a **hybrid approach** that stores small data inline in redb and large data as files on disk.
### Design Rationale (from DESIGN.md)
- **Databases** are good for small blobs (low per-entry overhead, fast random access)
- **Filesystems** are good for large blobs (OS-level caching, direct file access)
- **Neither alone** works well for both cases
### Layout
```
<data_dir>/
├── db/ # redb database
│ ├── metadata table # Hash → EntryState
│ ├── inline_data table # Hash → Bytes (for small blobs)
│ ├── inline_outboard table # Hash → Bytes (for small outboards)
│ └── tags table # Tag → HashAndFormat
├── data/<hash>.data # Large blob data files
├── data/<hash>.outboard # Large outboard files
├── data/<hash>.sizes # Size tracking for partial files
└── data/<hash>.bitfield # Validated chunk tracking for partial files
```
### EntryState
```rust
// Simplified from src/store/fs/entry_state.rs
pub enum EntryState {
Complete(CompleteEntryState),
Partial(PartialEntryState),
}
pub struct CompleteEntryState {
pub data: DataLocation, // Inline, Owned (canonical path), or External (user path)
pub outboard: OutboardLocation, // Inline, Owned, or NotNeeded
pub size: u64,
}
pub enum DataLocation {
Inline, // Stored in redb inline_data table
Owned, // File at canonical path <hash>.data
External(Vec<PathBuf>), // User-owned file paths
}
pub enum OutboardLocation {
Inline, // Stored in redb inline_outboard table
Owned, // File at canonical path <hash>.outboard
NotNeeded, // Data ≤ 16 KiB, no outboard needed
}
pub struct PartialEntryState {
// Either we know the verified size, or we don't yet
pub verified_size: Option<NonZeroU64>,
}
```
### Thresholds
- **Data inline threshold**: 16 KiB (default) — blobs smaller than this are stored entirely in redb
- **Outboard inline threshold**: 16 KiB (default) — outboards smaller than this are stored in redb
- Data ≤ 16 KiB has no outboard (not needed for verification of a single chunk group)
### Blob Lifecycle
**Adding a local file (known data, unknown hash)**:
1. Compute the full BLAKE3 hash and outboard
2. Atomically move the file into the store under the hash name
3. Apply inlining rules: small files → redb, large files → filesystem
**Syncing from remote (known hash, unknown data)**:
1. Start with no data — keep state in memory (not in database)
2. As chunks arrive, write incrementally to partial files
3. Once size is known to exceed the inline threshold, create database entry + filesystem files
4. On completion, transition to `Complete` state and apply inlining rules
**Deletion**:
- Tags protect content from GC
- `TempTag` provides ephemeral (process-lifetime) protection
- HashSeq tags protect the root blob AND all referenced child blobs
- GC is mark-and-sweep: mark all reachable content via tags → sweep (delete) everything else
- Explicit `force` deletion bypasses protection (emergency use only)
### FsStore Actor Architecture
```
FsStore (ApiClient)
└── MainActor (tokio task)
├── TaskContext { config, db_actor_sender }
├── EntityMap: HashMap<Hash, ActiveEntityState> // Currently active entities
├── JoinSet<TaskResult> // Running tasks
├── TempTags // Ephemeral protection
├── ProtectedSet // GC protection
└── idle_waiters
```
The FsStore uses an **entity manager** pattern where each hash gets a `BaoFileHandle` (like MemStore) when active, and entries are cleaned up when tasks complete.
## Garbage Collection
```rust
pub struct GcConfig {
pub interval: Duration,
pub add_protected: Option<ProtectCb>, // Optional callback to add more protected hashes
}
```
GC is a two-phase process:
1. **Mark**: Walk all tags (persistent + temp), collect reachable hashes. For HashSeq format, traverse the hash sequence to find all child hashes.
2. **Sweep**: Delete all blobs not in the reachable set, in batches of 100.
GC runs automatically at a configurable interval via `run_gc(store, config)`, or manually via `gc_run_once(store, live)`.

View File

@@ -0,0 +1,202 @@
# iroh-blobs: Remote API and Downloader
## Remote API
The `Remote` type (`api::remote::Remote`) provides the client-side interface for interacting with remote iroh-blobs providers. It's a thin wrapper around `ApiClient` that exposes fetch, observe, and push operations.
```rust
let remote = store.remote(); // or Remote::from_sender(client)
// Get local info about what we already have
let local = remote.local(hash_and_format).await?;
// Compute what we need
let missing = local.missing();
// Execute a download
let stats = remote.execute_get(connection, request).await?;
// Or use the simpler fetch API
let progress = remote.fetch(connection, hash, format, store);
```
### LocalInfo
```rust
pub struct LocalInfo {
pub size: Option<u64>, // Total size if known
pub present: ChunkRanges, // Chunks we already have
pub missing: ChunkRanges, // Chunks we still need
pub hash_and_format: HashAndFormat,
}
```
`LocalInfo` is computed by querying the local store's bitfield for a given hash and comparing it against what a full download would require.
### Fetch Process
The `fetch` method handles the complete lifecycle:
1. **Local check**: Query the store for what we already have
2. **Request computation**: If format is HashSeq, read the local HashSeq to compute precise missing ranges
3. **Connection**: Open a QUIC stream to the provider
4. **Transfer**: Use the get FSM to stream data into the store
5. **Verification**: BLAKE3 verification happens in-stream during the transfer
For HashSeq format:
- First fetch the root blob (the HashSeq)
- Parse it to get child hashes
- For each child, check local availability and compute missing ranges
- Fetch only what's missing
### Observe
```rust
// Subscribe to bitfield updates from a remote provider
let mut stream = remote.observe(connection, hash).stream().await?;
while let Some(bitfield) = stream.next().await {
// Process availability updates
}
```
The observe protocol sends `ObserveItem` messages (size + available ranges) whenever new chunks become available on the provider. The initial message contains the full current state, subsequent messages contain deltas.
### Push
```rust
// Push local data to a remote provider
let progress = remote.push(connection, request, store);
```
Push uses the same FSM-style approach but in reverse — the local side reads from the store and writes BLAKE3-verified data to the QUIC stream.
## Downloader API
The `Downloader` (`api::downloader::Downloader`) coordinates downloads from multiple sources:
```rust
let downloader = Downloader::new(store, endpoint);
// Download from specific providers
let progress = downloader.download(DownloadRequest {
request: FiniteRequest::Get(get_request),
providers: vec![endpoint_id_1, endpoint_id_2],
strategy: SplitStrategy::Split,
}).stream();
```
### SplitStrategy
```rust
pub enum SplitStrategy {
Split, // Split the request across multiple providers
None, // Use a single provider
}
```
When `SplitStrategy::Split` is used, the downloader:
1. Splits the `GetRequest` into per-child requests
2. Distributes children across available providers
3. Downloads in parallel from multiple sources
4. Stores each completed child into the local store
### DownloadRequest
```rust
pub struct DownloadRequest {
pub request: FiniteRequest, // What to download
pub providers: Vec<EndpointId>, // Who to download from
pub strategy: SplitStrategy, // How to split work
}
pub enum FiniteRequest {
Get(GetRequest),
GetMany(GetManyRequest),
}
```
### Download Progress
```rust
pub enum DownloadProgressItem {
TryProvider { id: EndpointId, request: Arc<GetRequest> },
ProviderFailed { id: EndpointId, request: Arc<GetRequest> },
PartComplete { request: Arc<GetRequest> },
Progress(u64),
DownloadError,
}
```
## Connection Pooling
The `util::connection_pool::ConnectionPool` manages reusable QUIC connections:
```rust
let pool = ConnectionPool::new(endpoint, ALPN, options);
let connection = pool.connect(endpoint_id).await?;
```
Options include connection timeout, idle timeout, and maximum connections per peer.
## Integration with iroh
### BlobsProtocol
```rust
// src/net_protocol.rs
pub struct BlobsProtocol {
inner: Arc<BlobsInner>, // (Store, EventSender)
}
impl ProtocolHandler for BlobsProtocol {
async fn accept(&self, conn: Connection) -> Result<(), AcceptError> {
crate::provider::handle_connection(conn, store, events).await;
Ok(())
}
async fn shutdown(&self) { /* shutdown store */ }
}
```
Usage with iroh Router:
```rust
let endpoint = Endpoint::bind(presets::N0).await?;
let store = MemStore::new(); // or FsStore::load(path).await?
let blobs = BlobsProtocol::new(&store, None);
let router = Router::builder(endpoint)
.accept(iroh_blobs::ALPN, blobs)
.spawn();
```
### Creating a BlobTicket
```rust
let endpoint = Endpoint::bind(presets::N0).await?;
endpoint.online().await;
let addr = endpoint.addr();
let tag = store.add_slice(b"hello world").await?;
let ticket = BlobTicket::new(addr, tag.hash, tag.format);
println!("Share this: {ticket}");
```
### Fetching from a Ticket
```rust
// On the requester side
let ticket: BlobTicket = ticket_str.parse()?;
let (addr, hash, format) = ticket.into_parts();
let endpoint = Endpoint::bind(presets::N0).await?;
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
let request = match format {
BlobFormat::Raw => GetRequest::blob(hash),
BlobFormat::HashSeq => GetRequest::all(hash),
};
// Use the get FSM
let fsm = get::fsm::start(conn, request, RequestCounters::default());
let connected = fsm.next().await?;
// ... drive the FSM to completion
```

View File

@@ -0,0 +1,312 @@
# iroh-blobs: Data Flow and Complete Example
## Complete Data Flow: Provider Side
```
QUIC Connection Arrives
handle_connection(conn, store, events)
┌──────────┴──────────┐
│ Accept QUIC BIDI │
│ streams in loop │
└──────────┬──────────┘
handle_stream(pair, store)
┌──────────┴──────────┐
│ Read Request type │
│ byte + deserialize │
└──────────┬──────────┘
┌─────────────┬───────┼───────┬──────────────┐
│ │ │ │ │
handle_get handle_get handle handle (reserved)
_many _observe _push
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────┐
│ For each (offset, ranges) in request.ranges: │
│ │
│ if offset == 0: │
│ send_blob(store, 0, hash, ranges, writer) │
│ else: │
│ lookup hash in HashSeq[offset-1] │
│ send_blob(store, offset, child_hash, ranges, writer) │
│ │
│ send_blob: │
│ store.export_bao(hash, ranges) │
│ .write_with_progress(writer, ctx, &hash, idx) │
└─────────────────────────────────────────────────┘
```
## Complete Data Flow: Requester Side (Get FSM)
```
Create GetRequest
fsm::start(connection, request, counters)
AtInitial.next()
│ (open_bi, send request)
AtConnected.next()
┌───────────┼───────────┐
│ │ │
StartRoot StartChild Closing
(offset=0) (offset>0) (empty)
│ │ │
▼ ▼ ▼
AtBlobHeader AtBlobHeader AtClosing
.next() .next(hash) .next()
│ │ │
▼ ▼ ▼
(size, AtBlobContent) Stats
┌────────┴────────┐
│ │
More(item) Done
(loop back to (AtEndBlob)
AtBlobContent) │
┌─────┼─────┐
│ │
MoreChildren Closing
(AtStartChild) (AtClosing)
│ │
└───────────┘
```
### Blob Content Items
During `AtBlobContent`, items arrive as `BaoContentItem`:
```rust
pub enum BaoContentItem {
Parent(ParentNode), // (node, (left_hash, right_hash)) — 64 bytes
Leaf(Leaf), // { offset: u64, data: Bytes } — actual data
}
```
- **Parent nodes** contain BLAKE3 hash pairs for tree verification. They're overhead (~64 bytes per internal node).
- **Leaf nodes** contain actual data chunks. Each leaf's data is at most `IROH_BLOCK_SIZE` bytes (16 KiB).
Verification is automatic: the `ResponseDecoder` from `bao-tree` validates each chunk against the expected hash tree rooted at the request hash.
## Blob Verification and BaoTree Encoding
### How BLAKE3 Verified Streaming Works
1. **The hash is the root** of a binary Merkle tree
2. **Internal nodes** store `(left_child_hash, right_child_hash)` — 64 bytes each
3. **Leaf nodes** store the actual data chunks (up to 1024 bytes each in standard BLAKE3, or 16 KiB in iroh's block size)
4. **Chunk groups** (16 chunks = 16 KiB) are the minimum verification unit in iroh-blobs
For a request with specific ranges:
- The provider traverses the tree, yielding only nodes needed to verify the requested ranges
- The requester can verify each chunk group independently after receiving its parent hash pair
- Maximum undetected corruption: 16 KiB (one chunk group)
### Outboard Storage
The **outboard** is the BLAKE3 hash tree stored separately from the data. For the provider:
- Small blobs (≤16 KiB): outboard is empty (not needed, single chunk group)
- Large blobs: outboard stored as `PreOrderMemOutboard` (in-memory) or as a file (filesystem store)
For the requester, the outboard is built incrementally as data arrives.
## Import and Export Flows
### Import Bytes (Local Data)
```
add_bytes(data) / add_slice(data)
ImportBytesRequest { data, format, scope }
Actor::import_bytes()
│ 1. Send AddProgressItem::Size(len)
│ 2. Send AddProgressItem::CopyDone
│ 3. Compute outboard: PreOrderMemOutboard::create(&data, IROH_BLOCK_SIZE)
│ 4. Return ImportEntry { data, outboard, scope, format, tx }
Actor::finish_import()
│ 1. Get hash from outboard.root()
│ 2. Get or create BaoFileHandle for hash
│ 3. Transition BaoFileStorage::Partial → Complete
│ 4. Create TempTag for the hash_and_format
│ 5. Send AddProgressItem::Done(temp_tag)
```
### Import BAO Stream (Remote Data)
```
import_bao_bytes(hash, ranges, data) / import_bao_reader(hash, ranges, reader)
ImportBaoRequest { hash, size }
Actor::import_bao()
│ 1. Set size on partial entry
│ 2. Create BaoTree for the size
│ 3. For each BaoContentItem from stream:
│ - Parent: write hash pair to outboard
│ - Leaf: write data to storage, update bitfield
│ - If bitfield becomes complete: transition Partial → Complete
│ 4. Send result
```
### Export BAO
```
export_bao(hash, ranges) → ExportBao
Actor::export_bao()
│ 1. Look up BaoFileHandle for hash
│ 2. If not found: send EncodeError::NotFound and return
│ 3. Create BaoTreeSender from data + outboard readers
│ 4. Call traverse_ranges_validated(data, outboard, &ranges, tx)
│ → streams validated BAO items to the sender
```
### Export Path (To Filesystem)
```
export(hash, target_path) → ExportPath
Actor::export_path()
│ 1. Look up BaoFileHandle for hash
│ 2. Create parent directories if needed
│ 3. Create file at target_path
│ 4. Send ExportProgressItem::Size(total_size)
│ 5. Read data from store in 64 KiB chunks
│ 6. Write to file, yielding ExportProgressItem::CopyProgress(offset)
│ 7. Send ExportProgressItem::Done
```
## Observe Protocol Detail
```
Requester Provider
│ │
│ ObserveRequest {hash, ranges} │
│─────────────────────────────────►│
│ │
│ ObserveItem {size, ranges} │ (initial state)
│◄─────────────────────────────────│
│ │
│ ... (time passes, more data │
│ becomes available) │
│ │
│ ObserveItem {size, ranges} │ (delta update)
│◄─────────────────────────────────│
│ │
│ ... (continue until │
│ requester stops │
│ or connection closes) │
│ │
│ STOP_STREAM │
│─────────────────────────────────►│
```
The observe protocol uses `Bitfield::diff()` to send only the new chunks since the last update, minimizing bandwidth.
## Full Working Example
```rust
use iroh::{protocol::Router, Endpoint, endpoint::presets};
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ticket::BlobTicket, BlobFormat};
// === Provider Side ===
async fn provider() -> anyhow::Result<()> {
let endpoint = Endpoint::bind(presets::N0).await?;
let store = MemStore::new();
// Add some data
let tag = store.add_slice(b"Hello, iroh-blobs!").await?;
let _ = endpoint.online().await;
let addr = endpoint.addr();
// Create ticket for sharing
let ticket = BlobTicket::new(addr, tag.hash, BlobFormat::Raw);
println!("Ticket: {ticket}");
// Start serving
let blobs = BlobsProtocol::new(&store, None);
let router = Router::builder(endpoint)
.accept(iroh_blobs::ALPN, blobs)
.spawn();
tokio::signal::ctrl_c().await?;
router.shutdown().await?;
Ok(())
}
// === Requester Side ===
async fn requester(ticket: BlobTicket) -> anyhow::Result<()> {
let (addr, hash, format) = ticket.into_parts();
let endpoint = Endpoint::bind(presets::N0).await?;
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
// Build request based on format
let request = match format {
BlobFormat::Raw => iroh_blobs::protocol::GetRequest::blob(hash),
BlobFormat::HashSeq => iroh_blobs::protocol::GetRequest::all(hash),
};
// Use the get FSM
let start = iroh_blobs::get::fsm::start(conn, request, Default::default());
let connected = start.next().await?;
let connected = connected.next().await?;
match connected {
iroh_blobs::get::fsm::ConnectedNext::StartRoot(at_root) => {
let (at_content, size) = at_root.next().next().await?;
let (at_end, data) = at_content.concatenate_into_vec().await?;
println!("Got {} bytes: {:?}", size, data);
// ...
}
iroh_blobs::get::fsm::ConnectedNext::StartChild(at_child) => {
// Need to know the child hash
}
iroh_blobs::get::fsm::ConnectedNext::Closing(at_closing) => {
println!("Empty response");
}
}
Ok(())
}
```
## Simplified Fetch (Using Store + Remote)
```rust
// The simplest way to download data
let store = MemStore::new();
let remote = store.remote();
// Fetch with automatic local availability checking
let result = remote.fetch(connection, hash, format, &store).await?;
// Result includes Stats with transfer metrics
```
## Key Error Types
| Error Type | Location | Purpose |
|------------|----------|---------|
| `GetError` | `get::error` | Errors during get FSM |
| `ExportBaoError` | `api` | Errors during BAO export |
| `RequestError` | `api` | Store command errors |
| `DecodeError` | `get::fsm` | BAO stream decode errors |
| `ProgressError` | `provider::events` | Provider event errors |

View File

@@ -0,0 +1,60 @@
# iroh-blobs Reference Documentation
This directory contains a comprehensive reference for the `iroh-blobs` crate (v0.100.0), a Rust library for content-addressed blob transfer over QUIC connections using BLAKE3 verified streaming.
## Documents
1. **[Overview and Architecture](01-overview-and-architecture.md)** — Core concepts, module structure, feature flags, and architecture diagram. Start here.
2. **[Key Types and Data Structures](02-key-types.md)** — Detailed reference for `Hash`, `BlobFormat`, `HashAndFormat`, `HashSeq`, `Bitfield`, `Tag`, `TempTag`, `BlobTicket`, `ChunkRanges`/`ChunkRangesSeq`/`RangeSpec`, and the store command protocol.
3. **[Transfer Protocol](03-transfer-protocol.md)** — Wire protocol specification: request types (`GetRequest`, `GetManyRequest`, `PushRequest`, `ObserveRequest`), response format (BLAKE3 verified streaming), the client-side FSM, provider handling, event system, and the Collection format.
4. **[Storage Architecture](04-storage.md)** — Store implementations: `MemStore` (in-memory), `FsStore` (hybrid redb + filesystem), `ReadonlyMemStore`. Covers the actor pattern, `BaoFileHandle`/`BaoFileStorage`, partial/complete states, the hybrid inline/file approach, entry states, blob lifecycle, and garbage collection.
5. **[Remote API and Downloader](05-remote-and-downloader.md)** — `Remote` API for fetching from/observing/pushing to peers, `Downloader` for multi-source downloads, connection pooling, and iroh integration via `BlobsProtocol`.
6. **[Data Flow and Examples](06-data-flow-and-examples.md)** — End-to-end data flow diagrams for provider and requester sides, BLAKE3 verification mechanics, import/export flows, observe protocol detail, and complete working examples.
## Quick Reference
### Creating a Provider
```rust
use iroh::{protocol::Router, Endpoint, endpoint::presets};
use iroh_blobs::{store::mem::MemStore, BlobsProtocol};
let endpoint = Endpoint::bind(presets::N0).await?;
let store = MemStore::new();
let tag = store.add_slice(b"data").await?;
let blobs = BlobsProtocol::new(&store, None);
let router = Router::builder(endpoint)
.accept(iroh_blobs::ALPN, blobs)
.spawn();
```
### Key Constants
| Constant | Value | Meaning |
|----------|-------|---------|
| `ALPN` | `b"/iroh-bytes/4"` | QUIC ALPN protocol identifier |
| `IROH_BLOCK_SIZE` | `BlockSize::from_chunk_log(4)` | 16 KiB chunk groups |
| `MAX_MESSAGE_SIZE` | `1 MiB` | Maximum request message size |
| `Hash::EMPTY` | BLAKE3 of `b""` | Hash of the empty blob |
### Core Crate Exports
```rust
pub use hash::{BlobFormat, Hash, HashAndFormat};
pub use hashseq::HashSeq;
pub use net_protocol::BlobsProtocol;
pub use protocol::ALPN;
pub mod api; // Store API, Blobs, Tags, Downloader, Remote
pub mod format; // Collection type
pub mod get; // Client-side FSM
pub mod protocol; // Wire protocol types (GetRequest, etc.)
pub mod provider; // Server-side handling
pub mod store; // Storage implementations
pub mod ticket; // BlobTicket
pub mod util; // Connection pool, temp tags, stream helpers
```

View File

@@ -0,0 +1,98 @@
# iroh-docs: Overview and Architecture
> Reference document for the `iroh-docs` crate (v0.98.0).
> Source: `/workspace/iroh-docs`
## What Is iroh-docs?
`iroh-docs` is a Rust crate implementing **multi-dimensional key-value documents with an efficient synchronization protocol**. It provides:
1. **A CRDT-based document model** — Replicas (documents) hold entries identified by namespace + author + key, with content-addressed values (BLAKE3 hashes).
2. **Range-based set reconciliation** — An efficient sync protocol based on [Aljoscha Meyer's paper](https://arxiv.org/abs/2212.13567) for reconciling sets between peers.
3. **Live sync via gossip** — Real-time document updates propagated through an iroh-gossip swarm.
4. **Persistent storage** — A `redb`-backed store supporting both in-memory and file-based modes.
## High-Level Architecture
```
┌──────────────────────────────────────────────────────────────┐
│ Docs (Protocol) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Engine │ │
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │ │
│ │ │ LiveActor│ │ GossipState │ │ SyncHandle/Actor │ │ │
│ │ │ (events) │ │ (iroh-gossip)│ │ (store + sync) │ │ │
│ │ └──────────┘ └──────────────┘ └───────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Replica │ │ SignedEntry │ │ Author/ │ │
│ │ (sync.rs) │ │ Entry/Record │ │ Namespace keys │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Store (redb) │ │
│ │ Authors │ Namespaces │ Records │ RecordsByKey │ ... │ │
│ └─────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
```
### Module Layout
| Module | Purpose |
|--------|---------|
| `sync.rs` | Core types: `Replica`, `Entry`, `SignedEntry`, `Record`, `RecordIdentifier`, `Capability`, events |
| `keys.rs` | Cryptographic key types: `Author`, `NamespaceSecret`, `AuthorId`, `NamespaceId` |
| `ranger.rs` | Range-based set reconciliation algorithm implementation |
| `heads.rs` | `AuthorHeads` — latest timestamps per author for efficient sync decisions |
| `store/` | Storage abstraction and `redb`-backed persistent store |
| `store/fs.rs` | File-based `Store` implementation with redb tables |
| `store/pubkeys.rs` | `PublicKeyStore` trait for caching expanded ed25519 public keys |
| `actor.rs` | `SyncHandle` / Actor — single-threaded executor for store and replica operations |
| `engine/` | Live sync coordination: `Engine`, `LiveActor`, `GossipState`, `NamespaceStates` |
| `engine/live.rs` | The `LiveActor` event loop: handles sync, gossip, content download |
| `engine/gossip.rs` | Integration with `iroh-gossip` for broadcasting document operations |
| `engine/state.rs` | `NamespaceStates` — tracks per-namespace, per-peer sync state |
| `net/` | Network protocol: ALPN `/iroh-sync/1`, connection handling |
| `net/codec.rs` | Wire codec: length-prefixed postcard-serialized `Message` frames |
| `protocol.rs` | `Docs` struct (the `ProtocolHandler`) and `Builder` |
| `api/` | irpc-based RPC API for external access |
| `ticket.rs` | `DocTicket` — shareable document capability + peer addresses |
## Key Design Principles
1. **Two-key identity model**: Every entry is uniquely identified by (namespace, author, key). The namespace key provides write authorization; the author key provides attribution.
2. **Content-addressed values**: Entries store a BLAKE3 hash + length, not the actual content. Content blobs are handled separately by `iroh-blobs`.
3. **Prefix deletion**: An entry with key "foo" acts as a tombstone for all entries whose keys start with "foo/" (prefix deletion semantics). This enables hierarchical key structures.
4. **Last-writer-wins with per-author timestamps**: Entries are ordered by (timestamp, hash). Newer entries dominate older ones. Different authors can have entries for the same key simultaneously (multi-dimensional).
5. **Actor-based concurrency**: All store and replica mutations go through a single `SyncHandle` actor thread, eliminating the need for locks on the store.
6. **Event-driven live sync**: The `LiveActor` coordinates gossip, direct sync, and content downloads through a `tokio::select!` event loop.
## Dependencies
Key dependencies from `Cargo.toml`:
| Crate | Purpose |
|-------|---------|
| `iroh` | Networking: endpoints, connections, protocol routing |
| `iroh-blobs` | Content-addressed blob storage and transfer |
| `iroh-gossip` | Gossip protocol for broadcasting updates |
| `iroh-tickets` | Ticket-based sharing mechanism |
| `redb` | Embedded key-value store for persistence |
| `ed25519-dalek` | Ed25519 signatures for entries |
| `blake3` | Hashing (fingerprints + content hashes) |
| `postcard` | Serialization (wire format for sync protocol) |
| `irpc` / `noq` | RPC framework for API |
## Feature Flags
| Feature | Default | Description |
|---------|---------|-------------|
| `metrics` | Yes | Enables iroh-metrics instrumentation |
| `rpc` | Yes | Enables irpc-based RPC API (depends on `noq`) |
| `fs-store` | Yes | Enables persistent file-based store |

View File

@@ -0,0 +1,201 @@
# iroh-docs: Document Model and CRDT Details
## Core Data Model
### Namespace (Document Identity)
A **Namespace** is the identity of a document. It consists of:
- **`NamespaceSecret`** — An Ed25519 signing key (32 bytes) that grants write capability
- **`NamespacePublicKey`** — The corresponding verifying key (32 bytes)
- **`NamespaceId`** — A `[u8; 32]` that is the byte representation of the public key; this serves as the unique identifier for a document/replica
```
NamespaceSecret (signing key) ──derives──▶ NamespacePublicKey (verifying key)
──into─────▶ NamespaceId ([u8; 32])
```
### Author (Writer Identity)
An **Author** represents a writer identity within a document. Multiple authors can write to the same namespace.
- **`Author`** — An Ed25519 signing key (32 bytes)
- **`AuthorPublicKey`** — The corresponding verifying key (32 bytes)
- **`AuthorId`** — A `[u8; 32]` byte representation of the public key
Authors are application-defined: an application might create one author per device, per user, or per session.
### Capability
Access to a document is controlled through a `Capability`:
```rust
pub enum Capability {
Write(NamespaceSecret), // Full read-write access
Read(NamespaceId), // Read-only access (can sync but not insert)
}
```
Capabilities can be **merged** — a `Read` capability can be upgraded to `Write` if a matching `Write` is presented:
```rust
capability.merge(other_capability) // Read + Write → Write
```
The raw representation is `(u8, [u8; 32])` — a kind byte followed by 32 bytes of key material.
### Entry (The Fundamental Record)
An **`Entry`** is the core data unit, consisting of:
```rust
pub struct Entry {
id: RecordIdentifier, // (namespace, author, key)
record: Record, // (hash, len, timestamp)
}
```
#### RecordIdentifier
```rust
pub struct RecordIdentifier(Bytes); // namespace[0..32] || author[32..64] || key[64..]
```
The key is a variable-length byte sequence. `RecordIdentifier` implements `Ord` by comparing namespace first, then author, then key — this ordering is critical for the range-based sync algorithm.
#### Record
```rust
pub struct Record {
len: u64, // byte length of the content
hash: Hash, // BLAKE3 hash of the content (32 bytes)
timestamp: u64, // microseconds since Unix epoch
}
```
The `Record` comparison uses `(timestamp, hash)` ordering — this is the **Last-Writer-Wins** rule for same-key entries. When two records for the same key exist, the one with the higher timestamp wins; if timestamps are equal, the higher hash wins as a tiebreaker.
### SignedEntry (Entry with Proofs)
```rust
pub struct SignedEntry {
signature: EntrySignature, // dual Ed25519 signatures
entry: Entry,
}
```
#### EntrySignature
```rust
pub struct EntrySignature {
author_signature: Signature, // 64-byte Ed25519 signature
namespace_signature: Signature, // 64-byte Ed25519 signature
}
```
Both signatures cover the canonical byte encoding of the `Entry` (id + record). This means:
- The **namespace signature** proves write authorization (only holders of `NamespaceSecret` can produce valid entries)
- The **author signature** proves authorship (provides attribution and non-repudiation)
#### Verification
```rust
fn verify<S: PublicKeyStore>(&self, store: &S) -> Result<(), SignatureError>
```
Verification requires both the `NamespacePublicKey` and `AuthorPublicKey`, which are derived from the entry's namespace and author IDs. The `PublicKeyStore` trait provides caching for these expanded keys.
### Empty Entries (Tombstones / Prefix Deletion)
An entry is **empty** when `hash == Hash::EMPTY && len == 0`. Empty entries serve as **deletion markers**:
- **Key deletion**: Inserting an empty entry with the exact key removes the previous entry for that key
- **Prefix deletion**: Inserting an empty entry with key "foo" removes all entries whose keys start with "foo" (prefix deletion)
```rust
pub async fn delete_prefix(&mut self, prefix: impl AsRef<[u8]>, author: &Author) -> Result<usize, InsertError>
```
### Insert Semantics (CRDT Rules)
When a `SignedEntry` is inserted into a replica via `Store::put()` (the ranger store trait):
1. **Check prefixes**: Look up all existing entries whose key is a **prefix** of the new entry's key. If any prefix entry has a value `>=` the new entry's value, the new entry is **rejected** (`InsertOutcome::NotInserted`).
2. **Remove dominated entries**: Remove all existing entries whose key **starts with** the new entry's key (i.e., the new key is a prefix of theirs) AND whose value is `<=` the new entry's value.
3. **Insert**: If not rejected, the new entry is stored.
This implements a **prefix-aware last-writer-wins** CRDT:
- Newer entries for the same (namespace, author, key) tuple replace older ones
- A new entry at key "/foo" can delete all entries under "/foo/*" if it's newer
- Different authors can coexist on the same key — each author's latest entry is kept
### Timestamp and Future Shift
Timestamps are in **microseconds since Unix epoch**. There is a maximum allowed future shift:
```rust
pub const MAX_TIMESTAMP_FUTURE_SHIFT: u64 = 10 * 60 * Duration::from_secs(1).as_millis() as u64;
```
Entries with timestamps more than 10 minutes in the future of the local clock are rejected during validation.
### Content Status
Each entry's content has an availability status:
```rust
pub enum ContentStatus {
Complete, // Content blob is fully available locally
Incomplete, // Partially available
Missing, // Not available
}
```
This status is communicated during sync to help peers decide whether to download content.
### AuthorHeads (Efficient Sync Optimization)
`AuthorHeads` tracks the latest timestamp for each author in a document:
```rust
pub struct AuthorHeads {
heads: BTreeMap<AuthorId, Timestamp>,
}
```
This enables a quick check: `has_news_for(other)` — comparing local and remote heads to determine whether sync would yield any new entries. If all timestamps are at least as recent locally, no sync is needed.
`AuthorHeads` can be serialized with a size limit, dropping the oldest entries when the limit is exceeded.
## Event System
Replicas emit events through a subscription system:
```rust
pub enum Event {
LocalInsert {
namespace: NamespaceId,
entry: SignedEntry,
},
RemoteInsert {
namespace: NamespaceId,
entry: SignedEntry,
from: PeerIdBytes,
should_download: bool, // based on download policy
remote_content_status: ContentStatus,
},
}
```
Subscribers use `async_channel` for non-blocking notification delivery. The `ReplicaInfo::subscribe()` method registers a sender, and events are fanned out to all subscribers.
## Validation
Entry validation during insertion checks:
1. **Namespace match**: The entry's namespace must match the replica's namespace
2. **Signature verification**: For non-local entries, both namespace and author signatures are verified
3. **Timestamp check**: The entry must not be more than `MAX_TIMESTAMP_FUTURE_SHIFT` in the future
4. **Empty entry check**: An empty entry must have `hash == EMPTY && len == 0`, and a non-empty entry must have `len != 0`

View File

@@ -0,0 +1,272 @@
# iroh-docs: Range-Based Set Reconciliation (Ranger)
## Overview
The sync protocol in iroh-docs is based on **Range-Based Set Reconciliation**, implementing the algorithm described in [Aljoscha Meyer's paper (arXiv:2212.13567)](https://arxiv.org/abs/2212.13567).
The core idea: two peers can efficiently compute the union of their entry sets by recursively partitioning the sets and comparing **fingerprints** (hashes) of partitions. When fingerprints match, no further work is needed. When they differ, the partition is subdivided until the difference can be resolved by sending the actual entries.
## Key Abstractions
### RangeEntry Trait
```rust
pub trait RangeEntry: Debug + Clone {
type Key: RangeKey;
type Value: RangeValue;
fn key(&self) -> &Self::Key;
fn value(&self) -> &Self::Value;
fn as_fingerprint(&self) -> Fingerprint;
}
```
`SignedEntry` implements `RangeEntry`:
- `Key` = `RecordIdentifier` (namespace || author || key bytes)
- `Value` = `Record` (timestamp, hash, len)
- Fingerprint = BLAKE3 hash of (namespace || author || key || timestamp || content_hash)
### RangeKey Trait
```rust
pub trait RangeKey: Sized + Debug + Ord + PartialEq + Clone + 'static {
fn is_prefix_of(&self, other: &Self) -> bool; // test-only
}
```
`RecordIdentifier` implements this via byte-level prefix matching: `(namespace, author, key)` where key prefix matching supports the hierarchical deletion semantics.
### RangeValue Trait
```rust
pub trait RangeValue: Sized + Debug + Ord + PartialEq + Clone + 'static {}
```
`Record` implements `RangeValue` with ordering by `(timestamp, hash)` — the Last-Writer-Wins ordering.
### Fingerprint
```rust
pub struct Fingerprint(pub [u8; 32]); // BLAKE3 hash
```
Fingerprints are computed by XOR-ing the individual entry fingerprints within a range. This means:
- The fingerprint of the empty set is `BLAKE3([])` (the hash of nothing)
- Adding/removing an entry toggles its contribution via XOR
- Equal sets produce equal fingerprints
## Range Concept
A `Range<K>` represents a half-open interval `[x, y)` in the key space, with special semantics:
```rust
pub(crate) struct Range<K> {
x: K,
y: K,
}
```
- `x == y`: The entire set (all elements)
- `x < y`: Standard half-open interval `[x, y)` — includes `x`, excludes `y`
- `x > y`: Wrapping range — elements from `x` to end + beginning to `y`
This wrapping range concept allows the algorithm to work with circular key spaces where the "first" element might be anywhere.
## Protocol Messages
```rust
pub type ProtocolMessage = crate::ranger::Message<SignedEntry>;
```
### Message Structure
```rust
pub struct Message<E: RangeEntry> {
parts: Vec<MessagePart<E>>,
}
pub enum MessagePart<E: RangeEntry> {
RangeFingerprint(RangeFingerprint<E::Key>), // "Here's a fingerprint for this range"
RangeItem(RangeItem<E>), // "Here are the entries in this range"
}
pub struct RangeFingerprint<K> {
range: Range<K>,
fingerprint: Fingerprint,
}
pub struct RangeItem<E: RangeEntry> {
range: Range<E::Key>,
values: Vec<(E, ContentStatus)>,
have_local: bool, // If true, sender already has these entries
}
```
The `have_local` flag is an optimization: when a peer sends entries AND indicates it already has them locally, the receiver doesn't need to send its own entries in that range back.
### Wire Format
Messages are serialized using `postcard` (a compact serde format) and framed with a 4-byte big-endian length prefix via `SyncCodec`:
```
┌─────────────────┬──────────────────────────────┐
│ u32 BE length │ postcard-encoded Message │
└─────────────────┴──────────────────────────────┘
```
Max message size: 1 GiB (`MAX_MESSAGE_SIZE = 1024 * 1024 * 1024`).
## Sync Algorithm Walkthrough
### 1. Initiation (Alice → Bob)
Alice generates the initial message:
```rust
fn init<S: Store<E>>(store: &mut S) -> Result<Self, S::Error> {
let x = store.get_first()?; // First key, or default
let range = Range::new(x.clone(), x); // "All elements" range
let fingerprint = store.get_fingerprint(&range)?;
Ok(Message { parts: vec![RangeFingerprint { range, fingerprint }] })
}
```
This sends a single fingerprint covering the entire set.
### 2. Processing (Bob processes Alice's message)
For each part in the message:
**Case 1: RangeFingerprint matches local fingerprint** → Nothing to do, sets are equal in this range.
**Case 2: RangeFingerprint is empty OR range has ≤ 1 local entry** → Send all entries in the range as a `RangeItem`.
**Case 3: Recurse** → Split the range into `split_factor` partitions, compute fingerprints, and send either `RangeFingerprint` (if partition is large) or `RangeItem` (if partition is small enough, ≤ `max_set_size`).
### 3. Processing RangeItem
When a peer receives a `RangeItem`:
1. **Validate** each incoming entry using `validate_cb`
2. **Insert** valid entries via `Store::put()` (which handles prefix deletion)
3. **Notify** via `on_insert_cb` for actually-inserted entries
4. If `have_local` is false, compute the **diff** — entries in the local range not present in the received set — and send them back
### Configuration
```rust
struct SyncConfig {
max_set_size: usize, // Default: 1 — entries to send before using fingerprints
split_factor: usize, // Default: 2 — number of partitions per recursion step
}
```
With `max_set_size = 1` and `split_factor = 2`, the algorithm behaves like a binary search: each fingerprint mismatch splits the range in two and sends fingerprints for both halves.
## Store Trait
The `Store` trait provides the interface that the reconciliation algorithm needs:
```rust
pub trait Store<E: RangeEntry>: Sized {
type Error: Debug + Send + Sync + Into<anyhow::Error> + 'static;
type RangeIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
type ParentIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
fn get_first(&mut self) -> Result<E::Key, Self::Error>;
fn get_fingerprint(&mut self, range: &Range<E::Key>) -> Result<Fingerprint, Self::Error>;
fn entry_put(&mut self, entry: E) -> Result<(), Self::Error>;
fn get_range(&mut self, range: Range<E::Key>) -> Result<Self::RangeIterator<'_>, Self::Error>;
fn prefixes_of(&mut self, key: &E::Key) -> Result<Self::ParentIterator<'_>, Self::Error>;
fn remove_prefix_filtered(&mut self, prefix: &E::Key, predicate: impl Fn(&E::Value) -> bool) -> Result<usize, Self::Error>;
fn initial_message(&mut self) -> Result<Message<E>, Self::Error>;
async fn process_message<F, F2, F3>(...) -> Result<Option<Message<E>>, Self::Error>;
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error>;
}
```
### Insert Semantics in `Store::put()`
The `put` method implements the CRDT insert logic:
```rust
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error> {
// 1. Check prefix entries — if any parent entry has value >= new entry, reject
for prefix_entry in self.prefixes_of(entry.key())? {
if entry.value() <= prefix_entry.value() {
return Ok(InsertOutcome::NotInserted);
}
}
// 2. Remove entries whose key is prefixed by new entry's key AND whose value is <=
let removed = self.remove_prefix_filtered(entry.key(), |v| entry.value() >= v)?;
// 3. Insert the new entry
self.entry_put(entry)?;
Ok(InsertOutcome::Inserted { removed })
}
```
### InsertOutcome
```rust
enum InsertOutcome {
NotInserted, // A newer or equal entry already exists
Inserted { removed: usize }, // Successfully inserted; reports removed entries
}
```
## Sync Flow at the Protocol Level
The `Replica` type provides the sync interface:
```rust
// Create initial message for sync
fn sync_initial_message(&mut self) -> anyhow::Result<ProtocolMessage>
// Process an incoming message and produce optional reply
async fn sync_process_message(
&mut self,
message: ProtocolMessage,
from_peer: PeerIdBytes,
state: &mut SyncOutcome,
) -> Result<Option<ProtocolMessage>, anyhow::Error>
```
### SyncOutcome
Tracks the result of a sync session:
```rust
pub struct SyncOutcome {
pub heads_received: AuthorHeads, // Latest timestamps per author from remote
pub num_recv: usize, // Number of entries received
pub num_sent: usize, // Number of entries sent
}
```
## Network Protocol (Codec)
The sync protocol operates over a QUIC bidirectional stream:
1. **Alice** (initiator) sends `Message::Init { namespace, message }`
2. **Bob** (responder) validates the namespace and either:
- Accepts and processes the initial message
- Rejects with `Message::Abort { reason }`
3. Both peers exchange `Message::Sync(message)` rounds until one side has no reply (convergence reached)
The `BobState` manages the responder side, tracking namespace and `SyncOutcome` progress across message rounds.
### Abort Reasons
```rust
pub enum AbortReason {
NotFound, // Namespace not available
AlreadySyncing, // Already syncing this namespace
InternalServerError,
}
```
### Concurrent Sync Prevention
When both peers try to sync with each other simultaneously, the system uses a deterministic tiebreaker based on comparing `EndpointId` bytes — the peer with the larger ID accepts, the other connects.

View File

@@ -0,0 +1,257 @@
# iroh-docs: Store and Persistence
## Store Architecture
The store is implemented in `store::fs::Store` using `redb`, an embedded key-value database. It supports two modes:
- **In-memory**: `Store::memory()` — backed by a `Vec<u8>` via `redb::backends::InMemoryBackend`
- **Persistent**: `Store::persistent(path)` — backed by a single file on disk
Both modes use the same `redb` table structure.
## redb Table Schema
### Authors Table
```
Table: "authors-1"
Key: [u8; 32] (AuthorId)
Value: [u8; 32] (Author secret key bytes)
```
### Namespaces Table
```
Table: "namespaces-2"
Key: [u8; 32] (NamespaceId)
Value: (u8, [u8; 32]) (CapabilityKind, key bytes)
```
The `CapabilityKind` discriminates between `Write = 1` (full key stored) and `Read = 2` (only the public key / namespace ID stored).
### Records Table (Primary)
```
Table: "records-1"
Key: (NamespaceId, AuthorId, key_bytes) = ([u8; 32], [u8; 32], &[u8])
Value: (timestamp, namespace_sig, author_sig, len, hash) = (u64, &[u8; 64], &[u8; 64], u64, &[u8; 32])
```
This is the main table storing all document entries. The key layout `(namespace, author, key)` enables efficient range queries for the sync algorithm.
### Latest-Per-Author Table
```
Table: "latest-by-author-1"
Key: (NamespaceId, AuthorId) = (&[u8; 32], &[u8; 32])
Value: (timestamp, key_bytes) = (u64, &[u8])
```
Used to quickly determine the latest entry timestamp for each author, supporting `AuthorHeads` computation and `has_news_for_us()` checks.
### Records-By-Key Table (Index)
```
Table: "records-by-key-1"
Key: (NamespaceId, key_bytes, AuthorId) = (&[u8; 32], &[u8], &[u8; 32])
Value: ()
```
An index table that enables efficient queries by key prefix, supporting `Query::key_prefix()` and `Query::key_exact()` lookups.
### Namespace Peers Table (Multimap)
```
MultimapTable: "sync-peers-1"
Key: &[u8; 32] (NamespaceId)
Value: (Nanos, &PeerIdBytes) (timestamp_nanos, peer_id)
```
Stores up to 5 (`PEERS_PER_DOC_CACHE_SIZE`) recently-useful peers per namespace. This is an LRU cache: when full, the oldest peer is evicted when a new one is registered.
### Download Policy Table
```
Table: "download-policy-1"
Key: &[u8; 32] (NamespaceId)
Value: &[u8] (postcard-encoded DownloadPolicy)
```
Per-namespace download policies controlling which content blobs to automatically download.
## Store Operations
### Transaction Model
The `Store` uses a "current transaction" approach:
```rust
enum CurrentTransaction {
None,
Read(ReadOnlyTables),
Write(TransactionAndTables),
}
```
- Read operations obtain a read snapshot
- Write operations batch into a write transaction
- Transactions older than `MAX_COMMIT_DELAY` (500ms) are automatically committed
- `flush()` commits any pending write transaction
### Core Methods
```rust
// Create/open/close replicas
fn new_replica(&mut self, namespace: NamespaceSecret) -> Result<Replica<'_>>;
fn open_replica(&mut self, namespace_id: &NamespaceId) -> Result<Replica<'_>>;
fn close_replica(&mut self, id: NamespaceId);
fn import_namespace(&mut self, capability: Capability) -> Result<ImportNamespaceOutcome>;
// Author management
fn new_author<R: CryptoRng>(&mut self, rng: &mut R) -> Result<Author>;
fn import_author(&mut self, author: Author) -> Result<()>;
fn get_author(&mut self, author_id: &AuthorId) -> Result<Option<Author>>;
fn delete_author(&mut self, author: AuthorId) -> Result<()>;
// Queries
fn get_many(&mut self, namespace: NamespaceId, query: impl Into<Query>) -> Result<QueryIterator>;
fn get_exact(&mut self, namespace: NamespaceId, author: AuthorId, key: impl AsRef<[u8]>, include_empty: bool) -> Result<Option<SignedEntry>>;
fn get_latest_for_each_author(&mut self, namespace: NamespaceId) -> Result<LatestIterator<'_>>;
// Sync support
fn has_news_for_us(&mut self, namespace: NamespaceId, heads: &AuthorHeads) -> Result<Option<NonZeroU64>>;
fn get_sync_peers(&mut self, namespace: &NamespaceId) -> Result<Option<PeersIter>>;
fn register_useful_peer(&mut self, namespace: NamespaceId, peer: PeerIdBytes) -> Result<()>;
// Content
fn content_hashes(&mut self) -> Result<ContentHashesIterator>;
```
### ImportNamespaceOutcome
```rust
pub enum ImportNamespaceOutcome {
Inserted, // New namespace created
Upgraded, // Existing namespace upgraded from Read to Write
NoChange, // Namespace already existed with same or higher capability
}
```
## Query System
The `Query` type supports flexible entry lookups:
```rust
pub struct Query {
kind: QueryKind,
filter_author: AuthorFilter,
filter_key: KeyFilter,
limit: Option<u64>,
offset: u64,
include_empty: bool,
sort_direction: SortDirection,
}
```
### Query Kinds
```rust
enum QueryKind {
Flat(FlatQuery), // Returns all matching entries
SingleLatestPerKey(SingleLatestPerKeyQuery), // Returns only latest entry per key
}
```
- **Flat**: Returns all entries matching the filters, sorted by `(namespace, author, key)` or `(namespace, key, author)` depending on `SortBy`
- **SingleLatestPerKey**: Groups by key and returns only the latest entry (by record value ordering) per key
### Filters
```rust
enum KeyFilter {
Any, // Match all keys
Exact(Bytes), // Exact key match
Prefix(Bytes), // Key starts with prefix
}
enum AuthorFilter {
Any, // Match all authors
Exact(AuthorId), // Match specific author
}
```
### Builder Pattern
```rust
// Get all entries
Query::all()
// Get entries by author
Query::author(author_id)
// Get entries by key prefix
Query::key_prefix(b"/path/")
// Get single latest entry per key
Query::single_latest_per_key()
.key_prefix(b"/path/")
.author(author_id)
```
## Download Policy
Controls which content blobs to automatically download after sync:
```rust
pub enum DownloadPolicy {
NothingExcept(Vec<FilterKind>), // Only download matching entries
EverythingExcept(Vec<FilterKind>), // Download all except matching (default)
}
pub enum FilterKind {
Prefix(Bytes), // Matches keys starting with bytes
Exact(Bytes), // Matches exact key
}
```
Default: `EverythingExcept(Vec::new())` — download everything.
## PublicKeyStore
The `PublicKeyStore` trait caches expanded `ed25519_dalek::VerifyingKey` objects to avoid repeated curve point decompression:
```rust
pub trait PublicKeyStore {
fn public_key(&self, id: &[u8; 32]) -> Result<VerifyingKey, SignatureError>;
fn namespace_key(&self, bytes: &NamespaceId) -> Result<NamespacePublicKey, SignatureError>;
fn author_key(&self, bytes: &AuthorId) -> Result<AuthorPublicKey, SignatureError>;
}
```
The `MemPublicKeyStore` implementation uses `Arc<RwLock<HashMap<[u8; 32], VerifyingKey>>>` for thread-safe caching.
The `Store` itself implements `PublicKeyStore`, leveraging its redb tables for author storage and the in-memory cache for fast verification.
## StoreInstance
```rust
pub struct StoreInstance<'a> {
namespace: NamespaceId,
store: &'a mut Store,
}
```
A `StoreInstance` bundles a namespace ID with a mutable reference to the store, providing the `ranger::Store<SignedEntry>` implementation for the sync algorithm. This is what `Replica` uses internally to perform sync operations.
## Replica
```rust
pub struct Replica<'a, I = Box<ReplicaInfo>> {
store: StoreInstance<'a>,
info: I,
}
```
`Replica` is the primary user-facing type for document operations. It combines:
- A `StoreInstance` for data access
- `ReplicaInfo` for metadata (capability, subscribers, content status callback)
Key methods:
- `insert(key, author, hash, len)` — Insert a new entry
- `delete_prefix(prefix, author)` — Delete entries by key prefix
- `insert_remote_entry(entry, from, content_status)` — Insert from sync
- `hash_and_insert(key, author, data)` — Hash data and insert
- `sync_initial_message()` / `sync_process_message()` — Sync protocol operations

View File

@@ -0,0 +1,343 @@
# iroh-docs: Engine and Live Sync
## Overview
The `Engine` is the top-level coordinator for live document synchronization. It brings together:
1. **SyncHandle/Actor** — Single-threaded actor for all store and replica operations
2. **LiveActor** — Async event loop coordinating sync, gossip, and content downloads
3. **GossipState** — Integration with `iroh-gossip` for broadcasting updates
4. **Blobs/Downloader** — Integration with `iroh-blobs` for content transfer
## Engine
```rust
pub struct Engine {
pub endpoint: Endpoint,
pub sync: SyncHandle,
pub default_author: DefaultAuthor,
to_live_actor: mpsc::Sender<ToLiveActor>,
actor_handle: AbortOnDropHandle<()>,
content_status_cb: ContentStatusCallback,
blob_store: iroh_blobs::api::Store,
_gc_protect_task: AbortOnDropHandle<()>,
}
```
### Initialization
```rust
Engine::spawn(
endpoint, // iroh Endpoint for QUIC connections
gossip, // iroh-gossip instance
replica_store, // Store for document data
bao_store, // iroh-blobs Store for content blobs
downloader, // Downloader for fetching blobs
default_author_storage, // Where to persist the default author
protect_cb, // Optional GC protection callback
) -> Result<Self>
```
During spawn:
1. A `ContentStatusCallback` is created that checks blob availability in `iroh-blobs`
2. A `SyncHandle` actor is spawned on a dedicated thread
3. A `LiveActor` is spawned as a tokio task
4. The default author is loaded or created
5. A GC protection task is started (if callback provided)
### Key Engine Methods
```rust
// Start syncing a document with given peers
async fn start_sync(&self, namespace: NamespaceId, peers: Vec<EndpointAddr>) -> Result<()>
// Stop syncing and leave gossip swarm
async fn leave(&self, namespace: NamespaceId, kill_subscribers: bool) -> Result<()>
// Subscribe to document events
async fn subscribe(&self, namespace: NamespaceId) -> Result<impl Stream<Item = Result<LiveEvent>>>
// Handle incoming QUIC connections
async fn handle_connection(&self, conn: Connection) -> Result<()>
// Shutdown the engine
async fn shutdown(&self) -> Result<()>
```
### GC Protection
The `ProtectCallbackHandler` bridges iroh-docs with iroh-blobs' garbage collection:
```rust
let (handler, protect_cb) = ProtectCallbackHandler::new();
// protect_cb goes into iroh-blobs GC config
// handler goes into Engine::spawn
```
When iroh-blobs runs GC, it calls `protect_cb` which queries the docs store for all content hashes, ensuring blobs referenced by document entries are not garbage-collected.
## SyncHandle / Actor
The `SyncHandle` is a handle to a single-threaded actor that processes all store and replica operations sequentially:
```rust
pub struct SyncHandle {
tx: async_channel::Sender<Action>,
join_handle: Arc<Option<std::thread::JoinHandle<()>>>,
metrics: Arc<Metrics>,
}
```
### Actor Architecture
```
External Code ──async──▶ SyncHandle ──channel──▶ Actor Thread
Store (redb)
Replica operations
Flush on timeout (500ms)
```
The actor runs on a **dedicated OS thread** (not a tokio task), using `tokio::runtime::Builder::new_current_thread()` internally. This ensures store operations are never concurrent.
### Action Types
```rust
enum Action {
ImportAuthor { author, reply },
ExportAuthor { author, reply },
DeleteAuthor { author, reply },
ImportNamespace { capability, reply },
ListAuthors { reply },
ListReplicas { reply },
ContentHashes { reply },
FlushStore { reply },
Replica(NamespaceId, ReplicaAction),
Shutdown { reply },
}
enum ReplicaAction {
Open { reply, opts },
Close { reply },
GetState { reply },
SetSync { sync, reply },
Subscribe { sender, reply },
Unsubscribe { sender, reply },
InsertLocal { author, key, hash, len, reply },
DeletePrefix { author, key, reply },
InsertRemote { entry, from, content_status, reply },
SyncInitialMessage { reply },
SyncProcessMessage { message, from, state, reply },
GetSyncPeers { reply },
RegisterUsefulPeer { peer, reply },
GetExact { author, key, include_empty, reply },
GetMany { query, reply },
DropReplica { reply },
ExportSecretKey { reply },
HasNewsForUs { heads, reply },
SetDownloadPolicy { policy, reply },
GetDownloadPolicy { reply },
}
```
### Replica Opening
When a replica is opened via the actor, an `OpenReplica` struct is created:
```rust
struct OpenReplica {
info: ReplicaInfo, // Capability, subscribers, content status callback
sync: bool, // Whether to accept sync requests
handles: usize, // Reference count for open handles
}
```
Multiple handles to the same replica are supported via reference counting.
## LiveActor
The `LiveActor` is the central async coordinator:
```rust
pub struct LiveActor {
inbox: mpsc::Receiver<ToLiveActor>,
sync: SyncHandle,
endpoint: Endpoint,
bao_store: Store,
downloader: Downloader,
memory_lookup: MemoryLookup,
replica_events_tx: async_channel::Sender<Event>,
replica_events_rx: async_channel::Receiver<Event>,
sync_actor_tx: mpsc::Sender<ToLiveActor>,
gossip: GossipState,
running_sync_connect: JoinSet<SyncConnectRes>,
running_sync_accept: JoinSet<SyncAcceptRes>,
download_tasks: JoinSet<DownloadRes>,
missing_hashes: HashSet<Hash>,
queued_hashes: QueuedHashes,
hash_providers: ProviderNodes,
subscribers: SubscribersMap,
state: NamespaceStates,
metrics: Arc<Metrics>,
}
```
### Event Loop
The `LiveActor::run_inner()` loop uses `tokio::select!` with biased polling:
```rust
tokio::select! {
biased;
msg = self.inbox.recv() => { /* handle actor messages */ }
event = self.replica_events_rx.recv() => { /* handle replica insert events */ }
res = self.running_sync_connect.join_next() => { /* sync connect finished */ }
res = self.running_sync_accept.join_next() => { /* sync accept finished */ }
res = self.download_tasks.join_next() => { /* download completed */ }
res = self.gossip.progress() => { /* gossip task progress */ }
}
```
### ToLiveActor Messages
```rust
pub enum ToLiveActor {
StartSync { namespace, peers, reply },
Leave { namespace, kill_subscribers, reply },
Shutdown { reply },
Subscribe { namespace, sender, reply },
HandleConnection { conn },
AcceptSyncRequest { namespace, peer, reply },
IncomingSyncReport { from, report },
NeighborContentReady { namespace, node, hash },
NeighborUp { namespace, peer },
NeighborDown { namespace, peer },
}
```
### Gossip Operations (Op)
```rust
pub enum Op {
Put(SignedEntry), // New entry inserted
ContentReady(Hash), // Content blob now available
SyncReport(SyncReport), // Heads summary after sync
}
```
Gossip broadcasts `Op` messages to all swarm participants. When a `Put` is received, the entry is inserted into the local replica. When a `ContentReady` is received, peers know they can download the blob. When a `SyncReport` is received, peers check `has_news_for_us()` to decide if they should sync.
### Content Download Flow
1. When a `RemoteInsert` event occurs with `should_download: true`, the entry's content hash is queued for download
2. The `LiveActor` uses `iroh_blobs::downloader::Downloader` to fetch the blob
3. Known providers (peers who had `ContentStatus::Complete`) are used as download sources
4. On download completion, a `LiveEvent::ContentReady` event is emitted
### LiveEvent (Public API)
```rust
pub enum LiveEvent {
InsertLocal { entry: Entry },
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
ContentReady { hash: Hash },
PendingContentReady,
NeighborUp(PublicKey),
NeighborDown(PublicKey),
SyncFinished(SyncEvent),
}
```
`SyncEvent` wraps `SyncFinished`:
```rust
pub struct SyncFinished {
pub namespace: NamespaceId,
pub peer: PublicKey,
pub outcome: SyncOutcome,
pub timings: Timings,
}
```
## NamespaceStates
```rust
pub struct NamespaceStates(BTreeMap<NamespaceId, NamespaceState>);
struct NamespaceState {
nodes: BTreeMap<EndpointId, PeerState>,
may_emit_ready: bool,
}
```
Each peer has a `PeerState` tracking sync progress:
```rust
struct PeerState {
state: SyncState, // Idle or Running
resync_requested: bool, // Whether a resync was requested during active sync
last_sync: Option<(Instant, Result<SyncFinished>)>,
}
```
This state machine prevents concurrent syncs with the same peer for the same namespace and queues resync requests when needed.
## DefaultAuthor
```rust
pub struct DefaultAuthor {
value: RwLock<AuthorId>,
storage: DefaultAuthorStorage,
}
```
- `DefaultAuthorStorage::Mem` — Ephemeral, creates a new author each time
- `DefaultAuthorStorage::Persistent(path)` — Stores the author ID as hex in a file, loads it on startup
The default author provides a convenient "current user" identity for applications.
## Docs Protocol Handler
```rust
pub struct Docs {
engine: Arc<Engine>,
api: DocsApi,
}
```
`Docs` implements `ProtocolHandler` for integration with iroh's `Router`:
```rust
impl ProtocolHandler for Docs {
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> { ... }
async fn shutdown(&self) { ... }
}
```
The `Builder` pattern configures storage:
```rust
let docs = Docs::memory()
.spawn(endpoint, blobs, gossip)
.await?;
// or
let docs = Docs::persistent(path)
.protect_handler(handler)
.spawn(endpoint, blobs, gossip)
.await?;
```
## DocTicket
```rust
pub struct DocTicket {
pub capability: Capability,
pub nodes: Vec<EndpointAddr>,
}
```
A `DocTicket` encapsulates everything needed to join a document:
- A `Capability` (Read or Write) — provides the namespace key
- A list of `EndpointAddr` — bootstrap peers to connect to
Tickets are serialized as base32-encoded postcard data with a `"doc"` prefix, using the `iroh_tickets::Ticket` trait.

View File

@@ -0,0 +1,189 @@
# iroh-docs: Network Protocol and Wire Format
## ALPN
The docs protocol uses ALPN `/iroh-sync/1` for QUIC connection identification.
```rust
pub const ALPN: &[u8] = b"/iroh-sync/1";
```
## Connection Flow
### Outgoing Sync (Alice — Initiator)
```rust
pub async fn connect_and_sync(
endpoint: &Endpoint,
sync: &SyncHandle,
namespace: NamespaceId,
peer: EndpointAddr,
metrics: Option<&Metrics>,
) -> Result<SyncFinished, ConnectError>
```
1. Open a QUIC connection to the peer with ALPN `/iroh-sync/1`
2. Open a bidirectional QUIC stream
3. Run the Alice (initiator) protocol via `run_alice()`
4. Close the stream and return `SyncFinished`
### Incoming Sync (Bob — Responder)
```rust
pub async fn handle_connection<F, Fut>(
sync: SyncHandle,
connection: Connection,
accept_cb: F,
metrics: Option<&Metrics>,
) -> Result<SyncFinished, AcceptError>
```
1. Accept a bidirectional QUIC stream from the connection
2. Run the Bob (responder) protocol via `BobState::run()`
3. The `accept_cb` determines whether to accept or reject each namespace
4. Close the stream and return `SyncFinished`
## Wire Format
### Frame Codec
All messages are length-prefixed:
```
┌──────────────────────┬──────────────────────────────┐
│ u32 big-endian len │ postcard-serialized Message │
└──────────────────────┴──────────────────────────────┘
```
Maximum message size: 1 GiB.
### Message Types
```rust
enum Message {
Init {
namespace: NamespaceId, // Which document to sync
message: ProtocolMessage, // Initial sync message (ranger::Message<SignedEntry>)
},
Sync(ProtocolMessage), // Subsequent sync round-trip messages
Abort { reason: AbortReason }, // Responder rejects the request
}
```
### Serialization
Messages use `postcard` (a compact `serde` format optimized for embedded/no-std use). The `SyncCodec` implements `tokio_util::codec::Encoder` and `Decoder` for async stream framing.
## Protocol Sequence
```
Alice (Initiator) Bob (Responder)
│ │
│──── Init { namespace, initial_msg } ───────▶│
│ │
│◀─── Sync(reply_msg) ────────────────────── │ (or Abort)
│ │
│──── Sync(next_msg) ──────────────────────▶│
│ │
│◀─── Sync(reply_msg) ────────────────────── │
│ │
│──── Sync(next_msg) ──────────────────────▶│
│ │
│ ... until convergence ... │
│ │
│──── (stream closed) ─────────────────────▶│
│ │
```
The protocol terminates when one side has no more messages to send (convergence reached). Each `Sync` message carries a `ProtocolMessage` which is a `ranger::Message<SignedEntry>` containing `MessagePart`s (either `RangeFingerprint` or `RangeItem`).
## SyncFinished Result
```rust
pub struct SyncFinished {
pub namespace: NamespaceId,
pub peer: PublicKey,
pub outcome: SyncOutcome, // heads_received, num_recv, num_sent
pub timings: Timings, // connect duration, process duration
}
```
## Error Types
### ConnectError
```rust
pub enum ConnectError {
Connect { error: anyhow::Error }, // Connection failed
RemoteAbort(AbortReason), // Remote rejected our request
Sync { error: anyhow::Error }, // Sync protocol error
Close { error: anyhow::Error }, // Stream close error
}
```
### AcceptError
```rust
pub enum AcceptError {
Connect { error: anyhow::Error }, // Connection failed
Open { peer: PublicKey, error }, // Failed to open replica
Abort { peer, namespace, reason }, // We aborted
Sync { peer, namespace, error }, // Sync protocol error
Close { peer, namespace, error }, // Stream close error
}
```
## Gossip Integration
The `GossipState` manages iroh-gossip subscriptions per namespace:
```rust
pub struct GossipState {
gossip: Gossip,
sync: SyncHandle,
to_live_actor: mpsc::Sender<ToLiveActor>,
active: HashMap<NamespaceId, ActiveState>,
active_tasks: JoinSet<(NamespaceId, Result<()>)>,
}
```
When a document starts syncing:
1. The engine joins a gossip topic for that namespace
2. `GossipState::join()` subscribes with bootstrap peers
3. A receive loop task is spawned to process incoming gossip messages
4. `Op` messages (Put, ContentReady, SyncReport) are deserialized and forwarded to `LiveActor`
When receiving an `Op::Put`:
```rust
// In the gossip receive loop:
let entry = SignedEntry::from_entry(...); // deserialize
sync.insert_remote(namespace, entry, from, content_status).await?;
```
When receiving an `Op::SyncReport`:
```rust
// Forward to LiveActor which checks has_news_for_us()
to_live_actor.send(ToLiveActor::IncomingSyncReport { from, report }).await?;
```
Broadcasting:
```rust
// When a local insert occurs:
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::Put(entry))).await;
// When content becomes ready:
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::ContentReady(hash))).await;
```
## Sync Report Compression
`SyncReport` encodes `AuthorHeads` with an optional size limit:
```rust
pub struct SyncReport {
namespace: NamespaceId,
heads: Vec<u8>, // postcard-encoded AuthorHeads with size limit
}
```
The size limit ensures gossip messages stay small, dropping the oldest (least recent) author timestamps when necessary.

View File

@@ -0,0 +1,188 @@
# iroh-docs: API and RPC
## DocsApi
The `DocsApi` provides an RPC-based interface to the docs engine, implemented via `irpc`:
```rust
#[derive(Debug, Clone)]
pub struct DocsApi {
inner: Client<DocsProtocol>,
}
```
### Methods (via irpc)
The API exposes document operations through an RPC protocol defined in `api/protocol.rs`:
| Method | Request | Response | Description |
|--------|---------|----------|-------------|
| `Open` | `OpenRequest { doc_id }` | `OpenResponse` | Open a document for operations |
| `Close` | `CloseRequest { doc_id }` | `CloseResponse` | Close a document |
| `Status` | `StatusRequest { doc_id }` | `StatusResponse { status: OpenState }` | Get document open state |
| `List` | `ListRequest` | Stream of `ListResponse { id, capability }` | List all documents |
| `Create` | `CreateRequest` | `CreateResponse { id }` | Create a new document |
| `Drop` | `DropRequest { doc_id }` | `DropResponse` | Remove a document |
| `Import` | `ImportRequest { capability }` | `ImportResponse { doc_id }` | Import a document by capability |
| `Set` | `SetRequest { doc_id, author_id, key, value }` | `SetResponse { entry }` | Set a key-value pair |
| `SetHash` | `SetHashRequest { doc_id, author_id, key, hash, size }` | `SetHashResponse` | Set a key with pre-hashed content |
| `GetMany` | `GetManyRequest { doc_id, query }` | Stream of entries | Query entries |
| `GetExact` | `GetExactRequest { doc_id, key, author, include_empty }` | `GetExactResponse { entry }` | Get single entry |
| `Del` | `DelRequest { doc_id, author_id, key }` | `DelResponse { removed }` | Delete by key prefix |
| `Subscribe` | `SubscribeRequest { doc_id }` | Stream of `LiveEvent` | Subscribe to document events |
| `Share` | `ShareRequest { doc_id, mode, peers }` | `ShareResponse { ticket }` | Create a sharing ticket |
| `StartSync` | `StartSyncRequest { doc_id, peers }` | `StartSyncResponse` | Start live sync |
| `Leave` | `LeaveRequest { doc_id }` | `LeaveResponse` | Leave gossip swarm |
| `ImportFile` | `ImportFileRequest { ... }` | Stream of `ImportProgress` | Import file content and set key |
| `ExportFile` | `ExportFileRequest { ... }` | Stream of `ExportProgress` | Export content to file |
| `AuthorList` | `AuthorListRequest` | Stream of `AuthorListResponse` | List authors |
| `AuthorCreate` | `AuthorCreateRequest` | `AuthorCreateResponse { author_id }` | Create new author |
| `AuthorImport` | `AuthorImportRequest { author }` | `AuthorImportResponse { author_id }` | Import author key |
| `AuthorExport` | `AuthorExportRequest { author_id }` | `AuthorExportResponse { author }` | Export author key |
| `AuthorDelete` | `AuthorDeleteRequest { author_id }` | `AuthorDeleteResponse` | Delete author |
| `AuthorGetDefault` | `AuthorGetDefaultRequest` | `AuthorGetDefaultResponse { author_id }` | Get default author |
| `AuthorSetDefault` | `AuthorSetDefaultRequest { author_id }` | `AuthorSetDefaultResponse` | Set default author |
| `SetDownloadPolicy` | `SetDownloadPolicyRequest { doc_id, policy }` | `SetDownloadPolicyResponse` | Set download policy |
| `GetDownloadPolicy` | `GetDownloadPolicyRequest { doc_id }` | `GetDownloadPolicyResponse { policy }` | Get download policy |
| `GetSyncPeers` | `GetSyncPeersRequest { doc_id }` | `GetSyncPeersResponse { peers }` | Get known sync peers |
## RPC Implementation
The RPC is implemented via `irpc` (for local/remote procedure calls) and `noq` (for remote network access):
### Local API
`DocsApi::spawn(engine)` creates an `RpcActor` that processes requests against the engine directly:
```rust
impl DocsApi {
pub fn spawn(engine: Arc<Engine>) -> Self {
RpcActor::spawn(engine)
}
}
```
### Remote API
When the `rpc` feature is enabled, `DocsApi::connect(endpoint, addr)` creates a remote client that sends requests over the network via `noq`.
### Protocol Dispatch
```rust
irpc::rpc::Handler<DocsProtocol> dispatches:
DocsProtocol::Open(msg) => local.send((msg, tx)).await
DocsProtocol::Set(msg) => local.send((msg, tx)).await
// ... etc
```
## RpcActor
The `RpcActor` (in `api/actor.rs`) bridges the RPC protocol to the `Engine`:
```rust
struct RpcActor {
engine: Arc<Engine>,
}
```
It handles each request type by calling the corresponding `Engine`/`SyncHandle` method and returning the result through the RPC channel.
For streaming responses (like `GetMany`, `Subscribe`, `AuthorList`), the actor sends results through an `mpsc` channel that the RPC framework streams back to the client.
## Share Mode and Tickets
When sharing a document:
```rust
pub enum ShareMode {
Read, // Share with read-only capability
Write, // Share with full write capability
}
```
The `Share` RPC method:
1. Gets or creates the namespace capability
2. Creates a `DocTicket` with the capability and provided peer addresses
3. Starts sync with the provided peers
4. Returns the ticket for distribution
## Example: Basic Setup
```rust
use iroh::{endpoint::presets, protocol::Router, Endpoint};
use iroh_blobs::{BlobsProtocol, store::mem::MemStore, ALPN as BLOBS_ALPN};
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let endpoint = Endpoint::bind(presets::N0).await?;
let blobs = MemStore::default();
let gossip = Gossip::builder().spawn(endpoint.clone());
let docs = Docs::memory()
.spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
.await?;
let router = Router::builder(endpoint.clone())
.accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
.accept(GOSSIP_ALPN, gossip)
.accept(DOCS_ALPN, docs)
.spawn();
Ok(())
}
```
## Data Flow Summary
```
┌─────────────────────────────────────────────────────────────────┐
│ Application / RPC │
│ DocsApi ──irpc──▶ RpcActor ──▶ Engine / SyncHandle │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Live Sync (per document) │
│ │
│ LiveActor event loop: │
│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Actor Messages │ │ Replica Events │ │ Gossip Events │ │
│ │ (StartSync, │ │ (LocalInsert, │ │ (Put, │ │
│ │ Subscribe, │ │ RemoteInsert) │ │ ContentReady, │ │
│ │ Leave, ...) │ │ │ │ SyncReport) │ │
│ └──────┬─────────┘ └───────┬────────┘ └──────┬──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LiveActor::run_inner() │ │
│ │ tokio::select! { ... } │ │
│ │ │ │
│ │ - Start/stop gossip subscriptions │ │
│ │ - Initiate outgoing syncs (connect_and_sync) │ │
│ │ - Accept incoming syncs (handle_connection) │ │
│ │ - Queue content downloads │ │
│ │ - Broadcast local inserts via gossip │ │
│ │ - Emit LiveEvent to subscribers │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ Running Tasks: │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ sync_connect tasks│ │ sync_accept tasks │ │
│ └───────────────────┘ └───────────────────┘ │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ download tasks │ │ gossip receive loop│ │
│ └───────────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Sync Actor (dedicated thread) │
│ │
│ ┌────────────┐ ┌─────────────────────────────────────────┐ │
│ │ Action │ │ Replica Operations: │ │
│ │ Channel │──▶│ Insert, Delete, Get, Query, │ │
│ │ (bounded) │ │ SyncInit, SyncProcess, Open, Close, ...│ │
│ └────────────┘ └─────────────────────────────────────────┘ │
│ │
│ Store (redb) ──▶ All reads/writes on this thread │
└─────────────────────────────────────────────────────────────────┘
```

View File

@@ -0,0 +1,318 @@
# iroh-docs: Key Types Reference
## Cryptographic Keys
### NamespaceSecret
```rust
pub struct NamespaceSecret {
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
}
```
- The write capability for a document
- Can sign entries (namespace signature)
- Derives `NamespacePublicKey` and `NamespaceId`
- Serialized as 32 bytes
### NamespacePublicKey
```rust
pub struct NamespacePublicKey(VerifyingKey); // ed25519_dalek::VerifyingKey
```
- The verifying key corresponding to `NamespaceSecret`
- Can verify namespace signatures on entries
- Serialized as 32 bytes
### NamespaceId
```rust
pub struct NamespaceId([u8; 32]);
```
- The byte representation of `NamespacePublicKey`
- Serves as the unique identifier for a document
- Can be converted back to `NamespacePublicKey` via `PublicKeyStore` (handles invalid curve points)
### Author
```rust
pub struct Author {
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
}
```
- A writer identity within a document
- Can sign entries (author signature)
- Derives `AuthorPublicKey` and `AuthorId`
- Created randomly with `Author::new(&mut rng)`
- Stored persistently in the redb authors table
### AuthorPublicKey
```rust
pub struct AuthorPublicKey(VerifyingKey);
```
- The verifying key corresponding to an `Author`
- Can verify author signatures on entries
- Serialized as 32 bytes
### AuthorId
```rust
pub struct AuthorId([u8; 32]);
```
- Byte representation of `AuthorPublicKey`
- Used as a component of `RecordIdentifier`
- Has `fmt_short()` for human-readable display (first 10 hex chars)
## Entry Types
### RecordIdentifier
```rust
pub struct RecordIdentifier(Bytes);
// Layout: [NamespaceId(32) | AuthorId(32) | Key(variable)]
```
- The composite key for an entry
- Byte layout: 32 bytes namespace + 32 bytes author + variable-length key
- Ordering: namespace → author → key (lexicographic)
- This ordering is critical for the range-based sync algorithm
### Record
```rust
pub struct Record {
len: u64, // Byte length of content
hash: Hash, // BLAKE3 hash of content (32 bytes)
timestamp: u64, // Microseconds since Unix epoch
}
```
- The value portion of an entry
- Ordering: timestamp first, then hash (Last-Writer-Wins)
- `Record::empty(timestamp)` creates a tombstone (hash=EMPTY, len=0)
- `Record::new_current(hash, len)` uses current system time
### Entry
```rust
pub struct Entry {
id: RecordIdentifier,
record: Record,
}
```
- Combines key and value
- `Entry::new(id, record)` constructor
- `Entry::new_empty(id)` creates a tombstone with current timestamp
- `entry.sign(namespace, author)` produces a `SignedEntry`
### SignedEntry
```rust
pub struct SignedEntry {
signature: EntrySignature, // Dual Ed25519 signatures
entry: Entry,
}
```
- An entry with cryptographic proof of authorization and authorship
- `SignedEntry::from_entry(entry, namespace, author)` — create from entry
- `signed_entry.verify(store)` — verify both signatures using a `PublicKeyStore`
- Implements `RangeEntry` for the sync algorithm
### EntrySignature
```rust
pub struct EntrySignature {
author_signature: Signature, // 64-byte Ed25519 signature
namespace_signature: Signature, // 64-byte Ed25519 signature
}
```
- Created by signing the canonical byte encoding of the `Entry`
- Both signatures cover the same message bytes
- Verification requires both `NamespacePublicKey` and `AuthorPublicKey`
## Sync Types
### SyncOutcome
```rust
pub struct SyncOutcome {
pub heads_received: AuthorHeads,
pub num_recv: usize,
pub num_sent: usize,
}
```
- Tracks the result of a sync session
- `heads_received` accumulates the latest timestamp seen from each author on the remote side
### ProtocolMessage
```rust
pub type ProtocolMessage = ranger::Message<SignedEntry>;
```
- The wire type for sync protocol messages
- Contains `Vec<MessagePart<SignedEntry>>`
### ContentStatus
```rust
pub enum ContentStatus {
Complete, // Content blob fully available
Incomplete, // Partially available
Missing, // Not available
}
```
- Communicated alongside entries during sync
- Helps peers decide whether to download content
### InsertOrigin
```rust
pub enum InsertOrigin {
Local,
Sync {
from: PeerIdBytes, // [u8; 32] — the remote peer
remote_content_status: ContentStatus,
},
}
```
## Event Types
### Event (Internal)
```rust
pub enum Event {
LocalInsert {
namespace: NamespaceId,
entry: SignedEntry,
},
RemoteInsert {
namespace: NamespaceId,
entry: SignedEntry,
from: PeerIdBytes,
should_download: bool,
remote_content_status: ContentStatus,
},
}
```
- Emitted by `Replica` via `ReplicaInfo` subscribers
- `should_download` is determined by the `DownloadPolicy`
### LiveEvent (Public)
```rust
pub enum LiveEvent {
InsertLocal { entry: Entry },
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
ContentReady { hash: Hash },
PendingContentReady,
NeighborUp(PublicKey),
NeighborDown(PublicKey),
SyncFinished(SyncEvent),
}
```
- Emitted by the `Engine` through `subscribe()`
- `InsertLocal` / `InsertRemote` are derived from `Event` by stripping `SignedEntry``Entry`
- `ContentReady` is emitted when a blob download completes
- `SyncFinished` wraps `SyncFinished` from the network layer
## Store Types
### Store (store::fs::Store)
```rust
pub struct Store {
db: Database, // redb database
transaction: CurrentTransaction, // Current read/write transaction
open_replicas: HashSet<NamespaceId>, // Track which replicas are open
pubkeys: MemPublicKeyStore, // Cache for expanded public keys
}
```
### Query
```rust
pub struct Query {
kind: QueryKind, // Flat or SingleLatestPerKey
filter_author: AuthorFilter, // Any or Exact
filter_key: KeyFilter, // Any, Exact, or Prefix
limit: Option<u64>,
offset: u64,
include_empty: bool,
sort_direction: SortDirection,
}
```
### Capability
```rust
pub enum Capability {
Write(NamespaceSecret),
Read(NamespaceId),
}
```
- `Write` allows inserting entries and signing them
- `Read` allows syncing and reading but not inserting
- Can be serialized as `(u8, [u8; 32])` — kind byte + key bytes
- `merge()` can upgrade `Read` to `Write`
### DownloadPolicy
```rust
pub enum DownloadPolicy {
NothingExcept(Vec<FilterKind>), // Whitelist mode
EverythingExcept(Vec<FilterKind>), // Blacklist mode (default)
}
```
### DocTicket
```rust
pub struct DocTicket {
pub capability: Capability,
pub nodes: Vec<EndpointAddr>,
}
```
- Serializable as a base32 string with "doc" prefix
- Contains everything needed to join a document
- The wire format uses a versioned enum: `TicketWireFormat::Variant0(DocTicket)`
## OpenState
```rust
pub struct OpenState {
pub sync: bool, // Whether sync is enabled
pub subscribers: usize, // Number of event subscribers
pub handles: usize, // Number of open handles
}
```
Returned by the `Status` RPC method to report the state of an open document.
## Utility Constants
| Constant | Value | Purpose |
|----------|-------|---------|
| `MAX_TIMESTAMP_FUTURE_SHIFT` | 10 min in μs | Max future drift for entry timestamps |
| `MAX_COMMIT_DELAY` | 500ms | Auto-commit interval for store transactions |
| `ACTION_CAP` | 1024 | Bounded channel capacity for SyncHandle actions |
| `ACTOR_CHANNEL_CAP` | 64 | Channel capacity for LiveActor messages |
| `SUBSCRIBE_CHANNEL_CAP` | 256 | Channel capacity for event subscriptions |
| `PEERS_PER_DOC_CACHE_SIZE` | 5 | LRU cache size for sync peers per document |
| `MAX_MESSAGE_SIZE` | 1 GiB | Max wire message size |

View File

@@ -0,0 +1,59 @@
# iroh-docs Reference Documentation
> Version: 0.98.0
> Repository: https://github.com/n0-computer/iroh-docs
> License: MIT/Apache-2.0
> Based on: [Range-Based Set Reconciliation (Meyer, 2022)](https://arxiv.org/abs/2212.13567)
## Document Index
| # | File | Topic |
|---|------|-------|
| 01 | [Overview and Architecture](01-overview-and-architecture.md) | High-level architecture, module layout, dependencies, feature flags |
| 02 | [Document Model](02-document-model.md) | CRDT data model: namespaces, authors, entries, signatures, prefix deletion, timestamps |
| 03 | [Sync Protocol](03-sync-protocol.md) | Range-based set reconciliation algorithm, fingerprints, message format, Store trait |
| 04 | [Store and Persistence](04-store-and-persistence.md) | redb table schema, transaction model, queries, download policies, PublicKeyStore |
| 05 | [Engine and Live Sync](05-engine-and-live-sync.md) | Engine, LiveActor, GossipState, content download, event system, DefaultAuthor |
| 06 | [Network Protocol](06-network-protocol.md) | ALPN, wire format, Alice/Bob protocol flow, error types, gossip integration |
| 07 | [API and Data Flow](07-api-and-data-flow.md) | RPC API, DocsApi, protocol messages, data flow diagrams |
| 08 | [Key Types Reference](08-key-types-reference.md) | All public types, constants, and their relationships |
## Quick Reference
### Core Concepts
- **Namespace**: A document identity. Identified by `NamespaceId` (32 bytes), backed by an Ed25519 keypair (`NamespaceSecret`).
- **Author**: A writer identity. Identified by `AuthorId` (32 bytes), backed by an Ed25519 keypair (`Author`).
- **Entry**: A record identified by (namespace, author, key) with a value of (hash, len, timestamp).
- **SignedEntry**: An entry with dual Ed25519 signatures (namespace + author) proving authorization and authorship.
- **Replica**: A local instance of a document, holding entries in a store.
- **Capability**: Either `Write(NamespaceSecret)` or `Read(NamespaceId)` — controls whether entries can be inserted.
- **Store**: A `redb`-backed persistent store managing authors, namespaces, entries, and peer caches.
- **Engine**: Coordinates sync actors, gossip, and content downloads for live synchronization.
### Key Algorithms
1. **Range-based set reconciliation**: Efficiently compute the union of two entry sets over a network by comparing fingerprints of partitions, subdividing when fingerprints differ.
2. **Prefix deletion**: An entry at key "foo" acts as a tombstone for all entries whose key starts with "foo/".
3. **Last-writer-wins**: When entries conflict on the same (namespace, author, key), the one with the higher (timestamp, hash) wins.
4. **XOR fingerprints**: Fingerprint of a set is the XOR of individual entry fingerprints (BLAKE3 hashes of key data).
### Data Flow
```
Application → DocsApi → Engine → LiveActor → GossipState → iroh-gossip
↓ ↓
SyncHandle → Actor → Store (redb) ← QUIC streams (iroh)
iroh-blobs (content transfer)
```
### Dependencies
- `iroh` — QUIC networking
- `iroh-blobs` — Content-addressed blob storage and transfer
- `iroh-gossip` — Gossip protocol for live updates
- `redb` — Embedded key-value store
- `ed25519-dalek` — Ed25519 signatures
- `blake3` — Hashing
- `postcard` — Serialization

View File

@@ -0,0 +1,79 @@
# iroh-gossip: Overview & Architecture
## What Is iroh-gossip?
`iroh-gossip` is a Rust crate that implements an **epidemic broadcast tree** protocol for disseminating messages among a swarm of peers interested in a common **topic**. It is based on two academic papers:
- **HyParView** — A hybrid partial view membership protocol for reliable swarm management ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf))
- **PlumTree** — An epidemic broadcast tree protocol for efficient message dissemination ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf))
The crate is designed as a protocol layer for the [iroh](https://docs.rs/iroh) networking library, but the core protocol logic is **IO-free** and can be used independently.
## High-Level Architecture
The crate is organized into two primary modules:
| Module | Purpose | IO-aware? |
|--------|---------|-----------|
| `proto` | Pure state-machine implementation of the gossip protocol | No — completely IO-free |
| `net` | Networking layer that runs the protocol over iroh connections | Yes — depends on `iroh` and tokio |
The `net` module is behind the `net` feature flag (enabled by default). An optional `rpc` feature adds remote procedure call support via the `irpc`/`noq` crates.
### Module Dependency Graph
```
┌──────────────┐
│ api │ ← Public API (Gossip, GossipTopic, GossipSender, GossipReceiver)
└──────┬───────┘
┌──────▼───────┐
│ net │ ← Networking actor, connection loops, dialer
└──────┬───────┘
┌──────▼───────┐
│ proto │ ← Pure protocol state machines
│ ┌─────────┐ │
│ │hyparview│ │ ← Membership layer
│ ├─────────┤ │
│ │ plumtree│ │ ← Broadcast layer
│ ├─────────┤ │
│ │ topic │ │ ← Per-topic coordinator
│ ├─────────┤ │
│ │ state │ │ ← Multi-topic state manager
│ ├─────────┤ │
│ │ util │ │ ← Shared data structures (IndexSet, TimeBoundCache, TimerMap)
│ └─────────┘ │
└──────────────┘
```
### Key Design Principles
1. **IO-free protocol core**: The `proto` module is a pure state machine. It takes `InEvent`s, produces `OutEvent`s, and has no knowledge of sockets, async runtimes, or network IO.
2. **Topic-based isolation**: Each topic (`TopicId` = 32-byte identifier) has completely independent state. Topics are separate swarms and broadcast scopes. Joining multiple topics increases connections and routing table size proportionally.
3. **Actor model for networking**: The `net` module runs a single async `Actor` that manages all topics, connections, and timers. It bridges between the protocol state machine and real network IO.
4. **Wire protocol**: Messages are serialized with `postcard` (a `no_std`-friendly serde format) and sent over QUIC streams via iroh connections. Each stream is prefixed with a `StreamHeader` containing the topic ID.
## Crate Features
| Feature | Default? | Description |
|---------|----------|-------------|
| `net` | Yes | Networking layer (requires `iroh`, `tokio`, etc.) |
| `rpc` | No | RPC support via `irpc`/`noq` for remote control |
| `metrics` | Yes | Prometheus-style metrics via `iroh-metrics` |
| `test-utils` | No | Test utilities (seeded RNG, etc.) |
| `simulator` | No | CLI simulator for testing |
| `examples` | No | Example binaries (chat, setup) |
## Cargo Dependencies (Key Ones)
- `iroh` / `iroh-base` — Networking primitives (Endpoint, EndpointId, PublicKey, etc.)
- `postcard` — Wire serialization (serde-based, `no_std` compatible)
- `blake3` — Message ID hashing
- `ed25519-dalek` — Cryptographic signatures
- `n0-future` / `n0-error` — Async utilities and error handling
- `irpc` / `noq` — RPC infrastructure (optional)
- `indexmap` — Order-preserving hash collections used in `IndexSet`

View File

@@ -0,0 +1,169 @@
# iroh-gossip: HyParView Membership Protocol
## Overview
The HyParView protocol provides **swarm membership management** — it maintains which peers are currently part of the swarm for a given topic and ensures the overlay network remains connected even as nodes join, leave, or fail.
It is implemented in `src/proto/hyparview.rs`.
## Core Concept: Two Views
Each peer maintains two sets of peers:
| View | Description | Default Size | Connection? |
|------|-------------|--------------|-------------|
| **Active View** | Peers we maintain active bidirectional connections to | 5 | Yes — TCP/QUIC connection is kept open |
| **Passive View** | An address book of peers we know about but are not connected to | 30 | No — just contact information |
Key invariants:
- **Active connections are always bidirectional**: If peer A has peer B in its active view, peer B also has peer A in its active view.
- The passive view serves as a **failover pool**: When an active peer disconnects, a random peer from the passive view is promoted to fill the slot.
## Configuration (`hyparview::Config`)
```rust
pub struct Config {
pub active_view_capacity: usize, // Default: 5
pub passive_view_capacity: usize, // Default: 30
pub active_random_walk_length: Ttl, // Default: Ttl(6)
pub passive_random_walk_length: Ttl, // Default: Ttl(3)
pub shuffle_random_walk_length: Ttl, // Default: Ttl(6)
pub shuffle_active_view_count: usize, // Default: 3
pub shuffle_passive_view_count: usize, // Default: 4
pub shuffle_interval: Duration, // Default: 60s
pub neighbor_request_timeout: Duration, // Default: 500ms
}
```
These defaults come directly from the HyParView paper (p9), except for `shuffle_interval` and `neighbor_request_timeout` which are "wild guesses" in the code.
## State Structure
```rust
pub struct State<PI, RG = ThreadRng> {
me: PI, // Our peer identity
me_data: Option<PeerData>, // Opaque data we share with peers
pub active_view: IndexSet<PI>, // Connected peers
pub passive_view: IndexSet<PI>, // Known but disconnected peers
config: Config,
shuffle_scheduled: bool, // Whether shuffle timer is active
rng: RG, // Random number generator
stats: Stats,
pending_neighbor_requests: HashSet<PI>, // Peers we've sent Neighbor to but no reply yet
peer_data: HashMap<PI, PeerData>, // Opaque data received from other peers
alive_disconnect_peers: HashSet<PI>, // Peers disconnecting but to keep in passive view
}
```
## Messages (`hyparview::Message`)
| Message | Direction | Purpose |
|---------|-----------|---------|
| `Join(Option<PeerData>)` | New node → Contact | Sent to a known peer to join the swarm |
| `ForwardJoin(ForwardJoin)` | Propagated | Forwarded to active view to introduce a new member |
| `Neighbor(Neighbor)` | Bidirectional | Request to add sender to active view (with priority) |
| `Disconnect(Disconnect)` | Bidirectional | Notification that a peer is leaving or being demoted |
| `Shuffle(Shuffle)` | Initiated periodically | Sent to random peer to exchange passive view contacts |
| `ShuffleReply(ShuffleReply)` | Reply to Shuffle | Returns a random subset of our views to the origin |
### Message Details
```rust
pub struct ForwardJoin<PI> {
peer: PeerInfo<PI>, // The new peer's identity + optional data
ttl: Ttl, // Time-to-live, decremented per hop
}
pub struct Shuffle<PI> {
origin: PI, // Who initiated the shuffle
nodes: Vec<PeerInfo<PI>>, // Random subset of our views
ttl: Ttl, // Time-to-live for the random walk
}
pub struct Neighbor {
priority: Priority, // High (cannot be denied) or Low (can be denied)
data: Option<PeerData>,
}
pub struct Disconnect {
alive: bool, // If true, peer is still alive (just demoting)
_respond: bool, // Obsolete, kept for wire compat
}
```
## Join Procedure (Step by Step)
1. A new node sends `Join(me_data)` to a known contact peer.
2. The contact peer adds the new node to its active view (even evicting a random peer if necessary).
3. The contact peer forwards `ForwardJoin` to all other peers in its active view with `TTL = active_random_walk_length`.
4. Each peer receiving `ForwardJoin`:
- If `TTL == 0` or active view has ≤1 peer: sends `Neighbor(High)` to the new node (which adds it to active view).
- If `TTL == passive_random_walk_length`: adds the new node to passive view.
- Decrements TTL and forwards to a random active peer (different from sender).
5. The `Neighbor` message establishes the bidirectional active connection. A `Priority::High` neighbor request **must** be accepted (potentially evicting a random active peer). A `Priority::Low` request is only accepted if there is room.
## Shuffle Mechanism
Periodically (every `shuffle_interval`), each node:
1. Picks a random active peer.
2. Sends `Shuffle` containing a random subset of active + passive views plus the origin's info, with a TTL.
3. The shuffle message does a random walk (each hop decrements TTL).
4. When TTL reaches 0 or the active view is ≤1, the peer accepts the shuffle and replies with `ShuffleReply` containing its own random peers.
5. The origin receives `ShuffleReply` and adds new peers to its passive view.
This ensures the passive view remains fresh and provides good connectivity even in dynamic networks.
## Failure Recovery
When a peer in the active view disconnects (detected via `PeerDisconnected`):
1. The peer is removed from the active view.
2. A `NeighborDown` event is emitted.
3. A random peer from the passive view is selected and sent a `Neighbor(Low)` request.
4. If that peer doesn't respond within `neighbor_request_timeout`, it's removed from the passive view and another peer is tried.
5. This continues until a connection is established or the passive view is exhausted.
If a `Disconnect(alive=true)` message is received:
- The peer is moved to the passive view (not just dropped), because it's still alive.
- The `alive_disconnect_peers` set tracks which disconnected peers should be retained in passive view when their connection eventually closes.
## PeerData
`PeerData` is an opaque `Bytes` type that peers exchange when joining. In the `net` module, it is used to serialize and transmit addressing information (`AddrInfo`):
```rust
struct AddrInfo {
relay_url: Option<RelayUrl>,
direct_addresses: BTreeSet<SocketAddr>,
}
```
This allows the gossip protocol itself to help propagate connectivity information, enabling the `GossipAddressLookup` service to feed addresses back into iroh's endpoint discovery system.
## Events (`hyparview::Event`)
| Event | Meaning |
|-------|---------|
| `NeighborUp(PI)` | A peer was added to our active view |
| `NeighborDown(PI)` | A peer was removed from our active view |
These events are forwarded up to the PlumTree layer and to the application.
## Timers
| Timer | Purpose |
|-------|---------|
| `DoShuffle` | Periodically trigger a shuffle operation |
| `PendingNeighborRequest(PI)` | Timeout for a pending neighbor request |
## IO Trait Pattern
The HyParView state machine is generic over an `IO` trait:
```rust
pub trait IO<PI: Clone> {
fn push(&mut self, event: impl Into<OutEvent<PI>>);
}
```
This allows the protocol to emit output events without knowing about the networking layer. The upper layers supply a `VecDeque<OutEvent>` or similar container.

View File

@@ -0,0 +1,256 @@
# iroh-gossip: PlumTree Broadcast Protocol
## Overview
The PlumTree (Epidemic Broadcast Trees) protocol provides **efficient message broadcasting** across all peers in a topic's swarm. It builds on top of HyParView's membership layer, using the active view as its peer set.
It is implemented in `src/proto/plumtree.rs`.
## Core Concept: Eager vs Lazy Push
Each peer maintains two subsets of its HyParView active view:
| Set | Description | Behavior |
|-----|-------------|----------|
| **Eager push peers** | Peers to whom full messages are sent immediately | Messages are pushed eagerly (full content) |
| **Lazy push peers** | Peers to whom only message IDs (hashes) are sent | `IHave` announcements are sent, requesting content only if needed |
When a peer broadcasts a message:
1. The **full message** is pushed to all **eager** peers.
2. The **message ID** (a blake3 hash) is pushed to all **lazy** peers (after a short delay for batching).
This creates an **optimized broadcast tree**: eager peers form a spanning tree for low-latency delivery, while lazy peers provide redundancy through timeout-based recovery.
## Configuration (`plumtree::Config`)
```rust
pub struct Config {
pub graft_timeout_1: Duration, // Default: 80ms
pub graft_timeout_2: Duration, // Default: 40ms
pub dispatch_timeout: Duration, // Default: 5ms
pub optimization_threshold: Round, // Default: Round(7)
pub message_cache_retention: Duration, // Default: 30s
pub message_id_retention: Duration, // Default: 90s
pub cache_evict_interval: Duration, // Default: 1s
}
```
### Timeout Semantics
- **`graft_timeout_1`**: After receiving an `IHave`, wait this long for the full message from an eager peer. If it doesn't arrive, send a `Graft` to the `IHave` sender.
- **`graft_timeout_2`**: After sending a `Graft`, wait this shorter timeout for the reply. If no reply, try the next `IHave` sender.
- **`dispatch_timeout`**: Delay before batching and sending `IHave` messages. This allows multiple announcements to be aggregated into a single message.
- **`optimization_threshold`**: Number of hops difference required to trigger tree optimization (see below).
### Cache Settings
- **`message_cache_retention`**: How long to keep full message payloads in cache. This enables replying to `Graft` requests from peers who missed the eager push.
- **`message_id_retention`**: How long to remember that we've already seen a message ID. This prevents re-delivering duplicate messages.
- **`cache_evict_interval`**: How often to check and evict expired entries.
## State Structure
```rust
pub struct State<PI> {
me: PI, // Our peer identity
config: Config, // Protocol configuration
pub eager_push_peers: BTreeSet<PI>, // Full message delivery peers
pub lazy_push_peers: BTreeSet<PI>, // Message-ID-only delivery peers
lazy_push_queue: BTreeMap<PI, Vec<IHave>>, // Pending IHave announcements (batched)
missing_messages: HashMap<MessageId, VecDeque<(PI, Round)>>, // IHave senders awaiting delivery
received_messages: TimeBoundCache<MessageId, ()>, // Seen message IDs
cache: TimeBoundCache<MessageId, Gossip>, // Full message payloads
graft_timer_scheduled: HashSet<MessageId>, // Active graft timers
dispatch_timer_scheduled: bool, // Whether IHave dispatch is pending
init: bool, // Whether first event was processed
stats: Stats, // Message counters
max_message_size: usize, // Maximum allowed message size
}
```
## Message Types (`plumtree::Message`)
| Message | Direction | Purpose |
|---------|-----------|---------|
| `Gossip(Gossip)` | Eager push | Full message content, broadcast to eager peers |
| `Prune` | Bidirectional | Sent when moving a peer from eager to lazy set |
| `Graft(Graft)` | Lazy → Eager upgrade | Request to become an eager peer; may include a message ID to request re-delivery |
| `IHave(Vec<IHave>)` | Lazy push | Announcement: "I have these messages" (batched, sent after `dispatch_timeout`) |
### Gossip Message Structure
```rust
pub struct Gossip {
id: MessageId, // blake3 hash of content
content: Bytes, // The actual message payload
scope: DeliveryScope, // Swarm(round) or Neighbors
}
```
The `DeliveryScope` tracks how many hops the message has traveled:
```rust
pub enum DeliveryScope {
Swarm(Round), // Delivered via the swarm; Round = hop count from origin
Neighbors, // Delivered only to direct neighbors (not forwarded further)
}
```
Each time a `Gossip` message is forwarded, its `Round` is incremented via `next_round()`. `Neighbors`-scope messages are not forwarded at all.
### IHave Structure
```rust
pub struct IHave {
id: MessageId, // The blake3 hash of the message content
round: Round, // The hop count at which the sender received this message
}
```
### Graft Structure
```rust
pub struct Graft {
id: Option<MessageId>, // If set, also reply with full message content
round: Round, // The round from the IHave that triggered this graft
}
```
### Message ID
```rust
pub struct MessageId([u8; 32]); // blake3 hash of message content
impl MessageId {
pub fn from_content(message: &[u8]) -> Self {
Self::from(blake3::hash(message))
}
}
```
Messages are validated: when receiving a `Gossip`, the receiver checks that `MessageId::from_content(&content) == id`. Spoofed messages (where the hash doesn't match the content) are silently discarded.
## Broadcast Flow
### Sending a Message
```
1. Compute MessageId = blake3(content)
2. Create Gossip { id, content, scope: Swarm(Round(0)) or Neighbors }
3. If Swarm scope:
a. Add to received_messages and cache
b. Queue IHave for lazy peers (dispatched after dispatch_timeout)
4. Eager-push Gossip to all eager peers (except self and sender)
```
### Receiving a Gossip Message
```
1. Validate: message.id == blake3(message.content) → discard if invalid
2. If already received (in received_messages):
→ Send Prune to sender (move sender to lazy set)
→ Return (don't re-broadcast)
3. If Swarm scope:
a. Add to received_messages
b. Increment round (next_round)
c. Add to cache (for Graft replies)
d. Eager-push to all eager peers (except sender)
e. Lazy-push IHave to all lazy peers (except sender)
f. Check if any prior IHave senders had a shorter path → optimize tree
4. Emit Received event to application
```
### Receiving an IHave
```
For each IHave entry:
If message ID not in received_messages:
Add (sender, round) to missing_messages[message_id]
If no graft timer scheduled for this message:
Schedule SendGraft timer (graft_timeout_1)
```
### Graft Timer Expiry (Two-Phase)
**Phase 1 (`graft_timeout_1`):**
```
If message already received → no-op (cancel)
Otherwise:
Pop first (peer, round) from missing_messages[message_id]
Move peer to eager set
Send Graft { id: Some(message_id), round } to that peer
Schedule another SendGraft timer (graft_timeout_2) for fallback
```
**Phase 2 (`graft_timeout_2`):**
```
If message already received → no-op
Otherwise:
Pop next (peer, round) from missing_messages[message_id]
Move that peer to eager set
Send Graft { id: Some(message_id), round }
Schedule another SendGraft timer (graft_timeout_2)
(continues until the message is received or senders are exhausted)
```
### Receiving a Graft
```
1. Move sender to eager set
2. If Graft contains a message ID:
Look up message in cache
If found: send Gossip(message) to the requesting peer
```
### Receiving a Prune
```
Move sender from eager set to lazy set
```
## Tree Optimization
The PlumTree self-optimizes based on latency. When a `Gossip` message is received, if we previously received an `IHave` for the same message from a different peer, we check whether the IHave path was significantly shorter:
```
if (ihave_round < gossip_round) && (gossip_round - ihave_round) >= optimization_threshold:
Graft the IHave sender (move to eager)
Prune the Gossip sender (move to lazy)
```
This means if a peer consistently has a shorter path to the message origin, they are promoted to eager, and the longer-path peer is demoted. The `optimization_threshold` (default: 7 hops) prevents thrashing from minor latency differences.
## Neighbor Events
PlumTree receives neighbor events from HyParView:
- **`NeighborUp(peer)`**: Add peer to eager set (all new neighbors start as eager)
- **`NeighborDown(peer)`**: Remove from both eager and lazy sets; clean up any `IHave` entries from this peer in `missing_messages`
## Neighbor-Only Broadcast
The `Scope::Neighbors` broadcast scope sends a message only to directly connected peers (the active view), without any forwarding:
```rust
pub enum Scope {
Swarm, // Broadcast to all peers in the swarm
Neighbors, // Broadcast only to immediate neighbors
}
```
Neighbor-scoped messages are useful for localized communication and are not cached or re-broadcast.
## Cache Management
The PlumTree maintains two time-bounded caches:
1. **`cache`** (`TimeBoundCache<MessageId, Gossip>`): Stores full message payloads for `message_cache_retention` (default 30s). This enables replying to `Graft` requests for recently-broadcast messages.
2. **`received_messages`** (`TimeBoundCache<MessageId, ()>`): Tracks which messages have been seen for `message_id_retention` (default 90s). This prevents duplicate delivery.
Both caches are periodically evicted (every `cache_evict_interval`, default 1s) via the `EvictCache` timer.

View File

@@ -0,0 +1,187 @@
# iroh-gossip: Protocol State & Topic Coordination
## Overview
The `state` module (`src/proto/state.rs`) provides the **top-level protocol state machine** that manages multiple topics. The `topic` module (`src/proto/topic.rs`) coordinates the HyParView and PlumTree state machines for a single topic.
## Multi-Topic State (`state::State`)
```rust
pub struct State<PI, R> {
me: PI, // Our peer identity
me_data: PeerData, // Our opaque peer data
config: Config, // Protocol configuration
rng: R, // Random number generator
states: HashMap<TopicId, topic::State<PI, R>>, // Per-topic state
outbox: Outbox<PI>, // Buffered output events
peer_topics: ConnsMap<PI>, // Maps peer → set of shared topics
}
```
The `State` acts as a **multiplexer** — it routes events to the correct topic's state and collects output events. It also tracks which topics are shared with each peer (in `peer_topics`), which is used to determine when a peer connection can safely be closed (only when no topic still needs it).
### TopicId
```rust
#[derive(Clone, Copy, Eq, PartialEq, Hash, Serialize, Ord, PartialOrd, Deserialize)]
pub struct TopicId([u8; 32]);
```
A 32-byte identifier for a topic. Typically created as `blake3::hash(topic_name)` or from raw bytes. Each topic is an independent swarm and broadcast scope.
### Wire Message Format
```rust
pub struct Message<PI> {
pub topic: TopicId,
pub message: topic::Message<PI>,
}
```
Every wire message carries the `TopicId` prefix, allowing multiplexing of multiple topics over a single connection.
### Event Routing
`InEvent` is mapped to either a topic-specific event or a global event:
| InEvent | Routing |
|---------|---------|
| `RecvMessage(from, Message{topic, message})` | → Topic-specific: `topic::InEvent::RecvMessage` |
| `Command(topic, command)` | → Topic-specific: `topic::InEvent::Command` |
| `TimerExpired(Timer{topic, timer})` | → Topic-specific: `topic::InEvent::TimerExpired` |
| `PeerDisconnected(peer)` | → Broadcast to ALL topics |
| `UpdatePeerData(data)` | → Broadcast to ALL topics |
### Topic Lifecycle
When a `Command::Join(peers)` is received for a topic that doesn't yet have state, a new `topic::State` is automatically created. When `Command::Quit` is received, the topic's state is removed after processing the quit event.
### Connection Management
When a `topic::OutEvent::DisconnectPeer(peer)` is emitted, the state module checks `peer_topics` to see if any other topic still needs a connection to that peer. Only when no topic needs the peer anymore is `OutEvent::DisconnectPeer(peer)` emitted at the top level.
## Topic State (`topic::State`)
```rust
pub struct State<PI, R> {
me: PI,
pub swarm: hyparview::State<PI, R>, // HyParView membership
pub gossip: plumtree::State<PI>, // PlumTree broadcast
outbox: VecDeque<OutEvent<PI>>,
stats: Stats,
}
```
The topic state **composes** HyParView and PlumTree, bridging them together:
### Event Forwarding
When `topic::State::handle()` is called:
1. **HyParView events** are processed first (membership layer).
2. **NeighborUp/NeighborDown events** emitted by HyParView are forwarded to PlumTree:
- `NeighborUp(peer)``plumtree::InEvent::NeighborUp(peer)` — adds peer to eager set
- `NeighborDown(peer)``plumtree::InEvent::NeighborDown(peer)` — removes peer from both sets
3. All output events from both layers are collected and returned.
### Command Handling
| Command | Action |
|---------|--------|
| `Join(peers)` | Sends `RequestJoin(peer)` to HyParView for each peer in the list |
| `Broadcast(data, scope)` | Sends `Broadcast(data, scope)` to PlumTree |
| `Quit` | Sends `Quit` to HyParView (which sends `Disconnect` to all active peers) |
### Message Routing
When a topic message is received:
```rust
match message {
Message::Swarm(message) => hyparview.handle(RecvMessage(from, message)),
Message::Gossip(message) => plumtree.handle(RecvMessage(from, message)),
}
```
### Timer Routing
```rust
match timer {
Timer::Swarm(timer) => hyparview.handle(TimerExpired(timer)),
Timer::Gossip(timer) => plumtree.handle(TimerExpired(timer)),
}
```
## Topic Messages (`topic::Message`)
```rust
pub enum Message<PI> {
Swarm(hyparview::Message<PI>), // Membership messages
Gossip(plumtree::Message), // Broadcast messages
}
```
The message kind is used for metrics tracking:
```rust
pub fn kind(&self) -> MessageKind {
match self {
Message::Swarm(_) => MessageKind::Control,
Message::Gossip(message) => match message {
plumtree::Message::Gossip(_) => MessageKind::Data,
_ => MessageKind::Control,
},
}
}
```
## Topic Events (`topic::Event`)
```rust
pub enum Event<PI> {
NeighborUp(PI), // From HyParView: new active neighbor
NeighborDown(PI), // From HyParView: lost active neighbor
Received(GossipEvent<PI>), // From PlumTree: received a gossip message
}
```
The `Received` event contains:
```rust
pub struct GossipEvent<PI> {
pub content: Bytes, // Message payload
pub delivered_from: PI, // Peer that delivered the message to us
pub scope: DeliveryScope, // Swarm(round) or Neighbors
}
```
## Topic Configuration
```rust
pub struct Config {
pub membership: hyparview::Config, // HyParView configuration
pub broadcast: plumtree::Config, // PlumTree configuration
pub max_message_size: usize, // Maximum wire message size (default: 4096)
}
```
The `max_message_size` is the total wire-level message size including headers. The actual payload capacity is computed as `max_message_size - postcard_header_size`, where the header size accounts for the topic ID and message envelope overhead.
## Statistics
Each topic tracks:
```rust
pub struct Stats {
pub messages_sent: usize,
pub messages_received: usize,
}
```
The PlumTree layer also tracks:
```rust
pub struct Stats {
pub payload_messages_received: u64,
pub control_messages_received: u64,
pub max_last_delivery_hop: u16,
}
```

View File

@@ -0,0 +1,244 @@
# iroh-gossip: Networking Layer & Actor Model
## Overview
The `net` module (`src/net.rs` and submodules) provides the async runtime layer that connects the IO-free protocol state machine to real network IO via iroh QUIC connections. It is built around a **single Actor** that manages all topics and connections.
## ALPN Protocol
```rust
pub const GOSSIP_ALPN: &[u8] = b"/iroh-gossip/1";
```
This ALPN identifier is used when establishing QUIC connections through iroh.
## Gossip Handle (`net::Gossip`)
```rust
#[derive(Debug, Clone)]
pub struct Gossip {
pub(crate) inner: Arc<Inner>,
}
```
`Gossip` is the primary public handle. It derefs to `GossipApi`, providing the user-facing interface:
```rust
// Subscribe to a topic
let (sender, receiver) = gossip.subscribe(topic_id, bootstrap_peers).await?.split();
// Subscribe and wait for at least one connection
let topic = gossip.subscribe_and_join(topic_id, bootstrap_peers).await?;
// Broadcast a message
sender.broadcast(b"hello world".to_vec().into()).await?;
// Broadcast to neighbors only
sender.broadcast_neighbors(b"local announcement".to_vec().into()).await?;
// Join additional peers
sender.join_peers(vec![peer_id]).await?;
```
### Builder Pattern
```rust
let gossip = Gossip::builder()
.max_message_size(8192) // Default: 4096
.membership_config(hyparview_config) // HyParView settings
.broadcast_config(plumtree_config) // PlumTree settings
.alpn(b"/custom-alpn") // Custom ALPN (must match across network)
.spawn(endpoint);
```
## Architecture: The Actor
The core of the networking layer is the `Actor` struct, which runs as a single async task:
```rust
struct Actor {
alpn: Bytes,
state: proto::State<PublicKey, StdRng>, // Protocol state machine
endpoint: Endpoint, // iroh endpoint for connections
dialer: Dialer, // Manages outgoing connections
rpc_rx: mpsc::Receiver<RpcMessage>, // API commands
local_rx: mpsc::Receiver<LocalActorMessage>, // Local commands (connections, shutdown)
in_event_tx: mpsc::Sender<InEvent>, // Protocol input channel
in_event_rx: mpsc::Receiver<InEvent>, // Protocol input channel (receiver)
timers: Timers<Timer>, // Scheduled timers
topics: HashMap<TopicId, TopicState>, // Per-topic subscription state
peers: HashMap<EndpointId, PeerState>, // Per-peer connection state
command_rx: stream_group::Keyed<TopicCommandStream>, // Per-topic command streams
quit_queue: VecDeque<TopicId>, // Topics pending unsubscription
connection_tasks: JoinSet<...>, // Running connection loop tasks
metrics: Arc<Metrics>,
topic_event_forwarders: JoinSet<TopicId>, // Tasks forwarding events to subscribers
address_lookup: GossipAddressLookup, // Address discovery integration
}
```
### Event Loop
The actor's `run()` method calls `event_loop()` in a loop. Each iteration uses `tokio::select!` to handle:
| Source | Action |
|--------|--------|
| `local_rx` (local messages) | Handle incoming connections or shutdown |
| `rpc_rx` (RPC messages) | Process `Join` requests from the API |
| `command_rx` (per-topic commands) | Process `Broadcast`, `BroadcastNeighbors`, `JoinPeers`, or stream closure |
| `addr_updates` (endpoint addr changes) | Update our `PeerData` in the protocol state |
| `dialer` (connection establishment) | Handle successful/failed outgoing connections |
| `in_event_rx` (protocol events from connections) | Feed events to the protocol state machine |
| `timers` (scheduled timers) | Feed timer expirations to the protocol state machine |
| `connection_tasks` (connection task completions) | Handle peer disconnections |
| `topic_event_forwarders` (subscription tasks) | Handle topic cleanup when all subscribers drop |
### Processing InEvents
When an `InEvent` is processed, the actor calls `self.state.handle(event, now, metrics)`, which returns `Vec<OutEvent>`. For each `OutEvent`:
| OutEvent | Action |
|----------|--------|
| `SendMessage(peer, message)` | Send via peer's active connection or queue for pending connection |
| `EmitEvent(topic, event)` | Forward to topic's `broadcast::Sender` → subscribers |
| `ScheduleTimer(delay, timer)` | Schedule timer via `Timers` data structure |
| `DisconnectPeer(peer)` | Drop the peer's send channel, removing from `peers` map |
| `PeerData(endpoint_id, data)` | Decode `AddrInfo` from `PeerData`, add to `GossipAddressLookup` |
## Connection Management
### Peer States
```rust
enum PeerState {
Pending {
queue: Vec<ProtoMessage>, // Messages queued while connecting
},
Active {
active_send_tx: mpsc::Sender<ProtoMessage>, // Current active send channel
active_conn_id: ConnId, // Stable ID of active connection
other_conns: Vec<ConnId>, // Older connections still closing
},
}
```
When a message needs to be sent to a peer:
- **Active**: Send immediately via `active_send_tx`
- **Pending**: Queue the message and initiate a dial
### Dialer
```rust
struct Dialer {
endpoint: Endpoint,
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
pending_dials: HashMap<EndpointId, CancellationToken>,
}
```
The `Dialer` manages outgoing connections. It:
1. Checks if a dial is already pending for a peer
2. Spawns an async connection task with cancellation support
3. Returns completed connections via `next_conn()`
### Connection Loop
Each peer connection runs a `connection_loop` task:
```rust
async fn connection_loop(
from: PublicKey, // Remote peer's public key
conn: Connection, // QUIC connection
origin: ConnOrigin, // Accept (incoming) or Dial (outgoing)
send_rx: mpsc::Receiver<ProtoMessage>, // Messages to send
in_event_tx: mpsc::Sender<InEvent>, // Channel to protocol
max_message_size: usize, // Maximum message size
queue: Vec<ProtoMessage>, // Queued messages to send first
) -> Result<(), ConnectionLoopError>
```
The connection loop:
1. First sends any queued messages
2. Runs a send loop and receive loop concurrently (`tokio::join!`)
3. Uses iroh QUIC bidirectional streams for communication
### Wire Protocol
Messages are serialized with `postcard` and sent as **length-prefixed frames** over QUIC unidirectional streams:
```
┌──────────────┐
│ Stream Header │ ── Contains TopicId (sent once per stream)
├──────────────┤
│ Frame (len) │ ── u32 length prefix
│ Frame (data) │ ── postcard-encoded topic::Message<PublicKey>
├──────────────┤
│ Frame (len) │ ── next message...
│ Frame (data) │
└──────────────┘
```
Each topic gets its own unidirectional stream. The stream header is sent once when the stream is opened. Disconnect messages close the stream after being sent.
The `SendLoop` manages per-topic streams within a connection:
```rust
struct SendLoop {
conn: Connection,
streams: HashMap<TopicId, SendStream>, // One stream per topic
buffer: Vec<u8>,
max_message_size: usize,
send_rx: mpsc::Receiver<ProtoMessage>,
}
```
When a disconnect message is sent for a topic, the stream for that topic is closed (via `finish()`).
## Topic State (Net Layer)
```rust
struct TopicState {
neighbors: BTreeSet<EndpointId>, // Current active neighbors (from protocol)
event_sender: broadcast::Sender<ProtoEvent>, // Broadcast channel to subscribers
command_rx_keys: HashSet<stream_group::Key>, // Active command stream keys
}
```
A topic is considered "still needed" if it has either:
- Active command receivers (publishers), or
- Active event subscribers (subscribers)
When neither exists, the topic is queued for quit/unsubscription.
## Address Lookup Integration
The `GossipAddressLookup` integrates with iroh's address discovery system:
```rust
pub(crate) struct GossipAddressLookup {
endpoints: NodeMap, // BTreeMap<EndpointId, StoredEndpointInfo>
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
}
```
It implements iroh's `AddressLookup` trait, allowing gossip-discovered peer addresses to feed back into iroh's connection establishment. This means that when a peer shares its address information in `Join` or `ForwardJoin` messages, that information is used to help iroh connect to that peer.
Entries expire after 5 minutes (configurable via `RetentionOpts`), with eviction checks every 30 seconds.
## Metrics
The `Metrics` struct tracks various counters:
| Metric | Description |
|--------|-------------|
| `msgs_ctrl_sent` | Control messages sent |
| `msgs_ctrl_recv` | Control messages received |
| `msgs_data_sent` | Data messages sent |
| `msgs_data_recv` | Data messages received |
| `msgs_data_sent_size` | Total size of data messages sent |
| `msgs_data_recv_size` | Total size of data messages received |
| `msgs_ctrl_sent_size` | Total size of control messages sent |
| `msgs_ctrl_recv_size` | Total size of control messages received |
| `neighbor_up` | Neighbor connections established |
| `neighbor_down` | Neighbor connections lost |
| `actor_tick_*` | Various event loop tick counters |

View File

@@ -0,0 +1,290 @@
# iroh-gossip: Public API & Data Flow
## Public API Types
### Gossip (Main Handle)
The `Gossip` struct is the main entry point, created via a `Builder`:
```rust
let gossip = Gossip::builder()
.max_message_size(8192)
.membership_config(HyparviewConfig { ... })
.broadcast_config(PlumtreeConfig { ... })
.alpn(b"/custom-alpn")
.spawn(endpoint);
```
It derefs to `GossipApi`, which provides:
| Method | Description |
|--------|-------------|
| `subscribe(topic_id, bootstrap)` | Join a topic with default options |
| `subscribe_and_join(topic_id, bootstrap)` | Join and wait for at least one connection |
| `subscribe_with_opts(topic_id, opts)` | Join with custom `JoinOptions` |
| `handle_connection(conn)` | Handle an incoming QUIC connection |
| `shutdown()` | Gracefully leave all topics and stop |
| `max_message_size()` | Get configured max message size |
| `metrics()` | Get metrics handle |
### GossipTopic (Subscription Handle)
Returned by `subscribe()`, it is a `Stream<Item = Result<Event, ApiError>>`:
```rust
let topic: GossipTopic = gossip.subscribe(topic_id, peers).await?;
topic.broadcast(b"hello".to_vec().into()).await?;
topic.broadcast_neighbors(b"local".to_vec().into()).await?;
topic.joined().await?; // Wait for first connection
```
Can be split into sender and receiver:
```rust
let (sender, receiver) = topic.split();
// sender: GossipSender - can broadcast and join peers
// receiver: GossipReceiver - can receive events and check neighbors
```
### GossipSender
```rust
pub struct GossipSender(mpsc::Sender<Command>);
impl GossipSender {
pub async fn broadcast(&self, message: Bytes) -> Result<(), ApiError>;
pub async fn broadcast_neighbors(&self, message: Bytes) -> Result<(), ApiError>;
pub async fn join_peers(&self, peers: Vec<EndpointId>) -> Result<(), ApiError>;
}
```
### GossipReceiver
```rust
pub struct GossipReceiver {
stream: Pin<Box<dyn Stream<Item = Result<Event, ApiError>> + Send + Sync + 'static>>,
neighbors: HashSet<EndpointId>,
}
impl GossipReceiver {
pub fn neighbors(&self) -> impl Iterator<Item = EndpointId> + '_;
pub async fn joined(&mut self) -> Result<(), ApiError>;
pub fn is_joined(&self) -> bool;
}
```
The `GossipReceiver` tracks the neighbor set internally by processing `NeighborUp` and `NeighborDown` events.
### Event Types
```rust
pub enum Event {
NeighborUp(EndpointId), // New direct neighbor connected
NeighborDown(EndpointId), // Direct neighbor disconnected
Received(Message), // Gossip message received
Lagged, // Internal channel lagged (messages dropped)
}
pub struct Message {
pub content: Bytes, // Message content
pub scope: DeliveryScope, // Swarm(round) or Neighbors
pub delivered_from: EndpointId, // Peer that delivered the message to us
}
```
### Command Types
```rust
pub enum Command {
Broadcast(Bytes), // Broadcast to all in swarm
BroadcastNeighbors(Bytes), // Broadcast to direct neighbors only
JoinPeers(Vec<EndpointId>), // Join additional peers
}
```
### JoinOptions
```rust
pub struct JoinOptions {
pub bootstrap: BTreeSet<EndpointId>, // Initial peers to connect to
pub subscription_capacity: usize, // Event channel capacity (default: 2048)
}
```
### DeliveryScope
```rust
pub enum DeliveryScope {
Swarm(Round), // Message traveled `Round` hops from origin
Neighbors, // Direct neighbor message (not forwarded)
}
```
`DeliveryScope::Swarm(Round(0))` means the message was sent by a direct neighbor. `Round(n)` means the message traveled n hops.
## Data Flow Diagrams
### Joining a Topic
```
User Code GossipApi Actor Proto State
| | | |
|-- subscribe(topic, peers)->| | |
| |-- JoinRequest ------->| |
| | |-- Command::Join ------>|
| | | |-- RequestJoin(peers)
| | | |-- SendMessage(peer, Join)
| | | |-- ...
| |<-- NeighborUp events--|<-- EmitEvent(NeighborUp)|
|<-- Event::NeighborUp ------| | |
```
### Broadcasting a Message
```
User Code GossipSender Actor Proto State Network
| | | | |
|-- broadcast(msg) ->| | | |
| |-- Command:: --> | | |
| | Broadcast | | |
| | |-- Broadcast ---->| |
| | | |-- eager_push --->|
| | | | (Gossip msgs) |
| | | |-- lazy_push ----->|
| | | | (IHave msgs) |
| | | | |
| (other peer receives Gossip) | | |
| | | |<-- RecvMessage --|
| | |<-- InEvent -------| |
| | | | (validates ID) |
| | | | (forwards) |
|<-- Received(msg) -|<-- EmitEvent -| | |
```
### Receiving and Processing IHave/Graft
```
Time →
Peer A Our Node Peer B
| | |
|-- IHave(id, round) --->| |
| | Schedule graft_timeout_1 |
| | (wait for eager push) |
| | |
| [timeout expires] | |
| |-- Graft(id, round) ----->| (Peer B sent IHave)
| | |
| |<-- Gossip(content) -------| (Peer B replies)
| | |
| |-- Prune ----------------->| (maybe, if optimization)
```
### HyParView Join Flow
```
New Node Contact Node Active Peers of Contact
| | |
|-- Join(me_data) -->| |
| |-- add_active(new) |
| |-- Neighbor(High) ----->| (to new node)
| |-- ForwardJoin ------->| (to each active peer)
| | |-- add_active or add_passive
| | |-- Neighbor(Low/High) -> (to new node)
| | |-- ForwardJoin -> (random peer)
| | |
|<-- Neighbor(High) -| |
|<-- Neighbor(Low/High) ----------------------|
| | |
```
### Shuffle Periodic Operation
```
Node A Node B Random Node
| | |
|-- Shuffle ---------->| |
| (origin=A, nodes, | |
| TTL=6) | |
| |-- Shuffle ------------>|
| | (origin=A, nodes, |
| | TTL=5) |
| | |-- ...
| | |-- (TTL reaches 0)
| | |
|<-- ShuffleReply ----|<-- ShuffleReply --------|
| (random nodes) | (random nodes) |
| | |
|-- add_passive(nodes from reply) |
```
## RPC Support (Optional Feature)
When the `rpc` feature is enabled, `GossipApi` can also operate remotely:
```rust
// Server side
gossip.listen(rpc_endpoint).await;
// Client side
let api = GossipApi::connect(rpc_endpoint, addr);
let topic = api.subscribe_and_join(topic_id, bootstrap).await?;
```
This uses the `irpc`/`noq` crates for bidirectional streaming RPC. The `Join` request establishes a bidirectional stream:
- Client → Server: `Command` messages (Broadcast, BroadcastNeighbors, JoinPeers)
- Server → Client: `Event` messages (NeighborUp, NeighborDown, Received, Lagged)
## Channel Architecture
```
┌─────────────────────────────────────────────────┐
│ Actor │
│ │
RPC/Local ──────►│ rpc_rx ◄─────────────────────────────────────│
Commands │ local_rx ◄── HandleConnection, Shutdown │
│ │
│ in_event_tx ──► in_event_rx ────────────────│──► proto::State::handle()
│ │ │
│ ◄── OutEvent ────────────────────────────────│◄──── │
│ │ │
│ ├──► SendMessage ──► peer.send_tx │
│ ├──► EmitEvent ──► topic.event_sender │
│ ├──► ScheduleTimer ──► timers │
│ ├──► DisconnectPeer ──► drop peer │
│ └──► PeerData ──► address_lookup │
│ │
│ topic.event_sender ──► broadcast channel ────│──► GossipReceiver
│ │
│ command_rx ◄─── per-topic command streams ──│◄── GossipSender
│ │
└─────────────────────────────────────────────────┘
```
## Configuration Defaults Summary
| Parameter | Default | Source |
|-----------|---------|--------|
| Active view capacity | 5 | HyParView paper (p9) |
| Passive view capacity | 30 | HyParView paper (p9) |
| Active random walk length | 6 | HyParView paper (p9) |
| Passive random walk length | 3 | HyParView paper (p9) |
| Shuffle random walk length | 6 | HyParView paper (p9) |
| Shuffle active view count | 3 | HyParView paper (p9) |
| Shuffle passive view count | 4 | HyParView paper (p9) |
| Shuffle interval | 60s | Implementation choice |
| Neighbor request timeout | 500ms | Implementation choice |
| Graft timeout 1 | 80ms | Implementation choice |
| Graft timeout 2 | 40ms | Implementation choice |
| Dispatch timeout | 5ms | Implementation choice |
| Optimization threshold | 7 hops | PlumTree paper (p12) |
| Message cache retention | 30s | Implementation choice |
| Message ID retention | 90s | Implementation choice |
| Cache evict interval | 1s | Implementation choice |
| Max message size | 4096 bytes | Implementation choice |
| Send queue capacity | 64 messages | Implementation choice |
| To-actor channel capacity | 64 messages | Implementation choice |
| In-event channel capacity | 1024 messages | Implementation choice |
| Topic event channel capacity | 256 events | Implementation choice |
| Topic events default capacity | 2048 events | Implementation choice |
| Topic commands channel capacity | 64 commands | Implementation choice |

View File

@@ -0,0 +1,176 @@
# iroh-gossip: Utility Data Structures & Wire Format
## IndexSet (`proto::util::IndexSet`)
A wrapper around `indexmap::IndexSet` that provides random selection capabilities needed by HyParView:
```rust
pub(crate) struct IndexSet<T> {
inner: indexmap::IndexSet<T>,
}
```
### Key Operations
| Method | Purpose |
|--------|---------|
| `insert(value)` | Add element (returns false if already present) |
| `remove(value)` | Remove by value (swap-remove, O(1)) |
| `remove_index(index)` | Remove by index (swap-remove) |
| `remove_random(rng)` | Remove a random element |
| `pick_random(rng)` | Get reference to random element |
| `pick_random_without(exclude, rng)` | Random element excluding certain elements |
| `pick_random_index(rng)` | Random index |
| `shuffled(rng)` | All elements in random order |
| `shuffled_and_capped(len, rng)` | First `len` elements after shuffle |
| `shuffled_without(exclude, rng)` | Random order excluding certain elements |
| `shuffled_without_and_capped(exclude, len, rng)` | Capped shuffle excluding elements |
| `iter_without(value)` | Iterator skipping a specific element |
These operations are critical for HyParView's random walks, shuffle exchanges, and passive view management.
## TimerMap (`proto::util::TimerMap`)
A priority queue of timer entries sorted by `Instant`, with stable ordering via a sequence counter:
```rust
pub struct TimerMap<T> {
heap: BinaryHeap<TimerMapEntry<T>>,
seq: u64,
}
```
Used by the protocol state machine for scheduling future events (shuffles, graft timeouts, cache eviction). The networking layer wraps this in an async-friendly `Timers` type that can `wait_next()`.
### Key Operations
| Method | Purpose |
|--------|---------|
| `insert(instant, item)` | Schedule a timer |
| `pop_before(limit)` | Pop the earliest entry if it's before `limit` |
| `drain_until(from)` | Drain all entries up to a time |
| `first()` | Get reference to earliest entry |
## TimeBoundCache (`proto::util::TimeBoundCache`)
A `HashMap` where entries expire after a specified `Instant`:
```rust
pub struct TimeBoundCache<K, V> {
map: HashMap<K, (Instant, V)>,
expiry: TimerMap<K>,
}
```
Used by PlumTree for:
- `received_messages: TimeBoundCache<MessageId, ()>` — deduplication
- `cache: TimeBoundCache<MessageId, Gossip>` — message payload storage for Graft replies
### Key Operations
| Method | Purpose |
|--------|---------|
| `insert(key, value, expires)` | Insert with expiration |
| `contains_key(key)` | Check existence |
| `get(key)` | Get value |
| `expires(key)` | Get expiration time |
| `expire_until(instant)` | Remove all expired entries, returns count |
| `len()` / `is_empty()` | Size queries |
The `expire_until` method correctly handles re-insertions: if a key is re-inserted with a later expiration time after being added to the expiry queue, the old expiry entry is ignored (not removed from the map).
## Wire Format
### Frame Encoding
Messages are encoded using `postcard` (a `no_std`-friendly, `serde`-compatible format) and sent as length-prefixed frames:
```
┌──────────────┬──────────────┬─────────────────┐
│ Length (u32) │ TopicHeader │ Message Payload │
│ big-endian │ postcard │ postcard │
└──────────────┴──────────────┴─────────────────┘
```
### Stream Protocol
Each QUIC unidirectional stream is dedicated to a single topic. The stream begins with a `StreamHeader`:
```rust
pub(crate) struct StreamHeader {
pub(crate) topic_id: TopicId,
}
```
All subsequent frames on that stream carry messages for that topic. When a `Disconnect` message is sent, the stream is closed (via `finish()`).
### Message Types on Wire
```rust
pub enum Message<PI> {
Swarm(hyparview::Message<PI>), // Membership messages
Gossip(plumtree::Message), // Broadcast messages
}
```
Where `PI` is `PublicKey` (32-byte ed25519 public key) in the networking layer.
The `MessageKind` classification is used for metrics:
| Kind | Message Types |
|------|--------------|
| `Data` | `Gossip` messages (actual content) |
| `Control` | All Swarm messages, plus `Prune`, `Graft`, `IHave` |
### Message Size Limits
- Default max message size: 4096 bytes (minimum: 512)
- The header size is computed at compile time via `postcard::experimental::serialized_size`
- Actual payload capacity = `max_message_size - header_size`
The `SendLoop` checks message size before writing and returns `WriteError::TooLarge` if exceeded.
## PeerData & Address Propagation
The `PeerData` type is an opaque `Bytes` wrapper used in HyParView messages. In the `net` layer, it carries addressing information:
```rust
struct AddrInfo {
relay_url: Option<RelayUrl>,
direct_addresses: BTreeSet<SocketAddr>,
}
```
This is serialized with `postcard` and passed as `PeerData` in `Join`, `ForwardJoin`, and `Neighbor` messages. When received, the `AddrInfo` is decoded and fed into `GossipAddressLookup`, which implements iroh's `AddressLookup` trait, allowing gossip-discovered addresses to be used for future connections.
## GossipAddressLookup
```rust
pub(crate) struct GossipAddressLookup {
endpoints: NodeMap, // Arc<RwLock<BTreeMap<EndpointId, StoredEndpointInfo>>>
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
}
```
Key behaviors:
- **Merging**: When adding addresses for an already-known endpoint, new addresses are merged (union of direct addresses, relay URL is overwritten)
- **Expiration**: Entries expire after 5 minutes, with eviction checks every 30 seconds
- **Integration**: Implements `iroh::address_lookup::AddressLookup`, returning data with provenance "gossip"
## Dialer
```rust
struct Dialer {
endpoint: Endpoint,
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
pending_dials: HashMap<EndpointId, CancellationToken>,
}
```
The `Dialer` manages outgoing connection attempts:
- Queues a dial via `queue_dial(endpoint_id, alpn)`
- Checks for pending dials to avoid duplicate connections
- Supports cancellation of in-progress dials
- Returns completed connections via `next_conn()`
When a dial succeeds, the connection is passed to `handle_connection()`. When a dial fails and the peer is not already active, a `PeerDisconnected` event is injected into the protocol state.

View File

@@ -0,0 +1,169 @@
# iroh-gossip: Testing & Simulation
## Test Infrastructure
The crate includes two layers of testing:
### 1. Unit Tests (in source files)
Unit tests are embedded in each module file behind `#[cfg(test)]`:
| Module | Tests |
|--------|-------|
| `proto/hyparview.rs` | Not shown (would be in the file) |
| `proto/plumtree.rs` | `optimize_tree`, `spoofed_messages_are_ignored`, `cache_is_evicted` |
| `proto.rs` | `hyparview_smoke`, `plumtree_smoke`, `quit` |
| `net.rs` | `gossip_net_smoke`, `subscription_cleanup` |
| `api.rs` | `test_rpc`, `ensure_gossip_topic_is_sync` |
| `proto/util.rs` | `indexset`, `timer_map`, `hex`, `time_bound_cache` |
### 2. Protocol Simulator (`proto::sim`)
The `sim` module (behind `test-utils` feature) provides a deterministic network simulator:
```rust
// Available when feature = "test-utils"
pub mod sim;
```
This allows testing the protocol logic without any real networking, using seeded RNG for reproducibility.
The simulator creates a `Network` of virtual nodes, each running their own `proto::State`. Events are processed in discrete "trips" (round-trips), allowing controlled testing of protocol behavior.
### 3. Simulation Binary (`sim` feature)
The crate includes a CLI simulator (behind `simulator` feature) that can run large-scale simulations:
```
cargo run --bin sim --features simulator
```
This uses `rayon` for parallel execution and `comfy-table` for result output.
### 4. Integration Tests (`tests/sim.rs`)
Behind the `test-utils` feature, provides end-to-end protocol testing.
## Key Test Patterns
### Protocol-Level Smoke Test
From `proto.rs`:
```rust
#[test]
fn hyparview_smoke() {
let rng = ChaCha12Rng::seed_from_u64(0);
let mut config = Config::default();
config.membership.active_view_capacity = 2;
let mut network = Network::new(config.into(), rng);
for i in 0..4 { network.insert(i); }
let t: TopicId = [0u8; 32].into();
// Join nodes
network.command(0, t, Command::Join(vec![1, 2]));
network.command(1, t, Command::Join(vec![2]));
network.command(2, t, Command::Join(vec![]));
network.run_trips(3);
// Verify events and connections
assert_eq!(network.events_sorted(), expected);
assert_eq!(network.conns(), vec![(0, 1), (0, 2), (1, 2)]);
assert!(network.check_synchronicity());
}
```
### PlumTree Optimization Test
From `plumtree.rs`:
```rust
#[test]
fn optimize_tree() {
// When an IHave message arrives with fewer hops than the Gossip message,
// and the difference exceeds optimization_threshold, the tree is restructured:
// - The IHave sender is promoted to eager (Graft)
// - The Gossip sender is demoted to lazy (Prune)
}
```
### Spoofed Message Test
```rust
#[test]
fn spoofed_messages_are_ignored() {
// Messages where MessageId != blake3(content) are silently discarded
let message = Message::Gossip(Gossip {
content: content.clone(),
id: MessageId::from_content(b"wrong_content"), // Spoofed!
scope: DeliveryScope::Swarm(Round(1)),
});
state.handle(InEvent::RecvMessage(2, message), now, &mut io);
// No events are emitted
}
```
### Networking Smoke Test
From `net.rs`:
```rust
#[tokio::test]
async fn gossip_net_smoke() {
// Creates 3 endpoints with a relay server
// Subscribes and joins a topic
// Broadcasts messages and verifies reception
// Uses real QUIC connections via iroh
}
```
## Metrics
The `Metrics` struct (in `src/metrics.rs`) uses `iroh_metrics::MetricsGroup`:
```rust
#[derive(Debug, Default, MetricsGroup)]
#[metrics(name = "gossip")]
pub struct Metrics {
pub msgs_ctrl_sent: Counter,
pub msgs_ctrl_recv: Counter,
pub msgs_data_sent: Counter,
pub msgs_data_recv: Counter,
pub msgs_data_sent_size: Counter,
pub msgs_data_recv_size: Counter,
pub msgs_ctrl_sent_size: Counter,
pub msgs_ctrl_recv_size: Counter,
pub neighbor_up: Counter,
pub neighbor_down: Counter,
pub actor_tick_main: Counter,
pub actor_tick_rx: Counter,
pub actor_tick_endpoint: Counter,
pub actor_tick_dialer: Counter,
pub actor_tick_dialer_success: Counter,
pub actor_tick_dialer_failure: Counter,
pub actor_tick_in_event_rx: Counter,
pub actor_tick_timers: Counter,
}
```
These are tracked both in the protocol state machine (for message counts) and in the actor event loop (for tick-level diagnostics). When the `metrics` feature is enabled, they are exported via Prometheus-compatible endpoints.
## References
### Academic Papers
- **HyParView**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "HyParView: A Membership Protocol for Reliable Gossip Multicast with Dense Coverage." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf)
- **PlumTree**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "Epidemic Broadcast Trees." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf)
### Implementation Reference
- Bartosz Sypytkowski's example implementation: [gist](https://gist.github.com/Horusiath/84fac596101b197da0546d1697580d99)
### Related Projects
- [iroh](https://docs.rs/iroh) — The networking library that iroh-gossip integrates with
- [Earthstar](https://github.com/earthstar-project/earthstar) — Another PlumTree implementation referenced in code comments
### Crate Repository
- [github.com/n0-computer/iroh-gossip](https://github.com/n0-computer/iroh-gossip)

View File

@@ -0,0 +1,40 @@
# iroh-gossip Reference Documentation
This directory contains a deep-dive reference on how the `iroh-gossip` crate works, based on source code analysis of the repository at `/workspace/iroh-gossip`.
## Documents
| # | File | Topic |
|---|------|-------|
| 01 | [Overview & Architecture](01-overview-architecture.md) | Crate structure, module organization, design principles, features, dependencies |
| 02 | [HyParView Membership](02-hyparview-membership.md) | Swarm membership protocol: active/passive views, join procedure, shuffle mechanism, failure recovery, PeerData |
| 03 | [PlumTree Broadcast](03-plumtree-broadcast.md) | Epidemic broadcast trees: eager/lazy push, Graft/IHave/Prune, tree optimization, message deduplication, cache management |
| 04 | [State & Topic Coordination](04-state-and-topic.md) | Multi-topic state management, topic lifecycle, event routing between HyParView and PlumTree |
| 05 | [Net Actor & Networking](05-net-actor.md) | Actor model, event loop, connection management, Dialer, wire protocol, address lookup, topic state in the net layer |
| 06 | [API & Data Flow](06-api-data-flow.md) | Public API types, subscription model, event/command flow, channel architecture, configuration defaults |
| 07 | [Utilities & Wire Format](07-utilities-wire-format.md) | IndexSet, TimerMap, TimeBoundCache, serialization, PeerData/AddrInfo, Dialer internals |
| 08 | [Testing & Metrics](08-testing-metrics-refs.md) | Test infrastructure, simulation, key test patterns, metrics, references |
## Quick Reference
### Version
`iroh-gossip` v0.97.0
### ALPN
`/iroh-gossip/1`
### Core Protocols
- **HyParView**: Hybrid partial view membership (active view = 5, passive view = 30 by default)
- **PlumTree**: Epidemic broadcast trees (eager + lazy push with Graft/IHave optimization)
### Key Abstractions
- **TopicId**: 32-byte identifier for a topic/swarm
- **PeerIdentity**: Generic trait (instantiated as `PublicKey` in the net layer)
- **PeerData**: Opaque bytes exchanged on join (carries `AddrInfo` in net layer)
- **IO trait**: Interface for protocol output events (pure state machine, no IO)
### Wire Format
- Postcard (serde) encoding over QUIC unidirectional streams
- Length-prefixed frames (u32 length + postcard payload)
- Stream header with TopicId
- Max message size: 4096 bytes (configurable, minimum 512)

View File

@@ -0,0 +1,104 @@
# iroh-live: Overview and Architecture
## What It Is
iroh-live is a real-time audio/video streaming system built on top of [iroh](https://github.com/n0-computer/iroh) (QUIC-based P2P networking) and [Media over QUIC (MoQ)](https://moq.dev/). It handles the full pipeline: camera/mic capture → encoding → transport → decoding → rendering. Connections are peer-to-peer by default, with an optional relay server for browser access via WebTransport.
**Status:** Early tech preview. APIs are unstable. Windows support is missing. Audio-video sync is basic.
## Workspace Crates
| Crate | Description |
|-------|-------------|
| `iroh-live` | High-level API: `Live`, `Call`, `Room`, tickets, subscriptions |
| `iroh-moq` | MoQ transport layer over iroh/QUIC via `web-transport-iroh` |
| `iroh-live-relay` | Relay server bridging iroh P2P to browser WebTransport |
| `moq-media` | Media pipelines: capture, encode, decode, publish, subscribe, adaptive bitrate. No iroh dependency |
| `rusty-codecs` | Codec implementations (H264/openh264, AV1/rav1e+ rav1d, Opus), hardware accel (VAAPI, V4L2, VideoToolbox) |
| `rusty-capture` | Cross-platform capture: PipeWire, V4L2, X11, ScreenCaptureKit, AVFoundation |
| `moq-media-egui` | egui integration for video rendering |
| `moq-media-dioxus` | dioxus-native integration for video rendering |
| `moq-media-android` | Android camera, EGL rendering, JNI bridge |
| `iroh-live-cli` | CLI tool (`irl`) for publishing, playing, calls, rooms, relay |
## Layer Architecture
Three distinct layers, each usable independently:
```
┌──────────────────────────────────────────────────────────┐
│ iroh-live │
│ Session management, tickets, rooms, calls │
│ Re-exports: moq-media, iroh-moq │
├──────────────────────────────────────────────────────────┤
│ moq-media │
│ Media pipelines: LocalBroadcast, RemoteBroadcast, │
│ codecs, adaptive bitrate, playout │
│ NO iroh dependency (transport-agnostic) │
├──────────────────────────────────────────────────────────┤
│ iroh-moq │
│ MoQ session management, publish/subscribe over QUIC │
│ Uses web-transport-iroh + moq-lite │
└──────────────────────────────────────────────────────────┘
Below moq-media:
rusty-codecs ─ codec implementations, hardware accel, wgpu rendering
rusty-capture ─ platform-specific screen/camera capture
```
## Design Principles
1. **`&self` everywhere** — All public types use interior mutability. Safe to share across async tasks/threads without wrappers.
2. **Drop-based cleanup** — Dropping a `Call` closes it. Dropping `LocalBroadcast` tears down encoders. Dropping `VideoTrack` stops its decoder thread.
3. **Watcher for continuous state, Stream for discrete events** — Connection quality and catalog contents use `n0_watcher::Direct<T>`. Participant joins use `impl Stream`.
4. **Declarative intent, not mechanism**`VideoTarget::default().max_pixels(1280*720)` describes what quality you need. The catalog selects the best rendition.
5. **moq-media is standalone** — A recording pipeline can use `LocalBroadcast`/`RemoteBroadcast` without iroh-live. The transport boundary is the `PacketSink`/`PacketSource` trait pair.
## Data Flow (End-to-End)
```
Publisher Side:
capture source (rusty-capture, VideoSource trait)
encoder pipeline (moq-media, dedicated OS thread)
▼ EncodedFrame
PacketSink (MoqPacketSink — starts new MoQ group on keyframe)
▼ MoQ transport (iroh-moq, QUIC streams)
Subscriber Side:
PacketSource (MoqPacketSource — reads ordered frames from MoQ)
▼ MediaPacket
decoder pipeline (moq-media, dedicated OS thread)
▼ VideoFrame
FramePacer (PTS-based sleep) or Sync (shared playout clock)
renderer (wgpu texture upload or egui widget)
```
Encoder and decoder pipelines run on **dedicated OS threads**, not tokio tasks, so slow codec operations never block the async runtime. The `forward_packets` async task bridges the network-side `PacketSource` into an mpsc channel that the decoder thread reads synchronously.
## Key Dependencies
| Dependency | Purpose |
|------------|---------|
| `iroh` | QUIC endpoint, connection management, P2P connectivity |
| `iroh-gossip` | Gossip protocol for room participant discovery |
| `iroh-tickets` | Ticket serialization for `RoomTicket` |
| `iroh-smol-kv` | Distributed KV store for room state (gossip-backed) |
| `moq-lite` | Core MoQ protocol: BroadcastProducer, BroadcastConsumer, Track, Group |
| `hang` | Catalog management for broadcast metadata |
| `moq-mux` | MoQ multiplexing |
| `moq-relay` | Relay server implementation (used by iroh-live-relay) |
| `web-transport-iroh` | WebTransport over iroh QUIC connections |
| `n0-future` | Async utilities (FuturesUnordered, AbortOnDropHandle) |
| `n0-watcher` | Watchable/Direct reactive state |
## License
Dual-licensed: MIT OR Apache-2.0. Copyright 2025 N0, INC.

View File

@@ -0,0 +1,167 @@
# iroh-live: Core API — Live, Call, Subscription, Ticket
## `Live` — Entry Point
The primary entry point for all iroh-live operations. Manages an iroh `Endpoint`, the MoQ transport (`Moq`), and optionally a `Gossip` instance for rooms.
### Construction
```rust
// Simple: from environment, accept incoming connections
let live = Live::from_env().await?.with_router().spawn();
// With gossip for rooms
let live = Live::from_env().await?.with_router().with_gossip().spawn();
// From an existing endpoint
let live = Live::builder(endpoint).with_router().with_gossip().spawn();
// Manual router mounting (when you have other protocols)
let router = live.register_protocols(Router::builder(endpoint));
let router = router.accept(other_protocol, other_handler);
let router = router.spawn();
```
### Key Methods
| Method | Description |
|--------|-------------|
| `publish(name, &LocalBroadcast)` | Register a broadcast for all connected peers |
| `subscribe(remote, name)` | Connect to a peer and subscribe to a broadcast → `Subscription` |
| `subscribe_media(remote, name, audio, config)` | Connect, subscribe, decode → `(MoqSession, MediaTracks)` |
| `join_room(ticket)` | Join a gossip-based multi-party room → `Room` |
| `endpoint()` | Access the underlying iroh `Endpoint` |
| `transport()` | Access the `Moq` transport for advanced operations |
| `gossip()` | Access the `Gossip` instance (if enabled) |
| `shutdown()` | Close all sessions, stop router, close endpoint |
### Builder Options
- **`with_router()`** — Spawns an internal `Router` so the endpoint accepts incoming MoQ sessions. Without this, only outbound connections work.
- **`with_gossip()`** — Creates a `Gossip` instance (required for rooms). Internally mounts on the Router if `with_router` is also set.
- **`gossip(gossip)`** — Use an externally-managed `Gossip` instance.
### Internal Architecture
`Live` holds:
- `endpoint: Endpoint` — iroh QUIC endpoint
- `moq: Moq` — Internal actor for session/broadcast management
- `gossip: Option<Gossip>` — For room coordination
- `router: Option<Router>` — For accepting incoming connections
The `from_env()` method reads `IROH_SECRET` for the secret key and generates one if not set. It uses the `N0` preset for relay and DNS discovery.
## `LiveTicket` — Connection Sharing
A serializable ticket that contains everything needed to connect to a publisher.
```rust
// Create a ticket
let ticket = LiveTicket::new(endpoint.addr(), "my-stream");
// Serialize to URI string (fits in QR codes)
let s = ticket.to_string();
// → "iroh-live:<base64url(postcard(EndpointAddr))>/my-stream"
// Deserialize
let parsed: LiveTicket = s.parse()?;
// With relay URLs for indirect connectivity
let ticket = LiveTicket::new(addr, "stream").with_relay_urls(vec![
"https://relay.example.com".to_string(),
]);
```
**Format:** `iroh-live:<base64url(postcard(EndpointAddr))>/<name>`
Also supports legacy `name@base32` format for backward compatibility.
The ticket string is kept short enough for QR codes (< 2000 bytes). It uses postcard (binary) serialization with base64url encoding.
## `Call` — 1:1 Video Call
A convenience wrapper over MoQ primitives for bidirectional calls.
### Flow
1. One side creates a `LocalBroadcast` with video/audio configured
2. **Dialer:** `Call::dial(live, remote_addr, local_broadcast)` — connects, publishes "call" broadcast, subscribes to remote's "call" broadcast
3. **Acceptor:** `Call::accept(session, local_broadcast)` — accepts an incoming session, publishes and subscribes
The broadcast name is always `"call"` — this is hardcoded (`CALL_BROADCAST_NAME`).
```rust
// Dialer side
let local = LocalBroadcast::new();
local.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
let call = Call::dial(&live, remote_addr, local).await?;
// Access remote media
let remote_broadcast = call.remote();
let video = remote_broadcast.video()?;
// Wait for call to end
let reason = call.closed().await;
```
### Key Properties
- `call.local()``&LocalBroadcast` (your media)
- `call.remote()``&RemoteBroadcast` (peer's media)
- `call.signals()``watch::Receiver<NetworkSignals>` (for adaptive bitrate)
- `call.close()` — closes with error code 0 and reason "call ended"
- `call.closed()` → waits for close, returns `DisconnectReason` (LocalClose, RemoteClose, TransportError)
Auto-wires stats recording and network signal production on the connection.
## `Subscription` — Subscribe Handle
Created by `Live::subscribe()`. Wraps the MoQ session, remote broadcast, and network signals into a single handle. The constructor auto-wires stats recording and signal production.
```rust
let sub = live.subscribe(remote_addr, "stream").await?;
// Access components
sub.session() // &MoqSession
sub.broadcast() // &RemoteBroadcast
sub.signals() // &watch::Receiver<NetworkSignals>
// Convenience methods
let tracks = sub.media(&audio_backend, Default::default()).await?;
let tracks = sub.media_with_decoders::<DefaultDecoders>(&audio_backend, config).await?;
// Decompose
let (session, broadcast, signals) = sub.into_parts();
```
## `DisconnectReason`
```rust
pub enum DisconnectReason {
LocalClose,
RemoteClose,
TransportError,
}
```
Derived from the QUIC connection's close reason. Used by `Call::closed()`.
## `util` Module
### `secret_key_from_env()`
Loads the iroh secret key from the `IROH_SECRET` environment variable. Generates a new key if not set, printing the hex-encoded key for reuse.
### `spawn_signal_producer(conn, shutdown)`
Spawns a background task that polls QUIC connection path stats every 200ms and produces `NetworkSignals` for adaptive rendition selection. Returns a `watch::Receiver<NetworkSignals>`.
Computes:
- **RTT** — from `selected_path.rtt()`
- **Loss rate** — delta-based (lost packets / (sent + lost) over the interval)
- **Available bandwidth** — estimated from congestion window: `cwnd * 8 / rtt`
- **Congestion events** — monotonically increasing counter
### `spawn_stats_recorder(conn, net_stats, shutdown)`
Records connection stats (RTT, loss rate, bandwidth, path type) into `NetStats` for debug overlay display. Runs every 200ms.

View File

@@ -0,0 +1,164 @@
# iroh-moq: MoQ Transport Layer
## Overview
`iroh-moq` is the transport bridge between iroh's QUIC endpoint and the moq-lite broadcast protocol. It manages connections, session lifecycle, broadcast routing, and subscription handling. This is the only crate in the workspace that directly interacts with QUIC transport — everything above uses `Moq`/`MoqSession` as the interface.
**ALPN:** `moq-lite-03`
## Core Types
### `Moq` — Transport Manager
The top-level transport entry point. Wraps an iroh `Endpoint` and runs an internal actor (`Actor`) that handles all connection and broadcast management.
```rust
let moq = Moq::new(endpoint);
```
**Internal architecture:**
`Moq` holds an `mpsc::Sender<ActorMessage>` to communicate with a spawned actor task. The actor manages:
- A `HashMap<EndpointId, MoqSession>` of active sessions
- A `HashMap<BroadcastName, BroadcastProducer>` of locally published broadcasts
- A `JoinSet` of session tasks (each tracks session lifetime)
- A `FuturesUnordered` of pending connect tasks
- A `broadcast::Sender<MoqSession>` for incoming session notifications
**Key methods:**
| Method | Description |
|--------|-------------|
| `new(endpoint)` | Creates transport and spawns the actor |
| `protocol_handler()` | Returns `MoqProtocolHandler` for Router registration |
| `publish(name, producer)` | Register a broadcast for all current and future sessions |
| `connect(remote)` | Connect to remote peer, deduplicating existing connections |
| `incoming_sessions()` | Get stream of incoming sessions |
| `published_broadcasts()` | List currently published broadcast names |
| `shutdown()` | Cancel the shutdown token, closing all sessions |
### `MoqProtocolHandler`
Implements iroh's `ProtocolHandler` trait. When the Router receives an incoming connection with the `moq-lite-03` ALPN:
1. Accepts the raw QUIC `Connection`
2. Wraps it in a `web_transport_iroh::Session::raw(connection)`
3. Completes the moq-lite server handshake: `MoqSession::session_accept(wt_session)`
4. Sends the session to the actor via `ActorMessage::HandleSession`
### `MoqSession` — Single Peer Connection
Represents a MoQ connection with one remote peer. Created via:
- `Moq::connect()` (outbound, client role)
- `IncomingSession::accept()` (inbound, server role)
```rust
// Outbound
let session = moq.connect(remote_addr).await?;
// Inbound
let incoming = incoming_session.next().await?;
let session = incoming.accept(); // or incoming.reject()
```
**Internal structure:**
```rust
pub struct MoqSession {
wt_session: web_transport_iroh::Session,
_moq_session: Arc<moq_lite::Session>,
publish: OriginProducer, // For announcing local broadcasts
subscribe: OriginConsumer, // For consuming remote broadcasts
}
```
The `OriginProducer`/`OriginConsumer` pair comes from moq-lite. The session creates them before the handshake:
- **Client (connect):** Creates `OriginProducer` for publish and `OriginConsumer` for subscribe, then `Client::new().with_publish(...).with_consume(...).connect(session)`
- **Server (accept):** Same pattern with `Server::new().with_publish(...).with_consume(...).accept(session)`
**Key methods:**
| Method | Description |
|--------|-------------|
| `subscribe(name)` | Wait for remote to announce broadcast, return `BroadcastConsumer` |
| `publish(name, consumer)` | Make a broadcast available to remote peer |
| `conn()` | Reference to underlying QUIC `Connection` (for stats) |
| `remote_id()` | Remote peer's `EndpointId` |
| `close(code, reason)` | Close the session |
| `closed()` | Wait for session to close, returns `SessionError` |
| `origin_producer()` | Direct access to moq-lite publish origin |
| `origin_consumer()` | Direct access to moq-lite subscribe origin |
### `IncomingSession` / `IncomingSessionStream`
`IncomingSession` wraps a `MoqSession` that has completed the handshake. Provides:
- `remote_id()` — the connecting peer's identity
- `accept()` — returns the `MoqSession`
- `reject()` — closes with error code 1
`IncomingSessionStream` is an async stream that yields `IncomingSession` values. Uses a `broadcast::Receiver<MoqSession>` internally, handling lag by skipping missed sessions.
## Actor Internals
The `Actor` is the core event loop for the `Moq` transport:
```
loop {
select! {
msg = inbox.recv() → handle_message(msg)
session_closed → remove session, log
broadcast_closed → remove from publishing map
connect_completed → handle_session or reply to caller
}
}
```
### Message Types
```rust
enum ActorMessage {
HandleSession { session: Box<MoqSession> },
LocalBroadcast { broadcast_name: String, producer: BroadcastProducer },
Connect { remote: EndpointAddr, reply: oneshot::Sender<...> },
GetPublished { reply: oneshot::Sender<Vec<String>> },
}
```
### Connection Deduplication
When `Connect` is received for a peer that already has an active session, the existing session is returned immediately. If a connection attempt is already in progress, the oneshot reply is queued and notified when the connection completes.
### Broadcast Fan-out
When a `LocalBroadcast` is published via `Moq::publish()`:
1. The actor stores the `BroadcastProducer` in its `publishing` map
2. It immediately announces the broadcast to all existing sessions by calling `session.publish(name, producer.consume())` on each
3. For future sessions, the actor iterates `publishing` entries and announces each one
4. A `FuturesUnordered` tracks when each broadcast closes, removing it from the map
### Session Lifecycle
When a session is established (either incoming or outgoing):
1. All currently published broadcasts are announced to it
2. It's stored in `sessions` by `EndpointId`
3. A session task is spawned that waits for the session to close
4. If there were pending connect requests for this peer, they're fulfilled
## Error Types
```rust
enum Error {
Connect(ConnectError), // iroh connection failure
Moq(moq_lite::Error), // MoQ protocol error
Server(web_transport_iroh::ServerError), // WebTransport server error
InternalConsistencyError(LiveActorDiedError), // Actor died
Request(WriteError), // QUIC write error
}
enum SubscribeError {
NotAnnounced, // Track was not announced
Closed, // Track was closed
SessionClosed(SessionError), // Session closed
}
```

View File

@@ -0,0 +1,185 @@
# iroh-live: Rooms — Multi-Party Coordination
## Overview
The `rooms` module provides multi-party room coordination over iroh-gossip. Participants discover each other via a gossip topic, automatically connect and subscribe to each other's broadcasts, and receive `RoomEvent` notifications as peers join, publish, and leave.
## Core Types
### `Room`
The main room handle. Created via `Room::new(live, ticket)`. Spawns an internal actor that manages all peer coordination.
```rust
// Create a room (generates a random topic)
let ticket = RoomTicket::generate();
let room = Room::new(&live, ticket.clone()).await?;
// Or join an existing room
let room = Room::new(&live, existing_ticket).await?;
```
**Methods:**
- `recv()` — Wait for next `RoomEvent`
- `try_recv()` — Non-blocking event check
- `ticket()` — Get a ticket that includes this peer as a bootstrap node
- `split()` — Decompose into `(RoomEvents, RoomHandle)` for use in separate tasks
- `publish(name, &LocalBroadcast)` — Publish a broadcast to the room
- `set_chat_publisher(publisher)` — Register a chat publisher
- `send_chat(text)` — Send a chat message
### `RoomHandle`
Cloneable handle for publishing into a room. Obtained from `Room::split()`. Can be shared across tasks.
```rust
let (events, handle) = room.split();
// In one task: receive events
while let Some(event) = events.recv().await {
match event { ... }
}
// In another task: publish
handle.publish("camera", &broadcast).await?;
handle.send_chat("Hello!").await?;
handle.set_display_name("Alice").await?;
```
### `RoomTicket`
```rust
pub struct RoomTicket {
pub bootstrap: Vec<EndpointId>, // Bootstrap peer IDs for gossip
pub topic_id: TopicId, // Gossip topic identifier
}
```
Serialized via `iroh_tickets` (binary format). Can be created from:
- `RoomTicket::generate()` — Random topic, no bootstrap
- `RoomTicket::new(topic_id, bootstrap)` — Specific topic and peers
- `RoomTicket::new_from_env()` — From `IROH_LIVE_ROOM` or `IROH_LIVE_TOPIC` env vars
### `RoomEvent`
```rust
pub enum RoomEvent {
RemoteAnnounced {
remote: EndpointId,
broadcasts: Vec<String>,
},
BroadcastSubscribed {
session: Box<MoqSession>,
broadcast: Box<RemoteBroadcast>,
},
PeerJoined {
remote: EndpointId,
display_name: Option<String>,
},
PeerLeft {
remote: EndpointId,
},
ChatReceived {
remote: EndpointId,
message: ChatMessage,
},
}
```
## Room Actor — Internal Architecture
The room actor is a spawned task that manages the gossip KV subscription and coordinates all peer connections.
### State
```rust
struct Actor {
me: EndpointId,
_gossip: Gossip,
live: Live,
active_subscribe: HashSet<BroadcastId>, // (EndpointId, name) pairs
active_publish: HashSet<String>, // Locally published broadcast names
known_peers: HashMap<EndpointId, Option<String>>, // display names
connecting: ConnectingFutures, // In-flight subscribe attempts
subscribe_closed: FuturesUnordered, // Track subscription lifetimes
publish_closed: FuturesUnordered, // Track publish lifetimes
chat_messages: FuturesUnordered, // Active chat subscribers
chat_publisher: Option<ChatPublisher>,
display_name: Option<String>,
event_tx: mpsc::Sender<RoomEvent>,
kv: iroh_smol_kv::Client, // Distributed KV for peer state
kv_writer: WriteScope, // KV write access
}
```
### Gossip KV for Peer Discovery
The room uses `iroh-smol-kv` over gossip for peer state coordination. Each peer writes their `PeerState` to key `b"s"`:
```rust
struct PeerState {
broadcasts: Vec<String>,
display_name: Option<String>,
}
```
Serialized with postcard (binary format — **no `skip_serializing_if`** allowed since postcard is positional).
### Event Loop
```
loop {
select! {
update = gossip_kv_stream.next() → handle_gossip_update
msg = inbox.recv() → handle_api_message
result = connecting.next() → subscribe succeeded/failed
broadcast_closed → remove from active, maybe emit PeerLeft
publish_closed → remove from active_publish, update KV
chat_message → emit ChatReceived
}
}
```
### Peer Discovery Flow
1. Peer A publishes a broadcast via `handle.publish("camera", &broadcast)`
2. Actor publishes to MoQ AND updates gossip KV with `PeerState { broadcasts: ["camera"], display_name: ... }`
3. Peer B's gossip KV stream receives the update
4. Peer B's actor checks `known_peers` — if new, emits `PeerJoined`
5. Peer B's actor checks `active_subscribe` — if new broadcast, initiates `live.subscribe(remote, name)`
6. When subscription succeeds, Peer B emits `BroadcastSubscribed`
7. If the broadcast has a chat track, a chat subscriber is spawned
### Chat
Chat uses a dedicated MoQ track within each broadcast. Each message is a single MoQ group containing one frame of UTF-8 text. The sender identity comes from the broadcast context (peer ID), not the message payload.
### Connection Lifecycle
- When a broadcast closes (`subscribe_closed`), it's removed from `active_subscribe`
- If this was the last broadcast from that peer, `PeerLeft` is emitted
- When a publish closes (`publish_closed`), the KV is updated to remove that broadcast
### `RoomPublisherSync`
A convenience wrapper for the common pattern of publishing camera+audio and optionally screen share into a room:
```rust
let publisher = RoomPublisherSync::new(room_handle, audio_backend);
publisher.set_state(&PublishOpts::default())?;
```
Automatically publishes a "camera" broadcast and manages a "screen" broadcast when screen sharing is toggled on.
## API Messages
```rust
enum ApiMessage {
Publish { name: String, producer: BroadcastProducer },
SendChat { text: String },
SetChatPublisher { publisher: ChatPublisher },
SetDisplayName { name: String },
}
```
These are sent from `RoomHandle` to the actor via an mpsc channel.

View File

@@ -0,0 +1,105 @@
# iroh-live-relay: Browser Bridging
## Overview
The relay server bridges iroh P2P streams to browser clients via WebTransport. Browsers cannot speak iroh's QUIC protocol directly, so the relay accepts WebTransport connections and either serves locally-published broadcasts or pulls them from remote iroh publishers on demand.
**Architecture:**
```
iroh-live publish --(iroh P2P)--> iroh-live-relay <--(WebTransport)-- browser
browser --(WebTransport)--> iroh-live-relay --(iroh P2P)--> iroh-live subscribe
```
## Components
### `RelayConfig` (CLI Configuration)
```rust
pub struct RelayConfig {
pub bind: SocketAddr, // QUIC/WebTransport bind (default: [::]:4443)
pub http_bind: SocketAddr, // HTTP static files bind (default: same as bind)
}
```
Flattenable into a clap CLI via `#[command(flatten)]`.
### `run(config)` — Main Server Loop
The main entry point. Sets up:
1. **QUIC/WebTransport server** — Uses `moq-native::ServerConfig` with:
- QUIC backend: `noq` (a custom QUIC implementation)
- iroh endpoint integration
- Self-signed TLS certificates (dev mode) for `localhost`
- Max streams: `moq_relay::DEFAULT_MAX_STREAMS`
2. **iroh endpoint** — Binds an iroh endpoint for P2P connectivity, prints its ID
3. **moq-relay Cluster** — The broadcast routing engine. Manages broadcast lifecycle: when all subscribers disconnect, the broadcast is removed.
4. **HTTP server** — Axum router serving:
- `GET /certificate.sha256` — TLS fingerprint for dev mode
- `GET /` — Web viewer landing page
- `GET /{path}` — Static file serving with CORS
- Embedded via `include_dir!` from `web/dist/`
5. **Pull mode** — If iroh endpoint is available, creates a `PullState` for on-demand remote broadcast fetching
6. **Connection loop** — Accepts incoming connections, parses the URL path as a `LiveTicket`, and if valid, triggers a pull before running the connection
### `PullState` — On-Demand Remote Fetching
When a browser connects with a broadcast name that is a valid `LiveTicket`, the relay:
1. Checks if the broadcast already exists in the cluster (fast path)
2. If not, connects to the remote publisher via iroh-live's `Moq::connect()`
3. Subscribes to the remote broadcast
4. Publishes the consumer into the local cluster under the ticket string as the name
5. Spawns a keepalive task that holds the session until it closes
**Concurrency:** Duplicate concurrent pulls for the same ticket are deduplicated using a `HashMap<String, Arc<Notify>>`. Waiters block on the `Notify` until the first connector finishes.
```rust
pub(crate) struct PullState {
live: iroh_live::Live,
cluster: Cluster,
connecting: Arc<Mutex<HashMap<String, Arc<Notify>>>>>,
}
```
### Web Viewer
The relay embeds a SolidJS + TypeScript web application compiled by Vite. It uses:
- `@moq/watch` — Web component for watching streams via WebCodecs
- `@moq/publish` — Web component for publishing from browser camera/mic
- WebTransport — For QUIC connectivity from the browser
Watch URLs: `https://relay:4443/?name=<BROADCAST_OR_TICKET>`
### Data Directory
The relay persists data to `$IROH_LIVE_RELAY_DATA` (or the platform default). This includes:
- iroh secret key (`iroh_secret_key`) — ensures endpoint ID stability across restarts
- TLS certificates
### TLS and Certificates
Currently **self-signed only**. ACME/Let's Encrypt is planned but not implemented. In dev mode, browsers need `--ignore-certificate-errors` or the relay's fingerprint (served at `/certificate.sha256`) for WebTransport to work.
## Error Handling
No authentication is implemented yet. The relay accepts all connections. MoQ supports token-based authentication which could be added.
## CLI Binary
```rust
// iroh-live-relay/src/main.rs
#[derive(Parser)]
struct Cli {
#[command(flatten)]
relay: RelayConfig,
}
```
Must call `rustls::crypto::aws_lc_rs::default_provider().install_default()` before `run()`.

View File

@@ -0,0 +1,304 @@
# moq-media: Media Pipelines
## Overview
`moq-media` owns the media pipeline: broadcast management, codec orchestration, playout timing, adaptive bitrate, and audio backend. **It has no dependency on iroh** — it works with any transport that implements `PacketSource` and `PacketSink`. This makes it usable for recording pipelines, studio links, and camera dashboards without RTC.
## Module Structure
```
moq-media/
├── lib.rs — Re-exports and feature-gated modules
├── publish.rs — LocalBroadcast, VideoPublisher, AudioPublisher
├── subscribe.rs — RemoteBroadcast, VideoTrack, AudioTrack, MediaTracks
├── transport.rs — PacketSource/PacketSink traits, MoqPacketSource, MoqPacketSink
├── net.rs — NetworkSignals (RTT, loss rate, available bandwidth)
├── adaptive.rs — Adaptive rendition switching algorithm
├── playout.rs — PlaybackPolicy, SyncMode
├── chat.rs — ChatPublisher, ChatSubscriber (MoQ track-based)
├── frame_channel.rs — Single-frame channel (last-writer-wins for video)
├── sync.rs — Shared playout clock (Sync) for A/V sync
├── stats.rs — Metric, Label, NetStats, EncodeStats, RenderStats, etc.
├── pipeline.rs — Pipeline orchestration
├── pipeline/ — VideoEncoderPipeline, AudioEncoderPipeline, VideoDecoderPipeline, etc.
├── audio_backend.rs — AudioBackend trait and device enumeration
├── audio_backend/ — Platform-specific audio backends (cpal, etc.)
├── capture.rs — Camera/screen capture integration
├── source_spec.rs — VideoInput, PreEncodedTrack
├── test_util.rs — Test utilities (feature-gated)
└── processing/ — Scale, color conversion, etc.
```
## Publish Pipeline — `LocalBroadcast`
`LocalBroadcast` manages encoder pipelines and publishes a catalog that subscribers use to discover available renditions. It owns a `BroadcastProducer` (from moq-lite) and coordinates video and audio track lifecycles.
### Construction
```rust
let broadcast = LocalBroadcast::new();
broadcast.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
broadcast.audio().set(mic, AudioCodec::Opus, [AudioPreset::Hq])?;
// Or pre-encoded sources
broadcast.video().set(VideoInput::pre_encoded("video/h264-pi", config, factory))?;
```
### Slot Handles
- `broadcast.video()``VideoPublisher` (borrows `&self`)
- `broadcast.audio()``AudioPublisher` (borrows `&self`)
Both use interior mutability. Calling `set()` tears down any existing pipeline and installs the new one.
### Video Input Modes
```rust
pub enum VideoInput {
Renditions(VideoRenditions), // Raw source → multiple encoded renditions (simulcast)
PreEncoded(Vec<PreEncodedTrack>), // Already-encoded tracks pass through
}
```
**`VideoRenditions`** holds a `SharedVideoSource` and a map of rendition names to encoder factories. Multiple renditions share the same source via `watch::Receiver<Option<VideoFrame>>`. Slow encoders never cause backpressure on the source — intermediate frames are silently skipped.
**`PreEncodedTrack`** is for hardware encoders that produce compressed output directly (e.g., rpicam-vid on Raspberry Pi). Each track carries a name, `VideoConfig`, and a factory closure that creates a fresh source per subscriber.
### SharedVideoSource
Runs the capture source on a dedicated OS thread. Parks when no subscribers are connected (releasing camera/screen resources) and unparks when the first subscriber arrives. Uses `AtomicU32` subscriber counting with proper memory ordering (`AcqRel`/`Acquire`).
Frames are distributed via `watch::Sender<Option<VideoFrame>>` — always contains the latest frame, so slow encoders never block the source.
### Demand-Driven Track Startup
The broadcast's run loop (`LocalBroadcast::run_dynamic`) calls `producer.requested_track().await` to wait for subscriber demand. When a subscriber requests a specific rendition:
1. The loop looks up the rendition in the current `VideoInput` or `AudioRenditions`
2. It starts the corresponding encoder pipeline on a dedicated OS thread
3. When all subscribers disconnect (tracked via `track.unused().await`), the pipeline is stopped
This means encoder threads only run when someone is actually consuming.
### Catalog
`LocalBroadcast` maintains a catalog track (hang's built-in catalog mechanism) listing all available video and audio renditions with codec configuration, dimensions, and bitrate. Updated whenever video or audio is set/cleared.
Catalog format follows the `hang::catalog::Catalog` structure with `Video` and `Audio` entries, each containing a `BTreeMap<String, Config>` of rendition names to configurations.
### Encoder Pipeline Architecture
All encoder pipelines run on **dedicated OS threads** (`spawn_thread`), not tokio tasks. Codec operations are CPU-intensive and sometimes block on hardware (VAAPI, V4L2), so running on tokio tasks would starve other async work.
Communication with the async runtime:
- **VideoEncoderPipeline**: reads `SharedVideoSource` via `watch::Receiver`, writes encoded frames to `MoqPacketSink`
- **AudioEncoderPipeline**: reads from `AudioSource`, writes to `MoqPacketSink`
- **PreEncodedVideoPipeline**: reads from `PreEncodedVideoSource`, writes to `MoqPacketSink`
### Chat
```rust
let chat_publisher = broadcast.enable_chat()?;
chat_publisher.send("Hello!")?;
// Subscriber side
if let Some(chat_sub) = remote_broadcast.chat() {
let msg = chat_sub.recv().await;
}
```
Each chat message is a single MoQ group with one frame of UTF-8 text. The track name is `"chat"` with priority 10.
## Subscribe Pipeline — `RemoteBroadcast`
`RemoteBroadcast` wraps a `BroadcastConsumer` and watches its catalog for available video and audio renditions. Created with a `BroadcastConsumer` and a `PlaybackPolicy`.
### Construction
```rust
let broadcast = RemoteBroadcast::new("stream-name", consumer).await?;
// Or with explicit policy
let broadcast = RemoteBroadcast::with_playback_policy("stream", consumer, policy).await?;
```
On construction, spawns a catalog-watching task that publishes snapshots via `Watchable<CatalogSnapshot>`.
### `CatalogSnapshot`
Point-in-time view of the broadcast's catalog. Derefs to `hang::Catalog`. Carries a sequence number for change detection.
```rust
let catalog = broadcast.catalog();
catalog.video_renditions() // Iterator of rendition names sorted by width
catalog.audio_renditions() // Iterator of audio rendition names
catalog.select_video_rendition(Quality::High)? // Best match for quality
catalog.has_video()
catalog.has_audio()
catalog.has_chat()
catalog.user() // User metadata from publisher
```
### Rendition Selection
```rust
pub enum Quality { Highest, High, Mid, Low }
pub struct VideoTarget {
pub max_pixels: Option<u32>,
pub max_bitrate_kbps: Option<u32>,
pub rendition: Option<String>, // Pin to specific rendition
}
```
`Quality::High``max_pixels(1280*720)`, etc. If `rendition` is set, it takes priority.
### VideoTrack
Represents a decoded video stream from a remote broadcast. The decoder runs on a dedicated OS thread.
**Creation flow:**
1. Pick a rendition (via `VideoTarget` or explicit name)
2. Create `TrackConsumer` from `BroadcastConsumer`, wrap in `OrderedConsumer` with `PlaybackPolicy::max_latency`
3. Wrap in `MoqPacketSource`
4. A `forward_packets` async task reads from `MoqPacketSource``mpsc` channel
5. Decoder thread reads `mpsc` → decoder → output via `Sync` playout clock (or `FramePacer`)
6. Output channel: `FrameReceiver<VideoFrame>` (latest-frame wins, suitable for rendering)
**Frame access:**
- `track.try_recv()` — Returns latest frame, draining older buffered frames (for game loops)
- `track.next_frame().await` — Async wait for next frame
- `track.has_frame()` — Check without consuming
**Adaptive rendition switching:**
```rust
track.enable_adaptation(broadcast, signals, config, decode_config)?;
track.disable_adaptation();
track.is_adaptive();
track.selected_rendition();
track.set_rendition_mode(RenditionMode::Fixed("video/h264-360p".into()));
track.set_rendition_mode(RenditionMode::Auto);
track.rendition_watcher(); // Direct<String> watcher for rendition changes
```
### AudioTrack
Same pattern as `VideoTrack` but sends decoded samples to an `AudioSink` (typically cpal + sonora). The audio decoder thread runs a 10ms tick loop.
### MediaTracks
Convenience struct combining `RemoteBroadcast` with optional `VideoTrack` and `AudioTrack`:
```rust
pub struct MediaTracks {
pub broadcast: RemoteBroadcast,
pub video: Option<VideoTrack>,
pub audio: Option<AudioTrack>,
}
```
### Lifecycle
Both `VideoTrack` and `AudioTrack` use drop-based cleanup. Dropping cancels the decoder thread (via `CancellationToken`) and the `forward_packets` task (via `AbortOnDropHandle`). The `OrderedConsumer` is dropped, signaling the transport that the track is no longer needed.
## Transport Abstraction — `PacketSource` / `PacketSink`
The transport boundary between moq-media and the network:
```rust
pub trait PacketSource: Send + 'static {
fn read(&mut self) -> impl Future<Output = Result<Option<MediaPacket>>> + Send;
}
pub trait PacketSink: Send + 'static {
fn write(&mut self, packet: EncodedFrame) -> Result<()>;
fn finish(&mut self) -> Result<()>;
}
```
**`MoqPacketSink`** wraps an `OrderedProducer`. When it receives an `EncodedFrame` with `is_keyframe = true`, it calls `keyframe()` on the producer to start a new MoQ group. This keyframe-to-group mapping is how subscribers can join at any group boundary.
**`MoqPacketSource`** wraps an `OrderedConsumer` and reads frames, converting them to `MediaPacket`.
**`PipeSink` / `PipeSource`** — In-memory pipe for local encode→decode without network (testing, local preview).
## Adaptive Rendition Switching
The adaptation algorithm runs in a background task that monitors `NetworkSignals` and decides whether to switch to a different video rendition.
### Algorithm
Renditions are ranked by pixel count (highest first). The algorithm maintains state across ticks:
```rust
pub enum Decision {
Hold, // Stay on current rendition
Downgrade(usize), // Switch to lower at index
Emergency, // Drop to lowest immediately
StartProbe(usize), // Try upgrading to index
}
```
**Emergency** (immediate): Loss rate ≥ 20% → drop to lowest rendition
**Downgrade** (sustained 500ms): Loss rate ≥ 10% OR available bandwidth < 85% of current rendition's bitrate
**Upgrade probe** (sustained 4s good conditions): Loss ≤ 2%, bandwidth ≥ 120% of next-higher rendition's bitrate → start 3-second probe on the higher rendition
**Probe abort**: Loss ≥ 5% or new congestion events during probe → abort, 8s cooldown
**Post-downgrade cooldown**: 4s after any downgrade before probes are allowed
### Implementation
The adaptation task (`adaptation_task_v2`) creates new `VideoDecoderPipeline`s that write to the same `FrameSender` via `with_sender()`. The frame channel stays the same while the underlying decoder pipeline gets swapped. When switching:
1. Create a new decoder pipeline for the target rendition
2. Drop the old pipeline handle
3. Update `selected_rendition` Watchable
## Playback and Sync
### PlaybackPolicy
```rust
pub struct PlaybackPolicy {
pub sync: SyncMode, // Synced (shared clock) or Unmanaged (PTS pacing)
pub max_latency: Duration, // Default: 150ms — how much buffering before skipping forward
}
```
### SyncMode
- **`Synced`** (default): Shared playout clock (`Sync`). Video frames are gated by `Sync::wait(pts)`, which blocks until `reference + pts + latency` arrives. Audio paces itself through its ring buffer (~80ms).
- **`Unmanaged`**: No synchronization. `FramePacer` sleeps between frames based on PTS deltas, clamped to 2× frame period.
### Sync
The `Sync` type records arrival offsets via `received(pts)` and blocks on `wait(pts)` until `reference + pts + latency`. This keeps audio and video aligned without cross-path gating or signaling. Ported from the moq/js implementation.
## Stats
moq-media has a structured stats system for debug overlays:
- **`NetStats`** — RTT, loss%, bandwidth, path type (written by iroh-live transport bridge)
- **`EncodeStats`** — FPS, encode time, bitrate, codec, encoder, resolution, capture path
- **`RenderStats`** — FPS, decode time, decoder, renderer, rendition
- **`TimingStats`** — Audio buffer level, video/audio lag, A/V delta, video buffer depth
- **`Timeline`** — Ring buffer of `FrameMeta` entries for timeline visualization
Each `Metric` has EMA smoothing, a history ring buffer, and optional color thresholds. `Label` provides atomic string values.
## Codec Support
Feature-gated codec support:
| Feature | Codec | Backend |
|---------|-------|---------|
| `h264` | H.264 | openh264 (software) |
| `av1` | AV1 | rav1e encoder, rav1d decoder |
| `opus` | Opus | opus crate |
| `vaapi` | VAAPI | Linux hardware encode/decode |
| `videotoolbox` | VideoToolbox | macOS hardware |
| `v4l2` | V4L2 | Raspberry Pi hardware |
| `pcm` | Raw PCM | No encoding |

View File

@@ -0,0 +1,95 @@
# iroh-live: Network Signals and Adaptive Bitrate
## NetworkSignals
Produced by polling iroh QUIC connection stats. Consumed by `VideoTrack::enable_adaptation()` to decide when to switch video renditions.
```rust
pub struct NetworkSignals {
pub rtt: Duration, // Round-trip time to remote peer
pub loss_rate: f64, // Recent packet loss rate (0.0..=1.0), 200ms delta window
pub available_bps: u64, // Estimated available bandwidth (cwnd * 8 / rtt)
pub congestion_events: u64, // Monotonically increasing congestion counter
}
```
### Production
`spawn_signal_producer()` in `iroh-live/src/util.rs` polls every 200ms:
1. Gets connection paths via `conn.paths().get()`
2. Finds the selected path (`is_selected()`)
3. Reads path stats (`lost_packets`, `udp_tx.datagrams`, `cwnd`) and RTT
4. Computes delta-based loss rate: `delta_lost / (delta_sent + delta_lost)`
5. Estimates bandwidth: `cwnd * 8 * 1e9 / rtt_ns`
6. Writes to `watch::Sender<NetworkSignals>`
Also: `spawn_stats_recorder()` records into `NetStats` for the debug overlay (RTT, loss%, bandwidth in/out, path type).
## Adaptive Rendition Algorithm
Located in `moq-media/src/adaptive.rs`. The algorithm evaluates `NetworkSignals` against configured thresholds and produces `Decision` values.
### Configuration (`AdaptiveConfig`)
| Parameter | Default | Description |
|-----------|---------|-------------|
| `upgrade_hold` | 4s | Sustained good conditions before upgrade probe |
| `downgrade_hold` | 500ms | Sustained bad conditions before downgrade |
| `probe_duration` | 3s | How long a probe runs before committing |
| `probe_cooldown` | 8s | Cooldown after a failed probe |
| `post_downgrade_cooldown` | 4s | Cooldown after any downgrade |
| `loss_downgrade` | 10% | Loss rate threshold for downgrade |
| `loss_emergency` | 20% | Loss rate for immediate drop to lowest |
| `loss_good` | 2% | Loss rate considered "good" |
| `loss_probe_abort` | 5% | Loss rate that aborts an active probe |
| `bw_downgrade_ratio` | 85% | Bandwidth utilization ceiling for downgrade |
| `bw_probe_headroom` | 120% | Required excess bandwidth for probe |
| `check_interval` | 200ms | How often adaptation task checks signals |
### Decision Logic
```
1. Emergency: loss >= 20% AND not already lowest → Drop to lowest immediately
2. Downgrade check:
- bandwidth_stressed (available < current_bitrate * 85%) OR loss >= 10%
- sustained for downgrade_hold (500ms) → Downgrade(next_lower)
3. Upgrade check:
- Already at highest → Hold
- Within post_downgrade_cooldown (4s) → Hold
- Within probe_cooldown (8s) → Hold
- bandwidth_headroom (available >= next_higher_bitrate * 120%) AND loss <= 2%
- sustained for upgrade_hold (4s) → StartProbe(next_higher)
4. Otherwise: Hold
```
### Probe Lifecycle
When `StartProbe(idx)` is decided:
1. Create a new decoder pipeline for the higher rendition
2. Write frames to the same `FrameSender` (seamless switch for the consumer)
3. Monitor signals during the probe period
4. If `should_abort_probe()` (loss ≥ 5% or new congestion events) → abort, drop probe pipeline, cooldown 8s
5. If probe duration (3s) passes without abort → commit, replace current pipeline
### Rendition Ranking
```rust
pub fn rank_renditions(renditions: &BTreeMap<String, VideoConfig>) -> Vec<RankedRendition>
```
Sorts by pixel count descending (highest quality = index 0). Each `RankedRendition` carries name, pixels, bitrate_bps, width, height.
### RenditionMode
```rust
pub enum RenditionMode {
Auto, // Algorithm-driven switching
Fixed(String), // Pin to a specific rendition
}
```
Controlled via `VideoTrack::set_rendition_mode()`. In Fixed mode, the algorithm switches directly to the named rendition without probing.

View File

@@ -0,0 +1,85 @@
# iroh-live: P2P Connectivity and Relay Architecture
## Direct Connectivity
iroh connects peers directly when possible:
- **Same LAN:** Communicates over the local network without traffic leaving the subnet
- **Public IP / simple NAT:** iroh's hole-punching establishes a direct UDP path
- **Symmetric NAT / corporate firewalls / CGNAT:** Falls back to iroh relay network
The iroh endpoint exposes path statistics via `conn.paths()`, which returns a `Watcher<PathInfoList>`. Each `PathInfo` reports RTT, whether the path is selected, and the remote address. The selected path is the one actively carrying traffic; iroh may maintain multiple candidate paths and switch between them.
The transition between direct and relayed paths is transparent to the application. The media pipeline sees only changes in RTT and bandwidth, which adaptive rendition switching handles automatically.
## iroh-live-relay: Architecture
The relay serves two transport protocols simultaneously:
```
iroh P2P publisher ──(QUIC, moq-lite-03)──> iroh-live-relay <──(WebTransport/H3, noq)── browser
```
Both protocols feed into `moq-relay`'s shared `Origin`, which manages broadcast routing. A broadcast published via iroh is automatically available to WebTransport subscribers, and vice versa.
### Pull Model
The relay operates in **pull mode**: it connects to iroh publishers on demand when a browser client requests a broadcast. The broadcast name in the URL can be a `LiveTicket` URI. Multiple browser clients watching the same broadcast share a single upstream iroh connection.
Pull flow:
1. Browser connects via WebTransport, requests broadcast by name (or ticket)
2. Relay checks if broadcast already exists in local cluster → fast path
3. If not, relay uses iroh-live `Moq::connect()` to connect to the remote publisher
4. Subscribes to the broadcast via `session.subscribe(broadcast_name)`
5. Publishes the consumer into the local cluster under the ticket string as the name
6. Spawns a keepalive task holding the session until it closes
7. Browser receives the stream through the relay's WebTransport frontend
### Connection Deduplication
`PullState` uses a `HashMap<String, Arc<Notify>>` to prevent duplicate concurrent connections to the same remote. If a pull is already in progress for a given ticket, subsequent requests wait on the `Notify` and then check if the broadcast appeared in the cluster.
### QUIC Backend: noq
The relay uses `noq` as its QUIC backend (not quinn). This is configured via:
```rust
server_config.backend = Some(moq_native::QuicBackend::Noq);
```
### iroh Endpoint Integration
The relay also binds an iroh endpoint:
```rust
let mut iroh_config = moq_native::IrohEndpointConfig::default();
iroh_config.enabled = Some(true);
iroh_config.secret = Some(relay.iroh_secret_path_str());
let iroh = iroh_config.bind().await?;
```
This enables the relay to participate in the iroh P2P network directly.
## Ticket Format
`LiveTicket` serves as the connection mechanism for both P2P and relay scenarios:
- **P2P:** Subscriber uses the `EndpointAddr` (node ID + relay URLs) to connect directly
- **Relay:** The full ticket string becomes the broadcast name in the URL: `https://relay:4443/?name=iroh-live:...`
The ticket format: `iroh-live:<base64url(postcard(EndpointAddr))>/<broadcast_name>`
It also supports a legacy format: `<name>@<base32(postcard(EndpointAddr))>`
## Connection Access in iroh-moq
`MoqSession::conn()` returns a reference to the underlying iroh `Connection`. This is used by:
1. **Signal producer** — Polls path stats for `NetworkSignals`
2. **Stats recorder** — Records into `NetStats` for debug overlays
3. **Call::closed()** — Inspects QUIC close reason to determine `DisconnectReason`
The connection provides:
- `paths().get()` — List of active network paths with RTT, stats, relay status
- `close_reason()` — Why the connection closed (LocallyClosed, ApplicationClosed, ConnectionClosed, Reset)
- `remote_id()` — Remote peer's endpoint ID

View File

@@ -0,0 +1,42 @@
# iroh-live Reference Documentation
> **Status:** Early tech preview. APIs are unstable. Based on source code analysis of the iroh-live workspace.
## Files
| File | Topic |
|------|-------|
| [01-overview-and-architecture](01-overview-and-architecture.md) | Workspace structure, crate layers, design principles, data flow, dependencies |
| [02-core-api](02-core-api.md) | `Live`, `LiveTicket`, `Call`, `Subscription`, `DisconnectReason`, `util` module |
| [03-iroh-moq-transport](03-iroh-moq-transport.md) | `Moq`, `MoqSession`, `MoqProtocolHandler`, actor internals, session lifecycle, error types |
| [04-rooms](04-rooms.md) | `Room`, `RoomHandle`, `RoomTicket`, `RoomEvent`, gossip KV coordination, actor architecture |
| [05-relay](05-relay.md) | `iroh-live-relay`: browser bridging, pull model, `RelayConfig`, `PullState`, web viewer |
| [06-moq-media-pipelines](06-moq-media-pipelines.md) | `LocalBroadcast`, `RemoteBroadcast`, `VideoTrack`, `AudioTrack`, transport abstraction, codec support |
| [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md) | `NetworkSignals`, adaptation algorithm, `AdaptiveConfig`, `Decision`, probe lifecycle |
| [08-p2p-and-relay](08-p2p-and-relay.md) | iroh P2P connectivity, relay architecture, pull model, ticket format, connection access |
## Quick Navigation
### "How do I..."
- **Publish a stream?** → [02-core-api](02-core-api.md) (`Live::publish`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`LocalBroadcast`)
- **Subscribe to a stream?** → [02-core-api](02-core-api.md) (`Live::subscribe`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`RemoteBroadcast`)
- **Make a 1:1 call?** → [02-core-api](02-core-api.md) (`Call::dial` / `Call::accept`)
- **Create a multi-party room?** → [04-rooms](04-rooms.md) (`Room::new`, `RoomTicket`)
- **Bridge to browsers?** → [05-relay](05-relay.md) (`iroh-live-relay`)
- **Adapt quality to network conditions?** → [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md)
- **Understand the MoQ transport?** → [03-iroh-moq-transport](03-iroh-moq-transport.md)
- **Understand the media pipeline?** → [06-moq-media-pipelines](06-moq-media-pipelines.md)
### Key Source Files
| Component | Path |
|-----------|------|
| iroh-live crate | `iroh-live/src/{lib, live, call, subscription, ticket, types, util, rooms}.rs` |
| iroh-moq crate | `iroh-moq/src/lib.rs` |
| iroh-live-relay | `iroh-live-relay/src/{lib, main, pull}.rs` |
| moq-media publish | `moq-media/src/publish.rs` |
| moq-media subscribe | `moq-media/src/subscribe.rs` |
| moq-media adaptive | `moq-media/src/adaptive.rs` |
| moq-media transport | `moq-media/src/transport.rs` |
| moq-media network signals | `moq-media/src/net.rs` |

View File

@@ -0,0 +1,160 @@
# Iroh: Overview & Architecture
**Version**: 0.98.1
**Repository**: https://github.com/n0-computer/iroh
**License**: MIT OR Apache-2.0
**Rust Edition**: 2024
**MSRV**: 1.89
## What is Iroh?
Iroh is a Rust library for establishing **peer-to-peer QUIC connections dialed by public key**. You provide an `EndpointAddr` (which identifies a peer), and iroh finds and maintains the fastest connection route — whether direct (hole-punched) or relayed through a server.
Core value propositions:
- **Dial by public key** — no IP addresses or hostnames needed at the application layer
- **Hole-punching** — automatically attempts direct P2P connectivity
- **Relay fallback** — encrypted relay servers ensure connectivity even behind NATs
- **Built on QUIC** — uses the `noq` QUIC implementation for multiplexed, encrypted streams
- **Address Lookup** — pluggable discovery system to resolve `EndpointId → addressing info`
## Workspace Structure
```
iroh/ # Core library (p2p QUIC connections)
├── iroh-base/ # Fundamental types: SecretKey, PublicKey, EndpointId, RelayUrl, EndpointAddr
├── iroh-dns/ # DNS resolver + endpoint info serialization (pkarr)
├── iroh-dns-server/ # DNS server implementation (powers dns.iroh.link)
├── iroh-relay/ # Relay server + client implementation
└── iroh/bench/ # Benchmarks
```
### Dependency Graph
```
iroh depends on:
├── iroh-base (key types, EndpointAddr, RelayUrl)
├── iroh-dns (DNS resolution, EndpointInfo serialization)
├── iroh-relay (RelayMap, RelayConfig, relay client/server, QUIC client)
├── noq (QUIC implementation)
├── noq-proto (QUIC protocol types)
├── noq-udp (UDP socket abstraction)
├── netwatch (network interface monitoring)
├── portmapper (UPnP/PCP/NAT-PMP port mapping, optional)
├── n0-future (async utilities)
├── n0-watcher (watch/subscribe primitives)
└── iroh-metrics (metrics collection)
```
## Key Concepts
### EndpointId / PublicKey
Every iroh endpoint has a unique Ed25519 cryptographic key pair. The public key doubles as the endpoint identifier (`EndpointId`). It's used for both:
- **Identity** — unique addressing in the network
- **Encryption** — TLS authentication (via RFC 7250 Raw Public Keys, no X.509 certificates)
### EndpointAddr
The addressing structure that combines identity with network paths:
```rust
pub struct EndpointAddr {
pub id: EndpointId, // Who to connect to
pub addrs: BTreeSet<TransportAddr>, // How to reach them
}
pub enum TransportAddr {
Relay(RelayUrl), // Via relay server
Ip(SocketAddr), // Direct IP address
Custom(CustomAddr), // Via custom transport
}
```
### Relay Servers
Relay servers provide:
1. **Reliable connectivity** — always reachable, forward encrypted traffic to the correct endpoint by `EndpointId`
2. **Hole-punching assistance** — QUIC Address Discovery (QAD), STUN-like services
3. **Traffic relay** — fallback when direct connections are impossible
Connections to relays use HTTP/1.1 with TLS, then upgrade to a custom protocol. The relay only sees encrypted traffic.
### Connection Flow
1. Endpoint binds, connects to a "home relay"
2. To connect to peer: resolve `EndpointId``EndpointAddr` via Address Lookup
3. Establish initial connection via relay
4. Attempt direct connection (hole-punching if needed)
5. Migrate to direct connection when available (relay becomes backup)
## Crate: `iroh` (Core Library)
### Main Types
| Type | Module | Purpose |
|------|--------|---------|
| `Endpoint` | `endpoint` | Central API — connect, accept, manage connections |
| `Builder` | `endpoint` | Configure and construct an `Endpoint` |
| `Router` | `protocol` | Accept loop that dispatches to `ProtocolHandler`s |
| `ProtocolHandler` | `protocol` | Trait for handling incoming connections by ALPN |
| `Connection` | `endpoint::connection` | QUIC connection wrapper |
| `Incoming` | `endpoint::connection` | Pre-handshake incoming connection |
| `Accepting` | `endpoint::connection` | Post-accept, pre-handshake state |
### Feature Flags
- `default` = `["metrics", "fast-apple-datapath", "portmapper", "tls-ring"]`
- `metrics` — Prometheus-style metrics collection
- `portmapper` — UPnP/PCP/NAT-PMP support
- `test-utils` — Testing utilities
- `platform-verifier` — Use OS TLS trust anchors
- `qlog` — QUIC event logging
- `fast-apple-datapath` — Private Apple APIs for batched sends
- `tls-ring` / `tls-aws-lc-rs` — Choose TLS crypto backend
- `unstable-custom-transports` — Custom transport API (unstable)
### WASM Support
The crate compiles to `wasm32-unknown-unknown` for browser targets. Browser builds:
- Use `PkarrResolver` instead of `DnsAddressLookup` (DNS-over-HTTPS)
- Cannot bind IP sockets (no direct connectivity)
- Use `wasm-bindgen-futures` for async runtime
## Presets
The `presets` module provides common configurations:
| Preset | Description |
|--------|-------------|
| `Empty` | No defaults — you must set all required options yourself |
| `Minimal` | Sets only the crypto provider (ring or aws-lc-rs) |
| `N0` | Full n0 defaults: crypto provider, Pkarr publisher, DNS resolver, n0 relay servers |
| `N0DisableRelay` | N0 defaults but with `RelayMode::Disabled` |
```rust
// Quick start with full n0 infrastructure
let endpoint = Endpoint::bind(presets::N0).await?;
// Minimal — just crypto, no relay or address lookup
let endpoint = Endpoint::bind(presets::Minimal).await?;
```
## Encryption & Authentication
Iroh uses **RFC 7250 Raw Public Keys** for TLS — no X.509 certificates. Each endpoint has:
- `SecretKey` (Ed25519) — used for TLS authentication and signing
- `PublicKey`/`EndpointId` — derived from `SecretKey`, used as identity
The TLS server name is encoded as `<base32-dnssec-encoded-public-key>.iroh.invalid` to ensure 0-RTT session ticket separation per endpoint.
## 0-RTT Support
Iroh supports QUIC 0-RTT connections:
- `Connecting::into_0rtt()` on the client side
- `Accepting::into_0rtt()` on the server side
- TLS session tickets cached per remote endpoint (default 256 tickets = ~150 KiB)
- `max_tls_tickets()` builder option to tune cache size
## Default Infrastructure (n0)
Production relay servers (4 regions):
| Region | Hostname |
|--------|----------|
| NA East | `use1-1.relay.n0.iroh-canary.iroh.link` |
| NA West | `usw1-1.relay.n0.iroh-canary.iroh.link` |
| EU | `euc1-1.relay.n0.iroh-canary.iroh.link` |
| AP | `aps1-1.relay.n0.iroh-canary.iroh.link` |
DNS Address Lookup origin: `dns.iroh.link`

View File

@@ -0,0 +1,392 @@
# Iroh: Key Types and Traits
## Core Identity Types (`iroh-base`)
### `SecretKey`
Ed25519 signing key (32 bytes). Used for:
- TLS authentication (RFC 7250 Raw Public Key)
- Signing pkarr packets for address discovery
- Generating the corresponding `PublicKey`/`EndpointId`
```rust
// Generation
let secret_key = SecretKey::generate();
// From bytes
let secret_key = SecretKey::from_bytes(&[0u8; 32]);
// Access public key
let public_key: PublicKey = secret_key.public();
```
### `PublicKey` / `EndpointId`
`EndpointId` is a type alias for `PublicKey`. Both are 32-byte Ed25519 compressed points.
```rust
pub type EndpointId = PublicKey;
impl PublicKey {
pub const LENGTH: usize = 32;
pub fn from_bytes(bytes: &[u8; 32]) -> Result<Self, KeyParsingError>;
pub fn as_bytes(&self) -> &[u8; 32];
pub fn verify(&self, message: &[u8], signature: &Signature) -> Result<(), SignatureError>;
pub fn fmt_short(&self) -> impl Display; // First 5 bytes hex
}
```
Serialization: Human-readable → base32 z-base-32 encoding; Binary → 32 raw bytes.
### `Signature`
Ed25519 signature (64 bytes). Used in pkarr for signing endpoint discovery records.
### `KeyParsingError`
Error type for key parsing failures.
## Addressing Types (`iroh-base`)
### `EndpointAddr`
The primary addressing type — combines identity with network paths:
```rust
pub struct EndpointAddr {
pub id: EndpointId,
pub addrs: BTreeSet<TransportAddr>,
}
impl EndpointAddr {
pub fn new(id: PublicKey) -> Self;
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
pub fn is_empty(&self) -> bool;
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
}
```
Can be constructed from just an `EndpointId` (relies on Address Lookup), or with explicit paths:
```rust
// From just EndpointId — needs Address Lookup
let addr = EndpointAddr::new(endpoint_id);
// With relay URL
let addr = EndpointAddr::new(endpoint_id).with_relay_url(relay_url);
// With both
let addr = EndpointAddr::from_parts(endpoint_id, [
TransportAddr::Relay(relay_url),
TransportAddr::Ip(socket_addr),
]);
```
### `TransportAddr`
```rust
pub enum TransportAddr {
Relay(RelayUrl),
Ip(SocketAddr),
Custom(CustomAddr),
}
```
### `CustomAddr`
Opaque custom transport address (for `unstable-custom-transports` feature):
```rust
pub struct CustomAddr {
id: u32,
addr: Vec<u8>,
}
```
### `RelayUrl`
Arc-wrapped `Url` identifying a relay server. Cheaply clonable. Encourages fully-qualified DNS names (trailing dot).
```rust
let url: RelayUrl = "https://use1-1.relay.n0.iroh-canary.iroh.link.".parse()?;
```
## Endpoint Trait (`iroh`)
### `Endpoint`
The central type — created via `Builder`, used for all connection operations:
```rust
impl Endpoint {
// Construction
pub fn builder(preset: impl Preset) -> Builder;
pub async fn bind(preset: impl Preset) -> Result<Self, BindError>;
// Connection
pub async fn connect(&self, addr: impl Into<EndpointAddr>, alpn: &[u8]) -> Result<Connection, ConnectError>;
pub async fn connect_with_opts(&self, addr: impl Into<EndpointAddr>, alpn: &[u8], opts: ConnectOptions) -> Result<Connecting, ConnectWithOptsError>;
pub fn accept(&self) -> Accept<'_>;
// Identity
pub fn id(&self) -> EndpointId;
pub fn secret_key(&self) -> &SecretKey;
pub fn addr(&self) -> EndpointAddr;
pub fn watch_addr(&self) -> impl Watcher<Value = EndpointAddr>;
// Lifecycle
pub async fn close(&self);
pub fn is_closed(&self) -> bool;
pub fn closed(&self) -> EndpointClosed;
pub async fn online(&self); // Wait for relay connection
// Configuration changes
pub fn set_alpns(&self, alpns: Vec<Vec<u8>>);
pub async fn insert_relay(&self, relay: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
pub async fn remove_relay(&self, relay: &RelayUrl) -> Option<Arc<RelayConfig>>;
pub async fn add_external_addr(&self, addr: SocketAddr);
pub async fn remove_external_addr(&self, addr: &SocketAddr) -> bool;
pub fn set_user_data_for_address_lookup(&self, user_data: Option<UserData>);
pub async fn network_change(&self);
// Observers
pub fn home_relay_status(&self) -> impl Watcher<Value = Vec<RelayStatus>>;
pub fn net_report(&self) -> impl Watcher<Value = Option<NetReport>>;
pub fn remote_info(&self, id: EndpointId) -> Option<RemoteInfo>;
pub fn metrics(&self) -> &EndpointMetrics;
pub fn bound_sockets(&self) -> Vec<SocketAddr>;
pub fn dns_resolver(&self) -> Result<&DnsResolver, EndpointError>;
pub fn tls_config(&self) -> &rustls::ClientConfig;
pub fn address_lookup(&self) -> Result<&AddressLookupServices, EndpointError>;
}
```
### `Builder`
Fluent builder for `Endpoint`:
```rust
let ep = Endpoint::builder(presets::N0)
.secret_key(secret_key) // Identity
.alpns(vec![b"my-alpn".to_vec()]) // Accepted protocols
.relay_mode(RelayMode::Default) // Relay configuration
.address_lookup(PkarrPublisher::n0_dns()) // Address discovery
.address_lookup(DnsAddressLookup::n0_dns()) // DNS resolution
.addr_filter(AddrFilter::relay_only()) // Filter published addresses
.user_data_for_address_lookup(user_data) // Custom discovery data
.transport_config(QuicTransportConfig::default()) // QUIC tuning
.dns_resolver(dns_resolver) // Custom DNS resolver
.proxy_url(proxy_url) // HTTP proxy
.ca_roots_config(CaRootsConfig::default()) // TLS CA roots
.keylog(true) // SSLKEYLOGFILE debug
.max_tls_tickets(256) // 0-RTT ticket cache
.hooks(my_hook) // Connection hooks
.portmapper_config(PortmapperConfig::Enabled) // UPnP/NAT-PMP
.external_addr(addr) // Advertised external addr
.bind_addr("0.0.0.0:0")? // Bind specific socket
.bind() // Build & bind
.await?;
```
### `RelayMode`
```rust
pub enum RelayMode {
Disabled, // No relay
Default, // n0 production relays
Staging, // n0 staging relays
Custom(RelayMap), // Custom relay configuration
}
```
## Protocol Handler (`iroh::protocol`)
### `ProtocolHandler`
Trait for handling incoming connections by ALPN:
```rust
pub trait ProtocolHandler: Send + Sync + Debug + 'static {
// Optional: intercept at Accepting stage (supports 0-RTT)
fn on_accepting(&self, accepting: Accepting) -> impl Future<Output = Result<Connection, AcceptError>> + Send;
// Required: handle the established connection
fn accept(&self, connection: Connection) -> impl Future<Output = Result<(), AcceptError>> + Send;
// Optional: called on graceful shutdown
fn shutdown(&self) -> impl Future<Output = ()> + Send;
}
```
### `Router`
Spawns an accept loop that dispatches incoming connections to registered handlers:
```rust
let router = Router::builder(endpoint)
.accept(b"/my-alpn", Arc::new(MyHandler))
.incoming_filter(|incoming| {
if !incoming.remote_addr_validated() {
IncomingFilterOutcome::Retry
} else {
IncomingFilterOutcome::Accept
}
})
.spawn();
// Later...
router.shutdown().await?;
```
### `IncomingFilterOutcome`
```rust
pub enum IncomingFilterOutcome {
Accept, // Allow the connection
Retry, // Send QUIC retry (address validation)
Reject, // Refuse with CONNECTION_REFUSED
Ignore, // Drop silently (remote times out)
}
```
### `AccessLimit`
Wrapper that limits connections to allowed `EndpointId`s:
```rust
let handler = AccessLimit::new(MyHandler, |endpoint_id| allowed_set.contains(&endpoint_id));
```
### `EndpointHooks`
Intercept connection establishment at two points:
```rust
pub trait EndpointHooks: Debug + Send + Sync {
// Before outgoing connection starts
fn before_connect<'a>(&'a self, remote_addr: &'a EndpointAddr, alpn: &'a [u8])
-> BoxFuture<'a, BeforeConnectOutcome>;
// After TLS handshake completes (on both sides)
fn after_handshake<'a>(&'a self, info: &'a ConnectionInfo)
-> BoxFuture<'a, AfterHandshakeOutcome>;
}
```
## Connection Types (`iroh::endpoint::connection`)
### `Connecting`
The state between initiating a connection and completing the handshake:
```rust
impl Connecting {
pub async fn await?(self) -> Result<Connection, ConnectingError>;
pub fn into_0rtt(self) -> Result<(OutgoingZeroRttConnection, Connection), Connecting>;
pub fn alpn(&self) -> Result<Vec<u8>, ConnectingError>;
pub fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
}
```
### `Connection`
Wraps a `noq::Connection` with iroh-specific metadata:
```rust
impl Connection {
// Stream operations
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), OpenBi>;
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), AcceptBi>;
pub async fn open_uni(&self) -> Result<SendStream, OpenUni>;
pub async fn accept_uni(&self) -> Result<RecvStream, AcceptUni>;
// Datagrams
pub fn send_datagram(&self, data: SendDatagram) -> Result<(), SendDatagramError>;
pub async fn read_datagram(&self) -> Result<Bytes, ReadDatagram>;
// Connection lifecycle
pub fn close(&self, error_code: VarInt, reason: &[u8]);
pub async fn closed(&self) -> ConnectionError;
// Identity
pub fn remote_id(&self) -> EndpointId;
pub fn alpn(&self) -> Vec<u8>;
// Path observation
pub fn paths(&self) -> PathWatcher;
// Keying material export
pub fn export_keying_material(&self, output: &mut [u8], label: &[u8], context: Option<&[u8]>) -> Result<(), ExportKeyingMaterialError>;
}
```
### `Incoming`
Pre-accept incoming connection:
```rust
impl Incoming {
pub fn accept(self) -> Result<Accepting, ConnectionError>;
pub fn accept_with(self, server_config: Arc<ServerConfig>) -> Result<Accepting, ConnectionError>;
pub fn refuse(self);
pub fn retry(self) -> Result<(), RetryError>;
pub fn ignore(self);
pub fn remote_addr(&self) -> IncomingAddr;
pub fn local_ip(&self) -> Option<IpAddr>;
pub fn remote_addr_validated(&self) -> bool;
pub fn decrypt(&self) -> Option<DecryptedInitial>;
}
```
### `IncomingAddr`
```rust
pub enum IncomingAddr {
Ip(SocketAddr),
Relay { url: RelayUrl, endpoint_id: EndpointId },
Custom(CustomAddr),
}
```
## `RelayMap` and `RelayConfig` (`iroh-relay`)
### `RelayMap`
Thread-safe map of relay servers:
```rust
let map = RelayMap::from_iter([
"https://relay1.example.org".parse()?,
"https://relay2.example.org".parse()?,
]);
```
### `RelayConfig`
```rust
pub struct RelayConfig {
pub url: RelayUrl,
pub quic: Option<RelayQuicConfig>, // QAD support
}
pub struct RelayQuicConfig {
pub port: u16, // Default: 3478
}
```
## `EndpointData` and `EndpointInfo` (`iroh-dns`)
### `EndpointData`
The data published about an endpoint:
```rust
pub struct EndpointData {
addrs: Vec<TransportAddr>,
user_data: Option<UserData>,
}
```
### `EndpointInfo`
Combines `EndpointId` with `EndpointData`:
```rust
pub struct EndpointInfo {
pub endpoint_id: EndpointId,
pub data: EndpointData,
}
```
### `UserData`
Application-defined string data published alongside addressing info:
```rust
pub struct UserData(String); // Max 256 bytes
```
### `AddrFilter`
Controls which addresses are published to address lookup services:
```rust
let filter = AddrFilter::relay_only(); // Only relay URLs
let filter = AddrFilter::unfiltered(); // All addresses
let filter = AddrFilter::custom(|addrs| { /* custom logic */ });
```

View File

@@ -0,0 +1,401 @@
# Iroh: Networking & Protocol Details
## Connection Establishment
### Overview
The connection process follows this sequence:
```
Caller Callee
| |
|--- connect(EndpointAddr, alpn) -------->| (via relay first)
| |
|<------ TLS Handshake (Raw Public Key) ->|
| |
|<====== QUIC Connection Established ====|
| |
| (iroh attempts direct path migration) |
| |
|--- open_bi() / open_uni() ------------->|
|<--- accept_bi() / accept_uni() ----------|
```
### Step-by-Step
1. **Resolve addressing**`resolve_remote(EndpointAddr)` starts a `RemoteStateActor` for the peer. If no direct addresses or relay URL are provided, Address Lookup services are queried.
2. **Map addresses**`EndpointId` is mapped to a synthetic IPv6 address for the QUIC layer (`EndpointIdMappedAddr`). Relay and custom transport addresses are similarly mapped.
3. **TLS connection** — Uses RFC 7250 Raw Public Keys. The server name is encoded as `<z32-encoded-pubkey>.iroh.invalid`. Both sides authenticate by `EndpointId`.
4. **ALPN negotiation** — The Application-Layer Protocol Negotiation determines which protocol handler receives the connection.
5. **Path migration** — Once a QUIC connection is established (initially via relay), iroh continuously searches for better paths. Direct IP paths are preferred when available.
## Transport Layer Architecture
### The `Socket` — Core Connectivity Engine
The `Socket` struct is the heart of iroh's networking. It manages:
- Multiple transport paths (IPv4, IPv6, relay, custom)
- Address discovery and NAT traversal
- Path migration between relay and direct connections
```
┌──────────────┐
│ Endpoint │ (Public API)
│ (Arc<EndpointInner>) │
└──────┬───────┘
┌──────▼───────┐
│ Socket │ (Connectivity engine)
│ (Arc<Socket>) │
└──────┬───────┘
┌────────────┼────────────┐
│ │ │
┌─────▼─────┐ ┌───▼────┐ ┌──────▼──────┐
│IpTransport│ │Relay │ │CustomTransport│
│(IPv4/v6) │ │Transport│ │(unstable) │
└─────┬─────┘ └───┬────┘ └──────┬──────┘
│ │ │
┌─────▼─────┐ ┌───▼────┐ │
│ UdpSocket │ │WebSocket│ │
│ (netwatch)│ │ Actor │ │
└────────────┘ └────────┘ │
```
### Transport Configuration
```rust
pub enum TransportConfig {
Ip {
config: IpConfig, // IPv4 or IPv6 socket config
is_user_defined: bool,
},
Relay {
relay_map: RelayMap, // Which relay servers to use
is_user_defined: bool,
},
#[cfg(feature = "unstable-custom-transports")]
Custom(Arc<dyn CustomTransport>),
}
pub enum IpConfig {
V4 { ip_net: Ipv4Net, port: u16, is_required: bool, is_default: bool },
V6 { ip_net: Ipv6Net, scope_id: u32, port: u16, is_required: bool, is_default: bool },
}
```
### Address Mapping
Iroh maps all transport addresses to IPv6 for the QUIC layer:
- **IPv4/IPv6 addresses** → used directly as QUIC path addresses
- **Relay addresses** → mapped to synthetic IPv6 addresses in a dedicated range
- **Custom addresses** → mapped to synthetic IPv6 addresses in another range
The `MappedAddrs` struct maintains these mappings:
```rust
pub(crate) struct MappedAddrs {
pub(super) endpoint_addrs: AddrMap<EndpointId, EndpointIdMappedAddr>,
pub(super) relay_addrs: AddrMap<(RelayUrl, EndpointId), RelayMappedAddr>,
pub(super) custom_addrs: AddrMap<CustomAddr, CustomMappedAddr>,
}
```
### Transport Bias
Path selection uses a configurable bias system:
```rust
let endpoint = Endpoint::builder(presets::N0)
.transport_bias(AddrKind::Custom(42), TransportBias::primary())
.bind()
.await?;
```
Default biases:
- IPv4 and IPv6 are **primary** (IPv6 gets small RTT advantage)
- Relay is **backup** (only used when no primary transport available)
## Relay Protocol
### Architecture
The relay system is based on a revised version of Tailscale's DERP (Designated Encrypted Relay for Packets) protocol.
```
Client A Relay Server Client B
│ │ │
│─── HTTP CONNECT ──>| │
│<── 200 OK ─────────│ │
│ │<─── HTTP CONNECT ────│
│ │──── 200 OK ────────>│
│ │ │
│─── Encrypted QUIC ─>│─── Encrypted QUIC ─>│
│<── Encrypted QUIC ──│<── Encrypted QUIC ──│
```
### Relay Actor
The `RelayActor` manages the WebSocket connection to the relay:
- Connects to relay via HTTPS, upgrades to custom protocol
- Sends/receives encrypted datagrams on behalf of the local endpoint
- Manages reconnection on network changes or relay restarts
- Reports connection status via `HomeRelayWatch`
### Relay Data Flow
1. Outgoing packet → `RelayTransport::send()``RelayActor` → WebSocket → Relay server → WebSocket → remote `RelayActor` → remote `RelayTransport::recv()` → QUIC
2. The relay only sees encrypted QUIC packets — it cannot decode application data
### Home Relay Selection
The `net_report` module continuously probes relay servers and maintains latency statistics. The "home relay" is selected based on:
- Lowest recent latency (with hysteresis to avoid flapping)
- At most a 2/3 improvement threshold to switch from current relay
## Hole-Punching & NAT Traversal
### QUIC Address Discovery (QAD)
Iroh uses QUIC Address Discovery (based on [draft-ietf-quic-address-discovery](https://datatracker.ietf.org/doc/draft-ietf-quic-address-discovery/)) to discover external IP addresses. The relay servers expose QAD endpoints.
The `net_report` module:
1. Establishes QUIC connections to relay servers
2. Uses `observed_external_addr()` to learn external addresses
3. Reports NAT type, mapping behavior, and preferred relay
### NAT Traversal Strategy
```
┌──────────────────────────────┐
│ NAT Traversal │
│ │
│ 1. Direct connection attempt │
│ (simultaneous open) │
│ │
│ 2. QAD-discovered addresses │
│ (relay reports observed IP)│
│ │
│ 3. Port mapping (UPnP/PCP/NAT-PMP)│
│ (if supported by gateway) │
│ │
│ 4. Relay fallback │
│ (always available) │
└──────────────────────────────┘
```
### Port Mapper
```rust
pub enum PortmapperConfig {
Enabled {}, // Default: tries UPnP, PCP, NAT-PMP
Disabled, // No port mapping
}
```
When enabled, the port mapper:
- Discovers gateway devices
- Requests port mappings
- Provides external addresses to the endpoint
- Updates when mappings change
### Net Report
`NetReport` discovers network conditions:
- IPv4/IPv6 connectivity
- NAT mapping behavior (varies by destination or not)
- Captive portal detection
- Preferred relay selection
- External IP addresses (via QAD)
Key timeouts:
- `NET_REPORT_TIMEOUT` = 10 seconds
- `FULL_REPORT_INTERVAL` = 5 minutes
- `HEARTBEAT_INTERVAL` = 5 seconds (keepalive)
- `PATH_MAX_IDLE_TIMEOUT` = 15 seconds (direct)
- `RELAY_PATH_MAX_IDLE_TIMEOUT` = 30 seconds (relay)
## Address Lookup System
### Trait Definition
```rust
pub trait AddressLookup: Debug + Send + Sync + 'static {
fn publish(&self, data: &EndpointData);
fn resolve(&self, endpoint_id: EndpointId) -> Option<BoxStream<Result<Item, Error>>>;
}
```
### `AddressLookupServices`
A composite that runs multiple lookup services concurrently:
```rust
let services = AddressLookupServices::default();
services.set_addr_filter(AddrFilter::relay_only());
services.add(publisher);
services.add(resolver);
```
Resolution merges results from all services. Individual service errors don't block other services.
### Built-in Implementations
#### `PkarrPublisher`
Publishes endpoint info to a pkarr relay via HTTP PUT:
```rust
let publisher = PkarrPublisher::builder(pkarr_url)
.addr_filter(AddrFilter::relay_only()) // Default: relay-only
.build(secret_key, tls_config);
```
#### `PkarrResolver` (browser/WASM)
Resolves endpoint info from a pkarr relay via HTTP GET.
#### `DnsAddressLookup` (non-browser)
Resolves endpoint info via DNS TXT records:
```rust
// Default n0 DNS
let lookup = DnsAddressLookup::n0_dns();
// Custom DNS origin
let lookup = DnsAddressLookup::new(dns_resolver, origin);
```
#### `MemoryLookup`
In-memory address lookup for testing:
```rust
let lookup = MemoryLookup::new();
lookup.add_endpoint(endpoint_id, endpoint_data);
```
### DNS Record Format
```
_iroh.<z32-encoded-endpoint-id>.<origin-domain> TXT
```
Attributes:
- `relay=<url>` — Home relay URL
- `addr=<addr> <addr>` — Space-separated socket addresses
- `user_data=<base64-encoded-data>` — Application-specific data
## TLS Configuration
### `TlsConfig`
Manages TLS state shared across sessions:
```rust
struct TlsConfig {
secret_key: SecretKey,
cert_resolver: Arc<ResolveRawPublicKeyCert>,
server_verifier: Arc<ServerCertificateVerifier>,
client_verifier: Arc<ClientCertificateVerifier>,
session_store: Arc<dyn ClientSessionStore>,
crypto_provider: Arc<CryptoProvider>,
}
```
### Raw Public Key Certificate
Uses RFC 7250 — no X.509 certificates. The `ResolveRawPublicKeyCert` resolver creates TLS certificates on-the-fly from the Ed25519 public key.
### Verification Flow
- **Client verifies server**: The `ServerCertificateVerifier` checks that the server's `EndpointId` matches the expected `EndpointId` encoded in the TLS server name.
- **Server verifies client**: The `ClientCertificateVerifier` ensures the client presents a valid raw public key.
### Crypto Providers
Two built-in options via feature flags:
- `tls-ring` — uses `ring` crypto (default)
- `tls-aws-lc-rs` — uses AWS LC-RS crypto
Custom providers can be set via `Builder::crypto_provider()`.
## Multipath & Path Migration
Iroh supports QUIC multipath connections. Multiple paths can be active simultaneously:
```rust
// Watch path changes
let paths = connection.paths();
while let Some(infos) = paths.stream().next().await {
for info in infos.iter() {
if info.is_ip() { /* direct path */ }
if info.is_relay() { /* relay path */ }
}
}
```
Maximum multipath paths per connection: 12 (`MAX_MULTIPATH_PATHS`).
### Path Types
```rust
pub struct PathInfo {
pub addr: TransportAddr,
pub usage: TransportAddrUsage,
}
pub enum TransportAddrUsage {
DefaultRoute,
SubnetRoute,
Backup,
}
```
## Connection Hooks
```rust
#[derive(Debug, Clone)]
struct MyHook;
impl EndpointHooks for MyHook {
fn before_connect<'a>(
&'a self,
remote_addr: &'a EndpointAddr,
alpn: &'a [u8],
) -> BoxFuture<'a, BeforeConnectOutcome> {
Box::pin(async move {
if is_allowed(remote_addr.id()) {
BeforeConnectOutcome::Accept
} else {
BeforeConnectOutcome::Reject
}
})
}
fn after_handshake<'a>(
&'a self,
info: &'a ConnectionInfo,
) -> BoxFuture<'a, AfterHandshakeOutcome> {
Box::pin(async move {
AfterHandshakeOutcome::Accept
})
}
}
```
## Custom Transports (Unstable)
```rust
pub trait CustomTransport: Send + Sync + Debug + 'static {
// Create an endpoint for this transport
fn create_endpoint(&self, config: CustomEndpointConfig) -> Result<Arc<dyn CustomEndpoint>, CustomTransportError>;
}
pub trait CustomEndpoint: Send + Sync + Debug + 'static {
fn send(&self, item: CustomSendItem) -> Result<(), CustomTransportError>;
fn recv(&self) -> Result<CustomRecvItem, CustomTransportError>;
}
// Register:
let ep = Endpoint::builder(presets::N0)
.add_custom_transport(Arc::new(MyTransport))
.bind()
.await?;
```
Transport IDs (from `TRANSPORTS.md`):
| ID | Transport | Address format |
|----|-----------|---------------|
| `0x00-0x1F` | Reserved | - |
| `0x20` | Test | Ed25519 public key (32 bytes) |
| `0x544F52` | Tor | Ed25519 public key (32 bytes) |
| `0x424C45` | BLE | Bluetooth MAC address (6 bytes) |

View File

@@ -0,0 +1,294 @@
# Iroh: Sub-Crates
## `iroh-base`
**Purpose**: Fundamental types shared across all iroh crates.
**Features**: `key` (default), `relay` (default)
### Key Types
| Type | Description |
|------|-------------|
| `SecretKey` | Ed25519 signing key (32 bytes). Generated randomly or from bytes. |
| `PublicKey` | Ed25519 public key (32 bytes). Verifies signatures. |
| `EndpointId` | Type alias for `PublicKey` — used as network identity. |
| `Signature` | Ed25519 signature (64 bytes). |
| `RelayUrl` | Arc-wrapped `Url` identifying a relay server. |
| `EndpointAddr` | Combines `EndpointId` + `BTreeSet<TransportAddr>`. Primary addressing type. |
| `TransportAddr` | Enum: `Relay(RelayUrl)`, `Ip(SocketAddr)`, `Custom(CustomAddr)`. |
| `CustomAddr` | Opaque address for custom transports (id + bytes). |
| `KeyParsingError` | Error type for key parsing. |
| `RelayUrlParseError` | Error type for URL parsing. |
### `EndpointAddr` Methods
```rust
impl EndpointAddr {
pub fn new(id: PublicKey) -> Self;
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
pub fn with_addrs(self, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
pub fn is_empty(&self) -> bool;
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
}
```
### Serialization
- `PublicKey`/`EndpointId`: Human-readable → base32 z-base-32; Binary → 32 raw bytes
- `EndpointAddr`: Serialized as `{id, addrs}` with `TransportAddr` as tagged enum
- `RelayUrl`: Serialized as URL string
---
## `iroh-dns`
**Purpose**: DNS resolver and endpoint info serialization for address discovery.
**Key Features**: pkarr signed packet creation/verification, DNS TXT record parsing, configurable DNS resolver.
### Modules
| Module | Description |
|--------|-------------|
| `dns` | `DnsResolver` — configurable async DNS resolver with IPv4/IPv6 staggered lookup |
| `endpoint_info` | `EndpointInfo`, `EndpointData`, `AddrFilter`, `UserData` — serialization/deserialization |
| `pkarr` | Pkarr signed packet creation and verification |
| `attrs` | Low-level TXT record attribute parsing |
### `DnsResolver`
```rust
impl DnsResolver {
pub fn new() -> Self;
pub fn with_nameserver(addr: SocketAddr) -> Self;
pub fn with_nameservers(addrs: Vec<SocketAddr>) -> Self;
// Lookup methods
pub async fn lookup_ipv4(&self, host: String) -> Result<...>;
pub async fn lookup_ipv6(&self, host: String) -> Result<...>;
pub async fn lookup_ipv4_ipv6_staggered(&self, host: &str, timeout: Duration, delays: &[u64]) -> Result<...>;
pub async fn lookup_txt(&self, host: String) -> Result<...>;
pub async fn lookup_endpoint_by_id(&self, id: &EndpointId, origin: &str) -> Result<EndpointInfo>;
// Cache management
pub fn clear_cache(&self);
pub fn reset_resolver(&self);
}
```
### `EndpointInfo` & `EndpointData`
```rust
pub struct EndpointInfo {
pub endpoint_id: EndpointId,
pub data: EndpointData,
}
pub struct EndpointData {
addrs: Vec<TransportAddr>,
user_data: Option<UserData>,
}
impl EndpointData {
pub fn new(addrs: Vec<TransportAddr>) -> Self;
pub fn from_iter(addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
pub fn with_user_data(mut self, user_data: UserData) -> Self;
pub fn addrs(&self) -> impl Iterator<Item = &TransportAddr>;
pub fn user_data(&self) -> Option<&UserData>;
pub fn apply_filter(&self, filter: &AddrFilter) -> Cow<'_, EndpointData>;
}
```
### `AddrFilter`
Controls which addresses are published in address lookup:
```rust
pub enum AddrFilter {
RelayOnly, // Only relay URLs
Unfiltered, // All addresses
Custom(fn(&[TransportAddr]) -> Vec<TransportAddr>),
}
```
### Pkarr Integration
```rust
// Creating signed packets
let info = EndpointInfo::new(secret_key.public())
.with_relay_url(relay_url);
let packet = info.to_pkarr_signed_packet(&secret_key, 30)?; // 30 second TTL
// Verifying and extracting
let info = EndpointInfo::from_pkarr_signed_packet(&packet)?;
```
---
## `iroh-relay`
**Purpose**: Relay server and client implementation. Provides DERP-like relay protocol, QAD support, and relay server binary.
### Key Exports
| Type | Description |
|------|-------------|
| `RelayMap` | Thread-safe map of `RelayUrl → RelayConfig` |
| `RelayConfig` | Configuration for a single relay server |
| `RelayQuicConfig` | QUIC address discovery configuration |
| `KeyCache` | Cache for relay server public keys |
| `PingTracker` | Ping/pong tracking for relay connections |
| `MAX_PACKET_SIZE` | Maximum relay packet size (64KB - overhead) |
### Modules
| Module | Description |
|--------|-------------|
| `client` | HTTP client for relay server connections |
| `http` | HTTP-related relay functionality |
| `protos` | Protocol definitions (handshake, relay, streams) |
| `quic` | QUIC client for QAD probing |
| `server` | Full relay server implementation (`feature = "server"`) |
| `tls` | TLS configuration utilities |
### `RelayConfig`
```rust
pub struct RelayConfig {
pub url: RelayUrl,
pub quic: Option<RelayQuicConfig>,
}
impl RelayConfig {
pub fn new(url: RelayUrl, quic: Option<RelayQuicConfig>) -> Self;
pub fn from(url: RelayUrl) -> Self; // No QAD
}
```
### `RelayMap`
```rust
impl RelayMap {
pub fn empty() -> Self;
pub fn from(relay: RelayConfig) -> Self;
pub fn from_iter(iter: impl IntoIterator<Item = impl Into<RelayConfig>>) -> Self;
pub fn try_from_iter(iter: impl IntoIterator<Item = &str>) -> Result<Self, RelayUrlParseError>;
pub fn insert(&self, url: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
pub fn remove(&self, url: &RelayUrl) -> Option<Arc<RelayConfig>>;
pub fn len(&self) -> usize;
pub fn is_empty(&self) -> bool;
pub fn urls<T: FromIterator<RelayUrl>>(&self) -> T;
pub fn relays<T: FromIterator<Arc<RelayConfig>>>(&self) -> T;
}
```
### Relay Protocol (DERP-like)
The relay protocol is based on Tailscale's DERP protocol, adapted for iroh:
1. Client connects via HTTPS, upgrades to custom protocol
2. Authentication via raw public key (Ed25519)
3. Encrypted datagram forwarding by `EndpointId`
4. QAD probes via QUIC for address discovery
5. Ping/pong keepalive mechanism
### TLS Utilities
```rust
pub use iroh_relay::tls::{CaRootsConfig, default_provider};
// Skip certificate verification (testing only)
let config = CaRootsConfig::insecure_skip_verify();
// Use system trust roots
let config = CaRootsConfig::platform_verifier();
// Use specific roots
let config = CaRootsConfig::from_pem(pem_bytes);
```
---
## `iroh-dns-server`
**Purpose**: DNS server that resolves iroh `EndpointId`s to addressing information. Powers `dns.iroh.link`.
### Key Features
- Serves DNS TXT records for `_iroh.<z32-endpoint-id>.<origin>` queries
- Integrates with pkarr for signed record verification
- Supports production (`dns.iroh.link`) and staging (`staging-dns.iroh.link`) origins
- Includes benchmarking support
### Configuration Files
- `config.dev.toml` — Development configuration
- `config.prod.toml` — Production configuration
---
## Internal Modules in `iroh` Crate
### `socket` Module
The connectivity layer — manages the `Socket` struct that orchestrates:
- Multiple transport paths
- Network change detection
- Address discovery and publication
- Remote state actors (per-peer state machines)
**Key sub-modules**:
| Sub-module | Description |
|-----------|-------------|
| `transports/` | Transport implementations (IP, relay, custom) |
| `transports/ip.rs` | IPv4/IPv6 UDP transport |
| `transports/relay.rs` | Relay WebSocket transport |
| `transports/relay/actor.rs` | Relay connection management actor |
| `transports/custom.rs` | Unstable custom transport API |
| `remote_map.rs` | Per-peer `RemoteStateActor` management |
| `remote_map/remote_state.rs` | State machine for connecting to a peer |
| `mapped_addrs.rs` | Address mapping for QUIC layer |
| `concurrent_read_map.rs` | Lock-free concurrent map for remote actors |
| `metrics.rs` | Socket-level metrics |
### `net_report` Module
Network condition reporter:
- Discovers external IP addresses (QAD)
- Measures relay latencies
- Detects NAT types
- Detects captive portals
- Selects preferred relay
### `portmapper` Module
UPnP/PCP/NAT-PMP port mapping:
- Gateway discovery
- Port mapping procurement
- External address monitoring
### `address_lookup` Module
Pluggable address discovery:
| Sub-module | Description |
|-----------|-------------|
| `dns.rs` | `DnsAddressLookup` — resolves via DNS TXT records |
| `pkarr.rs` | `PkarrPublisher` — publishes via HTTP PUT to pkarr relay; `PkarrResolver` — resolves from pkarr relay |
| `memory.rs` | `MemoryLookup` — in-memory lookup for testing |
### `runtime` Module
Tokio-based async runtime wrapper for `noq`:
- Task spawning with cancellation support
- Timer management
- Graceful and abrupt shutdown
- WASM browser support (delegates to `wasm-bindgen-futures`)
### `defaults` Module
Default configuration values:
- Production relay servers (4 regions)
- Staging relay servers (2 regions)
- Timeout constants
- Environment variable for forcing staging (`IROH_FORCE_STAGING_RELAYS`)
### `metrics` Module
`EndpointMetrics` collection:
- Socket metrics (datagrams sent/received, data by transport type)
- Net report metrics (reports generated, full vs incremental)
- Port mapper metrics

View File

@@ -0,0 +1,261 @@
# Iroh: Data Flow & Internal Architecture
## Data Flow: Connecting to a Remote Endpoint
```
Endpoint::connect(endpoint_addr, alpn)
resolve_remote(endpoint_addr)
├─ If addr has direct IPs or relay URL → use those
└─ If addr is just EndpointId → query AddressLookupServices
├─ PkarrPublisher/PkarrResolver (HTTP)
├─ DnsAddressLookup (DNS TXT)
├─ MemoryLookup (in-memory)
└─ ...custom implementations
Map EndpointId → MappedAddr for QUIC layer
noq::Endpoint::connect(client_config, dest_addr, server_name)
├─ TLS handshake with Raw Public Key authentication
│ server_name = "<z32-encoded-endpoint-id>.iroh.invalid"
└─ QUIC connection established
Connecting → Connection
├─ Connection stays on relay path initially
└─ RemoteStateActor discovers direct paths
├─ QAD-discovered addresses
├─ Addresses from Address Lookup
├─ Port mapper external addresses
└─ Path migration: relay → direct (if possible)
```
## Data Flow: Accepting Connections
```
Endpoint::accept() → Accept<'_>
▼ (incoming QUIC packet arrives on any transport)
noq::Endpoint::accept()
Incoming
├─ incoming.remote_addr() → IncomingAddr (Ip/Relay/Custom)
├─ incoming.remote_addr_validated() → bool
├─ incoming.accept() → Accepting
├─ incoming.refuse() → reject
├─ incoming.retry() → QUIC retry (address validation)
└─ incoming.ignore() → drop silently
Accepting
├─ accepting.alpn().await → alpn bytes
├─ accepting.into_0rtt() → (OutgoingZeroRtt, Connection) [optional]
└─ accepting.await → Connection
```
## Data Flow: Router Accept Loop
```
Router::spawn()
├─ endpoint.set_alpns(registered_alpns)
└─ Loop:
├─ endpoint.accept().await → Incoming
│ │
│ ├─ Apply incoming_filter (optional)
│ │ ├─ Accept → continue
│ │ ├─ Retry → incoming.retry()
│ │ ├─ Reject → incoming.refuse()
│ │ └─ Ignore → incoming.ignore()
│ │
│ ├─ incoming.accept() → Accepting
│ ├─ accepting.alpn().await → determine ALPN
│ │
│ └─ protocols.get(alpn) → handler
│ │
│ ├─ handler.on_accepting(accepting).await
│ └─ handler.accept(connection).await
└─ On shutdown:
├─ protocols.shutdown().await
├─ handler_cancel_token.cancel()
└─ endpoint.close().await
```
## Actor Model: Per-Remote State
Each remote peer gets a `RemoteStateActor` that manages the connection state:
```
┌───────────────────────────────────────────────┐
│ RemoteStateActor │
│ │
│ ┌─────────────┐ ┌─────────────────┐ │
│ │ Address │ │ Connection │ │
│ │ Lookup │ │ Tracker │ │
│ │ Resolution │ │ │ │
│ └──────┬──────┘ └────────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ Path Selection │ │
│ │ ┌────────┐ ┌────────┐ │ │
│ │ │ IPv4 │ │ IPv6 │ │ │
│ │ │primary │ │primary │ │ │
│ │ └────────┘ └────────┘ │ │
│ │ ┌────────┐ ┌────────┐ │ │
│ │ │ Relay │ │Custom │ │ │
│ │ │backup │ │primary │ │ │
│ │ └────────┘ └────────┘ │ │
│ └──────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Mapped Addresses │ │
│ │ EndpointId → MappedIPv6Addr │ │
│ │ (RelayUrl, EndpointId) → Addr │ │
│ │ CustomAddr → MappedIPv6Addr │ │
│ └──────────────────────────────────┘ │
│ │
│ Messages: │
│ ├─ ResolveRemote(EndpointAddr, reply) │
│ ├─ AddConnection(EndpointId, WeakConn, reply)│
│ └─ RemoteInfo(reply) │
└───────────────────────────────────────────────┘
```
## Data Flow: Socket Actor
The `Actor` in `Socket` runs as a background task handling network changes:
```
┌────────────────────────────────────────────────────────────┐
│ Socket Actor │
│ │
│ ┌──────────────────┐ ┌─────────────────┐ │
│ │ Network Monitor │ │ Direct Addr │ │
│ │ (netwatch) │ │ Update State │ │
│ │ │ │ │ │
│ │ Detects: │ │ Manages: │ │
│ │ - Interface up/down│ │ - NetReport runs │ │
│ │ - Address changes │ │ - Port mapper │ │
│ │ - Route changes │ │ - Direct addrs │ │
│ └────────┬─────────┘ └────────┬──────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Triggers │ │
│ │ - NetworkChange (major/minor) │ │
│ │ - PeriodicReStun (every 30s-5min) │ │
│ │ - PortmapUpdated │ │
│ │ - RelayMapChange │ │
│ │ - DirectAddrRefresh │ │
│ │ - ResolveRemote (from connect) │ │
│ │ - AddConnection (from new QUIC conn) │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ On address change: │
│ ┌──────────────────────────────────────────────┐ │
│ │ 1. Run net_report to discover external addrs │ │
│ │ 2. Update direct_addrs watchable │ │
│ │ 3. Publish new addresses to AddressLookup │ │
│ │ 4. Notify noq of network changes │ │
│ └──────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────┘
```
## Shutdown Sequence
```
Endpoint::close()
├─ Cancel at_close_start token
│ (stops net_reports, address lookups)
├─ Clear address_lookup services
├─ noq_endpoint.close(0, b"")
│ (refuses new connections, starts close for existing)
├─ noq_endpoint.wait_idle().await
│ (waits for close frames to be acknowledged)
├─ Cancel at_endpoint_closed token
├─ Wait for actor task (100ms timeout, then abort)
└─ runtime.shutdown().await
(waits for all spawned tasks)
```
## WASM/Browser Differences
When compiled to `wasm32-unknown-unknown`:
| Feature | Native | WASM/Browser |
|---------|--------|-------------|
| IP transports | Yes (IPv4 + IPv6) | No (no socket access) |
| DNS resolution | `DnsAddressLookup` (system DNS) | `PkarrResolver` (HTTP) |
| Network monitoring | `netwatch` (interface changes) | Not available |
| Port mapping | UPnP/PCP/NAT-PMP | Not available |
| Net report | Full (QAD, HTTPS probes) | Limited |
| Runtime | Tokio | `wasm-bindgen-futures` |
| Timer | Tokio timer | `web::Timer` wrapping `sleep_until` |
## Thread Safety & Concurrency
- `Endpoint` is `Clone` (wraps `Arc<EndpointInner>`)
- `Socket` is `Arc<Socket>` — shared across all connections
- `RemoteMap` uses `ConcurrentReadMap` — lock-free reads for hot path
- `AddressLookupServices` uses `RwLock` — infrequent writes, frequent reads
- `DirectAddrs` uses `Watchable` — publishes changes to watchers
- `HomeRelayWatch` uses `n0_watcher::Direct` — efficient change notification
## Error Handling Patterns
Iroh uses the `n0_error::stack_error` macro for rich error chains:
```rust
#[stack_error(derive, add_meta, from_sources)]
pub enum ConnectError {
#[error(transparent)]
Connect { source: ConnectWithOptsError },
#[error(transparent)]
Connecting { source: ConnectingError },
#[error(transparent)]
Connection { source: ConnectionError },
}
// Usage:
// ConnectError::Connect { source: ConnectWithOptsError::SelfConnect }
// ConnectError::Connecting { source: ConnectingError::AuthenticationError { .. } }
```
## Key Constants & Timeouts
| Constant | Value | Purpose |
|----------|-------|---------|
| `HEARTBEAT_INTERVAL` | 5s | Keepalive PING interval |
| `PATH_MAX_IDLE_TIMEOUT` | 15s | Max idle before closing direct path |
| `RELAY_PATH_MAX_IDLE_TIMEOUT` | 30s | Max idle before closing relay path |
| `MAX_MULTIPATH_PATHS` | 12 | Max concurrent paths per connection |
| `DEFAULT_MAX_TLS_TICKETS` | 256 (8×32) | TLS session ticket cache size |
| `NET_REPORT_TIMEOUT` | 10s | Max time for net report |
| `FULL_REPORT_INTERVAL` | 5min | Time between full net reports |
| `DEFAULT_RELAY_QUIC_PORT` | 3478 | QAD port on relay servers |

View File

@@ -0,0 +1,108 @@
# irpc: Overview and Architecture
## What is irpc?
`irpc` is a **streaming RPC system** built for [iroh](https://docs.rs/iroh) and [noq](https://docs.rs/noq) (QUIC-based transports). It provides a framework for defining RPC protocols in Rust that work identically whether the communication is **in-process** (via tokio channels) or **cross-process/cross-network** (via QUIC streams).
**Key design goals:**
1. **Zero-overhead local use** — When used in-process, irpc should be as lightweight as raw tokio channels, replacing the common pattern of a giant `enum` over an `mpsc` channel with typed backchannels.
2. **Transparent local/remote abstraction** — The same protocol definition and client API works for both in-process and remote communication.
3. **Streaming-first** — Full support for unary RPC, server streaming, client streaming, and bidirectional streaming interaction patterns.
4. **QUIC-native** — Does not abstract over stream types; directly uses noq/iroh QUIC streams, enabling per-request stream tuning (priorities, etc.).
**Non-goals:**
- Cross-language interop (Rust-to-Rust only)
- Versioning (users must handle this themselves)
- Making remote calls look like local async function calls
- Runtime agnosticism (tokio only)
## Crate Structure
```
irpc/
├── src/lib.rs # Core library: traits, channels, Client, RPC module
├── src/util.rs # Varint utilities, noq endpoint setup helpers
├── src/tests.rs # Channel filter/map tests
├── irpc-derive/ # Procedural macro crate (rpc_requests)
├── irpc-iroh/ # Iroh transport integration
├── examples/ # Working examples (storage, compute, derive, local)
└── tests/ # Integration tests (channels, derive)
```
### Features
| Feature | Default | Purpose |
|---|---|---|
| `rpc` | ✅ | Enables remote RPC (noq transport, postcard serialization) |
| `derive` | ✅ | Enables the `#[rpc_requests]` macro |
| `spans` | ✅ | Preserves tracing spans across message passing |
| `stream` | ✅ | Enables `into_stream()` on mpsc receivers |
| `noq_endpoint_setup` | ✅ | Utilities to create noq endpoints (testing, localhost) |
| `varint-util` | ❌ | Varint read/write utilities without full RPC |
## High-Level Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Application │
│ │
│ ┌──────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Client │─────│ Protocol │─────│ Actor/ │ │
│ │<S> │ │ Enum (S) │ │ Handler │ │
│ └────┬─────┘ └───────────┘ └─────┬─────┘ │
│ │ │ │
│ ┌────▼─────────────────────────────────────▼─────┐ │
│ │ WithChannels<I, S> │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌─────┐ │ │
│ │ │ inner │ │ tx │ │ rx │ │span │ │ │
│ │ │ (I) │ │(Sender)│ │(Recv) │ │ │ │ │
│ │ └────────┘ └────────┘ └────────┘ └─────┘ │ │
│ └────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────┐ ┌─────────────────────────┐ │
│ │ Local Path │ │ Remote Path (rpc feat) │ │
│ │ tokio::mpsc │ │ noq QUIC streams │ │
│ │ tokio::oneshot │ │ postcard serialization │ │
│ └────────────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### Core Flow
1. **Define a protocol** — An enum where each variant represents an RPC method, annotated with `#[rpc(tx=..., rx=...)]`.
2. **The `rpc_requests` macro** generates:
- `Channels<S>` impl for each request type
- A message enum wrapping each request in `WithChannels<I, S>`
- `Service` and `RemoteService` trait implementations
- `From` conversions between request types, protocol enum, and message enum
3. **Client sends messages**`Client<S>` either sends over a local `mpsc` channel or serializes and sends over a QUIC stream.
4. **Actor/handler processes messages** — Matches on the message enum, extracts `WithChannels { inner, tx, rx, .. }`, and uses `tx`/`rx` to communicate back.
## Dependency Graph
```
irpc (core)
├── serde (always)
├── tokio (sync, macros)
├── tokio-util
├── n0-error
├── n0-future
├── postcard (rpc feature)
├── noq (rpc feature)
├── smallvec (rpc feature)
├── tracing (spans feature)
└── irpc-derive (derive feature)
irpc-iroh
├── irpc
├── iroh
├── iroh-base
├── postcard
└── n0-error, n0-future, tokio, tracing, serde
```
## License
Dual-licensed: Apache-2.0 OR MIT

View File

@@ -0,0 +1,239 @@
# irpc: Key Types and Traits
## Core Traits
### `RpcMessage`
```rust
pub trait RpcMessage: Debug + Serialize + DeserializeOwned + Send + Sync + Unpin + 'static {}
```
A blanket trait implemented for all types that satisfy the bounds. Every message sent through irpc (both local and remote) must implement this. The `Serialize + DeserializeOwned` requirement exists even without the `rpc` feature because the same protocol definition should work in both modes.
### `Service`
```rust
pub trait Service: Serialize + DeserializeOwned + Send + Sync + Debug + 'static {
type Message: Send + Unpin + 'static;
}
```
Implemented on the **protocol enum** (e.g., `StorageProtocol`). The `Message` associated type is the **message enum** — an enum with identical variant names but whose single field is `WithChannels<InnerType, Self>`.
The `Service` trait acts as a **scope** for channel type definitions, allowing the same inner request type to be used with multiple services.
### `Channels<S>`
```rust
pub trait Channels<S: Service>: Send + 'static {
type Tx: Sender;
type Rx: Receiver;
}
```
Implemented on each **request type** (e.g., `Get`, `Set`). Specifies what kind of channels accompany that request when sent through service `S`. The `Tx` type is the response channel (server → client); the `Rx` type is the update channel (client → server).
### `Sender` and `Receiver`
```rust
pub trait Sender: Debug + Sealed {}
pub trait Receiver: Debug + Sealed {}
```
Sealed marker traits. Only the types in `irpc::channel` implement these: `oneshot::Sender`, `oneshot::Receiver`, `mpsc::Sender`, `mpsc::Receiver`, `NoSender`, `NoReceiver`.
### `RemoteService` (rpc feature)
```rust
pub trait RemoteService: Service + Sized {
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message;
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
// Default: convert deserialized protocol enum + streams → Message, send to local sender
}
}
```
Implemented on the protocol enum. Maps a deserialized protocol variant + a pair of QUIC streams into a `WithChannels` message, which is then forwarded to the local actor.
### `RemoteConnection` (rpc feature)
```rust
pub trait RemoteConnection: Send + Sync + Debug + 'static {
fn clone_boxed(&self) -> Box<dyn RemoteConnection>;
fn open_bi(&self) -> BoxFuture<Result<(noq::SendStream, noq::RecvStream), RequestError>>;
fn zero_rtt_accepted(&self) -> BoxFuture<bool>;
}
```
Abstraction over how to open a bidirectional QUIC stream. Implemented for:
- `noq::Connection` — direct noq connection
- `NoqLazyRemoteConnection` — lazy connection that caches the underlying QUIC connection
- `IrohRemoteConnection` — iroh connection (in `irpc-iroh`)
- `IrohLazyRemoteConnection` — lazy iroh connection (in `irpc-iroh`)
- `IrohZrttRemoteConnection` — 0-RTT iroh connection (in `irpc-iroh`)
## Key Structs
### `WithChannels<I, S>`
```rust
pub struct WithChannels<I: Channels<S>, S: Service> {
pub inner: I,
pub tx: <I as Channels<S>>::Tx,
pub rx: <I as Channels<S>>::Rx,
#[cfg(feature = "spans")]
pub span: tracing::Span,
}
```
The central message wrapper. Wraps a request type `I` with its typed channels for service `S`. Implements `Deref` to `I` for convenient field access.
**Construction** via tuple conversions:
- `(inner, tx, rx)` → full channels
- `(inner, tx)` → when `Rx = NoReceiver` (most common for RPC/server-streaming)
- `(inner,)` → when `Tx = NoSender, Rx = NoReceiver` (notify)
### `Client<S>`
```rust
#[derive(Debug)]
pub struct Client<S: Service>(ClientInner<S::Message>, PhantomData<S>);
```
The primary client type. Generic over a service `S`. Can be either local or remote.
**Construction:**
- `Client::local(mpsc_sender)` — from a tokio mpsc sender
- `Client::noq(endpoint, addr)` — from a noq endpoint + address (rpc feature)
- `Client::boxed(remote_connection)` — from any `RemoteConnection` impl
**Key methods** (all handle both local and remote transparently):
| Method | Pattern | Tx Type | Rx Type |
|---|---|---|---|
| `rpc()` | Unary RPC | `oneshot::Sender<Res>` | `NoReceiver` |
| `server_streaming()` | Server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
| `client_streaming()` | Client streaming | `oneshot::Sender<Res>` | `mpsc::Receiver<Update>` |
| `bidi_streaming()` | Bidirectional | `mpsc::Sender<Res>` | `mpsc::Receiver<Update>` |
| `notify()` | Fire-and-forget | `NoSender` | `NoReceiver` |
| `rpc_0rtt()` | 0-RTT unary | `oneshot::Sender<Res>` | `NoReceiver` |
| `server_streaming_0rtt()` | 0-RTT server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
| `notify_0rtt()` | 0-RTT fire-and-forget | `NoSender` | `NoReceiver` |
Each method creates the appropriate channel pair, wraps the message into `WithChannels`, and sends it.
### `LocalSender<S>`
```rust
#[repr(transparent)]
pub struct LocalSender<S: Service>(crate::channel::mpsc::Sender<S::Message>);
```
A thin wrapper around `mpsc::Sender<S::Message>` for sending messages to a local actor. Provides:
```rust
impl<S: Service> LocalSender<S> {
pub fn send<T>(&self, value: impl Into<WithChannels<T, S>>) -> impl Future<Output = Result<(), SendError>>
where
T: Channels<S>,
S::Message: From<WithChannels<T, S>>;
pub fn send_raw(&self, value: S::Message) -> impl Future<Output = Result<(), SendError>>;
}
```
### `Request<L, R>`
```rust
pub enum Request<L, R> {
Local(L),
Remote(R),
}
```
A generic enum distinguishing local vs remote requests. `Client::request()` returns `Request<LocalSender<S>, RemoteSender<S>>`.
### `RemoteSender<S>` (rpc feature)
```rust
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
```
Holds a QUIC stream pair after opening a bidirectional stream. The `write()` method serializes the protocol message with postcard + varint length prefix and sends it over the send stream.
### `Handler<R>` (rpc feature)
```rust
pub type Handler<R> = Arc<
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
+ Send + Sync + 'static,
>;
```
A shared handler function that processes incoming remote requests. Typically created via `Protocol::remote_handler(local_sender)`.
## Error Types
### `RequestError`
```rust
pub enum RequestError {
Connect { source: noq::ConnectError }, // Connection establishment failed
Connection { source: noq::ConnectionError }, // Stream open failed
Other { source: AnyError }, // Generic error for non-noq transports
}
```
### `SendError` (in `channel` module)
```rust
pub enum SendError {
ReceiverClosed, // Local: receiver dropped
MaxMessageSizeExceeded, // Remote: message > 16 MiB
Io { source: io::Error }, // Remote: network/serialization error
}
```
### `RecvError` (oneshot and mpsc variants)
```rust
// oneshot::RecvError
pub enum RecvError {
SenderClosed, // Local: sender dropped
MaxMessageSizeExceeded, // Remote: message > 16 MiB
Io { source: io::Error }, // Remote: network/deserialization error
}
// mpsc::RecvError
pub enum RecvError {
MaxMessageSizeExceeded, // Remote: message > 16 MiB
Io { source: io::Error }, // Remote: network/deserialization error
}
```
Note: `mpsc::RecvError` does **not** have `SenderClosed` — mpsc receivers return `Ok(None)` when the sender is dropped.
### `WriteError` (rpc feature)
```rust
pub enum WriteError {
Noq { source: noq::WriteError }, // QUIC stream write error
MaxMessageSizeExceeded, // Message > 16 MiB
Io { source: io::Error }, // Serialization error
}
```
### `Error` (top-level umbrella)
```rust
pub enum Error {
Request { source: RequestError },
Send { source: SendError },
MpscRecv { source: mpsc::RecvError },
OneshotRecv { source: oneshot::RecvError },
Write { source: rpc::WriteError }, // rpc feature only
}
```
All error types implement `From<Error>` for `io::Error`, allowing integration with `?` in `io::Result` contexts.

View File

@@ -0,0 +1,168 @@
# irpc: Channel System
The channel system is the heart of irpc. It provides channel types that abstract over local (tokio) and remote (QUIC stream) communication, with the same API surface regardless of transport.
## Channel Kinds
irpc provides three kinds of channels, each with local and remote variants:
### Oneshot Channels (`channel::oneshot`)
Single-value, single-use channels for RPC responses.
| Type | Local Backend | Remote Backend |
|---|---|---|
| `oneshot::Sender<T>` | `tokio::sync::oneshot::Sender` | `BoxedSender<T>` (FnOnce over QUIC write) |
| `oneshot::Receiver<T>` | `FusedOneshotReceiver<T>` | `BoxedReceiver<T>` (boxed future over QUIC read) |
**Creation:** `oneshot::channel::<T>()` returns `(Sender<T>, Receiver<T>)`
**Sender behavior:**
- Local: `send(value)` is synchronous-ish, fails only if receiver dropped
- Remote: `send(value)` is async — serializes with postcard, length-prefixes with varint, writes to QUIC stream
**Receiver behavior:**
- Implements `Future<Output = Result<T, RecvError>>`
- Local: resolves to the value or `SenderClosed` error
- Remote: reads varint length prefix, reads that many bytes, deserializes with postcard
**Filtering/Mapping** (on `Sender<T>` where `T: Send + Sync + 'static`):
```rust
sender.with_filter(|v| v > 0) // Drop messages failing predicate
sender.with_map(|v: U| v.into()) // Transform before sending
sender.with_filter_map(|v| ...) // Combined filter + map
```
### MPSC Channels (`channel::mpsc`)
Multi-producer, single-consumer streaming channels for server-streaming, client-streaming, and bidirectional patterns.
| Type | Local Backend | Remote Backend |
|---|---|---|
| `mpsc::Sender<T>` | `tokio::sync::mpsc::Sender` | `Arc<DynSender<T>>` (NoqSender) |
| `mpsc::Receiver<T>` | `tokio::sync::mpsc::Receiver` | `Box<dyn DynReceiver<T>>` (NoqReceiver) |
**Creation:** `mpsc::channel::<T>(buffer)` returns `(Sender<T>, Receiver<T>)`
**Sender behavior:**
- `send(value).await` — sends, yielding if full (remote: serializes + writes to stream)
- `try_send(value).await` — non-blocking attempt; returns `Ok(false)` if would block
- `closed().await` — waits until all receivers are dropped
- `is_rpc()` — returns `true` for remote senders
**Receiver behavior:**
- `recv().await``Result<Option<T>, RecvError>``None` means sender closed/cleanly finished
- `filter(pred)`, `map(fn)`, `filter_map(fn)` — chainable transformations
- `into_stream()` (with `stream` feature) — converts to `Stream<Item = Result<T, RecvError>>`
**Cloning:** `mpsc::Sender<T>` implements `Clone`. Local senders clone the underlying tokio sender; remote senders clone the `Arc`.
### None Channels (`channel::none`)
Placeholder channels for when no communication is needed.
```rust
pub struct NoSender; // Implements Sender, does nothing
pub struct NoReceiver; // Implements Receiver, does nothing
```
Used as defaults when `#[rpc(tx=...)]` or `#[rpc(rx=...)]` are omitted.
## Remote Channel Internals
### NoqSender<T>
```rust
struct NoqSender<T>(tokio::sync::Mutex<NoqSenderState<T>>);
enum NoqSenderState<T> {
Open(NoqSenderInner<T>),
Closed,
}
struct NoqSenderInner<T> {
send: noq::SendStream,
buffer: SmallVec<[u8; 128]>, // Stack-allocated buffer for small messages
_marker: PhantomData<T>,
}
```
Key behaviors:
- **Mutex-protected state**: The inner state is `Mutex`-protected because `DynSender::send()` takes `&self`. When a send fails, the state transitions to `Closed` and all subsequent sends return `BrokenPipe`.
- **Buffer reuse**: Uses `SmallVec<[u8; 128]>` to avoid heap allocation for messages that serialize to ≤128 bytes.
- **Serialization**: Each message is postcard-serialized with a varint length prefix. If serialization exceeds `MAX_MESSAGE_SIZE` (16 MiB), the stream is reset with error code `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` (1).
- **Serialization errors**: If postcard serialization fails, the stream is reset with `ERROR_CODE_INVALID_POSTCARD` (2).
### NoqReceiver<T>
```rust
struct NoqReceiver<T> {
recv: noq::RecvStream,
_marker: PhantomData<T>,
}
```
Reads a varint length prefix, allocates a buffer of that size, reads the data, and deserializes with postcard. If the length exceeds `MAX_MESSAGE_SIZE`, stops the stream with the appropriate error code.
### Oneshot Remote Sender
For `oneshot::Sender<T>` over QUIC, the sender is a `BoxedSender<T>` — a `Box<dyn FnOnce(T) -> BoxFuture<Result<(), SendError>>>`. This captures the `noq::SendStream` and on invocation:
1. Computes `postcard::experimental::serialized_size(&value)`
2. Checks against `MAX_MESSAGE_SIZE`
3. Writes length-prefixed postcard data to the stream
### Oneshot Remote Receiver
For `oneshot::Receiver<T>` over QUIC, the receiver is constructed from a `noq::RecvStream`:
1. Reads a varint length prefix
2. Reads that many bytes
3. Deserializes with postcard
4. Returns the value
## Channel Conversion Table
When a QUIC stream pair `(SendStream, RecvStream)` is received for a request:
| Channel Kind | `Tx` (SendStream →) | `Rx` (RecvStream →) |
|---|---|---|
| `oneshot::Sender<T>` | Serialize + write, then finish | Read length-prefixed data |
| `mpsc::Sender<T>` | Repeatedly serialize + write | N/A |
| `oneshot::Receiver<T>` | N/A | Read single length-prefixed value |
| `mpsc::Receiver<T>` | N/A | Repeatedly read length-prefixed values |
| `NoSender` | Drop the stream | N/A |
| `NoReceiver` | N/A | Drop the stream |
The `From<noq::RecvStream>` and `From<noq::SendStream>` impls handle these conversions automatically based on the target type.
## DynSender and DynReceiver Traits
The `mpsc` module exposes traits for dynamic dispatch:
```rust
pub trait DynSender<T>: Debug + Send + Sync + 'static {
fn send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<(), SendError>> + Send + '_>>;
fn try_send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<bool, SendError>> + Send + '_>>;
fn closed(&self) -> Pin<Box<dyn Future<Output = ()> + Send + Sync + '_>>;
fn is_rpc(&self) -> bool;
}
pub trait DynReceiver<T>: Debug + Send + Sync + 'static {
fn recv(&mut self) -> Pin<Box<dyn Future<Output = Result<Option<T>, RecvError>> + Send + Sync + '_>>;
}
```
These enable boxing of remote senders/receivers while keeping the local variants unboxed for zero overhead.
## FusedOneshotReceiver
A thin wrapper around `tokio::sync::oneshot::Receiver` that prevents panics when polling an already-completed receiver. It tracks completion state and returns `Poll::Pending` indefinitely after resolution, matching the `FusedFuture` pattern.
## Cancellation Safety
For remote `mpsc::Sender`:
- If a `send()` future is dropped before completion, the underlying QUIC stream is closed.
- All clones of the sender will receive `SendError::Io(BrokenPipe)` on subsequent send attempts.
- This is documented behavior: **always poll send futures to completion if you want to reuse the sender**.
For remote `oneshot::Sender`:
- Since it's `FnOnce`, dropping the future before sending simply means the value is never sent. The receiver will get `SenderClosed`.

View File

@@ -0,0 +1,272 @@
# irpc: Protocol and Message Flow
## Wire Protocol
When the `rpc` feature is enabled, irpc uses the following wire format over QUIC streams:
### Message Framing
Every message on the wire is **length-prefixed using postcard varints** (LEB128 encoding):
```
┌─────────────────┬──────────────────────┐
│ varint length │ postcard-serialized │
│ (1-10 bytes) │ message data │
└─────────────────┴──────────────────────┘
```
- **Length prefix**: LEB128 varint encoding of `u64` length. Each byte uses 7 bits for the value and the MSB as a continuation bit. Maximum 10 bytes for a full `u64`.
- **Payload**: Postcard-encoded (compact, no-schema serde format) Rust message.
### Maximum Message Size
`MAX_MESSAGE_SIZE = 16 MiB (16 * 1024 * 1024)`
Messages exceeding this limit are rejected:
- **Send side**: The sender checks `postcard::experimental::serialized_size()` before sending. If exceeded, the stream is reset with error code `1` (`ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED`).
- **Receive side**: After reading the varint length, if it exceeds `MAX_MESSAGE_SIZE`, the stream is stopped with error code `1`.
### Error Codes
| Code | Constant | Meaning |
|---|---|---|
| `1` | `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` | Message larger than 16 MiB |
| `2` | `ERROR_CODE_INVALID_POSTCARD` | Postcard serialization failed |
These are used as QUIC stream reset/stop error codes.
### Connection Closure
Error code `0` on the QUIC connection means "clean close" — the remote side intentionally shut down. This is distinguished from actual errors.
## Message Flow: Local Path
```
Client Actor
│ │
│ Client::rpc(Get { key: "x" }) │
│ │
│ 1. Create oneshot channel pair │
│ (tx, rx) = oneshot::channel() │
│ │
│ 2. Wrap into WithChannels │
│ WithChannels { │
│ inner: Get { key: "x" }, │
│ tx: oneshot::Sender<Res>, │
│ rx: NoReceiver, │
│ span: current_span, │
│ } │
│ │
│ 3. Convert to Message enum │
│ StorageMessage::Get(wc) │
│ │
│ 4. Send over mpsc channel ────────►│
│ │
│ 5. Await on oneshot receiver │
│ rx.await ◄─────────────────────│
│ tx.send(res)│
│ │
│ Result: res │
```
For bidirectional streaming:
```
Client Actor
│ │
│ Client::bidi_streaming(Sum, 4, 4) │
│ │
│ 1. Create channel pairs │
│ (update_tx, update_rx) │
│ (res_tx, res_rx) │
│ │
│ 2. WithChannels { │
│ inner: Sum, │
│ tx: mpsc::Sender<i64>, │
│ rx: mpsc::Receiver<i64>, │
│ } │
│ │
│ 3. Send message ──────────────────►│
│ │
│ 4. Use update_tx.send(val) ───────►│
│ Use res_rx.recv() ◄─────────│
│ res_tx.send(val)
│ │
```
## Message Flow: Remote Path
```
Client Server
│ │
│ Client::rpc(Get { key: "x" }) │
│ │
│ 1. open_bi() → (SendStream, RecvStream)
│ │
│ 2. Serialize StorageProtocol::Get(Get { key: "x" })
│ with postcard + varint prefix │
│ │
│ 3. Write to SendStream ───────────►│
│ │
│ │ 4. Accept bi stream
│ │ 5. Read varint + deserialize
│ │ 6. RemoteService::with_remote_channels()
│ │ → WithChannels { inner, tx, rx }
│ │ 7. Forward to local actor
│ │
│ │ Actor processes, sends response
│ │ on the SendStream (which is the
│ │ oneshot::Sender<T> backed by QUIC)
│ │
│ 8. Read from RecvStream ◄──────────│
│ 9. Deserialize response │
│ │
│ Result: res │
```
For bidirectional streaming over remote:
```
Client Server
│ │
│ Client::bidi_streaming(Sum, 4, 4) │
│ │
│ open_bi() → (SendStream, RecvStream)
│ │
│ SendStream → mpsc::Sender<Update> │ RecvStream → mpsc::Receiver<Update>
│ RecvStream → oneshot::Receiver<Res>│ SendStream → oneshot::Sender<Res>
│ (or mpsc::Receiver<Res> for │
│ server-streaming with mpsc tx) │
│ │
│ The initial message is sent on │
│ SendStream with varint prefix. │
│ │
│ Subsequent updates are sent on │
│ the same SendStream as varint- │
│ prefixed postcard messages. │
│ │
│ The response stream is read from │
│ the RecvStream as varint-prefixed │
│ postcard messages. │
```
## Stream Direction Convention
In irpc's QUIC stream model:
- **Client opens** a bidirectional stream (`open_bi()`)
- **SendStream** (client → server): carries the initial request message, plus any client-streaming updates
- **RecvStream** (server → client): carries the response(s) from the server
The `RemoteService::with_remote_channels()` method decides how to map streams to channels:
```rust
// For a simple RPC (tx=oneshot, rx=none):
fn with_remote_channels(self, rx: RecvStream, tx: SendStream) -> Self::Message {
// rx stream is unused (NoReceiver), tx carries response
WithChannels::from((msg, tx.into(), rx.into()))
// tx → oneshot::Sender<Res> (or mpsc::Sender<Res>)
// rx → NoReceiver
}
```
Wait — looking at the actual implementation more carefully:
The `RemoteService::with_remote_channels` method takes `(self, rx: RecvStream, tx: SendStream)` where:
- `rx` = the `RecvStream` from the bidirectional stream (client reads from this)
- `tx` = the `SendStream` from the bidirectional stream (client writes to this)
But for the **server side**, the `RecvStream` is what the server reads from (client updates), and `SendStream` is what the server writes to (server responses).
In the `with_remote_channels` generated code:
```rust
// For rpc(tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>):
WithChannels::from((msg, tx.into(), rx.into()))
// tx (SendStream) → oneshot::Sender<Res> — server writes response
// rx (RecvStream) → mpsc::Receiver<Update> — server reads client updates
```
So the naming in `with_remote_channels` is from the **server's perspective**:
- `rx` parameter = RecvStream = what server receives (client → server updates)
- `tx` parameter = SendStream = what server sends (server → client responses)
## Connection Management
### NoqLazyRemoteConnection
```rust
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
struct NoqLazyRemoteConnectionInner {
endpoint: noq::Endpoint,
addr: SocketAddr,
connection: Mutex<Option<noq::Connection>>,
}
```
- Lazily establishes connection on first use
- Caches the `noq::Connection` inside a `Mutex<Option<...>>`
- On `open_bi()`: if cached connection exists, tries to reuse it; if it fails, clears cache and reconnects once
- Thread-safe via `Arc` + `Mutex`
### IrohLazyRemoteConnection (irpc-iroh)
Same pattern but for iroh endpoints, with an additional `alpn` field for protocol identification.
### 0-RTT Support
irpc supports QUIC 0-RTT for reduced latency on reconnections:
- `Client::rpc_0rtt()` — sends request immediately with 0-RTT data; if the server rejects 0-RTT, re-sends
- `Client::server_streaming_0rtt()` — same for server-streaming
- `Client::notify_0rtt()` — same for fire-and-forget
The 0-RTT flow:
1. Client serializes the message into a buffer (`prepare_write()`)
2. Sends the buffer over a 0-RTT connection
3. Awaits `zero_rtt_accepted()` to check if 0-RTT was accepted
4. If not accepted, opens a new connection and re-sends the same buffer
`RemoteConnection::zero_rtt_accepted()` returns `true` for regular connections and for lazy connections. For `IrohZrttRemoteConnection`, it checks the actual 0-RTT status via `handshake_completed()`.
## Server-Side: Accepting Connections
### Using noq (direct QUIC)
```rust
irpc::rpc::listen(endpoint, handler)
```
This function:
1. Loops on `endpoint.accept()` to accept incoming connections
2. For each connection, spawns a task running `handle_connection()`
3. `handle_connection()` loops on `read_request_raw()` to read requests from bidirectional streams
4. Each request is deserialized and passed to the `Handler`
### Using iroh
```rust
IrohProtocol::with_sender(local_sender)
```
This creates a `ProtocolHandler` that can be registered with `iroh::protocol::Router`. When a connection arrives, it calls `handle_connection()` from irpc-iroh, which handles the protocol handshake and reads requests.
For 0-RTT support:
```rust
Iroh0RttProtocol::with_sender(local_sender)
```
This implements `ProtocolHandler::on_accepting()` to handle 0-RTT connections.
### Handler Function
```rust
type Handler<R> = Arc<
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
+ Send + Sync + 'static,
>;
```
The handler receives:
1. The deserialized protocol message (`R`)
2. The `RecvStream` (for client → server updates)
3. The `SendStream` (for server → client responses)
Typically created via `Protocol::remote_handler(local_sender)`, which converts streams to typed channels and forwards the `WithChannels` message to a local actor.

View File

@@ -0,0 +1,278 @@
# irpc: The rpc_requests Macro
The `#[rpc_requests]` attribute macro is the primary way to define an irpc protocol. It generates the boilerplate for channel typing, message wrapping, and service trait implementations.
## Basic Usage
```rust
use irpc::{channel::{mpsc, oneshot}, rpc_requests, Client, WithChannels};
use serde::{Deserialize, Serialize};
#[rpc_requests(message = ComputeMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum ComputeProtocol {
/// Unary RPC: one request, one response
#[rpc(tx=oneshot::Sender<i64>)]
#[wrap(Multiply)]
Multiply(i64, i64),
/// Bidirectional streaming
#[rpc(tx=mpsc::Sender<i64>, rx=mpsc::Receiver<i64>)]
#[wrap(Sum)]
Sum,
}
```
This single macro invocation generates:
1. **Wrapper structs** (from `#[wrap]`): `Multiply` and `Sum` struct types
2. **`Channels<ComputeProtocol>` impls**: For each variant's inner type, specifying `Tx` and `Rx`
3. **`Service` impl**: `impl Service for ComputeProtocol { type Message = ComputeMessage; }`
4. **`RemoteService` impl** (rpc feature): Maps protocol variants + QUIC streams to messages
5. **`ComputeMessage` enum**: Wraps each request in `WithChannels`
6. **`From` conversions**: Between inner types, `ComputeProtocol`, and `ComputeMessage`
## Macro Arguments
### Top-level (on the enum)
| Argument | Required | Description |
|---|---|---|
| `message = Name` | Recommended | Name of the generated message enum. Also generates `Service` and `RemoteService` impls. |
| `alias = "Suffix"` | Optional | Generates type aliases like `MultiplyMsg = WithChannels<Multiply, ComputeProtocol>` |
| `rpc_feature = "feat"` | Optional | Feature-gates the `RemoteService` impl with `#[cfg(feature = "feat")]` |
| `no_rpc` | Optional | Skips generating `RemoteService` impl entirely |
| `no_spans` | Optional | Skips span-related code (for use without the `spans` feature) |
### Per-variant
#### `#[rpc(tx=Type, rx=Type)]`
Specifies channel types for each request:
- `tx` — response channel type (server → client). Defaults to `NoSender`.
- `rx` — update channel type (client → server). Defaults to `NoReceiver`.
Valid types:
- `oneshot::Sender<T>` — single response
- `mpsc::Sender<T>` — streaming response
- `oneshot::Receiver<T>` — not valid as tx (use for rx pattern)
- `mpsc::Receiver<T>` — streaming updates (client → server)
- `NoSender` / `NoReceiver` — no channel in that direction
#### `#[wrap(TypeName, derive(Traits))]`
Generates a struct from the variant's fields:
- `TypeName` — name of the generated struct
- Optional visibility prefix (e.g., `pub(crate) TypeName`)
- `derive(...)` — additional derive macros beyond the default `Serialize, Deserialize, Debug`
If `#[wrap]` is not used, each variant must have exactly one unnamed field (a named type).
## Generated Code Walkthrough
Given this input:
```rust
#[rpc_requests(message = StoreMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum StoreProtocol {
#[rpc(tx=oneshot::Sender<String>)]
#[wrap(GetRequest, derive(Clone))]
Get(String),
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(SetRequest)]
Set { key: String, value: String },
}
```
The macro generates:
### 1. Wrapper Structs
```rust
#[derive(Debug, Serialize, Deserialize, Clone)]
pub GetRequest(pub String);
#[derive(Debug, Serialize, Deserialize)]
pub SetRequest { pub key: String, pub value: String }
```
The variants are rewritten to use these:
```rust
enum StoreProtocol {
Get(GetRequest),
Set(SetRequest),
}
```
### 2. Channels Implementations
```rust
impl Channels<StoreProtocol> for GetRequest {
type Tx = oneshot::Sender<String>;
type Rx = NoReceiver;
}
impl Channels<StoreProtocol> for SetRequest {
type Tx = oneshot::Sender<()>;
type Rx = NoReceiver;
}
```
### 3. Message Enum
```rust
#[doc = "Message enum for [`StoreProtocol`]"]
#[allow(missing_docs)]
#[derive(Debug)]
pub enum StoreMessage {
Get(WithChannels<GetRequest, StoreProtocol>),
Set(WithChannels<SetRequest, StoreProtocol>),
}
```
### 4. Service Implementation
```rust
impl Service for StoreProtocol {
type Message = StoreMessage;
}
```
### 5. RemoteService Implementation (rpc feature)
```rust
impl RemoteService for StoreProtocol {
fn with_remote_channels(
self,
rx: noq::RecvStream,
tx: noq::SendStream,
) -> Self::Message {
match self {
StoreProtocol::Get(msg) => {
StoreMessage::from(WithChannels::from((msg, tx, rx)))
}
StoreProtocol::Set(msg) => {
StoreMessage::from(WithChannels::from((msg, tx, rx)))
}
}
}
}
```
### 6. From Conversions
```rust
// Inner type → Protocol enum
impl From<GetRequest> for StoreProtocol { ... }
impl From<SetRequest> for StoreProtocol { ... }
// WithChannels → Message enum
impl From<WithChannels<GetRequest, StoreProtocol>> for StoreMessage { ... }
impl From<WithChannels<SetRequest, StoreProtocol>> for StoreMessage { ... }
```
### 7. parent_span Method (spans feature)
```rust
impl StoreMessage {
pub fn parent_span(&self) -> tracing::Span {
let span = match self {
StoreMessage::Get(inner) => inner.parent_span_opt(),
StoreMessage::Set(inner) => inner.parent_span_opt(),
};
span.cloned().unwrap_or_else(|| tracing::Span::current())
}
}
```
## Interaction Pattern Mapping
The `#[rpc]` attribute maps directly to gRPC-like patterns:
| Pattern | `tx` type | `rx` type | Example |
|---|---|---|---|
| **Unary RPC** | `oneshot::Sender<R>` | `NoReceiver` | Get by key, return value |
| **Server streaming** | `mpsc::Sender<R>` | `NoReceiver` | List all items |
| **Client streaming** | `oneshot::Sender<R>` | `mpsc::Receiver<U>` | Upload items, get count |
| **Bidirectional** | `mpsc::Sender<R>` | `mpsc::Receiver<U>` | Chat, live updates |
| **Notify (fire & forget)** | `NoSender` | `NoReceiver` | Log event |
## Client Methods Generated by Patterns
The `Client<S>` methods correspond to channel types:
```rust
// Unary RPC: tx=oneshot::Sender<Res>, rx=NoReceiver
client.rpc(Get { key: "x" }).await // → Result<Res>
// Server streaming: tx=mpsc::Sender<Res>, rx=NoReceiver
client.server_streaming(List, 16).await // → Result<mpsc::Receiver<Res>>
// Client streaming: tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>
client.client_streaming(SetMany, 4).await // → Result<(mpsc::Sender<Update>, oneshot::Receiver<Res>)>
// Bidirectional: tx=mpsc::Sender<Res>, rx=mpsc::Receiver<Update>
client.bidi_streaming(Sum, 4, 4).await // → Result<(mpsc::Sender<Update>, mpsc::Receiver<Res>)>
// Notify: tx=NoSender, rx=NoReceiver
client.notify(Log { msg: "hi" }).await // → Result<()>
```
## Manual Protocol Definition (Without Macro)
You can define protocols manually instead of using the macro:
```rust
use irpc::{channel::{mpsc, none::NoReceiver, oneshot}, Channels, Service, WithChannels};
use serde::{Deserialize, Serialize};
// 1. Define request types
#[derive(Debug, Serialize, Deserialize)]
struct Get { key: String }
#[derive(Debug, Serialize, Deserialize)]
struct Set { key: String, value: String }
// 2. Implement Channels for each type
impl Channels<StorageProtocol> for Get {
type Tx = oneshot::Sender<Option<String>>;
type Rx = NoReceiver;
}
impl Channels<StorageProtocol> for Set {
type Tx = oneshot::Sender<()>;
type Rx = NoReceiver;
}
// 3. Define protocol enum
#[derive(derive_more::From, Serialize, Deserialize, Debug)]
enum StorageProtocol {
Get(Get),
Set(Set),
}
// 4. Define message enum
#[derive(derive_more::From)]
enum StorageMessage {
Get(WithChannels<Get, StorageProtocol>),
Set(WithChannels<Set, StorageProtocol>),
}
// 5. Implement Service
impl Service for StorageProtocol {
type Message = StorageMessage;
}
// 6. Implement RemoteService (rpc feature)
impl RemoteService for StorageProtocol {
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message {
match self {
StorageProtocol::Get(msg) => WithChannels::from((msg, tx, rx)).into(),
StorageProtocol::Set(msg) => WithChannels::from((msg, tx, rx)).into(),
}
}
}
```
This manual approach gives full control but requires more boilerplate. The macro generates all of this automatically.

View File

@@ -0,0 +1,274 @@
# irpc: RPC Module and Remote Transport
The `rpc` module (enabled by the `rpc` feature) contains all cross-process RPC functionality: QUIC stream handling, connection management, serialization, and server-side request processing.
## Module Structure
```rust
pub mod rpc {
pub const MAX_MESSAGE_SIZE: u64 = 1024 * 1024 * 16;
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
pub enum WriteError { Noq, MaxMessageSizeExceeded, Io }
pub trait RemoteConnection: Send + Sync + Debug + 'static { ... }
pub struct RemoteSender<S>(SendStream, RecvStream, PhantomData<S>);
pub type Handler<R> = Arc<dyn Fn(R, RecvStream, SendStream) -> BoxFuture<Result<(), SendError>> + Send + Sync>;
pub trait RemoteService: Service + Sized { ... }
pub async fn listen<R>(endpoint, handler);
pub async fn handle_connection<R>(connection, handler) -> io::Result<()>;
pub async fn read_request<S: RemoteService>(connection) -> io::Result<Option<S::Message>>;
pub async fn read_request_raw<R>(connection) -> io::Result<Option<(R, RecvStream, SendStream)>>;
}
```
## RemoteConnection Implementations
### NoqLazyRemoteConnection
The default remote connection for noq (QUIC-by-socket-address):
```rust
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
struct NoqLazyRemoteConnectionInner {
endpoint: noq::Endpoint,
addr: SocketAddr,
connection: Mutex<Option<noq::Connection>>,
}
```
**Behavior:**
- `open_bi()`:
1. Locks the `Mutex<Option<Connection>>`
2. If a cached connection exists, tries `conn.open_bi()`
3. If that fails, clears the cache and establishes a new connection
4. If no cached connection, establishes a new one
5. Returns `(SendStream, RecvStream)` pair
- `zero_rtt_accepted()`: Always returns `true` (noq doesn't have 0-RTT concept in this context)
- `clone_boxed()`: Clones the `Arc`, sharing the same connection cache
### Direct noq::Connection
```rust
impl RemoteConnection for noq::Connection {
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
// Directly opens a bidirectional stream on the connection
}
fn zero_rtt_accepted(&self) -> BoxFuture<bool> { Box::pin(async { true }) }
}
```
## RemoteSender
```rust
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
```
Created by `Client::request()` when the client is remote. Holds both sides of a QUIC bidirectional stream.
### Key Methods
```rust
impl<S: Service> RemoteSender<S> {
pub fn new(send: SendStream, recv: RecvStream) -> Self;
pub async fn write(self, msg: impl Into<S>) -> Result<(SendStream, RecvStream), WriteError> {
let buf = prepare_write(msg)?;
self.write_raw(&buf).await
}
// Internal: writes pre-serialized buffer
pub(crate) async fn write_raw(self, buf: &[u8]) -> Result<(SendStream, RecvStream), WriteError>;
}
```
The `write()` method:
1. Converts `msg` into the protocol enum `S` via `Into`
2. Checks serialized size against `MAX_MESSAGE_SIZE`
3. Length-prefixes with varint + postcard serialization
4. Writes to the `SendStream`
5. Returns the stream pair (now usable for response channels)
The `write_raw()` method is used for 0-RTT where the message is pre-serialized to allow re-sending without re-serialization.
### prepare_write
```rust
fn prepare_write<S: Service>(msg: impl Into<S>) -> Result<SmallVec<[u8; 128]>, WriteError> {
let msg = msg.into();
if postcard::experimental::serialized_size(&msg)? as u64 > MAX_MESSAGE_SIZE {
return Err(WriteError::MaxMessageSizeExceeded);
}
let mut buf = SmallVec::<[u8; 128]>::new();
buf.write_length_prefixed(&msg)?;
Ok(buf)
}
```
Uses `SmallVec<[u8; 128]>` to avoid heap allocation for small messages.
## Stream-to-Channel Conversions
When a QUIC stream pair is received on the server side, it needs to be converted into typed channels. The `From` implementations handle this:
### SendStream → Channel Tx
```rust
// NoSender: drop the stream
impl From<SendStream> for NoSender { ... }
// Oneshot: serialize and send single value, then done
impl<T: RpcMessage> From<SendStream> for oneshot::Sender<T> { ... }
// MPSC: repeatedly serialize and send values
impl<T: RpcMessage> From<SendStream> for mpsc::Sender<T> { ... }
```
### RecvStream → Channel Rx
```rust
// NoReceiver: drop the stream
impl From<RecvStream> for NoReceiver { ... }
// Oneshot: read single length-prefixed value
impl<T: DeserializeOwned> From<RecvStream> for oneshot::Receiver<T> { ... }
// MPSC: repeatedly read length-prefixed values
impl<T: RpcMessage> From<RecvStream> for mpsc::Receiver<T> { ... }
```
## Server-Side Request Processing
### read_request_raw
```rust
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
connection: &noq::Connection,
) -> io::Result<Option<(R, RecvStream, SendStream)>>
```
1. Calls `connection.accept_bi()` to accept an incoming bidirectional stream
2. If `ApplicationClosed(0)`, returns `Ok(None)` (clean shutdown)
3. Reads a varint length prefix from the `RecvStream`
4. Checks against `MAX_MESSAGE_SIZE`
5. Reads `length` bytes from the stream
6. Deserializes with `postcard::from_bytes::<R>()`
7. Returns `(deserialized_message, RecvStream, SendStream)`
### read_request (typed)
```rust
pub async fn read_request<S: RemoteService>(
connection: &noq::Connection,
) -> io::Result<Option<S::Message>>
```
Calls `read_request_raw()` and then applies `S::with_remote_channels()` to convert the raw protocol message + stream pair into a `WithChannels`-wrapped `Message`.
### handle_connection
```rust
pub async fn handle_connection<R: DeserializeOwned + 'static>(
connection: noq::Connection,
handler: Handler<R>,
) -> io::Result<()>
```
Loops:
1. Calls `read_request_raw()` to get the next request
2. If `None`, returns `Ok(())` (connection closed)
3. Invokes `handler(msg, rx, tx)` to process the request
4. Continues until the connection closes or an error occurs
Each connection is handled in a separate task (spawned by `listen()`).
### listen
```rust
pub async fn listen<R: DeserializeOwned + 'static>(
endpoint: noq::Endpoint,
handler: Handler<R>,
)
```
The top-level server loop:
1. Accepts incoming connections from the `noq::Endpoint`
2. Spawns a task for each connection
3. Each task calls `handle_connection()`
4. Uses a `JoinSet` to manage and clean up completed tasks
## The Handler and Local Forwarding
The typical handler is created by `Protocol::remote_handler(local_sender)`:
```rust
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
Arc::new(move |msg, rx, tx| {
let msg = Self::with_remote_channels(msg, rx, tx);
Box::pin(local_sender.send_raw(msg))
})
}
```
This converts the raw (deserialized protocol message, RecvStream, SendStream) tuple into a typed `WithChannels` message and forwards it to the local actor via the mpsc channel. The local actor can then use the typed channels without knowing whether they're local or remote.
## Full Request Lifecycle (Remote)
```
CLIENT SERVER
│ │
│ 1. Client::request() │
│ → open_bi() on connection │
│ │
│ 2. RemoteSender::write(protocol_msg) │
│ → serialize + send on SendStream ────►│
│ │ 3. accept_bi()
│ │ 4. read_request_raw()
│ │ → read varint + data
│ │ → deserialize protocol_msg
│ │
│ │ 5. RemoteService::with_remote_channels()
│ │ → creates WithChannels
│ │ → SendStream → tx channel
│ │ → RecvStream → rx channel
│ │
│ │ 6. handler(msg, rx, tx)
│ │ → local_sender.send_raw(message)
│ │ → message goes to actor
│ │
│ │ 7. Actor processes:
│ │ match message {
│ │ Msg::Get(wc) => {
│ │ let res = db.get(wc.inner.key);
│ │ wc.tx.send(res).await;
│ │ // tx.send() writes to SendStream
│ │ }
│ │ }
│ │
│ 8. RecvStream reads response ◄───────────│
│ 9. Deserialize response │
│ 10. Return to caller │
```
## 0-RTT Flow
```
CLIENT SERVER
│ │
│ 1. Serialize message into buffer │
│ (prepare_write) │
│ │
│ 2. Open 0-RTT connection │
│ → write buffer immediately ─────────►│
│ │
│ 3. Check zero_rtt_accepted() │
│ → If true: done, read response │
│ → If false: │
│ 4. Open new (full) connection │
│ 5. Re-send same buffer ────────────►│
│ │
│ 6. Read response ◄──────────────────────│
```
The key insight: the message buffer is pre-serialized so it can be re-sent without re-serialization if 0-RTT is rejected.

View File

@@ -0,0 +1,271 @@
# irpc: irpc-iroh — Iroh Transport Integration
The `irpc-iroh` crate provides transport integration for iroh, enabling irpc to work with iroh's QUIC connections that use endpoint IDs (rather than socket addresses) for routing.
## Crate Overview
```toml
[package]
name = "irpc-iroh"
version = "0.13.0"
description = "Iroh transport for irpc"
```
Dependencies: `iroh`, `irpc`, `tokio`, `tracing`, `serde`, `postcard`, `n0-error`, `n0-future`
## Key Types
### IrohRemoteConnection
```rust
#[derive(Debug, Clone)]
pub struct IrohRemoteConnection(Connection);
```
Wraps an existing iroh `Connection`. Simplest way to use irpc with iroh — create a connection externally and wrap it.
```rust
impl RemoteConnection for IrohRemoteConnection {
fn clone_boxed(&self) -> Box<dyn RemoteConnection> { ... }
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
// Delegates to connection.open_bi()
}
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
// Always true — fully authenticated connection
}
}
```
**Note:** This stops working when the underlying connection is closed. For automatic reconnection, use `IrohLazyRemoteConnection`.
### IrohZrttRemoteConnection
```rust
#[derive(Debug, Clone)]
pub struct IrohZrttRemoteConnection(OutgoingZeroRttConnection);
```
Wraps an iroh 0-RTT (Zero Round Trip Time) connection. This enables sending data before the full handshake completes for reduced latency on reconnections.
```rust
impl RemoteConnection for IrohZrttRemoteConnection {
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
// Delegates to the 0-RTT connection's open_bi()
}
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
// Actually checks handshake_completed() to determine
// if 0-RTT data was accepted
}
}
```
The `zero_rtt_accepted()` method:
- Returns `true` if `ZeroRttStatus::Accepted`
- Returns `false` if `ZeroRttStatus::Rejected` or on error
- This allows the `Client` to decide whether to re-send data
### IrohLazyRemoteConnection
```rust
#[derive(Debug, Clone)]
pub struct IrohLazyRemoteConnection(Arc<IrohRemoteConnectionInner>);
struct IrohRemoteConnectionInner {
endpoint: iroh::Endpoint,
addr: iroh::EndpointAddr,
connection: tokio::sync::Mutex<Option<Connection>>,
alpn: Vec<u8>,
}
```
The lazy connection caches the underlying iroh `Connection` and reconnects automatically:
1. On first `open_bi()`, establishes a connection via `endpoint.connect(addr, alpn)`
2. Caches the connection in a `Mutex<Option<Connection>>`
3. On subsequent `open_bi()`, tries to reuse the cached connection
4. If the cached connection fails, clears the cache and reconnects once
The `alpn` field is required because iroh connections need an ALPN protocol identifier.
### `client()` Function
```rust
pub fn client<S: irpc::Service>(
endpoint: iroh::Endpoint,
addr: impl Into<iroh::EndpointAddr>,
alpn: impl AsRef<[u8]>,
) -> irpc::Client<S>
```
Convenience function to create a `Client<S>` using iroh. Creates an `IrohLazyRemoteConnection` and wraps it with `Client::boxed()`.
## Server-Side: IrohProtocol
### IrohProtocol
```rust
pub struct IrohProtocol<R> {
handler: Handler<R>,
request_id: AtomicU64,
}
```
Implements `iroh::protocol::ProtocolHandler`, allowing it to be registered with iroh's `Router`:
```rust
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for IrohProtocol<R> {
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> {
// Handle the connection using irpc's handle_connection
let handler = self.handler.clone();
let fut = handle_connection(&connection, handler).map_err(AcceptError::from_err);
fut.instrument(span).await
}
}
```
**Usage:**
```rust
let protocol = IrohProtocol::with_sender(local_sender);
// or
let protocol = IrohProtocol::new(handler);
let router = Router::builder(endpoint)
.accept(ALPN, protocol)
.spawn();
```
### Iroh0RttProtocol
```rust
pub struct Iroh0RttProtocol<R> { ... }
```
Supports 0-RTT connections by implementing `ProtocolHandler::on_accepting()`:
```rust
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for Iroh0RttProtocol<R> {
async fn on_accepting(&self, accepting: Accepting) -> Result<Connection, AcceptError> {
let zrtt_conn = accepting.into_0rtt();
// Handle 0-RTT data immediately
handle_connection(&zrtt_conn, handler).await?;
// Wait for handshake completion
let conn = zrtt_conn.handshake_completed().await?;
Ok(conn)
}
async fn accept(&self, _connection: Connection) -> Result<(), AcceptError> {
// Noop — handled in on_accepting
Ok(())
}
}
```
**Warning:** 0-RTT data is replayable. Only use for idempotent operations. See <https://www.iroh.computer/blog/0rtt-api>.
### IncomingRemoteConnection Trait
```rust
pub trait IncomingRemoteConnection {
fn accept_bi(&self) -> impl Future<Output = Result<(SendStream, RecvStream), ConnectionError>> + Send;
fn close(&self, error_code: VarInt, reason: &[u8]);
fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
}
```
Abstraction over `Connection` and `IncomingZeroRttConnection`, enabling `handle_connection` and `read_request` to work with both regular and 0-RTT connections.
Implemented for:
- `Connection` — regular iroh connection
- `IncomingZeroRttConnection` — 0-RTT connection
## handle_connection (iroh variant)
```rust
pub async fn handle_connection<R: DeserializeOwned + 'static>(
connection: &impl IncomingRemoteConnection,
handler: Handler<R>,
) -> io::Result<()>
```
Similar to the noq version but works with iroh's `IncomingRemoteConnection` trait. Records the remote endpoint ID in the tracing span.
## read_request and read_request_raw (iroh variants)
Same logic as the noq versions but using `IncomingRemoteConnection` instead of `noq::Connection`:
```rust
pub async fn read_request<S: RemoteService>(
connection: &impl IncomingRemoteConnection,
) -> io::Result<Option<S::Message>>
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
connection: &impl IncomingRemoteConnection,
) -> io::Result<Option<(R, RecvStream, SendStream)>>
```
## listen (iroh variant)
```rust
pub async fn listen<R: DeserializeOwned + 'static>(endpoint: iroh::Endpoint, handler: Handler<R>)
```
Accepts connections from an iroh `Endpoint` and handles them with the provided handler. Uses `n0_future::task::JoinSet` for task management.
## Example Usage
### Server
```rust
use irpc::{rpc_requests, channel::oneshot, Client, WithChannels};
use irpc_iroh::IrohProtocol;
use iroh::{endpoint::presets, protocol::Router, Endpoint};
#[rpc_requests(message = FooMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum FooProtocol {
#[rpc(tx=oneshot::Sender<String>)]
Get(String),
}
async fn server() -> Result<()> {
let (tx, rx) = tokio::sync::mpsc::channel(16);
tokio::task::spawn(actor(rx));
let client = Client::<FooProtocol>::local(tx);
let endpoint = Endpoint::bind(presets::N0).await?;
let protocol = IrohProtocol::with_sender(client.as_local().unwrap());
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
// ... keep running
}
```
### Client
```rust
async fn connect(endpoint_id: EndpointId) -> Result<Client<FooProtocol>> {
let endpoint = Endpoint::bind(presets::N0).await?;
let client = irpc_iroh::client(endpoint, endpoint_id, ALPN);
Ok(client)
}
// Or with direct connection:
async fn connect_direct(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<FooProtocol>> {
let conn = endpoint.connect(addr, ALPN).await?;
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
}
```
### 0-RTT Client
```rust
async fn connect_0rtt(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<EchoProtocol>> {
let connecting = endpoint.connect_with_opts(addr, ALPN, Default::default()).await?;
match connecting.into_0rtt() {
Ok(conn) => Ok(Client::boxed(IrohZrttRemoteConnection::new(conn))),
Err(connecting) => {
let conn = connecting.await?;
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
}
}
}
```

View File

@@ -0,0 +1,134 @@
# irpc: Serialization and Utility Modules
## Varint Utilities
The `varint-util` module (available with `rpc` or `varint-util` feature) provides LEB128 varint encoding/decoding compatible with postcard's format.
### Async Reading
```rust
pub async fn read_varint_u64<R: AsyncRead + Unpin>(reader: &mut R) -> io::Result<Option<u64>>
```
Reads a LEB128-encoded `u64` from an async reader. Returns `Ok(None)` on `UnexpectedEof` at the first byte position (clean stream end).
**Format:** Each byte uses 7 bits for the value, MSB as continuation bit. Values stored little-endian (least significant group first).
### Sync Writing
```rust
pub fn write_varint_u64_sync<W: io::Write>(writer: &mut W, value: u64) -> io::Result<usize>
```
Writes a `u64` as LEB128 to a synchronous writer.
### Length-Prefixed Encoding
```rust
// Sync:
pub fn write_length_prefixed<T: Serialize>(write: impl io::Write, value: T) -> io::Result<()>
pub trait WriteVarintExt: io::Write {
fn write_varint_u64(&mut self, value: u64) -> io::Result<usize>;
fn write_length_prefixed<T: Serialize>(&mut self, value: T) -> io::Result<()>;
}
// Async:
pub trait AsyncReadVarintExt: AsyncRead + Unpin {
fn read_varint_u64(&mut self) -> impl Future<Output = io::Result<Option<u64>>>;
fn read_length_prefixed<T: DeserializeOwned>(&mut self, max_size: usize) -> impl Future<Output = io::Result<T>>;
}
pub trait AsyncWriteVarintExt: AsyncWrite + Unpin {
fn write_varint_u64(&mut self, value: u64) -> impl Future<Output = io::Result<usize>>;
fn write_length_prefixed<T: Serialize>(&mut self, value: V) -> impl Future<Output = io::Result<usize>>;
}
```
The length-prefix format is:
```
[varint-encoded-length][postcard-serialized-data]
```
Used internally by irpc for framing all messages on QUIC streams. The `max_size` parameter in `read_length_prefixed` prevents memory exhaustion from malicious length values.
## noq Endpoint Setup
The `noq_endpoint_setup` feature provides helpers for creating noq endpoints with TLS configuration:
```rust
pub fn configure_client(server_certs: &[&[u8]]) -> Result<ClientConfig>
pub fn configure_server() -> Result<(ServerConfig, Vec<u8>)>
pub fn configure_client_insecure() -> Result<ClientConfig>
// Non-WASM only:
pub fn make_client_endpoint(bind_addr: SocketAddr, server_certs: &[&[u8]]) -> Result<Endpoint>
pub fn make_insecure_client_endpoint(bind_addr: SocketAddr) -> Result<Endpoint>
pub fn make_server_endpoint(bind_addr: SocketAddr) -> Result<(Endpoint, Vec<u8>)>
```
- `configure_server()`: Creates a self-signed certificate with rcgen and configures the server with TLS 1.3. Returns the DER-encoded certificate for clients to trust.
- `configure_client()`: Configures a client to trust specific DER certificates.
- `configure_client_insecure()`: Skips certificate verification (for testing only).
- Server endpoints set `max_concurrent_uni_streams(0)` to disable unidirectional streams (only bidirectional streams are used).
- Keep-alive interval is set to 1 second on client configs.
## FusedOneshotReceiver
```rust
pub(crate) struct FusedOneshotReceiver<T>(pub tokio::sync::oneshot::Receiver<T>);
```
A wrapper that prevents panics when polling an already-completed oneshot receiver. After the inner receiver resolves, subsequent polls return `Poll::Pending` indefinitely instead of panicking.
This is important because irpc's `oneshot::Receiver` can be wrapped in `Receiver::Boxed` (a `BoxFuture`), and the inner future might be polled multiple times in certain select patterns.
## now_or_never
```rust
pub(crate) fn now_or_never<F: Future>(future: F) -> Option<F::Output>
```
Attempts to complete a future immediately without blocking. If the future would block, returns `None`. Used internally by `NoqSenderInner::try_send()` to attempt an immediate write to the QUIC stream without yielding.
Implementation uses a no-op waker to poll the future once.
## Spans Feature
When the `spans` feature is enabled (default), `WithChannels` includes a `span: tracing::Span` field:
```rust
pub struct WithChannels<I: Channels<S>, S: Service> {
pub inner: I,
pub tx: <I as Channels<S>>::Tx,
pub rx: <I as Channels<S>>::Rx,
#[cfg(feature = "spans")]
pub span: tracing::Span,
}
```
The span is captured from `tracing::Span::current()` at the time of `WithChannels` construction (via `From` implementations). This preserves tracing context across async message-passing boundaries.
The `rpc_requests` macro generates a `parent_span()` method on the message enum when `no_spans` is not set:
```rust
impl ComputeMessage {
pub fn parent_span(&self) -> tracing::Span {
let span = match self {
ComputeMessage::Multiply(inner) => inner.parent_span_opt(),
ComputeMessage::Sum(inner) => inner.parent_span_opt(),
};
span.cloned().unwrap_or_else(|| tracing::Span::current())
}
}
```
This allows server-side handlers to enter the client's tracing span:
```rust
async fn handle(msg: ComputeMessage) {
let _entered = msg.parent_span().enter();
// ... processing happens in the client's tracing context
}
```
When `no_spans` is set in the macro, no span-related code is generated, making it compatible with builds that don't have the `spans` feature enabled.

View File

@@ -0,0 +1,249 @@
# irpc: Design Patterns and Usage Examples
## Pattern 1: Actor Model (Most Common)
The primary usage pattern is an actor that receives messages and processes them sequentially:
```rust
struct StorageActor {
recv: tokio::sync::mpsc::Receiver<StorageMessage>,
state: BTreeMap<String, String>,
}
impl StorageActor {
pub fn spawn() -> StorageApi {
let (tx, rx) = tokio::sync::mpsc::channel(16);
let actor = Self { recv: rx, state: BTreeMap::new() };
tokio::task::spawn(actor.run());
StorageApi { inner: Client::local(tx) }
}
async fn run(mut self) {
while let Some(msg) = self.recv.recv().await {
self.handle(msg).await;
}
}
async fn handle(&mut self, msg: StorageMessage) {
match msg {
StorageMessage::Get(wc) => {
let WithChannels { inner, tx, .. } = wc;
tx.send(self.state.get(&inner.key).cloned()).await.ok();
}
StorageMessage::Set(wc) => {
let WithChannels { inner, tx, .. } = wc;
self.state.insert(inner.key, inner.value);
tx.send(()).await.ok();
}
}
}
}
```
**Key points:**
- The actor owns state and processes messages sequentially
- `Client::local(tx)` wraps the sender side of the mpsc channel
- `WithChannels` destructuring gives access to `inner` (the request data), `tx` (response channel), and `rx` (update channel)
- The `..` pattern ignores `rx` when it's `NoReceiver` and `span` (with `spans` feature)
## Pattern 2: Concurrent Task Per Request
For long-running or independent requests, spawn a task per message:
```rust
async fn run(mut self) {
while let Ok(Some(msg)) = self.recv.recv().await {
tokio::task::spawn(async move {
if let Err(cause) = Self::handle(msg).await {
eprintln!("Error: {cause}");
}
});
}
}
```
This is useful for CPU-intensive or I/O-bound requests that shouldn't block other requests.
## Pattern 3: Local-Only Usage
irpc can be used without any RPC feature for pure in-process communication:
```rust
// Cargo.toml: default-features = false, features = ["derive"]
#[rpc_requests(message = StorageMessage, no_rpc, no_spans)]
#[derive(Serialize, Deserialize, Debug)]
enum StorageProtocol {
#[rpc(tx=oneshot::Sender<Option<String>>)]
Get(Get),
#[rpc(tx=oneshot::Sender<()>)]
Set(Set),
}
```
The `no_rpc` flag prevents `RemoteService` from being generated, and `no_spans` removes the tracing dependency. This leaves only the local channel mechanism, with minimal dependencies (serde, tokio, tokio-util).
## Pattern 4: API Type Wrapping Client
The recommended pattern is to wrap `Client<S>` in a higher-level API type:
```rust
struct StorageApi {
inner: Client<StorageProtocol>,
}
impl StorageApi {
// Local
pub fn spawn() -> Self {
let (tx, rx) = tokio::sync::mpsc::channel(16);
tokio::task::spawn(StorageActor::new(rx).run());
Self { inner: Client::local(tx) }
}
// Remote (noq)
pub fn connect(endpoint: noq::Endpoint, addr: SocketAddr) -> Self {
Self { inner: Client::noq(endpoint, addr) }
}
// Remote (iroh)
pub fn connect_iroh(endpoint: iroh::Endpoint, addr: EndpointAddr) -> Self {
Self { inner: irpc_iroh::client(endpoint, addr, ALPN) }
}
// Type-safe methods that work for both local and remote
pub async fn get(&self, key: String) -> irpc::Result<Option<String>> {
self.inner.rpc(Get { key }).await
}
pub async fn set(&self, key: String, value: String) -> irpc::Result<()> {
self.inner.rpc(Set { key, value }).await
}
pub async fn list(&self) -> irpc::Result<mpsc::Receiver<String>> {
self.inner.server_streaming(List, 16).await
}
}
```
This encapsulates the protocol details and provides a clean, type-safe API. The same `StorageApi` works identically whether connected locally or remotely.
## Pattern 5: Server Setup
### With noq
```rust
fn serve(api: &StorageApi, endpoint: noq::Endpoint) -> Result<JoinHandle<()>> {
let local = api.inner.as_local().context("cannot listen on remote service")?;
let handler = StorageProtocol::remote_handler(local);
Ok(tokio::task::spawn(irpc::rpc::listen(endpoint, handler)))
}
```
### With iroh
```rust
fn serve(api: &StorageApi, endpoint: iroh::Endpoint) -> Result<Router> {
let local = api.inner.as_local().context("cannot listen on remote service")?;
let protocol = IrohProtocol::with_sender(local);
Ok(Router::builder(endpoint).accept(ALPN, protocol).spawn())
}
```
## Pattern 6: Low-Level Request Handling
For more control than the `Client` methods provide, use `request()` directly:
```rust
async fn custom_request(&self, msg: Get) -> anyhow::Result<oneshot::Receiver<Option<String>>> {
match self.inner.request().await? {
Request::Local(request) => {
let (tx, rx) = oneshot::channel();
request.send((msg, tx)).await?;
Ok(rx)
}
Request::Remote(request) => {
let (_tx, rx) = request.write(msg).await?;
Ok(rx.into())
}
}
}
```
This allows custom channel creation logic, e.g., different buffer sizes for local vs remote.
## Pattern 7: Channel Filtering and Mapping
irpc channels support filtering and mapping, which work for both local and remote channels:
```rust
// Server-side: filter responses to only include values > 10
let filtered_tx = wc.tx.with_filter(|v: &i64| *v > 10);
// Server-side: transform responses
let mapped_tx = wc.tx.with_map(|v: i64| v * 2);
// Client-side: filter received updates
let filtered_rx = rx.filter(|update: &Update| update.is_relevant());
```
For remote channels, these create boxed wrappers. For local channels, they also create boxed wrappers. The overhead is negligible for remote (network latency dominates) but present for local.
## Pattern 8: Using the `wrap` Attribute
The `#[wrap]` attribute generates named structs from variant fields:
```rust
#[rpc_requests(message = StoreMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum StoreProtocol {
#[rpc(tx=oneshot::Sender<Option<String>>)]
#[wrap(GetRequest, derive(Clone))]
Get(String), // Generates: pub struct GetRequest(pub String);
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(SetRequest)]
Set { key: String, value: String }, // Generates: pub struct SetRequest { pub key: String, pub value: String }
}
```
Benefits:
- Named request types can be imported and constructed by name
- Additional derives (e.g., `Clone`) can be added
- Custom visibility can be specified: `#[wrap(pub(crate) GetRequest)]`
- The generated struct inherits the enum's visibility by default
## Pattern 9: 0-RTT Connections
For reduced latency on reconnections with iroh:
```rust
// Client side
let result = client.rpc_0rtt(Get { key: "x".into() }).await?;
// Server side (iroh)
let protocol = Iroh0RttProtocol::with_sender(local_sender);
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
```
**Important:** Only use 0-RTT for idempotent operations, as the data may be replayed by an attacker.
## Pattern 10: Shared State in Actor
For actors that need shared state accessible from multiple handlers:
```rust
struct Actor {
recv: tokio::sync::mpsc::Receiver<Message>,
state: Arc<Mutex<SharedState>>,
}
```
Or use the actor pattern with internal mutation:
```rust
struct Actor {
recv: tokio::sync::mpsc::Receiver<Message>,
db: HashMap<String, String>, // owned state
}
```
Since the actor processes messages sequentially, no internal synchronization is needed.

View File

@@ -0,0 +1,230 @@
# irpc: Quick Reference
## Crate Info
- **Name:** `irpc`
- **Version:** 0.13.0
- **License:** Apache-2.0 OR MIT
- **Repository:** https://github.com/n0-computer/irpc
- **MSRV:** 1.89
## Feature Flags
| Feature | Default | Dependencies Added |
|---|---|---|
| `rpc` | ✅ | noq, postcard, smallvec, tracing, tokio/io-util |
| `derive` | ✅ | irpc-derive |
| `spans` | ✅ | tracing |
| `stream` | ✅ | futures-util |
| `noq_endpoint_setup` | ✅ | rustls, rcgen, futures-buffered |
| `varint-util` | ❌ | postcard, smallvec, tokio/io-util |
## Type Quick Reference
### Core Types
```
Service trait — implemented on protocol enum, defines Message type
Channels<S> trait — implemented on request types, defines Tx/Rx types
RpcMessage trait — blanket impl for Debug+Serialize+DeserializeOwned+Send+Sync+Unpin+'static
Sender trait — sealed marker for sender types
Receiver trait — sealed marker for receiver types
WithChannels<I,S> struct — wraps request I with tx/rx/span for service S
Client<S> struct — client to service S (local or remote)
LocalSender<S> struct — local sender wrapping mpsc::Sender<S::Message>
Request<L,R> enum — Local(L) or Remote(R) request
RemoteSender<S> struct — holds QUIC stream pair for sending initial message
```
### Channel Types
```
oneshot::Sender<T> — Tokio or Boxed; single value; async send
oneshot::Receiver<T> — Tokio or Boxed; single value; Future impl
mpsc::Sender<T> — Tokio or Arc<DynSender>; stream; async send/try_send
mpsc::Receiver<T> — Tokio or Box<DynReceiver>; stream; async recv
NoSender — No-op sender
NoReceiver — No-op receiver
```
### Remote Types (rpc feature)
```
RemoteConnection trait — open_bi(), zero_rtt_accepted(), clone_boxed()
NoqLazyRemoteConnection — lazy noq connection with cache
Handler<R> type — Arc<dyn Fn(R, RecvStream, SendStream) -> ...>
```
### irpc-iroh Types
```
IrohRemoteConnection — wraps iroh::Connection
IrohZrttRemoteConnection — wraps iroh::OutgoingZeroRttConnection
IrohLazyRemoteConnection — lazy iroh connection with cache
IrohProtocol<R> — ProtocolHandler for iroh Router
Iroh0RttProtocol<R> — ProtocolHandler with 0-RTT support
IncomingRemoteConnection trait — abstraction over Connection and ZeroRttConnection
```
## Interaction Patterns Cheatsheet
```rust
// ═══════════════════════════════════════════
// Protocol Definition
// ═══════════════════════════════════════════
#[rpc_requests(message = MyMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum MyProtocol {
// Unary RPC
#[rpc(tx=oneshot::Sender<Response>)]
#[wrap(GetReq)]
Get(String),
// Server streaming
#[rpc(tx=mpsc::Sender<Item>)]
#[wrap(ListReq)]
List(ListParams),
// Client streaming
#[rpc(tx=oneshot::Sender<Count>, rx=mpsc::Receiver<Item>)]
#[wrap(UploadReq)]
Upload,
// Bidirectional streaming
#[rpc(tx=mpsc::Sender<Result>, rx=mpsc::Receiver<Update>)]
#[wrap(ProcessReq)]
Process(ProcessConfig),
// Fire and forget
#[rpc]
#[wrap(LogReq)]
Log(String),
}
// ═══════════════════════════════════════════
// Client Usage
// ═══════════════════════════════════════════
// Local
let (tx, rx) = tokio::sync::mpsc::channel(16);
tokio::task::spawn(actor(rx));
let client: Client<MyProtocol> = Client::local(tx);
// Remote (noq)
let client: Client<MyProtocol> = Client::noq(endpoint, addr);
// Remote (iroh)
let client: Client<MyProtocol> = irpc_iroh::client(endpoint, addr, alpn);
// ═══════════════════════════════════════════
// Making Requests
// ═══════════════════════════════════════════
// Unary
let result: Response = client.rpc(GetReq("key".into())).await?;
// Server streaming
let mut rx: mpsc::Receiver<Item> = client.server_streaming(ListReq(params), 16).await?;
while let Some(item) = rx.recv().await? { ... }
// Client streaming
let (update_tx, response_rx): (mpsc::Sender<Item>, oneshot::Receiver<Count>) =
client.client_streaming(Upload, 4).await?;
update_tx.send(item).await?;
let count = response_rx.await?;
// Bidirectional
let (update_tx, mut result_rx): (mpsc::Sender<Update>, mpsc::Receiver<Result>) =
client.bidi_streaming(ProcessReq(config), 4, 16).await?;
update_tx.send(update).await?;
while let Some(result) = result_rx.recv().await? { ... }
// Fire and forget
client.notify(LogReq("message".into())).await?;
// ═══════════════════════════════════════════
// Server Setup
// ═══════════════════════════════════════════
// noq
let handler = MyProtocol::remote_handler(local_sender);
irpc::rpc::listen(endpoint, handler).await;
// iroh
let protocol = IrohProtocol::with_sender(local_sender);
Router::builder(endpoint).accept(ALPN, protocol).spawn();
// ═══════════════════════════════════════════
// Actor Message Handling
// ═══════════════════════════════════════════
async fn handle(&mut self, msg: MyMessage) {
match msg {
MyMessage::Get(wc) => {
let WithChannels { inner, tx, .. } = wc;
let result = self.db.get(&inner.0).cloned();
tx.send(result).await.ok();
}
MyMessage::List(wc) => {
let WithChannels { tx, .. } = wc;
for item in &self.items {
if tx.send(item.clone()).await.is_err() { break; }
}
}
MyMessage::Upload(wc) => {
let WithChannels { tx, mut rx, .. } = wc;
let mut count = 0;
while let Ok(Some(item)) = rx.recv().await {
self.process(item);
count += 1;
}
tx.send(count).await.ok();
}
MyMessage::Process(wc) => {
let WithChannels { tx, mut rx, inner, .. } = wc;
tokio::task::spawn(async move {
while let Ok(Some(update)) = rx.recv().await {
if let Some(result) = process(update, &inner) {
if tx.send(result).await.is_err() { break; }
}
}
});
}
MyMessage::Log(wc) => {
let WithChannels { inner, .. } = wc;
println!("{}", inner.0);
}
}
}
```
## Error Handling Quick Reference
```rust
// Client-side errors
use irpc::{Error, RequestError, Result};
// Request errors (connection/stream open failures)
match client.rpc(GetReq("key".into())).await {
Ok(result) => { ... }
Err(Error::Request { source }) => { ... } // Connection failed
Err(Error::OneshotRecv { source }) => { ... } // Response channel error
}
// Channel errors
use irpc::channel::{SendError, mpsc::RecvError, oneshot::RecvError};
// SendError: ReceiverClosed | MaxMessageSizeExceeded | Io
// RecvError (oneshot): SenderClosed | MaxMessageSizeExceeded | Io
// RecvError (mpsc): MaxMessageSizeExceeded | Io
```
## Constants
```rust
pub const MAX_MESSAGE_SIZE: u64 = 16 * 1024 * 1024; // 16 MiB
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
// Connection close code 0 = clean shutdown
```