docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs
This commit is contained in:
@@ -0,0 +1,138 @@
|
||||
# iroh-blobs: Overview and Architecture
|
||||
|
||||
**Version**: 0.100.0
|
||||
**Repository**: https://github.com/n0-computer/iroh-blobs
|
||||
**License**: MIT OR Apache-2.0
|
||||
**Rust Edition**: 2021
|
||||
**MSRV**: 1.89
|
||||
|
||||
## What It Is
|
||||
|
||||
`iroh-blobs` is a Rust crate for content-addressed blob transfer over QUIC connections, built on top of [iroh](https://docs.rs/iroh). It implements a request-response protocol for streaming BLAKE3-verified data between peers, along with store implementations for persisting blobs locally.
|
||||
|
||||
The core value proposition: transfer arbitrary-sized data with **cryptographic integrity guaranteed in-stream** — every 16 KiB chunk group can be verified against the BLAKE3 hash tree as it arrives, without waiting for the complete transfer.
|
||||
|
||||
## Core Concepts
|
||||
|
||||
| Concept | Description |
|
||||
|---------|-------------|
|
||||
| **Blob** | A sequence of bytes of arbitrary size, identified by its BLAKE3 hash. No metadata. |
|
||||
| **Link** | A 32-byte BLAKE3 hash of a blob — the content address. |
|
||||
| **HashSeq** | A blob whose content is a sequence of BLAKE3 hashes (each 32 bytes). Length must be a multiple of 32. |
|
||||
| **Provider** | The side serving data. Waits for incoming requests and responds. |
|
||||
| **Requester** | The side requesting data. Initiates connections and sends requests. |
|
||||
| **Tag** | A persistent named reference to a `HashAndFormat`, protecting blobs from garbage collection. |
|
||||
| **TempTag** | An ephemeral in-memory reference that protects content while the process runs. |
|
||||
| **Chunk** | The fundamental BLAKE3 unit: 1024 bytes. |
|
||||
| **Chunk Group** | Iroh's grouping of 16 chunks (16 KiB), the minimum granularity for range requests and verification. |
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Application │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||
│ │ Blobs │ │ Tags │ │ Downloader │ │
|
||||
│ │ API │ │ API │ │ API │ │
|
||||
│ └────┬─────┘ └────┬─────┘ └───────┬──────────┘ │
|
||||
│ │ │ │ │
|
||||
│ └──────────────┴────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────┴───────┐ │
|
||||
│ │ Store (API) │ ← Actor-based, RPC │
|
||||
│ │ Commands │ message passing │
|
||||
│ └───────┬───────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────┼─────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌─────┴─────┐ ┌────┴────┐ ┌─────┴─────┐ │
|
||||
│ │ MemStore │ │ FsStore │ │ Readonly │ │
|
||||
│ │ │ │ (redb + │ │ MemStore │ │
|
||||
│ │ │ │ fs) │ │ │ │
|
||||
│ └────────────┘ └─────────┘ └───────────┘ │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Network Layer │
|
||||
│ │
|
||||
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ BlobsProtocol │ │ Remote (Client) │ │
|
||||
│ │ (Provider side) │ │ (Requester side) │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ handle_conn() │ │ Remote::fetch() │ │
|
||||
│ │ handle_stream() │ │ Remote::local() │ │
|
||||
│ └────────┬─────────┘ └──────────┬───────────┘ │
|
||||
│ │ │ │
|
||||
│ └──────── iroh QUIC ───────┘ │
|
||||
│ ALPN: /iroh-bytes/4 │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
iroh-blobs/src/
|
||||
├── lib.rs # Crate root, re-exports
|
||||
├── hash.rs # Hash, BlobFormat, HashAndFormat
|
||||
├── hashseq.rs # HashSeq type
|
||||
├── format.rs # Format module (Collection)
|
||||
│ └── collection.rs # Collection type with metadata
|
||||
├── protocol.rs # Wire protocol types (GetRequest, etc.)
|
||||
│ └── range_spec.rs # ChunkRangesSeq, RangeSpec wire encoding
|
||||
├── net_protocol.rs # BlobsProtocol (iroh ProtocolHandler)
|
||||
├── provider.rs # Server-side request handling
|
||||
│ └── events.rs # Event system (connect/disconnect/progress)
|
||||
├── get.rs # Client-side FSM for getting data
|
||||
│ ├── error.rs # GetError, GetResult types
|
||||
│ └── request.rs # Request execution helpers
|
||||
├── api/ # High-level store API
|
||||
│ ├── blobs.rs # Blob operations (add, export, read, etc.)
|
||||
│ │ └── reader.rs # BlobReader (AsyncRead + AsyncSeek)
|
||||
│ ├── downloader.rs # Multi-source download coordinator
|
||||
│ ├── remote.rs # Remote peer interaction (fetch, observe)
|
||||
│ ├── tags.rs # Tag management API
|
||||
│ ├── proto.rs # Store command protocol (RPC messages)
|
||||
│ └── proto/ # Proto sub-modules
|
||||
│ └── bitfield.rs # Bitfield type for chunk tracking
|
||||
├── store/ # Storage implementations
|
||||
│ ├── mod.rs # IROH_BLOCK_SIZE, GcConfig
|
||||
│ ├── mem.rs # MemStore (in-memory, mutable)
|
||||
│ ├── fs.rs # FsStore (filesystem + redb hybrid)
|
||||
│ ├── readonly_mem.rs # Read-only memory store
|
||||
│ ├── gc.rs # Garbage collection
|
||||
│ ├── util.rs # Shared utilities (Tag, SparseMemFile, etc.)
|
||||
│ └── test.rs # Test utilities
|
||||
├── ticket.rs # BlobTicket (shareable connection info)
|
||||
├── metrics.rs # Prometheus metrics definitions
|
||||
└── util/ # Utilities
|
||||
├── channel.rs # Channel helpers
|
||||
├── connection_pool.rs # Connection pooling
|
||||
├── stream.rs # Stream abstractions
|
||||
└── temp_tag.rs # TempTag, TagCounter, TempTags scope management
|
||||
```
|
||||
|
||||
## Key Dependencies
|
||||
|
||||
| Dependency | Purpose |
|
||||
|------------|---------|
|
||||
| `bao-tree` | BLAKE3 verified streaming, outboard storage, BaoTree encoding/decoding |
|
||||
| `iroh` | QUIC networking, endpoint, router |
|
||||
| `irpc` | RPC framework for store commands |
|
||||
| `postcard` | Wire serialization (compact, no-schema) |
|
||||
| `redb` | Embedded key-value database (fs-store feature) |
|
||||
| `range-collections` | RangeSet2 / ChunkRanges for chunk tracking |
|
||||
| `bytes` | Efficient byte buffer handling |
|
||||
|
||||
## Feature Flags
|
||||
|
||||
| Feature | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `fs-store` | ✅ | Filesystem-based store with redb + file hybrid |
|
||||
| `rpc` | ✅ | RPC support via `noq` / `irpc` |
|
||||
| `metrics` | ❌ | Prometheus metrics |
|
||||
| `hide-proto-docs` | ✅ | Hides protocol docs from rustdocs |
|
||||
|
||||
## BLAKE3 Block Size
|
||||
|
||||
The crate uses a fixed block size of `IROH_BLOCK_SIZE = BlockSize::from_chunk_log(4)`, which means each chunk group is 2^4 = 16 chunks = 16 × 1024 = 16,384 bytes (16 KiB). This is the minimum granularity for range requests and verification.
|
||||
195
docs/research/references/iroh/iroh-blobs/02-key-types.md
Normal file
195
docs/research/references/iroh/iroh-blobs/02-key-types.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# iroh-blobs: Key Types and Data Structures
|
||||
|
||||
## Hash
|
||||
|
||||
```rust
|
||||
// src/hash.rs
|
||||
pub struct Hash(blake3::Hash); // 32-byte BLAKE3 hash, wraps blake3::Hash
|
||||
```
|
||||
|
||||
The fundamental content-address. Created via `Hash::new(data)` or `Hash::from_bytes([u8; 32])`. Has a constant `Hash::EMPTY` for the empty blob. Supports hex display, serde (compact binary for non-human-readable), and is stored as a 32-byte fixed array in redb.
|
||||
|
||||
Wire format: 32 raw bytes (postcard serialization). No framing overhead.
|
||||
|
||||
## BlobFormat
|
||||
|
||||
```rust
|
||||
pub enum BlobFormat {
|
||||
Raw, // A single blob
|
||||
HashSeq, // A sequence of BLAKE3 hashes
|
||||
}
|
||||
```
|
||||
|
||||
Distinguishes between a raw binary blob and a hash sequence. Wire format: single byte (0 = Raw, 1 = HashSeq).
|
||||
|
||||
## HashAndFormat
|
||||
|
||||
```rust
|
||||
pub struct HashAndFormat {
|
||||
pub hash: Hash,
|
||||
pub format: BlobFormat,
|
||||
}
|
||||
```
|
||||
|
||||
Pairs a hash with its format. Wire format: 33 bytes (32 for hash + 1 for format). Display format: hex string, optionally prefixed with 's' for HashSeq.
|
||||
|
||||
## HashSeq
|
||||
|
||||
```rust
|
||||
// src/hashseq.rs
|
||||
pub struct HashSeq(Bytes); // Wrapper around Bytes, length must be multiple of 32
|
||||
```
|
||||
|
||||
A blob interpreted as a sequence of 32-byte BLAKE3 hashes. Created from `Bytes` via `HashSeq::new(bytes)` (returns `None` if length is not a multiple of 32). Iterable, supports `get(index)`, `pop_front()`.
|
||||
|
||||
Used extensively: collections are stored as a HashSeq where the first child is metadata and subsequent children are data blobs.
|
||||
|
||||
## Bitfield
|
||||
|
||||
```rust
|
||||
// src/api/proto/bitfield.rs
|
||||
pub struct Bitfield {
|
||||
pub size: u64, // Total size of the blob in bytes
|
||||
pub ranges: ChunkRanges, // Which chunks are verified/present
|
||||
}
|
||||
```
|
||||
|
||||
Tracks which chunks of a blob are present and verified. Key methods:
|
||||
- `is_complete()` — all chunks present
|
||||
- `validated_size()` — how many bytes are verified
|
||||
- `diff(&other)` — compute the delta between two bitfields
|
||||
|
||||
Used by the observe protocol and internal state tracking.
|
||||
|
||||
## Tag
|
||||
|
||||
```rust
|
||||
// src/store/util.rs
|
||||
pub struct Tag(pub Bytes); // Named reference, arbitrary bytes, typically UTF-8
|
||||
```
|
||||
|
||||
A persistent named reference to content in the store. Tags protect content from garbage collection. Auto-generated tags use the format `"auto-2026-01-15T12:34:56.789Z"`. Tags are stored in the store's database and can be listed, created, renamed, and deleted.
|
||||
|
||||
## TempTag
|
||||
|
||||
```rust
|
||||
// src/util/temp_tag.rs
|
||||
pub struct TempTag {
|
||||
inner: HashAndFormat,
|
||||
on_drop: Option<Weak<dyn TagDrop>>, // Callback when dropped
|
||||
}
|
||||
```
|
||||
|
||||
An ephemeral, in-memory tag. While a `TempTag` exists, its referenced content is protected from garbage collection. When dropped, the `TagDrop` callback notifies the store to unprotect. Can be `leak()`ed to make the protection permanent for the process lifetime.
|
||||
|
||||
Scopes: `TempTagScope` manages groups of temp tags. `Scope::GLOBAL` is the default scope. Batches of operations can create scoped temp tags that are cleaned up together.
|
||||
|
||||
## BlobTicket
|
||||
|
||||
```rust
|
||||
// src/ticket.rs
|
||||
pub struct BlobTicket {
|
||||
addr: EndpointAddr, // How to reach the provider (includes EndpointId, relay URL, direct addresses)
|
||||
format: BlobFormat, // Raw or HashSeq
|
||||
hash: Hash, // What to retrieve
|
||||
}
|
||||
```
|
||||
|
||||
A shareable token containing everything needed to retrieve a blob from a provider. Serialized via `iroh_tickets::Ticket` trait (base32-encoded with "blob" prefix). Wire format uses postcard with a variant discriminator.
|
||||
|
||||
```rust
|
||||
// Creating a ticket
|
||||
let ticket = BlobTicket::new(addr, hash, BlobFormat::Raw);
|
||||
|
||||
// From a ticket string
|
||||
let ticket: BlobTicket = ticket_str.parse()?;
|
||||
```
|
||||
|
||||
## ChunkRanges and ChunkRangesSeq
|
||||
|
||||
### ChunkRanges
|
||||
|
||||
```rust
|
||||
pub type ChunkRanges = RangeSet2<ChunkNum>; // From range_collections crate
|
||||
```
|
||||
|
||||
A set of non-overlapping chunk ranges. Supports boolean operations (union, intersection, difference). The fundamental unit is `ChunkNum` (a u64 newtype representing a 1024-byte BLAKE3 chunk).
|
||||
|
||||
Helper trait `ChunkRangesExt` provides:
|
||||
- `ChunkRanges::all()` — all chunks
|
||||
- `ChunkRanges::bytes(range)` — byte range rounded up to chunk boundaries
|
||||
- `ChunkRanges::chunks(range)` — chunk range from u64 bounds
|
||||
- `ChunkRanges::last_chunk()` — the very last chunk (for size verification)
|
||||
- `ChunkRanges::chunk(n)` — a single chunk
|
||||
- `ChunkRanges::offset(n)` — a single byte offset rounded to chunk
|
||||
|
||||
### ChunkRangesSeq
|
||||
|
||||
```rust
|
||||
// src/protocol/range_spec.rs
|
||||
pub struct ChunkRangesSeq(SmallVec<[(u64, ChunkRanges); 2]>);
|
||||
```
|
||||
|
||||
A sequence of `ChunkRanges`, one per blob in a HashSeq. Uses run-length encoding: stores `(offset, ranges)` pairs, where offset is the first blob index with that range spec. Unspecified indices default to the most recent range (or empty for finite sequences).
|
||||
|
||||
Key methods:
|
||||
- `ChunkRangesSeq::all()` — request everything (root + all children, forever)
|
||||
- `ChunkRangesSeq::root()` — request only the root blob
|
||||
- `ChunkRangesSeq::empty()` — request nothing
|
||||
- `ChunkRangesSeq::from_ranges(ranges)` — from explicit iterator
|
||||
- `ChunkRangesSeq::from_ranges_infinite(ranges)` — last range repeats forever
|
||||
- `.iter_non_empty_infinite()` — iterate only non-empty ranges
|
||||
- `.is_blob()` — true if requesting a single blob (offset 0 with one entry)
|
||||
|
||||
### RangeSpec (Wire Format)
|
||||
|
||||
```rust
|
||||
pub struct RangeSpec(SmallVec<[u64; 2]>);
|
||||
```
|
||||
|
||||
The on-wire encoding of `ChunkRanges`. Uses alternating spans: first span is deselected, second is selected, etc. SmallVec avoids allocation for the common case of a single range.
|
||||
|
||||
Examples:
|
||||
- `[]` — empty (nothing selected)
|
||||
- `[0]` — everything from chunk 0 selected (entire blob)
|
||||
- `[2, 5, 3, 1]` — chunks 2-7 and 10-11 selected
|
||||
- `[u64::MAX]` — only the last chunk (size proof)
|
||||
|
||||
### ChunkRangesSeq Wire Format
|
||||
|
||||
Serialized as `(SmallVec<[(u64, RangeSpec); 2]>)` where each element is `(delta_offset, rangespec)`. The `delta_offset` is the distance from the previous entry. Uses postcard varint encoding for compact transmission.
|
||||
|
||||
## Store Command Protocol
|
||||
|
||||
The store API uses an RPC-style command pattern via `irpc`. Each command has a `Command` enum variant with typed request/response channels:
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = Command, alias = "Msg", rpc_feature = "rpc")]
|
||||
pub enum Request {
|
||||
ListBlobs(ListRequest),
|
||||
Batch(BatchRequest),
|
||||
DeleteBlobs(BlobDeleteRequest),
|
||||
ImportBao(ImportBaoRequest), // streaming: rx bao items, tx result
|
||||
ExportBao(ExportBaoRequest), // streaming: tx encoded items
|
||||
ExportRanges(ExportRangesRequest), // streaming: tx range data
|
||||
Observe(ObserveRequest), // streaming: tx bitfield updates
|
||||
BlobStatus(BlobStatusRequest),
|
||||
ImportBytes(ImportBytesRequest),
|
||||
ImportByteStream(ImportByteStreamRequest), // duplex streaming
|
||||
ImportPath(ImportPathRequest),
|
||||
ExportPath(ExportPathRequest),
|
||||
ListTags(ListTagsRequest),
|
||||
SetTag(SetTagRequest),
|
||||
DeleteTags(DeleteTagsRequest),
|
||||
RenameTag(RenameTagRequest),
|
||||
CreateTag(CreateTagRequest),
|
||||
CreateTempTag(CreateTempTagRequest),
|
||||
ListTempTags(ListTempTagsRequest),
|
||||
SyncDb(SyncDbRequest),
|
||||
WaitIdle(WaitIdleRequest),
|
||||
Shutdown(ShutdownRequest),
|
||||
ClearProtected(ClearProtectedRequest),
|
||||
}
|
||||
```
|
||||
|
||||
This allows both local (in-process) and remote (RPC) store access through the same API surface.
|
||||
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
@@ -0,0 +1,249 @@
|
||||
# iroh-blobs: Transfer Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
The transfer protocol is a **request-response** protocol operating over QUIC streams (via iroh). The ALPN is `b"/iroh-bytes/4"`.
|
||||
|
||||
The requester opens a bidirectional QUIC stream, sends a request, and the provider responds with BLAKE3-verified streaming data on the same stream.
|
||||
|
||||
**Key properties**:
|
||||
- Data integrity is verified in-stream — every 16 KiB chunk group can be independently verified against the BLAKE3 hash tree
|
||||
- No upper limit on blob or collection size — streaming design avoids buffering entire transfers
|
||||
- Zero round-trip overhead for multiple small blobs (via HashSeq/GetManyRequest)
|
||||
- Range requests supported at chunk granularity
|
||||
|
||||
## Request Types
|
||||
|
||||
```rust
|
||||
pub enum Request {
|
||||
Get(GetRequest),
|
||||
Observe(ObserveRequest),
|
||||
Slot2, Slot3, Slot4, Slot5, Slot6, Slot7, // Reserved
|
||||
Push(PushRequest),
|
||||
GetMany(GetManyRequest),
|
||||
}
|
||||
```
|
||||
|
||||
Wire format: 1-byte discriminator (postcard-encoded `RequestType` enum), followed by postcard-serialized request body.
|
||||
|
||||
### GetRequest
|
||||
|
||||
```rust
|
||||
pub struct GetRequest {
|
||||
pub hash: Hash, // BLAKE3 hash of the root blob
|
||||
pub ranges: ChunkRangesSeq, // What ranges to request
|
||||
}
|
||||
```
|
||||
|
||||
The most common request type. The `ranges` field uses `ChunkRangesSeq` to express which parts of the root blob and its children to request.
|
||||
|
||||
**Common patterns**:
|
||||
|
||||
```rust
|
||||
// Request an entire single blob
|
||||
let req = GetRequest::blob(hash);
|
||||
// -> ChunkRangesSeq with a single element: all chunks of the root
|
||||
|
||||
// Request a HashSeq (root + all children)
|
||||
let req = GetRequest::all(hash);
|
||||
// -> ChunkRangesSeq::all() - infinite sequence of "all chunks"
|
||||
|
||||
// Request parts of a single blob
|
||||
let req = GetRequest::builder()
|
||||
.root(ChunkRanges::bytes(0..1000))
|
||||
.build(hash);
|
||||
|
||||
// Request a HashSeq with specific child ranges
|
||||
let req = GetRequest::builder()
|
||||
.root(ChunkRanges::all()) // full root (the hash seq)
|
||||
.child(1, ChunkRanges::bytes(0..100)) // partial child 1
|
||||
.next(ChunkRanges::all()) // full remaining children
|
||||
.build_open(hash); // build_open = last range repeats forever
|
||||
```
|
||||
|
||||
### GetManyRequest
|
||||
|
||||
```rust
|
||||
pub struct GetManyRequest {
|
||||
pub hashes: Vec<Hash>, // Sorted, deduplicated list of hashes
|
||||
pub ranges: ChunkRangesSeq, // Ranges for each hash (no root entry)
|
||||
}
|
||||
```
|
||||
|
||||
Like a `GetRequest` for a HashSeq, but the hashes are provided by the requester instead of looked up from the provider. This avoids the provider needing to have a pre-existing HashSeq blob.
|
||||
|
||||
```rust
|
||||
let req = GetManyRequest::builder()
|
||||
.hash(hash1, ChunkRanges::all())
|
||||
.hash(hash2, ChunkRanges::all())
|
||||
.build();
|
||||
// Deduplicates and sorts hashes automatically
|
||||
```
|
||||
|
||||
### PushRequest
|
||||
|
||||
```rust
|
||||
pub struct PushRequest(GetRequest); // Wraps a GetRequest
|
||||
```
|
||||
|
||||
The inverse of a GetRequest — the requester pushes data to the provider. The request describes what will be sent, followed by the actual data stream. Providers may reject push requests (disabled by default via `EventMask`).
|
||||
|
||||
### ObserveRequest
|
||||
|
||||
```rust
|
||||
pub struct ObserveRequest {
|
||||
pub hash: Hash,
|
||||
pub ranges: RangeSpec, // Which ranges to observe
|
||||
}
|
||||
```
|
||||
|
||||
Subscribes to availability changes for a blob's bitfield. The provider sends `ObserveItem` updates as chunks become available.
|
||||
|
||||
## Response Format
|
||||
|
||||
### For Get/GetMany/Push
|
||||
|
||||
The response is BLAKE3-verified streaming data (bao-tree format). For each blob in the request:
|
||||
|
||||
1. **8-byte size header** (little-endian u64) — the total size of the blob
|
||||
2. **BLAKE3 verified stream** — encoded data for the requested ranges, using bao-tree's mixed encoding:
|
||||
- `BaoContentItem::Parent(node, (left_hash, right_hash))` — internal hash tree nodes (64 bytes each)
|
||||
- `BaoContentItem::Leaf(Leaf { offset, data })` — actual data chunks
|
||||
|
||||
The data is sent in order: ascending chunks for each blob, blobs in HashSeq order.
|
||||
|
||||
**Verification**: The requester validates each chunk group against the expected BLAKE3 hash tree. Invalid data is detected within at most 16 KiB of reception. Missing data (provider doesn't have a chunk) causes the provider to close the stream at the point where data becomes unavailable.
|
||||
|
||||
### For Observe
|
||||
|
||||
The provider sends length-prefixed `ObserveItem` messages:
|
||||
|
||||
```rust
|
||||
pub struct ObserveItem {
|
||||
pub size: u64, // Blob size
|
||||
pub ranges: ChunkRanges, // Available chunks
|
||||
}
|
||||
```
|
||||
|
||||
Updates are sent as deltas — only the new chunks that have become available since the last update.
|
||||
|
||||
## Error Handling
|
||||
|
||||
Error codes for stream/connection closure:
|
||||
|
||||
| Code | Name | Meaning |
|
||||
|------|------|---------|
|
||||
| 0 | StreamDropped | RecvStream was dropped |
|
||||
| 1 | ProviderTerminating | Provider is shutting down |
|
||||
| 2 | RequestReceived | Only one request per stream allowed |
|
||||
| 1 (application) | ERR_PERMISSION | Permission denied |
|
||||
| 2 (application) | ERR_LIMIT | Rate limited |
|
||||
| 3 (application) | ERR_INTERNAL | Internal error |
|
||||
|
||||
## Client-Side FSM (Get)
|
||||
|
||||
The `get::fsm` module implements the get request as a **finite state machine** for maximum control:
|
||||
|
||||
```
|
||||
AtInitial
|
||||
│ (open QUIC stream)
|
||||
▼
|
||||
AtConnected
|
||||
│ (send request, drop writer)
|
||||
▼
|
||||
ConnectedNext ─┬─ StartRoot(hash, ranges) // offset 0 = root blob
|
||||
├─ StartChild(offset, ranges) // offset > 0 = child blob
|
||||
└─ Closing // empty request
|
||||
│
|
||||
AtStartRoot / AtStartChild
|
||||
│ (determine hash for child)
|
||||
▼
|
||||
AtBlobHeader
|
||||
│ (read 8-byte size)
|
||||
▼
|
||||
AtBlobContent
|
||||
│ (stream BLAKE3-verified items)
|
||||
├─ More(content_item) → AtBlobContent // loop
|
||||
└─ Done → AtEndBlob
|
||||
│
|
||||
AtEndBlob
|
||||
│ (iterate to next blob in sequence)
|
||||
├─ MoreChildren(AtStartChild)
|
||||
└─ Closing
|
||||
│ (drain remaining bytes)
|
||||
▼
|
||||
Stats (transfer statistics)
|
||||
```
|
||||
|
||||
Each state transition is explicit. The FSM gives the consumer full control:
|
||||
- `AtBlobContent::next()` returns `BlobContentNext::More((content, item))` or `BlobContentNext::Done(end)`
|
||||
- `AtBlobHeader::next()` reads the size header and creates a `ResponseDecoder`
|
||||
- `AtStartChild::next(hash)` requires the caller to supply the hash (from the HashSeq)
|
||||
|
||||
### Stats Tracking
|
||||
|
||||
```rust
|
||||
pub struct Stats {
|
||||
pub payload_bytes_read: u64, // Actual data bytes
|
||||
pub other_bytes_read: u64, // Hash pairs, headers
|
||||
pub payload_bytes_written: u64, // For push
|
||||
pub other_bytes_written: u64, // For push
|
||||
pub elapsed: Duration,
|
||||
}
|
||||
```
|
||||
|
||||
## Provider-Side Handling
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection(connection: Connection, store: Store, events: EventSender);
|
||||
```
|
||||
|
||||
The provider accepts QUIC streams on a connection. For each stream:
|
||||
1. Read the request type byte
|
||||
2. Deserialize the request
|
||||
3. Dispatch to `handle_get`, `handle_get_many`, `handle_observe`, or `handle_push`
|
||||
4. For `handle_get`: iterate over the `ChunkRangesSeq`, streaming each blob via `store.export_bao(hash, ranges)`
|
||||
5. For HashSeq requests: load the root blob, parse it as `HashSeq`, then stream each requested child
|
||||
|
||||
### Event System
|
||||
|
||||
The provider can emit events for monitoring and access control:
|
||||
|
||||
```rust
|
||||
pub struct EventMask {
|
||||
pub connected: ConnectMode, // None, Notify, Intercept
|
||||
pub get: RequestMode, // None, Notify, Intercept, NotifyLog, InterceptLog, Disabled
|
||||
pub get_many: RequestMode,
|
||||
pub push: RequestMode, // Disabled by default!
|
||||
pub observe: ObserveMode,
|
||||
pub throttle: ThrottleMode, // None, Intercept
|
||||
}
|
||||
```
|
||||
|
||||
- **None**: No events, requests processed normally
|
||||
- **Notify**: Events sent but cannot block requests
|
||||
- **Intercept**: Events sent as RPC requests; handler can reject with `AbortReason`
|
||||
- **Disabled**: All requests of this type rejected
|
||||
|
||||
Progress events: `TransferStarted`, `TransferProgress`, `TransferCompleted`, `TransferAborted`.
|
||||
|
||||
## Collection Format
|
||||
|
||||
```rust
|
||||
pub struct Collection {
|
||||
blobs: Vec<(String, Hash)>, // Named references to child blobs
|
||||
}
|
||||
```
|
||||
|
||||
Wire format (as a HashSeq blob):
|
||||
1. First child blob: `CollectionMeta` serialized with postcard
|
||||
2. Remaining children: the actual data blobs
|
||||
|
||||
```rust
|
||||
pub struct CollectionMeta {
|
||||
header: [u8; 13], // Must be b"CollectionV0."
|
||||
names: Vec<String>, // Names for each child blob
|
||||
}
|
||||
```
|
||||
|
||||
The header `b"CollectionV0."` is a magic number for format identification. The meta blob's hash becomes the first entry in the HashSeq, followed by the hashes of each data blob. Names correspond 1:1 with data blobs (excluding the meta entry).
|
||||
250
docs/research/references/iroh/iroh-blobs/04-storage.md
Normal file
250
docs/research/references/iroh/iroh-blobs/04-storage.md
Normal file
@@ -0,0 +1,250 @@
|
||||
# iroh-blobs: Storage Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
iroh-blobs provides three store implementations sharing a common `Store` API surface:
|
||||
|
||||
| Store | Location | Mutable | Use Case |
|
||||
|-------|----------|---------|----------|
|
||||
| `MemStore` | In-memory | ✅ | Small data, testing, WASM |
|
||||
| `FsStore` | Filesystem + redb | ✅ | Production, large data |
|
||||
| `ReadonlyMemStore` | In-memory | ❌ | Static data serving |
|
||||
|
||||
All stores implement the same RPC-based command protocol (`Command` enum), allowing both local in-process and remote RPC access through the same `Store` type.
|
||||
|
||||
## Store API Surface
|
||||
|
||||
The `Store` type (from `api::Store`) is the primary interface. It's accessed via typed sub-APIs:
|
||||
|
||||
```rust
|
||||
let store: Store = /* ... */;
|
||||
|
||||
// Blob operations
|
||||
store.blobs() // → Blobs API (add, export, read, delete, observe, etc.)
|
||||
store.tags() // → Tags API (create, list, set, delete, rename)
|
||||
|
||||
// Direct operations
|
||||
store.add_bytes(data) // → AddProgress
|
||||
store.add_slice(data) // → TempTag (convenience)
|
||||
store.get_bytes(hash) // → Result<Bytes>
|
||||
store.has(hash) // → bool
|
||||
store.shutdown() // Clean shutdown
|
||||
store.wait_idle() // Wait for all tasks to complete
|
||||
store.sync_db() // Sync database to disk (FsStore)
|
||||
```
|
||||
|
||||
## Blobs API
|
||||
|
||||
```rust
|
||||
let blobs = store.blobs();
|
||||
|
||||
// Import
|
||||
blobs.add_slice(data) // → AddProgress (raw format)
|
||||
blobs.add_bytes(data) // → AddProgress (raw format)
|
||||
blobs.add_bytes_with_opts(AddBytesOptions{..}) // → AddProgress (with format)
|
||||
blobs.import_byte_stream(format) // → streaming import
|
||||
|
||||
// Export
|
||||
blobs.reader(hash) // → BlobReader (AsyncRead + AsyncSeek)
|
||||
blobs.export(hash, path) // → export to filesystem
|
||||
blobs.export_bao(hash, ranges) // → ExportBao (BLAKE3 verified stream)
|
||||
blobs.export_ranges(hash, ranges) // → ExportRanges (raw data ranges)
|
||||
|
||||
// Observe (subscribe to chunk availability)
|
||||
blobs.observe(hash) // → ObserveAt (bitfield stream)
|
||||
|
||||
// Status
|
||||
blobs.status(hash) // → BlobStatus (NotFound/Partial/Complete)
|
||||
|
||||
// Import BAO-encoded data
|
||||
blobs.import_bao_bytes(hash, ranges, data) // → import verified BAO stream
|
||||
blobs.import_bao_reader(hash, ranges, reader) // → import from async reader
|
||||
|
||||
// Batch operations (scoped temp tags)
|
||||
blobs.batch() // → Batch (auto-cleanup scope)
|
||||
|
||||
// Delete
|
||||
blobs.delete(hashes) // → force delete (use GC normally)
|
||||
```
|
||||
|
||||
## Tags API
|
||||
|
||||
```rust
|
||||
let tags = store.tags();
|
||||
|
||||
tags.set(name, value) // Set a persistent tag
|
||||
tags.create(value) // Auto-generate a tag name, return Tag
|
||||
tags.get(name) // → Option<TagInfo>
|
||||
tags.list() // → Stream<TagInfo>
|
||||
tags.list_hash_seq() // → Stream<TagInfo> (only HashSeq format)
|
||||
tags.delete(name) // Delete a tag
|
||||
tags.delete_range(range) // Delete tags in range
|
||||
tags.delete_prefix(prefix) // Delete tags with prefix
|
||||
tags.rename(from, to) // Atomically rename a tag
|
||||
tags.temp_tag(value) // → TempTag (ephemeral protection)
|
||||
```
|
||||
|
||||
## MemStore Architecture
|
||||
|
||||
The in-memory store uses a simple actor pattern:
|
||||
|
||||
```
|
||||
MemStore (ApiClient)
|
||||
│
|
||||
└── Actor (tokio task)
|
||||
├── State
|
||||
│ ├── data: HashMap<Hash, BaoFileHandle> // All blob data
|
||||
│ ├── tags: BTreeMap<Tag, HashAndFormat> // Persistent tags
|
||||
│ └── empty_hash: BaoFileHandle // Special entry for empty blob
|
||||
├── tasks: JoinSet<TaskResult> // Spawned import/export tasks
|
||||
├── temp_tags: TempTags // Ephemeral protection
|
||||
├── protected: HashSet<Hash> // GC-protected hashes
|
||||
└── idle_waiters: Vec<oneshot::Sender<()>> // Wait-idle notifications
|
||||
```
|
||||
|
||||
### BaoFileHandle / BaoFileStorage
|
||||
|
||||
```rust
|
||||
pub enum BaoFileStorage {
|
||||
Partial(PartialMemStorage), // Still downloading
|
||||
Complete(CompleteStorage), // Fully available
|
||||
}
|
||||
|
||||
pub struct PartialMemStorage {
|
||||
data: SparseMemFile, // Sparse byte array for data
|
||||
outboard: SparseMemFile, // Sparse byte array for BLAKE3 hash tree
|
||||
size: SizeInfo, // Known/estimated size
|
||||
bitfield: Bitfield, // Which chunks are verified
|
||||
}
|
||||
|
||||
pub struct CompleteStorage {
|
||||
data: Bytes, // Complete data
|
||||
outboard: Bytes, // Complete outboard (hash tree)
|
||||
}
|
||||
```
|
||||
|
||||
The `watch::Sender<BaoFileStorage>` pattern allows subscribers to observe state changes (for the `observe` API).
|
||||
|
||||
### Data Flow (Import)
|
||||
|
||||
1. `add_bytes(data)` → compute outboard via `PreOrderMemOutboard::create()` → transition `Partial → Complete`
|
||||
2. `import_bao(hash, size, stream)` → receive `BaoContentItem` stream → write to `PartialMemStorage` → update bitfield → transition to `Complete` when all chunks present
|
||||
|
||||
### Data Flow (Export)
|
||||
|
||||
1. `export_bao(hash, ranges)` → look up `BaoFileHandle` → `traverse_ranges_validated(data, outboard, &ranges, tx)` — streams validated BAO data
|
||||
|
||||
## FsStore Architecture (Hybrid Store)
|
||||
|
||||
The filesystem store uses a **hybrid approach** that stores small data inline in redb and large data as files on disk.
|
||||
|
||||
### Design Rationale (from DESIGN.md)
|
||||
|
||||
- **Databases** are good for small blobs (low per-entry overhead, fast random access)
|
||||
- **Filesystems** are good for large blobs (OS-level caching, direct file access)
|
||||
- **Neither alone** works well for both cases
|
||||
|
||||
### Layout
|
||||
|
||||
```
|
||||
<data_dir>/
|
||||
├── db/ # redb database
|
||||
│ ├── metadata table # Hash → EntryState
|
||||
│ ├── inline_data table # Hash → Bytes (for small blobs)
|
||||
│ ├── inline_outboard table # Hash → Bytes (for small outboards)
|
||||
│ └── tags table # Tag → HashAndFormat
|
||||
├── data/<hash>.data # Large blob data files
|
||||
├── data/<hash>.outboard # Large outboard files
|
||||
├── data/<hash>.sizes # Size tracking for partial files
|
||||
└── data/<hash>.bitfield # Validated chunk tracking for partial files
|
||||
```
|
||||
|
||||
### EntryState
|
||||
|
||||
```rust
|
||||
// Simplified from src/store/fs/entry_state.rs
|
||||
pub enum EntryState {
|
||||
Complete(CompleteEntryState),
|
||||
Partial(PartialEntryState),
|
||||
}
|
||||
|
||||
pub struct CompleteEntryState {
|
||||
pub data: DataLocation, // Inline, Owned (canonical path), or External (user path)
|
||||
pub outboard: OutboardLocation, // Inline, Owned, or NotNeeded
|
||||
pub size: u64,
|
||||
}
|
||||
|
||||
pub enum DataLocation {
|
||||
Inline, // Stored in redb inline_data table
|
||||
Owned, // File at canonical path <hash>.data
|
||||
External(Vec<PathBuf>), // User-owned file paths
|
||||
}
|
||||
|
||||
pub enum OutboardLocation {
|
||||
Inline, // Stored in redb inline_outboard table
|
||||
Owned, // File at canonical path <hash>.outboard
|
||||
NotNeeded, // Data ≤ 16 KiB, no outboard needed
|
||||
}
|
||||
|
||||
pub struct PartialEntryState {
|
||||
// Either we know the verified size, or we don't yet
|
||||
pub verified_size: Option<NonZeroU64>,
|
||||
}
|
||||
```
|
||||
|
||||
### Thresholds
|
||||
|
||||
- **Data inline threshold**: 16 KiB (default) — blobs smaller than this are stored entirely in redb
|
||||
- **Outboard inline threshold**: 16 KiB (default) — outboards smaller than this are stored in redb
|
||||
- Data ≤ 16 KiB has no outboard (not needed for verification of a single chunk group)
|
||||
|
||||
### Blob Lifecycle
|
||||
|
||||
**Adding a local file (known data, unknown hash)**:
|
||||
1. Compute the full BLAKE3 hash and outboard
|
||||
2. Atomically move the file into the store under the hash name
|
||||
3. Apply inlining rules: small files → redb, large files → filesystem
|
||||
|
||||
**Syncing from remote (known hash, unknown data)**:
|
||||
1. Start with no data — keep state in memory (not in database)
|
||||
2. As chunks arrive, write incrementally to partial files
|
||||
3. Once size is known to exceed the inline threshold, create database entry + filesystem files
|
||||
4. On completion, transition to `Complete` state and apply inlining rules
|
||||
|
||||
**Deletion**:
|
||||
- Tags protect content from GC
|
||||
- `TempTag` provides ephemeral (process-lifetime) protection
|
||||
- HashSeq tags protect the root blob AND all referenced child blobs
|
||||
- GC is mark-and-sweep: mark all reachable content via tags → sweep (delete) everything else
|
||||
- Explicit `force` deletion bypasses protection (emergency use only)
|
||||
|
||||
### FsStore Actor Architecture
|
||||
|
||||
```
|
||||
FsStore (ApiClient)
|
||||
│
|
||||
└── MainActor (tokio task)
|
||||
├── TaskContext { config, db_actor_sender }
|
||||
├── EntityMap: HashMap<Hash, ActiveEntityState> // Currently active entities
|
||||
├── JoinSet<TaskResult> // Running tasks
|
||||
├── TempTags // Ephemeral protection
|
||||
├── ProtectedSet // GC protection
|
||||
└── idle_waiters
|
||||
```
|
||||
|
||||
The FsStore uses an **entity manager** pattern where each hash gets a `BaoFileHandle` (like MemStore) when active, and entries are cleaned up when tasks complete.
|
||||
|
||||
## Garbage Collection
|
||||
|
||||
```rust
|
||||
pub struct GcConfig {
|
||||
pub interval: Duration,
|
||||
pub add_protected: Option<ProtectCb>, // Optional callback to add more protected hashes
|
||||
}
|
||||
```
|
||||
|
||||
GC is a two-phase process:
|
||||
1. **Mark**: Walk all tags (persistent + temp), collect reachable hashes. For HashSeq format, traverse the hash sequence to find all child hashes.
|
||||
2. **Sweep**: Delete all blobs not in the reachable set, in batches of 100.
|
||||
|
||||
GC runs automatically at a configurable interval via `run_gc(store, config)`, or manually via `gc_run_once(store, live)`.
|
||||
@@ -0,0 +1,202 @@
|
||||
# iroh-blobs: Remote API and Downloader
|
||||
|
||||
## Remote API
|
||||
|
||||
The `Remote` type (`api::remote::Remote`) provides the client-side interface for interacting with remote iroh-blobs providers. It's a thin wrapper around `ApiClient` that exposes fetch, observe, and push operations.
|
||||
|
||||
```rust
|
||||
let remote = store.remote(); // or Remote::from_sender(client)
|
||||
|
||||
// Get local info about what we already have
|
||||
let local = remote.local(hash_and_format).await?;
|
||||
|
||||
// Compute what we need
|
||||
let missing = local.missing();
|
||||
|
||||
// Execute a download
|
||||
let stats = remote.execute_get(connection, request).await?;
|
||||
|
||||
// Or use the simpler fetch API
|
||||
let progress = remote.fetch(connection, hash, format, store);
|
||||
```
|
||||
|
||||
### LocalInfo
|
||||
|
||||
```rust
|
||||
pub struct LocalInfo {
|
||||
pub size: Option<u64>, // Total size if known
|
||||
pub present: ChunkRanges, // Chunks we already have
|
||||
pub missing: ChunkRanges, // Chunks we still need
|
||||
pub hash_and_format: HashAndFormat,
|
||||
}
|
||||
```
|
||||
|
||||
`LocalInfo` is computed by querying the local store's bitfield for a given hash and comparing it against what a full download would require.
|
||||
|
||||
### Fetch Process
|
||||
|
||||
The `fetch` method handles the complete lifecycle:
|
||||
|
||||
1. **Local check**: Query the store for what we already have
|
||||
2. **Request computation**: If format is HashSeq, read the local HashSeq to compute precise missing ranges
|
||||
3. **Connection**: Open a QUIC stream to the provider
|
||||
4. **Transfer**: Use the get FSM to stream data into the store
|
||||
5. **Verification**: BLAKE3 verification happens in-stream during the transfer
|
||||
|
||||
For HashSeq format:
|
||||
- First fetch the root blob (the HashSeq)
|
||||
- Parse it to get child hashes
|
||||
- For each child, check local availability and compute missing ranges
|
||||
- Fetch only what's missing
|
||||
|
||||
### Observe
|
||||
|
||||
```rust
|
||||
// Subscribe to bitfield updates from a remote provider
|
||||
let mut stream = remote.observe(connection, hash).stream().await?;
|
||||
while let Some(bitfield) = stream.next().await {
|
||||
// Process availability updates
|
||||
}
|
||||
```
|
||||
|
||||
The observe protocol sends `ObserveItem` messages (size + available ranges) whenever new chunks become available on the provider. The initial message contains the full current state, subsequent messages contain deltas.
|
||||
|
||||
### Push
|
||||
|
||||
```rust
|
||||
// Push local data to a remote provider
|
||||
let progress = remote.push(connection, request, store);
|
||||
```
|
||||
|
||||
Push uses the same FSM-style approach but in reverse — the local side reads from the store and writes BLAKE3-verified data to the QUIC stream.
|
||||
|
||||
## Downloader API
|
||||
|
||||
The `Downloader` (`api::downloader::Downloader`) coordinates downloads from multiple sources:
|
||||
|
||||
```rust
|
||||
let downloader = Downloader::new(store, endpoint);
|
||||
|
||||
// Download from specific providers
|
||||
let progress = downloader.download(DownloadRequest {
|
||||
request: FiniteRequest::Get(get_request),
|
||||
providers: vec![endpoint_id_1, endpoint_id_2],
|
||||
strategy: SplitStrategy::Split,
|
||||
}).stream();
|
||||
```
|
||||
|
||||
### SplitStrategy
|
||||
|
||||
```rust
|
||||
pub enum SplitStrategy {
|
||||
Split, // Split the request across multiple providers
|
||||
None, // Use a single provider
|
||||
}
|
||||
```
|
||||
|
||||
When `SplitStrategy::Split` is used, the downloader:
|
||||
1. Splits the `GetRequest` into per-child requests
|
||||
2. Distributes children across available providers
|
||||
3. Downloads in parallel from multiple sources
|
||||
4. Stores each completed child into the local store
|
||||
|
||||
### DownloadRequest
|
||||
|
||||
```rust
|
||||
pub struct DownloadRequest {
|
||||
pub request: FiniteRequest, // What to download
|
||||
pub providers: Vec<EndpointId>, // Who to download from
|
||||
pub strategy: SplitStrategy, // How to split work
|
||||
}
|
||||
|
||||
pub enum FiniteRequest {
|
||||
Get(GetRequest),
|
||||
GetMany(GetManyRequest),
|
||||
}
|
||||
```
|
||||
|
||||
### Download Progress
|
||||
|
||||
```rust
|
||||
pub enum DownloadProgressItem {
|
||||
TryProvider { id: EndpointId, request: Arc<GetRequest> },
|
||||
ProviderFailed { id: EndpointId, request: Arc<GetRequest> },
|
||||
PartComplete { request: Arc<GetRequest> },
|
||||
Progress(u64),
|
||||
DownloadError,
|
||||
}
|
||||
```
|
||||
|
||||
## Connection Pooling
|
||||
|
||||
The `util::connection_pool::ConnectionPool` manages reusable QUIC connections:
|
||||
|
||||
```rust
|
||||
let pool = ConnectionPool::new(endpoint, ALPN, options);
|
||||
let connection = pool.connect(endpoint_id).await?;
|
||||
```
|
||||
|
||||
Options include connection timeout, idle timeout, and maximum connections per peer.
|
||||
|
||||
## Integration with iroh
|
||||
|
||||
### BlobsProtocol
|
||||
|
||||
```rust
|
||||
// src/net_protocol.rs
|
||||
pub struct BlobsProtocol {
|
||||
inner: Arc<BlobsInner>, // (Store, EventSender)
|
||||
}
|
||||
|
||||
impl ProtocolHandler for BlobsProtocol {
|
||||
async fn accept(&self, conn: Connection) -> Result<(), AcceptError> {
|
||||
crate::provider::handle_connection(conn, store, events).await;
|
||||
Ok(())
|
||||
}
|
||||
async fn shutdown(&self) { /* shutdown store */ }
|
||||
}
|
||||
```
|
||||
|
||||
Usage with iroh Router:
|
||||
|
||||
```rust
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let store = MemStore::new(); // or FsStore::load(path).await?
|
||||
let blobs = BlobsProtocol::new(&store, None);
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(iroh_blobs::ALPN, blobs)
|
||||
.spawn();
|
||||
```
|
||||
|
||||
### Creating a BlobTicket
|
||||
|
||||
```rust
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
endpoint.online().await;
|
||||
let addr = endpoint.addr();
|
||||
|
||||
let tag = store.add_slice(b"hello world").await?;
|
||||
let ticket = BlobTicket::new(addr, tag.hash, tag.format);
|
||||
println!("Share this: {ticket}");
|
||||
```
|
||||
|
||||
### Fetching from a Ticket
|
||||
|
||||
```rust
|
||||
// On the requester side
|
||||
let ticket: BlobTicket = ticket_str.parse()?;
|
||||
let (addr, hash, format) = ticket.into_parts();
|
||||
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
|
||||
|
||||
let request = match format {
|
||||
BlobFormat::Raw => GetRequest::blob(hash),
|
||||
BlobFormat::HashSeq => GetRequest::all(hash),
|
||||
};
|
||||
|
||||
// Use the get FSM
|
||||
let fsm = get::fsm::start(conn, request, RequestCounters::default());
|
||||
let connected = fsm.next().await?;
|
||||
// ... drive the FSM to completion
|
||||
```
|
||||
@@ -0,0 +1,312 @@
|
||||
# iroh-blobs: Data Flow and Complete Example
|
||||
|
||||
## Complete Data Flow: Provider Side
|
||||
|
||||
```
|
||||
QUIC Connection Arrives
|
||||
│
|
||||
▼
|
||||
handle_connection(conn, store, events)
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ Accept QUIC BIDI │
|
||||
│ streams in loop │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
handle_stream(pair, store)
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ Read Request type │
|
||||
│ byte + deserialize │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌─────────────┬───────┼───────┬──────────────┐
|
||||
│ │ │ │ │
|
||||
handle_get handle_get handle handle (reserved)
|
||||
_many _observe _push
|
||||
│ │ │ │
|
||||
▼ ▼ ▼ ▼
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ For each (offset, ranges) in request.ranges: │
|
||||
│ │
|
||||
│ if offset == 0: │
|
||||
│ send_blob(store, 0, hash, ranges, writer) │
|
||||
│ else: │
|
||||
│ lookup hash in HashSeq[offset-1] │
|
||||
│ send_blob(store, offset, child_hash, ranges, writer) │
|
||||
│ │
|
||||
│ send_blob: │
|
||||
│ store.export_bao(hash, ranges) │
|
||||
│ .write_with_progress(writer, ctx, &hash, idx) │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Complete Data Flow: Requester Side (Get FSM)
|
||||
|
||||
```
|
||||
Create GetRequest
|
||||
│
|
||||
▼
|
||||
fsm::start(connection, request, counters)
|
||||
│
|
||||
▼
|
||||
AtInitial.next()
|
||||
│ (open_bi, send request)
|
||||
▼
|
||||
AtConnected.next()
|
||||
│
|
||||
┌───────────┼───────────┐
|
||||
│ │ │
|
||||
StartRoot StartChild Closing
|
||||
(offset=0) (offset>0) (empty)
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
AtBlobHeader AtBlobHeader AtClosing
|
||||
.next() .next(hash) .next()
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
(size, AtBlobContent) Stats
|
||||
│
|
||||
┌────────┴────────┐
|
||||
│ │
|
||||
More(item) Done
|
||||
(loop back to (AtEndBlob)
|
||||
AtBlobContent) │
|
||||
┌─────┼─────┐
|
||||
│ │
|
||||
MoreChildren Closing
|
||||
(AtStartChild) (AtClosing)
|
||||
│ │
|
||||
└───────────┘
|
||||
```
|
||||
|
||||
### Blob Content Items
|
||||
|
||||
During `AtBlobContent`, items arrive as `BaoContentItem`:
|
||||
|
||||
```rust
|
||||
pub enum BaoContentItem {
|
||||
Parent(ParentNode), // (node, (left_hash, right_hash)) — 64 bytes
|
||||
Leaf(Leaf), // { offset: u64, data: Bytes } — actual data
|
||||
}
|
||||
```
|
||||
|
||||
- **Parent nodes** contain BLAKE3 hash pairs for tree verification. They're overhead (~64 bytes per internal node).
|
||||
- **Leaf nodes** contain actual data chunks. Each leaf's data is at most `IROH_BLOCK_SIZE` bytes (16 KiB).
|
||||
|
||||
Verification is automatic: the `ResponseDecoder` from `bao-tree` validates each chunk against the expected hash tree rooted at the request hash.
|
||||
|
||||
## Blob Verification and BaoTree Encoding
|
||||
|
||||
### How BLAKE3 Verified Streaming Works
|
||||
|
||||
1. **The hash is the root** of a binary Merkle tree
|
||||
2. **Internal nodes** store `(left_child_hash, right_child_hash)` — 64 bytes each
|
||||
3. **Leaf nodes** store the actual data chunks (up to 1024 bytes each in standard BLAKE3, or 16 KiB in iroh's block size)
|
||||
4. **Chunk groups** (16 chunks = 16 KiB) are the minimum verification unit in iroh-blobs
|
||||
|
||||
For a request with specific ranges:
|
||||
- The provider traverses the tree, yielding only nodes needed to verify the requested ranges
|
||||
- The requester can verify each chunk group independently after receiving its parent hash pair
|
||||
- Maximum undetected corruption: 16 KiB (one chunk group)
|
||||
|
||||
### Outboard Storage
|
||||
|
||||
The **outboard** is the BLAKE3 hash tree stored separately from the data. For the provider:
|
||||
- Small blobs (≤16 KiB): outboard is empty (not needed, single chunk group)
|
||||
- Large blobs: outboard stored as `PreOrderMemOutboard` (in-memory) or as a file (filesystem store)
|
||||
|
||||
For the requester, the outboard is built incrementally as data arrives.
|
||||
|
||||
## Import and Export Flows
|
||||
|
||||
### Import Bytes (Local Data)
|
||||
|
||||
```
|
||||
add_bytes(data) / add_slice(data)
|
||||
│
|
||||
▼
|
||||
ImportBytesRequest { data, format, scope }
|
||||
│
|
||||
▼
|
||||
Actor::import_bytes()
|
||||
│ 1. Send AddProgressItem::Size(len)
|
||||
│ 2. Send AddProgressItem::CopyDone
|
||||
│ 3. Compute outboard: PreOrderMemOutboard::create(&data, IROH_BLOCK_SIZE)
|
||||
│ 4. Return ImportEntry { data, outboard, scope, format, tx }
|
||||
│
|
||||
▼
|
||||
Actor::finish_import()
|
||||
│ 1. Get hash from outboard.root()
|
||||
│ 2. Get or create BaoFileHandle for hash
|
||||
│ 3. Transition BaoFileStorage::Partial → Complete
|
||||
│ 4. Create TempTag for the hash_and_format
|
||||
│ 5. Send AddProgressItem::Done(temp_tag)
|
||||
```
|
||||
|
||||
### Import BAO Stream (Remote Data)
|
||||
|
||||
```
|
||||
import_bao_bytes(hash, ranges, data) / import_bao_reader(hash, ranges, reader)
|
||||
│
|
||||
▼
|
||||
ImportBaoRequest { hash, size }
|
||||
│
|
||||
▼
|
||||
Actor::import_bao()
|
||||
│ 1. Set size on partial entry
|
||||
│ 2. Create BaoTree for the size
|
||||
│ 3. For each BaoContentItem from stream:
|
||||
│ - Parent: write hash pair to outboard
|
||||
│ - Leaf: write data to storage, update bitfield
|
||||
│ - If bitfield becomes complete: transition Partial → Complete
|
||||
│ 4. Send result
|
||||
```
|
||||
|
||||
### Export BAO
|
||||
|
||||
```
|
||||
export_bao(hash, ranges) → ExportBao
|
||||
│
|
||||
▼
|
||||
Actor::export_bao()
|
||||
│ 1. Look up BaoFileHandle for hash
|
||||
│ 2. If not found: send EncodeError::NotFound and return
|
||||
│ 3. Create BaoTreeSender from data + outboard readers
|
||||
│ 4. Call traverse_ranges_validated(data, outboard, &ranges, tx)
|
||||
│ → streams validated BAO items to the sender
|
||||
```
|
||||
|
||||
### Export Path (To Filesystem)
|
||||
|
||||
```
|
||||
export(hash, target_path) → ExportPath
|
||||
│
|
||||
▼
|
||||
Actor::export_path()
|
||||
│ 1. Look up BaoFileHandle for hash
|
||||
│ 2. Create parent directories if needed
|
||||
│ 3. Create file at target_path
|
||||
│ 4. Send ExportProgressItem::Size(total_size)
|
||||
│ 5. Read data from store in 64 KiB chunks
|
||||
│ 6. Write to file, yielding ExportProgressItem::CopyProgress(offset)
|
||||
│ 7. Send ExportProgressItem::Done
|
||||
```
|
||||
|
||||
## Observe Protocol Detail
|
||||
|
||||
```
|
||||
Requester Provider
|
||||
│ │
|
||||
│ ObserveRequest {hash, ranges} │
|
||||
│─────────────────────────────────►│
|
||||
│ │
|
||||
│ ObserveItem {size, ranges} │ (initial state)
|
||||
│◄─────────────────────────────────│
|
||||
│ │
|
||||
│ ... (time passes, more data │
|
||||
│ becomes available) │
|
||||
│ │
|
||||
│ ObserveItem {size, ranges} │ (delta update)
|
||||
│◄─────────────────────────────────│
|
||||
│ │
|
||||
│ ... (continue until │
|
||||
│ requester stops │
|
||||
│ or connection closes) │
|
||||
│ │
|
||||
│ STOP_STREAM │
|
||||
│─────────────────────────────────►│
|
||||
```
|
||||
|
||||
The observe protocol uses `Bitfield::diff()` to send only the new chunks since the last update, minimizing bandwidth.
|
||||
|
||||
## Full Working Example
|
||||
|
||||
```rust
|
||||
use iroh::{protocol::Router, Endpoint, endpoint::presets};
|
||||
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ticket::BlobTicket, BlobFormat};
|
||||
|
||||
// === Provider Side ===
|
||||
async fn provider() -> anyhow::Result<()> {
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let store = MemStore::new();
|
||||
|
||||
// Add some data
|
||||
let tag = store.add_slice(b"Hello, iroh-blobs!").await?;
|
||||
|
||||
let _ = endpoint.online().await;
|
||||
let addr = endpoint.addr();
|
||||
|
||||
// Create ticket for sharing
|
||||
let ticket = BlobTicket::new(addr, tag.hash, BlobFormat::Raw);
|
||||
println!("Ticket: {ticket}");
|
||||
|
||||
// Start serving
|
||||
let blobs = BlobsProtocol::new(&store, None);
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(iroh_blobs::ALPN, blobs)
|
||||
.spawn();
|
||||
|
||||
tokio::signal::ctrl_c().await?;
|
||||
router.shutdown().await?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
// === Requester Side ===
|
||||
async fn requester(ticket: BlobTicket) -> anyhow::Result<()> {
|
||||
let (addr, hash, format) = ticket.into_parts();
|
||||
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
|
||||
|
||||
// Build request based on format
|
||||
let request = match format {
|
||||
BlobFormat::Raw => iroh_blobs::protocol::GetRequest::blob(hash),
|
||||
BlobFormat::HashSeq => iroh_blobs::protocol::GetRequest::all(hash),
|
||||
};
|
||||
|
||||
// Use the get FSM
|
||||
let start = iroh_blobs::get::fsm::start(conn, request, Default::default());
|
||||
let connected = start.next().await?;
|
||||
let connected = connected.next().await?;
|
||||
|
||||
match connected {
|
||||
iroh_blobs::get::fsm::ConnectedNext::StartRoot(at_root) => {
|
||||
let (at_content, size) = at_root.next().next().await?;
|
||||
let (at_end, data) = at_content.concatenate_into_vec().await?;
|
||||
println!("Got {} bytes: {:?}", size, data);
|
||||
// ...
|
||||
}
|
||||
iroh_blobs::get::fsm::ConnectedNext::StartChild(at_child) => {
|
||||
// Need to know the child hash
|
||||
}
|
||||
iroh_blobs::get::fsm::ConnectedNext::Closing(at_closing) => {
|
||||
println!("Empty response");
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
## Simplified Fetch (Using Store + Remote)
|
||||
|
||||
```rust
|
||||
// The simplest way to download data
|
||||
let store = MemStore::new();
|
||||
let remote = store.remote();
|
||||
|
||||
// Fetch with automatic local availability checking
|
||||
let result = remote.fetch(connection, hash, format, &store).await?;
|
||||
// Result includes Stats with transfer metrics
|
||||
```
|
||||
|
||||
## Key Error Types
|
||||
|
||||
| Error Type | Location | Purpose |
|
||||
|------------|----------|---------|
|
||||
| `GetError` | `get::error` | Errors during get FSM |
|
||||
| `ExportBaoError` | `api` | Errors during BAO export |
|
||||
| `RequestError` | `api` | Store command errors |
|
||||
| `DecodeError` | `get::fsm` | BAO stream decode errors |
|
||||
| `ProgressError` | `provider::events` | Provider event errors |
|
||||
60
docs/research/references/iroh/iroh-blobs/README.md
Normal file
60
docs/research/references/iroh/iroh-blobs/README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# iroh-blobs Reference Documentation
|
||||
|
||||
This directory contains a comprehensive reference for the `iroh-blobs` crate (v0.100.0), a Rust library for content-addressed blob transfer over QUIC connections using BLAKE3 verified streaming.
|
||||
|
||||
## Documents
|
||||
|
||||
1. **[Overview and Architecture](01-overview-and-architecture.md)** — Core concepts, module structure, feature flags, and architecture diagram. Start here.
|
||||
|
||||
2. **[Key Types and Data Structures](02-key-types.md)** — Detailed reference for `Hash`, `BlobFormat`, `HashAndFormat`, `HashSeq`, `Bitfield`, `Tag`, `TempTag`, `BlobTicket`, `ChunkRanges`/`ChunkRangesSeq`/`RangeSpec`, and the store command protocol.
|
||||
|
||||
3. **[Transfer Protocol](03-transfer-protocol.md)** — Wire protocol specification: request types (`GetRequest`, `GetManyRequest`, `PushRequest`, `ObserveRequest`), response format (BLAKE3 verified streaming), the client-side FSM, provider handling, event system, and the Collection format.
|
||||
|
||||
4. **[Storage Architecture](04-storage.md)** — Store implementations: `MemStore` (in-memory), `FsStore` (hybrid redb + filesystem), `ReadonlyMemStore`. Covers the actor pattern, `BaoFileHandle`/`BaoFileStorage`, partial/complete states, the hybrid inline/file approach, entry states, blob lifecycle, and garbage collection.
|
||||
|
||||
5. **[Remote API and Downloader](05-remote-and-downloader.md)** — `Remote` API for fetching from/observing/pushing to peers, `Downloader` for multi-source downloads, connection pooling, and iroh integration via `BlobsProtocol`.
|
||||
|
||||
6. **[Data Flow and Examples](06-data-flow-and-examples.md)** — End-to-end data flow diagrams for provider and requester sides, BLAKE3 verification mechanics, import/export flows, observe protocol detail, and complete working examples.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Creating a Provider
|
||||
|
||||
```rust
|
||||
use iroh::{protocol::Router, Endpoint, endpoint::presets};
|
||||
use iroh_blobs::{store::mem::MemStore, BlobsProtocol};
|
||||
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let store = MemStore::new();
|
||||
let tag = store.add_slice(b"data").await?;
|
||||
let blobs = BlobsProtocol::new(&store, None);
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(iroh_blobs::ALPN, blobs)
|
||||
.spawn();
|
||||
```
|
||||
|
||||
### Key Constants
|
||||
|
||||
| Constant | Value | Meaning |
|
||||
|----------|-------|---------|
|
||||
| `ALPN` | `b"/iroh-bytes/4"` | QUIC ALPN protocol identifier |
|
||||
| `IROH_BLOCK_SIZE` | `BlockSize::from_chunk_log(4)` | 16 KiB chunk groups |
|
||||
| `MAX_MESSAGE_SIZE` | `1 MiB` | Maximum request message size |
|
||||
| `Hash::EMPTY` | BLAKE3 of `b""` | Hash of the empty blob |
|
||||
|
||||
### Core Crate Exports
|
||||
|
||||
```rust
|
||||
pub use hash::{BlobFormat, Hash, HashAndFormat};
|
||||
pub use hashseq::HashSeq;
|
||||
pub use net_protocol::BlobsProtocol;
|
||||
pub use protocol::ALPN;
|
||||
pub mod api; // Store API, Blobs, Tags, Downloader, Remote
|
||||
pub mod format; // Collection type
|
||||
pub mod get; // Client-side FSM
|
||||
pub mod protocol; // Wire protocol types (GetRequest, etc.)
|
||||
pub mod provider; // Server-side handling
|
||||
pub mod store; // Storage implementations
|
||||
pub mod ticket; // BlobTicket
|
||||
pub mod util; // Connection pool, temp tags, stream helpers
|
||||
```
|
||||
@@ -0,0 +1,98 @@
|
||||
# iroh-docs: Overview and Architecture
|
||||
|
||||
> Reference document for the `iroh-docs` crate (v0.98.0).
|
||||
> Source: `/workspace/iroh-docs`
|
||||
|
||||
## What Is iroh-docs?
|
||||
|
||||
`iroh-docs` is a Rust crate implementing **multi-dimensional key-value documents with an efficient synchronization protocol**. It provides:
|
||||
|
||||
1. **A CRDT-based document model** — Replicas (documents) hold entries identified by namespace + author + key, with content-addressed values (BLAKE3 hashes).
|
||||
2. **Range-based set reconciliation** — An efficient sync protocol based on [Aljoscha Meyer's paper](https://arxiv.org/abs/2212.13567) for reconciling sets between peers.
|
||||
3. **Live sync via gossip** — Real-time document updates propagated through an iroh-gossip swarm.
|
||||
4. **Persistent storage** — A `redb`-backed store supporting both in-memory and file-based modes.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ Docs (Protocol) │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Engine │ │
|
||||
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │ │
|
||||
│ │ │ LiveActor│ │ GossipState │ │ SyncHandle/Actor │ │ │
|
||||
│ │ │ (events) │ │ (iroh-gossip)│ │ (store + sync) │ │ │
|
||||
│ │ └──────────┘ └──────────────┘ └───────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
||||
│ │ Replica │ │ SignedEntry │ │ Author/ │ │
|
||||
│ │ (sync.rs) │ │ Entry/Record │ │ Namespace keys │ │
|
||||
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Store (redb) │ │
|
||||
│ │ Authors │ Namespaces │ Records │ RecordsByKey │ ... │ │
|
||||
│ └─────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Module Layout
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `sync.rs` | Core types: `Replica`, `Entry`, `SignedEntry`, `Record`, `RecordIdentifier`, `Capability`, events |
|
||||
| `keys.rs` | Cryptographic key types: `Author`, `NamespaceSecret`, `AuthorId`, `NamespaceId` |
|
||||
| `ranger.rs` | Range-based set reconciliation algorithm implementation |
|
||||
| `heads.rs` | `AuthorHeads` — latest timestamps per author for efficient sync decisions |
|
||||
| `store/` | Storage abstraction and `redb`-backed persistent store |
|
||||
| `store/fs.rs` | File-based `Store` implementation with redb tables |
|
||||
| `store/pubkeys.rs` | `PublicKeyStore` trait for caching expanded ed25519 public keys |
|
||||
| `actor.rs` | `SyncHandle` / Actor — single-threaded executor for store and replica operations |
|
||||
| `engine/` | Live sync coordination: `Engine`, `LiveActor`, `GossipState`, `NamespaceStates` |
|
||||
| `engine/live.rs` | The `LiveActor` event loop: handles sync, gossip, content download |
|
||||
| `engine/gossip.rs` | Integration with `iroh-gossip` for broadcasting document operations |
|
||||
| `engine/state.rs` | `NamespaceStates` — tracks per-namespace, per-peer sync state |
|
||||
| `net/` | Network protocol: ALPN `/iroh-sync/1`, connection handling |
|
||||
| `net/codec.rs` | Wire codec: length-prefixed postcard-serialized `Message` frames |
|
||||
| `protocol.rs` | `Docs` struct (the `ProtocolHandler`) and `Builder` |
|
||||
| `api/` | irpc-based RPC API for external access |
|
||||
| `ticket.rs` | `DocTicket` — shareable document capability + peer addresses |
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
1. **Two-key identity model**: Every entry is uniquely identified by (namespace, author, key). The namespace key provides write authorization; the author key provides attribution.
|
||||
|
||||
2. **Content-addressed values**: Entries store a BLAKE3 hash + length, not the actual content. Content blobs are handled separately by `iroh-blobs`.
|
||||
|
||||
3. **Prefix deletion**: An entry with key "foo" acts as a tombstone for all entries whose keys start with "foo/" (prefix deletion semantics). This enables hierarchical key structures.
|
||||
|
||||
4. **Last-writer-wins with per-author timestamps**: Entries are ordered by (timestamp, hash). Newer entries dominate older ones. Different authors can have entries for the same key simultaneously (multi-dimensional).
|
||||
|
||||
5. **Actor-based concurrency**: All store and replica mutations go through a single `SyncHandle` actor thread, eliminating the need for locks on the store.
|
||||
|
||||
6. **Event-driven live sync**: The `LiveActor` coordinates gossip, direct sync, and content downloads through a `tokio::select!` event loop.
|
||||
|
||||
## Dependencies
|
||||
|
||||
Key dependencies from `Cargo.toml`:
|
||||
|
||||
| Crate | Purpose |
|
||||
|-------|---------|
|
||||
| `iroh` | Networking: endpoints, connections, protocol routing |
|
||||
| `iroh-blobs` | Content-addressed blob storage and transfer |
|
||||
| `iroh-gossip` | Gossip protocol for broadcasting updates |
|
||||
| `iroh-tickets` | Ticket-based sharing mechanism |
|
||||
| `redb` | Embedded key-value store for persistence |
|
||||
| `ed25519-dalek` | Ed25519 signatures for entries |
|
||||
| `blake3` | Hashing (fingerprints + content hashes) |
|
||||
| `postcard` | Serialization (wire format for sync protocol) |
|
||||
| `irpc` / `noq` | RPC framework for API |
|
||||
|
||||
## Feature Flags
|
||||
|
||||
| Feature | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `metrics` | Yes | Enables iroh-metrics instrumentation |
|
||||
| `rpc` | Yes | Enables irpc-based RPC API (depends on `noq`) |
|
||||
| `fs-store` | Yes | Enables persistent file-based store |
|
||||
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
@@ -0,0 +1,201 @@
|
||||
# iroh-docs: Document Model and CRDT Details
|
||||
|
||||
## Core Data Model
|
||||
|
||||
### Namespace (Document Identity)
|
||||
|
||||
A **Namespace** is the identity of a document. It consists of:
|
||||
|
||||
- **`NamespaceSecret`** — An Ed25519 signing key (32 bytes) that grants write capability
|
||||
- **`NamespacePublicKey`** — The corresponding verifying key (32 bytes)
|
||||
- **`NamespaceId`** — A `[u8; 32]` that is the byte representation of the public key; this serves as the unique identifier for a document/replica
|
||||
|
||||
```
|
||||
NamespaceSecret (signing key) ──derives──▶ NamespacePublicKey (verifying key)
|
||||
──into─────▶ NamespaceId ([u8; 32])
|
||||
```
|
||||
|
||||
### Author (Writer Identity)
|
||||
|
||||
An **Author** represents a writer identity within a document. Multiple authors can write to the same namespace.
|
||||
|
||||
- **`Author`** — An Ed25519 signing key (32 bytes)
|
||||
- **`AuthorPublicKey`** — The corresponding verifying key (32 bytes)
|
||||
- **`AuthorId`** — A `[u8; 32]` byte representation of the public key
|
||||
|
||||
Authors are application-defined: an application might create one author per device, per user, or per session.
|
||||
|
||||
### Capability
|
||||
|
||||
Access to a document is controlled through a `Capability`:
|
||||
|
||||
```rust
|
||||
pub enum Capability {
|
||||
Write(NamespaceSecret), // Full read-write access
|
||||
Read(NamespaceId), // Read-only access (can sync but not insert)
|
||||
}
|
||||
```
|
||||
|
||||
Capabilities can be **merged** — a `Read` capability can be upgraded to `Write` if a matching `Write` is presented:
|
||||
|
||||
```rust
|
||||
capability.merge(other_capability) // Read + Write → Write
|
||||
```
|
||||
|
||||
The raw representation is `(u8, [u8; 32])` — a kind byte followed by 32 bytes of key material.
|
||||
|
||||
### Entry (The Fundamental Record)
|
||||
|
||||
An **`Entry`** is the core data unit, consisting of:
|
||||
|
||||
```rust
|
||||
pub struct Entry {
|
||||
id: RecordIdentifier, // (namespace, author, key)
|
||||
record: Record, // (hash, len, timestamp)
|
||||
}
|
||||
```
|
||||
|
||||
#### RecordIdentifier
|
||||
|
||||
```rust
|
||||
pub struct RecordIdentifier(Bytes); // namespace[0..32] || author[32..64] || key[64..]
|
||||
```
|
||||
|
||||
The key is a variable-length byte sequence. `RecordIdentifier` implements `Ord` by comparing namespace first, then author, then key — this ordering is critical for the range-based sync algorithm.
|
||||
|
||||
#### Record
|
||||
|
||||
```rust
|
||||
pub struct Record {
|
||||
len: u64, // byte length of the content
|
||||
hash: Hash, // BLAKE3 hash of the content (32 bytes)
|
||||
timestamp: u64, // microseconds since Unix epoch
|
||||
}
|
||||
```
|
||||
|
||||
The `Record` comparison uses `(timestamp, hash)` ordering — this is the **Last-Writer-Wins** rule for same-key entries. When two records for the same key exist, the one with the higher timestamp wins; if timestamps are equal, the higher hash wins as a tiebreaker.
|
||||
|
||||
### SignedEntry (Entry with Proofs)
|
||||
|
||||
```rust
|
||||
pub struct SignedEntry {
|
||||
signature: EntrySignature, // dual Ed25519 signatures
|
||||
entry: Entry,
|
||||
}
|
||||
```
|
||||
|
||||
#### EntrySignature
|
||||
|
||||
```rust
|
||||
pub struct EntrySignature {
|
||||
author_signature: Signature, // 64-byte Ed25519 signature
|
||||
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||
}
|
||||
```
|
||||
|
||||
Both signatures cover the canonical byte encoding of the `Entry` (id + record). This means:
|
||||
- The **namespace signature** proves write authorization (only holders of `NamespaceSecret` can produce valid entries)
|
||||
- The **author signature** proves authorship (provides attribution and non-repudiation)
|
||||
|
||||
#### Verification
|
||||
|
||||
```rust
|
||||
fn verify<S: PublicKeyStore>(&self, store: &S) -> Result<(), SignatureError>
|
||||
```
|
||||
|
||||
Verification requires both the `NamespacePublicKey` and `AuthorPublicKey`, which are derived from the entry's namespace and author IDs. The `PublicKeyStore` trait provides caching for these expanded keys.
|
||||
|
||||
### Empty Entries (Tombstones / Prefix Deletion)
|
||||
|
||||
An entry is **empty** when `hash == Hash::EMPTY && len == 0`. Empty entries serve as **deletion markers**:
|
||||
|
||||
- **Key deletion**: Inserting an empty entry with the exact key removes the previous entry for that key
|
||||
- **Prefix deletion**: Inserting an empty entry with key "foo" removes all entries whose keys start with "foo" (prefix deletion)
|
||||
|
||||
```rust
|
||||
pub async fn delete_prefix(&mut self, prefix: impl AsRef<[u8]>, author: &Author) -> Result<usize, InsertError>
|
||||
```
|
||||
|
||||
### Insert Semantics (CRDT Rules)
|
||||
|
||||
When a `SignedEntry` is inserted into a replica via `Store::put()` (the ranger store trait):
|
||||
|
||||
1. **Check prefixes**: Look up all existing entries whose key is a **prefix** of the new entry's key. If any prefix entry has a value `>=` the new entry's value, the new entry is **rejected** (`InsertOutcome::NotInserted`).
|
||||
|
||||
2. **Remove dominated entries**: Remove all existing entries whose key **starts with** the new entry's key (i.e., the new key is a prefix of theirs) AND whose value is `<=` the new entry's value.
|
||||
|
||||
3. **Insert**: If not rejected, the new entry is stored.
|
||||
|
||||
This implements a **prefix-aware last-writer-wins** CRDT:
|
||||
- Newer entries for the same (namespace, author, key) tuple replace older ones
|
||||
- A new entry at key "/foo" can delete all entries under "/foo/*" if it's newer
|
||||
- Different authors can coexist on the same key — each author's latest entry is kept
|
||||
|
||||
### Timestamp and Future Shift
|
||||
|
||||
Timestamps are in **microseconds since Unix epoch**. There is a maximum allowed future shift:
|
||||
|
||||
```rust
|
||||
pub const MAX_TIMESTAMP_FUTURE_SHIFT: u64 = 10 * 60 * Duration::from_secs(1).as_millis() as u64;
|
||||
```
|
||||
|
||||
Entries with timestamps more than 10 minutes in the future of the local clock are rejected during validation.
|
||||
|
||||
### Content Status
|
||||
|
||||
Each entry's content has an availability status:
|
||||
|
||||
```rust
|
||||
pub enum ContentStatus {
|
||||
Complete, // Content blob is fully available locally
|
||||
Incomplete, // Partially available
|
||||
Missing, // Not available
|
||||
}
|
||||
```
|
||||
|
||||
This status is communicated during sync to help peers decide whether to download content.
|
||||
|
||||
### AuthorHeads (Efficient Sync Optimization)
|
||||
|
||||
`AuthorHeads` tracks the latest timestamp for each author in a document:
|
||||
|
||||
```rust
|
||||
pub struct AuthorHeads {
|
||||
heads: BTreeMap<AuthorId, Timestamp>,
|
||||
}
|
||||
```
|
||||
|
||||
This enables a quick check: `has_news_for(other)` — comparing local and remote heads to determine whether sync would yield any new entries. If all timestamps are at least as recent locally, no sync is needed.
|
||||
|
||||
`AuthorHeads` can be serialized with a size limit, dropping the oldest entries when the limit is exceeded.
|
||||
|
||||
## Event System
|
||||
|
||||
Replicas emit events through a subscription system:
|
||||
|
||||
```rust
|
||||
pub enum Event {
|
||||
LocalInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
},
|
||||
RemoteInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
from: PeerIdBytes,
|
||||
should_download: bool, // based on download policy
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Subscribers use `async_channel` for non-blocking notification delivery. The `ReplicaInfo::subscribe()` method registers a sender, and events are fanned out to all subscribers.
|
||||
|
||||
## Validation
|
||||
|
||||
Entry validation during insertion checks:
|
||||
|
||||
1. **Namespace match**: The entry's namespace must match the replica's namespace
|
||||
2. **Signature verification**: For non-local entries, both namespace and author signatures are verified
|
||||
3. **Timestamp check**: The entry must not be more than `MAX_TIMESTAMP_FUTURE_SHIFT` in the future
|
||||
4. **Empty entry check**: An empty entry must have `hash == EMPTY && len == 0`, and a non-empty entry must have `len != 0`
|
||||
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# iroh-docs: Range-Based Set Reconciliation (Ranger)
|
||||
|
||||
## Overview
|
||||
|
||||
The sync protocol in iroh-docs is based on **Range-Based Set Reconciliation**, implementing the algorithm described in [Aljoscha Meyer's paper (arXiv:2212.13567)](https://arxiv.org/abs/2212.13567).
|
||||
|
||||
The core idea: two peers can efficiently compute the union of their entry sets by recursively partitioning the sets and comparing **fingerprints** (hashes) of partitions. When fingerprints match, no further work is needed. When they differ, the partition is subdivided until the difference can be resolved by sending the actual entries.
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### RangeEntry Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeEntry: Debug + Clone {
|
||||
type Key: RangeKey;
|
||||
type Value: RangeValue;
|
||||
|
||||
fn key(&self) -> &Self::Key;
|
||||
fn value(&self) -> &Self::Value;
|
||||
fn as_fingerprint(&self) -> Fingerprint;
|
||||
}
|
||||
```
|
||||
|
||||
`SignedEntry` implements `RangeEntry`:
|
||||
- `Key` = `RecordIdentifier` (namespace || author || key bytes)
|
||||
- `Value` = `Record` (timestamp, hash, len)
|
||||
- Fingerprint = BLAKE3 hash of (namespace || author || key || timestamp || content_hash)
|
||||
|
||||
### RangeKey Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeKey: Sized + Debug + Ord + PartialEq + Clone + 'static {
|
||||
fn is_prefix_of(&self, other: &Self) -> bool; // test-only
|
||||
}
|
||||
```
|
||||
|
||||
`RecordIdentifier` implements this via byte-level prefix matching: `(namespace, author, key)` where key prefix matching supports the hierarchical deletion semantics.
|
||||
|
||||
### RangeValue Trait
|
||||
|
||||
```rust
|
||||
pub trait RangeValue: Sized + Debug + Ord + PartialEq + Clone + 'static {}
|
||||
```
|
||||
|
||||
`Record` implements `RangeValue` with ordering by `(timestamp, hash)` — the Last-Writer-Wins ordering.
|
||||
|
||||
### Fingerprint
|
||||
|
||||
```rust
|
||||
pub struct Fingerprint(pub [u8; 32]); // BLAKE3 hash
|
||||
```
|
||||
|
||||
Fingerprints are computed by XOR-ing the individual entry fingerprints within a range. This means:
|
||||
- The fingerprint of the empty set is `BLAKE3([])` (the hash of nothing)
|
||||
- Adding/removing an entry toggles its contribution via XOR
|
||||
- Equal sets produce equal fingerprints
|
||||
|
||||
## Range Concept
|
||||
|
||||
A `Range<K>` represents a half-open interval `[x, y)` in the key space, with special semantics:
|
||||
|
||||
```rust
|
||||
pub(crate) struct Range<K> {
|
||||
x: K,
|
||||
y: K,
|
||||
}
|
||||
```
|
||||
|
||||
- `x == y`: The entire set (all elements)
|
||||
- `x < y`: Standard half-open interval `[x, y)` — includes `x`, excludes `y`
|
||||
- `x > y`: Wrapping range — elements from `x` to end + beginning to `y`
|
||||
|
||||
This wrapping range concept allows the algorithm to work with circular key spaces where the "first" element might be anywhere.
|
||||
|
||||
## Protocol Messages
|
||||
|
||||
```rust
|
||||
pub type ProtocolMessage = crate::ranger::Message<SignedEntry>;
|
||||
```
|
||||
|
||||
### Message Structure
|
||||
|
||||
```rust
|
||||
pub struct Message<E: RangeEntry> {
|
||||
parts: Vec<MessagePart<E>>,
|
||||
}
|
||||
|
||||
pub enum MessagePart<E: RangeEntry> {
|
||||
RangeFingerprint(RangeFingerprint<E::Key>), // "Here's a fingerprint for this range"
|
||||
RangeItem(RangeItem<E>), // "Here are the entries in this range"
|
||||
}
|
||||
|
||||
pub struct RangeFingerprint<K> {
|
||||
range: Range<K>,
|
||||
fingerprint: Fingerprint,
|
||||
}
|
||||
|
||||
pub struct RangeItem<E: RangeEntry> {
|
||||
range: Range<E::Key>,
|
||||
values: Vec<(E, ContentStatus)>,
|
||||
have_local: bool, // If true, sender already has these entries
|
||||
}
|
||||
```
|
||||
|
||||
The `have_local` flag is an optimization: when a peer sends entries AND indicates it already has them locally, the receiver doesn't need to send its own entries in that range back.
|
||||
|
||||
### Wire Format
|
||||
|
||||
Messages are serialized using `postcard` (a compact serde format) and framed with a 4-byte big-endian length prefix via `SyncCodec`:
|
||||
|
||||
```
|
||||
┌─────────────────┬──────────────────────────────┐
|
||||
│ u32 BE length │ postcard-encoded Message │
|
||||
└─────────────────┴──────────────────────────────┘
|
||||
```
|
||||
|
||||
Max message size: 1 GiB (`MAX_MESSAGE_SIZE = 1024 * 1024 * 1024`).
|
||||
|
||||
## Sync Algorithm Walkthrough
|
||||
|
||||
### 1. Initiation (Alice → Bob)
|
||||
|
||||
Alice generates the initial message:
|
||||
|
||||
```rust
|
||||
fn init<S: Store<E>>(store: &mut S) -> Result<Self, S::Error> {
|
||||
let x = store.get_first()?; // First key, or default
|
||||
let range = Range::new(x.clone(), x); // "All elements" range
|
||||
let fingerprint = store.get_fingerprint(&range)?;
|
||||
Ok(Message { parts: vec![RangeFingerprint { range, fingerprint }] })
|
||||
}
|
||||
```
|
||||
|
||||
This sends a single fingerprint covering the entire set.
|
||||
|
||||
### 2. Processing (Bob processes Alice's message)
|
||||
|
||||
For each part in the message:
|
||||
|
||||
**Case 1: RangeFingerprint matches local fingerprint** → Nothing to do, sets are equal in this range.
|
||||
|
||||
**Case 2: RangeFingerprint is empty OR range has ≤ 1 local entry** → Send all entries in the range as a `RangeItem`.
|
||||
|
||||
**Case 3: Recurse** → Split the range into `split_factor` partitions, compute fingerprints, and send either `RangeFingerprint` (if partition is large) or `RangeItem` (if partition is small enough, ≤ `max_set_size`).
|
||||
|
||||
### 3. Processing RangeItem
|
||||
|
||||
When a peer receives a `RangeItem`:
|
||||
|
||||
1. **Validate** each incoming entry using `validate_cb`
|
||||
2. **Insert** valid entries via `Store::put()` (which handles prefix deletion)
|
||||
3. **Notify** via `on_insert_cb` for actually-inserted entries
|
||||
4. If `have_local` is false, compute the **diff** — entries in the local range not present in the received set — and send them back
|
||||
|
||||
### Configuration
|
||||
|
||||
```rust
|
||||
struct SyncConfig {
|
||||
max_set_size: usize, // Default: 1 — entries to send before using fingerprints
|
||||
split_factor: usize, // Default: 2 — number of partitions per recursion step
|
||||
}
|
||||
```
|
||||
|
||||
With `max_set_size = 1` and `split_factor = 2`, the algorithm behaves like a binary search: each fingerprint mismatch splits the range in two and sends fingerprints for both halves.
|
||||
|
||||
## Store Trait
|
||||
|
||||
The `Store` trait provides the interface that the reconciliation algorithm needs:
|
||||
|
||||
```rust
|
||||
pub trait Store<E: RangeEntry>: Sized {
|
||||
type Error: Debug + Send + Sync + Into<anyhow::Error> + 'static;
|
||||
type RangeIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||
type ParentIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||
|
||||
fn get_first(&mut self) -> Result<E::Key, Self::Error>;
|
||||
fn get_fingerprint(&mut self, range: &Range<E::Key>) -> Result<Fingerprint, Self::Error>;
|
||||
fn entry_put(&mut self, entry: E) -> Result<(), Self::Error>;
|
||||
fn get_range(&mut self, range: Range<E::Key>) -> Result<Self::RangeIterator<'_>, Self::Error>;
|
||||
fn prefixes_of(&mut self, key: &E::Key) -> Result<Self::ParentIterator<'_>, Self::Error>;
|
||||
fn remove_prefix_filtered(&mut self, prefix: &E::Key, predicate: impl Fn(&E::Value) -> bool) -> Result<usize, Self::Error>;
|
||||
fn initial_message(&mut self) -> Result<Message<E>, Self::Error>;
|
||||
async fn process_message<F, F2, F3>(...) -> Result<Option<Message<E>>, Self::Error>;
|
||||
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error>;
|
||||
}
|
||||
```
|
||||
|
||||
### Insert Semantics in `Store::put()`
|
||||
|
||||
The `put` method implements the CRDT insert logic:
|
||||
|
||||
```rust
|
||||
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error> {
|
||||
// 1. Check prefix entries — if any parent entry has value >= new entry, reject
|
||||
for prefix_entry in self.prefixes_of(entry.key())? {
|
||||
if entry.value() <= prefix_entry.value() {
|
||||
return Ok(InsertOutcome::NotInserted);
|
||||
}
|
||||
}
|
||||
|
||||
// 2. Remove entries whose key is prefixed by new entry's key AND whose value is <=
|
||||
let removed = self.remove_prefix_filtered(entry.key(), |v| entry.value() >= v)?;
|
||||
|
||||
// 3. Insert the new entry
|
||||
self.entry_put(entry)?;
|
||||
Ok(InsertOutcome::Inserted { removed })
|
||||
}
|
||||
```
|
||||
|
||||
### InsertOutcome
|
||||
|
||||
```rust
|
||||
enum InsertOutcome {
|
||||
NotInserted, // A newer or equal entry already exists
|
||||
Inserted { removed: usize }, // Successfully inserted; reports removed entries
|
||||
}
|
||||
```
|
||||
|
||||
## Sync Flow at the Protocol Level
|
||||
|
||||
The `Replica` type provides the sync interface:
|
||||
|
||||
```rust
|
||||
// Create initial message for sync
|
||||
fn sync_initial_message(&mut self) -> anyhow::Result<ProtocolMessage>
|
||||
|
||||
// Process an incoming message and produce optional reply
|
||||
async fn sync_process_message(
|
||||
&mut self,
|
||||
message: ProtocolMessage,
|
||||
from_peer: PeerIdBytes,
|
||||
state: &mut SyncOutcome,
|
||||
) -> Result<Option<ProtocolMessage>, anyhow::Error>
|
||||
```
|
||||
|
||||
### SyncOutcome
|
||||
|
||||
Tracks the result of a sync session:
|
||||
|
||||
```rust
|
||||
pub struct SyncOutcome {
|
||||
pub heads_received: AuthorHeads, // Latest timestamps per author from remote
|
||||
pub num_recv: usize, // Number of entries received
|
||||
pub num_sent: usize, // Number of entries sent
|
||||
}
|
||||
```
|
||||
|
||||
## Network Protocol (Codec)
|
||||
|
||||
The sync protocol operates over a QUIC bidirectional stream:
|
||||
|
||||
1. **Alice** (initiator) sends `Message::Init { namespace, message }`
|
||||
2. **Bob** (responder) validates the namespace and either:
|
||||
- Accepts and processes the initial message
|
||||
- Rejects with `Message::Abort { reason }`
|
||||
3. Both peers exchange `Message::Sync(message)` rounds until one side has no reply (convergence reached)
|
||||
|
||||
The `BobState` manages the responder side, tracking namespace and `SyncOutcome` progress across message rounds.
|
||||
|
||||
### Abort Reasons
|
||||
|
||||
```rust
|
||||
pub enum AbortReason {
|
||||
NotFound, // Namespace not available
|
||||
AlreadySyncing, // Already syncing this namespace
|
||||
InternalServerError,
|
||||
}
|
||||
```
|
||||
|
||||
### Concurrent Sync Prevention
|
||||
|
||||
When both peers try to sync with each other simultaneously, the system uses a deterministic tiebreaker based on comparing `EndpointId` bytes — the peer with the larger ID accepts, the other connects.
|
||||
@@ -0,0 +1,257 @@
|
||||
# iroh-docs: Store and Persistence
|
||||
|
||||
## Store Architecture
|
||||
|
||||
The store is implemented in `store::fs::Store` using `redb`, an embedded key-value database. It supports two modes:
|
||||
|
||||
- **In-memory**: `Store::memory()` — backed by a `Vec<u8>` via `redb::backends::InMemoryBackend`
|
||||
- **Persistent**: `Store::persistent(path)` — backed by a single file on disk
|
||||
|
||||
Both modes use the same `redb` table structure.
|
||||
|
||||
## redb Table Schema
|
||||
|
||||
### Authors Table
|
||||
```
|
||||
Table: "authors-1"
|
||||
Key: [u8; 32] (AuthorId)
|
||||
Value: [u8; 32] (Author secret key bytes)
|
||||
```
|
||||
|
||||
### Namespaces Table
|
||||
```
|
||||
Table: "namespaces-2"
|
||||
Key: [u8; 32] (NamespaceId)
|
||||
Value: (u8, [u8; 32]) (CapabilityKind, key bytes)
|
||||
```
|
||||
|
||||
The `CapabilityKind` discriminates between `Write = 1` (full key stored) and `Read = 2` (only the public key / namespace ID stored).
|
||||
|
||||
### Records Table (Primary)
|
||||
```
|
||||
Table: "records-1"
|
||||
Key: (NamespaceId, AuthorId, key_bytes) = ([u8; 32], [u8; 32], &[u8])
|
||||
Value: (timestamp, namespace_sig, author_sig, len, hash) = (u64, &[u8; 64], &[u8; 64], u64, &[u8; 32])
|
||||
```
|
||||
|
||||
This is the main table storing all document entries. The key layout `(namespace, author, key)` enables efficient range queries for the sync algorithm.
|
||||
|
||||
### Latest-Per-Author Table
|
||||
```
|
||||
Table: "latest-by-author-1"
|
||||
Key: (NamespaceId, AuthorId) = (&[u8; 32], &[u8; 32])
|
||||
Value: (timestamp, key_bytes) = (u64, &[u8])
|
||||
```
|
||||
|
||||
Used to quickly determine the latest entry timestamp for each author, supporting `AuthorHeads` computation and `has_news_for_us()` checks.
|
||||
|
||||
### Records-By-Key Table (Index)
|
||||
```
|
||||
Table: "records-by-key-1"
|
||||
Key: (NamespaceId, key_bytes, AuthorId) = (&[u8; 32], &[u8], &[u8; 32])
|
||||
Value: ()
|
||||
```
|
||||
|
||||
An index table that enables efficient queries by key prefix, supporting `Query::key_prefix()` and `Query::key_exact()` lookups.
|
||||
|
||||
### Namespace Peers Table (Multimap)
|
||||
```
|
||||
MultimapTable: "sync-peers-1"
|
||||
Key: &[u8; 32] (NamespaceId)
|
||||
Value: (Nanos, &PeerIdBytes) (timestamp_nanos, peer_id)
|
||||
```
|
||||
|
||||
Stores up to 5 (`PEERS_PER_DOC_CACHE_SIZE`) recently-useful peers per namespace. This is an LRU cache: when full, the oldest peer is evicted when a new one is registered.
|
||||
|
||||
### Download Policy Table
|
||||
```
|
||||
Table: "download-policy-1"
|
||||
Key: &[u8; 32] (NamespaceId)
|
||||
Value: &[u8] (postcard-encoded DownloadPolicy)
|
||||
```
|
||||
|
||||
Per-namespace download policies controlling which content blobs to automatically download.
|
||||
|
||||
## Store Operations
|
||||
|
||||
### Transaction Model
|
||||
|
||||
The `Store` uses a "current transaction" approach:
|
||||
|
||||
```rust
|
||||
enum CurrentTransaction {
|
||||
None,
|
||||
Read(ReadOnlyTables),
|
||||
Write(TransactionAndTables),
|
||||
}
|
||||
```
|
||||
|
||||
- Read operations obtain a read snapshot
|
||||
- Write operations batch into a write transaction
|
||||
- Transactions older than `MAX_COMMIT_DELAY` (500ms) are automatically committed
|
||||
- `flush()` commits any pending write transaction
|
||||
|
||||
### Core Methods
|
||||
|
||||
```rust
|
||||
// Create/open/close replicas
|
||||
fn new_replica(&mut self, namespace: NamespaceSecret) -> Result<Replica<'_>>;
|
||||
fn open_replica(&mut self, namespace_id: &NamespaceId) -> Result<Replica<'_>>;
|
||||
fn close_replica(&mut self, id: NamespaceId);
|
||||
fn import_namespace(&mut self, capability: Capability) -> Result<ImportNamespaceOutcome>;
|
||||
|
||||
// Author management
|
||||
fn new_author<R: CryptoRng>(&mut self, rng: &mut R) -> Result<Author>;
|
||||
fn import_author(&mut self, author: Author) -> Result<()>;
|
||||
fn get_author(&mut self, author_id: &AuthorId) -> Result<Option<Author>>;
|
||||
fn delete_author(&mut self, author: AuthorId) -> Result<()>;
|
||||
|
||||
// Queries
|
||||
fn get_many(&mut self, namespace: NamespaceId, query: impl Into<Query>) -> Result<QueryIterator>;
|
||||
fn get_exact(&mut self, namespace: NamespaceId, author: AuthorId, key: impl AsRef<[u8]>, include_empty: bool) -> Result<Option<SignedEntry>>;
|
||||
fn get_latest_for_each_author(&mut self, namespace: NamespaceId) -> Result<LatestIterator<'_>>;
|
||||
|
||||
// Sync support
|
||||
fn has_news_for_us(&mut self, namespace: NamespaceId, heads: &AuthorHeads) -> Result<Option<NonZeroU64>>;
|
||||
fn get_sync_peers(&mut self, namespace: &NamespaceId) -> Result<Option<PeersIter>>;
|
||||
fn register_useful_peer(&mut self, namespace: NamespaceId, peer: PeerIdBytes) -> Result<()>;
|
||||
|
||||
// Content
|
||||
fn content_hashes(&mut self) -> Result<ContentHashesIterator>;
|
||||
```
|
||||
|
||||
### ImportNamespaceOutcome
|
||||
|
||||
```rust
|
||||
pub enum ImportNamespaceOutcome {
|
||||
Inserted, // New namespace created
|
||||
Upgraded, // Existing namespace upgraded from Read to Write
|
||||
NoChange, // Namespace already existed with same or higher capability
|
||||
}
|
||||
```
|
||||
|
||||
## Query System
|
||||
|
||||
The `Query` type supports flexible entry lookups:
|
||||
|
||||
```rust
|
||||
pub struct Query {
|
||||
kind: QueryKind,
|
||||
filter_author: AuthorFilter,
|
||||
filter_key: KeyFilter,
|
||||
limit: Option<u64>,
|
||||
offset: u64,
|
||||
include_empty: bool,
|
||||
sort_direction: SortDirection,
|
||||
}
|
||||
```
|
||||
|
||||
### Query Kinds
|
||||
|
||||
```rust
|
||||
enum QueryKind {
|
||||
Flat(FlatQuery), // Returns all matching entries
|
||||
SingleLatestPerKey(SingleLatestPerKeyQuery), // Returns only latest entry per key
|
||||
}
|
||||
```
|
||||
|
||||
- **Flat**: Returns all entries matching the filters, sorted by `(namespace, author, key)` or `(namespace, key, author)` depending on `SortBy`
|
||||
- **SingleLatestPerKey**: Groups by key and returns only the latest entry (by record value ordering) per key
|
||||
|
||||
### Filters
|
||||
|
||||
```rust
|
||||
enum KeyFilter {
|
||||
Any, // Match all keys
|
||||
Exact(Bytes), // Exact key match
|
||||
Prefix(Bytes), // Key starts with prefix
|
||||
}
|
||||
|
||||
enum AuthorFilter {
|
||||
Any, // Match all authors
|
||||
Exact(AuthorId), // Match specific author
|
||||
}
|
||||
```
|
||||
|
||||
### Builder Pattern
|
||||
|
||||
```rust
|
||||
// Get all entries
|
||||
Query::all()
|
||||
|
||||
// Get entries by author
|
||||
Query::author(author_id)
|
||||
|
||||
// Get entries by key prefix
|
||||
Query::key_prefix(b"/path/")
|
||||
|
||||
// Get single latest entry per key
|
||||
Query::single_latest_per_key()
|
||||
.key_prefix(b"/path/")
|
||||
.author(author_id)
|
||||
```
|
||||
|
||||
## Download Policy
|
||||
|
||||
Controls which content blobs to automatically download after sync:
|
||||
|
||||
```rust
|
||||
pub enum DownloadPolicy {
|
||||
NothingExcept(Vec<FilterKind>), // Only download matching entries
|
||||
EverythingExcept(Vec<FilterKind>), // Download all except matching (default)
|
||||
}
|
||||
|
||||
pub enum FilterKind {
|
||||
Prefix(Bytes), // Matches keys starting with bytes
|
||||
Exact(Bytes), // Matches exact key
|
||||
}
|
||||
```
|
||||
|
||||
Default: `EverythingExcept(Vec::new())` — download everything.
|
||||
|
||||
## PublicKeyStore
|
||||
|
||||
The `PublicKeyStore` trait caches expanded `ed25519_dalek::VerifyingKey` objects to avoid repeated curve point decompression:
|
||||
|
||||
```rust
|
||||
pub trait PublicKeyStore {
|
||||
fn public_key(&self, id: &[u8; 32]) -> Result<VerifyingKey, SignatureError>;
|
||||
fn namespace_key(&self, bytes: &NamespaceId) -> Result<NamespacePublicKey, SignatureError>;
|
||||
fn author_key(&self, bytes: &AuthorId) -> Result<AuthorPublicKey, SignatureError>;
|
||||
}
|
||||
```
|
||||
|
||||
The `MemPublicKeyStore` implementation uses `Arc<RwLock<HashMap<[u8; 32], VerifyingKey>>>` for thread-safe caching.
|
||||
|
||||
The `Store` itself implements `PublicKeyStore`, leveraging its redb tables for author storage and the in-memory cache for fast verification.
|
||||
|
||||
## StoreInstance
|
||||
|
||||
```rust
|
||||
pub struct StoreInstance<'a> {
|
||||
namespace: NamespaceId,
|
||||
store: &'a mut Store,
|
||||
}
|
||||
```
|
||||
|
||||
A `StoreInstance` bundles a namespace ID with a mutable reference to the store, providing the `ranger::Store<SignedEntry>` implementation for the sync algorithm. This is what `Replica` uses internally to perform sync operations.
|
||||
|
||||
## Replica
|
||||
|
||||
```rust
|
||||
pub struct Replica<'a, I = Box<ReplicaInfo>> {
|
||||
store: StoreInstance<'a>,
|
||||
info: I,
|
||||
}
|
||||
```
|
||||
|
||||
`Replica` is the primary user-facing type for document operations. It combines:
|
||||
- A `StoreInstance` for data access
|
||||
- `ReplicaInfo` for metadata (capability, subscribers, content status callback)
|
||||
|
||||
Key methods:
|
||||
- `insert(key, author, hash, len)` — Insert a new entry
|
||||
- `delete_prefix(prefix, author)` — Delete entries by key prefix
|
||||
- `insert_remote_entry(entry, from, content_status)` — Insert from sync
|
||||
- `hash_and_insert(key, author, data)` — Hash data and insert
|
||||
- `sync_initial_message()` / `sync_process_message()` — Sync protocol operations
|
||||
@@ -0,0 +1,343 @@
|
||||
# iroh-docs: Engine and Live Sync
|
||||
|
||||
## Overview
|
||||
|
||||
The `Engine` is the top-level coordinator for live document synchronization. It brings together:
|
||||
|
||||
1. **SyncHandle/Actor** — Single-threaded actor for all store and replica operations
|
||||
2. **LiveActor** — Async event loop coordinating sync, gossip, and content downloads
|
||||
3. **GossipState** — Integration with `iroh-gossip` for broadcasting updates
|
||||
4. **Blobs/Downloader** — Integration with `iroh-blobs` for content transfer
|
||||
|
||||
## Engine
|
||||
|
||||
```rust
|
||||
pub struct Engine {
|
||||
pub endpoint: Endpoint,
|
||||
pub sync: SyncHandle,
|
||||
pub default_author: DefaultAuthor,
|
||||
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||
actor_handle: AbortOnDropHandle<()>,
|
||||
content_status_cb: ContentStatusCallback,
|
||||
blob_store: iroh_blobs::api::Store,
|
||||
_gc_protect_task: AbortOnDropHandle<()>,
|
||||
}
|
||||
```
|
||||
|
||||
### Initialization
|
||||
|
||||
```rust
|
||||
Engine::spawn(
|
||||
endpoint, // iroh Endpoint for QUIC connections
|
||||
gossip, // iroh-gossip instance
|
||||
replica_store, // Store for document data
|
||||
bao_store, // iroh-blobs Store for content blobs
|
||||
downloader, // Downloader for fetching blobs
|
||||
default_author_storage, // Where to persist the default author
|
||||
protect_cb, // Optional GC protection callback
|
||||
) -> Result<Self>
|
||||
```
|
||||
|
||||
During spawn:
|
||||
1. A `ContentStatusCallback` is created that checks blob availability in `iroh-blobs`
|
||||
2. A `SyncHandle` actor is spawned on a dedicated thread
|
||||
3. A `LiveActor` is spawned as a tokio task
|
||||
4. The default author is loaded or created
|
||||
5. A GC protection task is started (if callback provided)
|
||||
|
||||
### Key Engine Methods
|
||||
|
||||
```rust
|
||||
// Start syncing a document with given peers
|
||||
async fn start_sync(&self, namespace: NamespaceId, peers: Vec<EndpointAddr>) -> Result<()>
|
||||
|
||||
// Stop syncing and leave gossip swarm
|
||||
async fn leave(&self, namespace: NamespaceId, kill_subscribers: bool) -> Result<()>
|
||||
|
||||
// Subscribe to document events
|
||||
async fn subscribe(&self, namespace: NamespaceId) -> Result<impl Stream<Item = Result<LiveEvent>>>
|
||||
|
||||
// Handle incoming QUIC connections
|
||||
async fn handle_connection(&self, conn: Connection) -> Result<()>
|
||||
|
||||
// Shutdown the engine
|
||||
async fn shutdown(&self) -> Result<()>
|
||||
```
|
||||
|
||||
### GC Protection
|
||||
|
||||
The `ProtectCallbackHandler` bridges iroh-docs with iroh-blobs' garbage collection:
|
||||
|
||||
```rust
|
||||
let (handler, protect_cb) = ProtectCallbackHandler::new();
|
||||
// protect_cb goes into iroh-blobs GC config
|
||||
// handler goes into Engine::spawn
|
||||
```
|
||||
|
||||
When iroh-blobs runs GC, it calls `protect_cb` which queries the docs store for all content hashes, ensuring blobs referenced by document entries are not garbage-collected.
|
||||
|
||||
## SyncHandle / Actor
|
||||
|
||||
The `SyncHandle` is a handle to a single-threaded actor that processes all store and replica operations sequentially:
|
||||
|
||||
```rust
|
||||
pub struct SyncHandle {
|
||||
tx: async_channel::Sender<Action>,
|
||||
join_handle: Arc<Option<std::thread::JoinHandle<()>>>,
|
||||
metrics: Arc<Metrics>,
|
||||
}
|
||||
```
|
||||
|
||||
### Actor Architecture
|
||||
|
||||
```
|
||||
External Code ──async──▶ SyncHandle ──channel──▶ Actor Thread
|
||||
│
|
||||
Store (redb)
|
||||
Replica operations
|
||||
Flush on timeout (500ms)
|
||||
```
|
||||
|
||||
The actor runs on a **dedicated OS thread** (not a tokio task), using `tokio::runtime::Builder::new_current_thread()` internally. This ensures store operations are never concurrent.
|
||||
|
||||
### Action Types
|
||||
|
||||
```rust
|
||||
enum Action {
|
||||
ImportAuthor { author, reply },
|
||||
ExportAuthor { author, reply },
|
||||
DeleteAuthor { author, reply },
|
||||
ImportNamespace { capability, reply },
|
||||
ListAuthors { reply },
|
||||
ListReplicas { reply },
|
||||
ContentHashes { reply },
|
||||
FlushStore { reply },
|
||||
Replica(NamespaceId, ReplicaAction),
|
||||
Shutdown { reply },
|
||||
}
|
||||
|
||||
enum ReplicaAction {
|
||||
Open { reply, opts },
|
||||
Close { reply },
|
||||
GetState { reply },
|
||||
SetSync { sync, reply },
|
||||
Subscribe { sender, reply },
|
||||
Unsubscribe { sender, reply },
|
||||
InsertLocal { author, key, hash, len, reply },
|
||||
DeletePrefix { author, key, reply },
|
||||
InsertRemote { entry, from, content_status, reply },
|
||||
SyncInitialMessage { reply },
|
||||
SyncProcessMessage { message, from, state, reply },
|
||||
GetSyncPeers { reply },
|
||||
RegisterUsefulPeer { peer, reply },
|
||||
GetExact { author, key, include_empty, reply },
|
||||
GetMany { query, reply },
|
||||
DropReplica { reply },
|
||||
ExportSecretKey { reply },
|
||||
HasNewsForUs { heads, reply },
|
||||
SetDownloadPolicy { policy, reply },
|
||||
GetDownloadPolicy { reply },
|
||||
}
|
||||
```
|
||||
|
||||
### Replica Opening
|
||||
|
||||
When a replica is opened via the actor, an `OpenReplica` struct is created:
|
||||
|
||||
```rust
|
||||
struct OpenReplica {
|
||||
info: ReplicaInfo, // Capability, subscribers, content status callback
|
||||
sync: bool, // Whether to accept sync requests
|
||||
handles: usize, // Reference count for open handles
|
||||
}
|
||||
```
|
||||
|
||||
Multiple handles to the same replica are supported via reference counting.
|
||||
|
||||
## LiveActor
|
||||
|
||||
The `LiveActor` is the central async coordinator:
|
||||
|
||||
```rust
|
||||
pub struct LiveActor {
|
||||
inbox: mpsc::Receiver<ToLiveActor>,
|
||||
sync: SyncHandle,
|
||||
endpoint: Endpoint,
|
||||
bao_store: Store,
|
||||
downloader: Downloader,
|
||||
memory_lookup: MemoryLookup,
|
||||
replica_events_tx: async_channel::Sender<Event>,
|
||||
replica_events_rx: async_channel::Receiver<Event>,
|
||||
sync_actor_tx: mpsc::Sender<ToLiveActor>,
|
||||
gossip: GossipState,
|
||||
running_sync_connect: JoinSet<SyncConnectRes>,
|
||||
running_sync_accept: JoinSet<SyncAcceptRes>,
|
||||
download_tasks: JoinSet<DownloadRes>,
|
||||
missing_hashes: HashSet<Hash>,
|
||||
queued_hashes: QueuedHashes,
|
||||
hash_providers: ProviderNodes,
|
||||
subscribers: SubscribersMap,
|
||||
state: NamespaceStates,
|
||||
metrics: Arc<Metrics>,
|
||||
}
|
||||
```
|
||||
|
||||
### Event Loop
|
||||
|
||||
The `LiveActor::run_inner()` loop uses `tokio::select!` with biased polling:
|
||||
|
||||
```rust
|
||||
tokio::select! {
|
||||
biased;
|
||||
msg = self.inbox.recv() => { /* handle actor messages */ }
|
||||
event = self.replica_events_rx.recv() => { /* handle replica insert events */ }
|
||||
res = self.running_sync_connect.join_next() => { /* sync connect finished */ }
|
||||
res = self.running_sync_accept.join_next() => { /* sync accept finished */ }
|
||||
res = self.download_tasks.join_next() => { /* download completed */ }
|
||||
res = self.gossip.progress() => { /* gossip task progress */ }
|
||||
}
|
||||
```
|
||||
|
||||
### ToLiveActor Messages
|
||||
|
||||
```rust
|
||||
pub enum ToLiveActor {
|
||||
StartSync { namespace, peers, reply },
|
||||
Leave { namespace, kill_subscribers, reply },
|
||||
Shutdown { reply },
|
||||
Subscribe { namespace, sender, reply },
|
||||
HandleConnection { conn },
|
||||
AcceptSyncRequest { namespace, peer, reply },
|
||||
IncomingSyncReport { from, report },
|
||||
NeighborContentReady { namespace, node, hash },
|
||||
NeighborUp { namespace, peer },
|
||||
NeighborDown { namespace, peer },
|
||||
}
|
||||
```
|
||||
|
||||
### Gossip Operations (Op)
|
||||
|
||||
```rust
|
||||
pub enum Op {
|
||||
Put(SignedEntry), // New entry inserted
|
||||
ContentReady(Hash), // Content blob now available
|
||||
SyncReport(SyncReport), // Heads summary after sync
|
||||
}
|
||||
```
|
||||
|
||||
Gossip broadcasts `Op` messages to all swarm participants. When a `Put` is received, the entry is inserted into the local replica. When a `ContentReady` is received, peers know they can download the blob. When a `SyncReport` is received, peers check `has_news_for_us()` to decide if they should sync.
|
||||
|
||||
### Content Download Flow
|
||||
|
||||
1. When a `RemoteInsert` event occurs with `should_download: true`, the entry's content hash is queued for download
|
||||
2. The `LiveActor` uses `iroh_blobs::downloader::Downloader` to fetch the blob
|
||||
3. Known providers (peers who had `ContentStatus::Complete`) are used as download sources
|
||||
4. On download completion, a `LiveEvent::ContentReady` event is emitted
|
||||
|
||||
### LiveEvent (Public API)
|
||||
|
||||
```rust
|
||||
pub enum LiveEvent {
|
||||
InsertLocal { entry: Entry },
|
||||
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||
ContentReady { hash: Hash },
|
||||
PendingContentReady,
|
||||
NeighborUp(PublicKey),
|
||||
NeighborDown(PublicKey),
|
||||
SyncFinished(SyncEvent),
|
||||
}
|
||||
```
|
||||
|
||||
`SyncEvent` wraps `SyncFinished`:
|
||||
|
||||
```rust
|
||||
pub struct SyncFinished {
|
||||
pub namespace: NamespaceId,
|
||||
pub peer: PublicKey,
|
||||
pub outcome: SyncOutcome,
|
||||
pub timings: Timings,
|
||||
}
|
||||
```
|
||||
|
||||
## NamespaceStates
|
||||
|
||||
```rust
|
||||
pub struct NamespaceStates(BTreeMap<NamespaceId, NamespaceState>);
|
||||
|
||||
struct NamespaceState {
|
||||
nodes: BTreeMap<EndpointId, PeerState>,
|
||||
may_emit_ready: bool,
|
||||
}
|
||||
```
|
||||
|
||||
Each peer has a `PeerState` tracking sync progress:
|
||||
|
||||
```rust
|
||||
struct PeerState {
|
||||
state: SyncState, // Idle or Running
|
||||
resync_requested: bool, // Whether a resync was requested during active sync
|
||||
last_sync: Option<(Instant, Result<SyncFinished>)>,
|
||||
}
|
||||
```
|
||||
|
||||
This state machine prevents concurrent syncs with the same peer for the same namespace and queues resync requests when needed.
|
||||
|
||||
## DefaultAuthor
|
||||
|
||||
```rust
|
||||
pub struct DefaultAuthor {
|
||||
value: RwLock<AuthorId>,
|
||||
storage: DefaultAuthorStorage,
|
||||
}
|
||||
```
|
||||
|
||||
- `DefaultAuthorStorage::Mem` — Ephemeral, creates a new author each time
|
||||
- `DefaultAuthorStorage::Persistent(path)` — Stores the author ID as hex in a file, loads it on startup
|
||||
|
||||
The default author provides a convenient "current user" identity for applications.
|
||||
|
||||
## Docs Protocol Handler
|
||||
|
||||
```rust
|
||||
pub struct Docs {
|
||||
engine: Arc<Engine>,
|
||||
api: DocsApi,
|
||||
}
|
||||
```
|
||||
|
||||
`Docs` implements `ProtocolHandler` for integration with iroh's `Router`:
|
||||
|
||||
```rust
|
||||
impl ProtocolHandler for Docs {
|
||||
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> { ... }
|
||||
async fn shutdown(&self) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
The `Builder` pattern configures storage:
|
||||
|
||||
```rust
|
||||
let docs = Docs::memory()
|
||||
.spawn(endpoint, blobs, gossip)
|
||||
.await?;
|
||||
// or
|
||||
let docs = Docs::persistent(path)
|
||||
.protect_handler(handler)
|
||||
.spawn(endpoint, blobs, gossip)
|
||||
.await?;
|
||||
```
|
||||
|
||||
## DocTicket
|
||||
|
||||
```rust
|
||||
pub struct DocTicket {
|
||||
pub capability: Capability,
|
||||
pub nodes: Vec<EndpointAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
A `DocTicket` encapsulates everything needed to join a document:
|
||||
- A `Capability` (Read or Write) — provides the namespace key
|
||||
- A list of `EndpointAddr` — bootstrap peers to connect to
|
||||
|
||||
Tickets are serialized as base32-encoded postcard data with a `"doc"` prefix, using the `iroh_tickets::Ticket` trait.
|
||||
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# iroh-docs: Network Protocol and Wire Format
|
||||
|
||||
## ALPN
|
||||
|
||||
The docs protocol uses ALPN `/iroh-sync/1` for QUIC connection identification.
|
||||
|
||||
```rust
|
||||
pub const ALPN: &[u8] = b"/iroh-sync/1";
|
||||
```
|
||||
|
||||
## Connection Flow
|
||||
|
||||
### Outgoing Sync (Alice — Initiator)
|
||||
|
||||
```rust
|
||||
pub async fn connect_and_sync(
|
||||
endpoint: &Endpoint,
|
||||
sync: &SyncHandle,
|
||||
namespace: NamespaceId,
|
||||
peer: EndpointAddr,
|
||||
metrics: Option<&Metrics>,
|
||||
) -> Result<SyncFinished, ConnectError>
|
||||
```
|
||||
|
||||
1. Open a QUIC connection to the peer with ALPN `/iroh-sync/1`
|
||||
2. Open a bidirectional QUIC stream
|
||||
3. Run the Alice (initiator) protocol via `run_alice()`
|
||||
4. Close the stream and return `SyncFinished`
|
||||
|
||||
### Incoming Sync (Bob — Responder)
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection<F, Fut>(
|
||||
sync: SyncHandle,
|
||||
connection: Connection,
|
||||
accept_cb: F,
|
||||
metrics: Option<&Metrics>,
|
||||
) -> Result<SyncFinished, AcceptError>
|
||||
```
|
||||
|
||||
1. Accept a bidirectional QUIC stream from the connection
|
||||
2. Run the Bob (responder) protocol via `BobState::run()`
|
||||
3. The `accept_cb` determines whether to accept or reject each namespace
|
||||
4. Close the stream and return `SyncFinished`
|
||||
|
||||
## Wire Format
|
||||
|
||||
### Frame Codec
|
||||
|
||||
All messages are length-prefixed:
|
||||
|
||||
```
|
||||
┌──────────────────────┬──────────────────────────────┐
|
||||
│ u32 big-endian len │ postcard-serialized Message │
|
||||
└──────────────────────┴──────────────────────────────┘
|
||||
```
|
||||
|
||||
Maximum message size: 1 GiB.
|
||||
|
||||
### Message Types
|
||||
|
||||
```rust
|
||||
enum Message {
|
||||
Init {
|
||||
namespace: NamespaceId, // Which document to sync
|
||||
message: ProtocolMessage, // Initial sync message (ranger::Message<SignedEntry>)
|
||||
},
|
||||
Sync(ProtocolMessage), // Subsequent sync round-trip messages
|
||||
Abort { reason: AbortReason }, // Responder rejects the request
|
||||
}
|
||||
```
|
||||
|
||||
### Serialization
|
||||
|
||||
Messages use `postcard` (a compact `serde` format optimized for embedded/no-std use). The `SyncCodec` implements `tokio_util::codec::Encoder` and `Decoder` for async stream framing.
|
||||
|
||||
## Protocol Sequence
|
||||
|
||||
```
|
||||
Alice (Initiator) Bob (Responder)
|
||||
│ │
|
||||
│──── Init { namespace, initial_msg } ───────▶│
|
||||
│ │
|
||||
│◀─── Sync(reply_msg) ────────────────────── │ (or Abort)
|
||||
│ │
|
||||
│──── Sync(next_msg) ──────────────────────▶│
|
||||
│ │
|
||||
│◀─── Sync(reply_msg) ────────────────────── │
|
||||
│ │
|
||||
│──── Sync(next_msg) ──────────────────────▶│
|
||||
│ │
|
||||
│ ... until convergence ... │
|
||||
│ │
|
||||
│──── (stream closed) ─────────────────────▶│
|
||||
│ │
|
||||
```
|
||||
|
||||
The protocol terminates when one side has no more messages to send (convergence reached). Each `Sync` message carries a `ProtocolMessage` which is a `ranger::Message<SignedEntry>` containing `MessagePart`s (either `RangeFingerprint` or `RangeItem`).
|
||||
|
||||
## SyncFinished Result
|
||||
|
||||
```rust
|
||||
pub struct SyncFinished {
|
||||
pub namespace: NamespaceId,
|
||||
pub peer: PublicKey,
|
||||
pub outcome: SyncOutcome, // heads_received, num_recv, num_sent
|
||||
pub timings: Timings, // connect duration, process duration
|
||||
}
|
||||
```
|
||||
|
||||
## Error Types
|
||||
|
||||
### ConnectError
|
||||
|
||||
```rust
|
||||
pub enum ConnectError {
|
||||
Connect { error: anyhow::Error }, // Connection failed
|
||||
RemoteAbort(AbortReason), // Remote rejected our request
|
||||
Sync { error: anyhow::Error }, // Sync protocol error
|
||||
Close { error: anyhow::Error }, // Stream close error
|
||||
}
|
||||
```
|
||||
|
||||
### AcceptError
|
||||
|
||||
```rust
|
||||
pub enum AcceptError {
|
||||
Connect { error: anyhow::Error }, // Connection failed
|
||||
Open { peer: PublicKey, error }, // Failed to open replica
|
||||
Abort { peer, namespace, reason }, // We aborted
|
||||
Sync { peer, namespace, error }, // Sync protocol error
|
||||
Close { peer, namespace, error }, // Stream close error
|
||||
}
|
||||
```
|
||||
|
||||
## Gossip Integration
|
||||
|
||||
The `GossipState` manages iroh-gossip subscriptions per namespace:
|
||||
|
||||
```rust
|
||||
pub struct GossipState {
|
||||
gossip: Gossip,
|
||||
sync: SyncHandle,
|
||||
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||
active: HashMap<NamespaceId, ActiveState>,
|
||||
active_tasks: JoinSet<(NamespaceId, Result<()>)>,
|
||||
}
|
||||
```
|
||||
|
||||
When a document starts syncing:
|
||||
1. The engine joins a gossip topic for that namespace
|
||||
2. `GossipState::join()` subscribes with bootstrap peers
|
||||
3. A receive loop task is spawned to process incoming gossip messages
|
||||
4. `Op` messages (Put, ContentReady, SyncReport) are deserialized and forwarded to `LiveActor`
|
||||
|
||||
When receiving an `Op::Put`:
|
||||
```rust
|
||||
// In the gossip receive loop:
|
||||
let entry = SignedEntry::from_entry(...); // deserialize
|
||||
sync.insert_remote(namespace, entry, from, content_status).await?;
|
||||
```
|
||||
|
||||
When receiving an `Op::SyncReport`:
|
||||
```rust
|
||||
// Forward to LiveActor which checks has_news_for_us()
|
||||
to_live_actor.send(ToLiveActor::IncomingSyncReport { from, report }).await?;
|
||||
```
|
||||
|
||||
Broadcasting:
|
||||
```rust
|
||||
// When a local insert occurs:
|
||||
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::Put(entry))).await;
|
||||
|
||||
// When content becomes ready:
|
||||
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::ContentReady(hash))).await;
|
||||
```
|
||||
|
||||
## Sync Report Compression
|
||||
|
||||
`SyncReport` encodes `AuthorHeads` with an optional size limit:
|
||||
|
||||
```rust
|
||||
pub struct SyncReport {
|
||||
namespace: NamespaceId,
|
||||
heads: Vec<u8>, // postcard-encoded AuthorHeads with size limit
|
||||
}
|
||||
```
|
||||
|
||||
The size limit ensures gossip messages stay small, dropping the oldest (least recent) author timestamps when necessary.
|
||||
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# iroh-docs: API and RPC
|
||||
|
||||
## DocsApi
|
||||
|
||||
The `DocsApi` provides an RPC-based interface to the docs engine, implemented via `irpc`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct DocsApi {
|
||||
inner: Client<DocsProtocol>,
|
||||
}
|
||||
```
|
||||
|
||||
### Methods (via irpc)
|
||||
|
||||
The API exposes document operations through an RPC protocol defined in `api/protocol.rs`:
|
||||
|
||||
| Method | Request | Response | Description |
|
||||
|--------|---------|----------|-------------|
|
||||
| `Open` | `OpenRequest { doc_id }` | `OpenResponse` | Open a document for operations |
|
||||
| `Close` | `CloseRequest { doc_id }` | `CloseResponse` | Close a document |
|
||||
| `Status` | `StatusRequest { doc_id }` | `StatusResponse { status: OpenState }` | Get document open state |
|
||||
| `List` | `ListRequest` | Stream of `ListResponse { id, capability }` | List all documents |
|
||||
| `Create` | `CreateRequest` | `CreateResponse { id }` | Create a new document |
|
||||
| `Drop` | `DropRequest { doc_id }` | `DropResponse` | Remove a document |
|
||||
| `Import` | `ImportRequest { capability }` | `ImportResponse { doc_id }` | Import a document by capability |
|
||||
| `Set` | `SetRequest { doc_id, author_id, key, value }` | `SetResponse { entry }` | Set a key-value pair |
|
||||
| `SetHash` | `SetHashRequest { doc_id, author_id, key, hash, size }` | `SetHashResponse` | Set a key with pre-hashed content |
|
||||
| `GetMany` | `GetManyRequest { doc_id, query }` | Stream of entries | Query entries |
|
||||
| `GetExact` | `GetExactRequest { doc_id, key, author, include_empty }` | `GetExactResponse { entry }` | Get single entry |
|
||||
| `Del` | `DelRequest { doc_id, author_id, key }` | `DelResponse { removed }` | Delete by key prefix |
|
||||
| `Subscribe` | `SubscribeRequest { doc_id }` | Stream of `LiveEvent` | Subscribe to document events |
|
||||
| `Share` | `ShareRequest { doc_id, mode, peers }` | `ShareResponse { ticket }` | Create a sharing ticket |
|
||||
| `StartSync` | `StartSyncRequest { doc_id, peers }` | `StartSyncResponse` | Start live sync |
|
||||
| `Leave` | `LeaveRequest { doc_id }` | `LeaveResponse` | Leave gossip swarm |
|
||||
| `ImportFile` | `ImportFileRequest { ... }` | Stream of `ImportProgress` | Import file content and set key |
|
||||
| `ExportFile` | `ExportFileRequest { ... }` | Stream of `ExportProgress` | Export content to file |
|
||||
| `AuthorList` | `AuthorListRequest` | Stream of `AuthorListResponse` | List authors |
|
||||
| `AuthorCreate` | `AuthorCreateRequest` | `AuthorCreateResponse { author_id }` | Create new author |
|
||||
| `AuthorImport` | `AuthorImportRequest { author }` | `AuthorImportResponse { author_id }` | Import author key |
|
||||
| `AuthorExport` | `AuthorExportRequest { author_id }` | `AuthorExportResponse { author }` | Export author key |
|
||||
| `AuthorDelete` | `AuthorDeleteRequest { author_id }` | `AuthorDeleteResponse` | Delete author |
|
||||
| `AuthorGetDefault` | `AuthorGetDefaultRequest` | `AuthorGetDefaultResponse { author_id }` | Get default author |
|
||||
| `AuthorSetDefault` | `AuthorSetDefaultRequest { author_id }` | `AuthorSetDefaultResponse` | Set default author |
|
||||
| `SetDownloadPolicy` | `SetDownloadPolicyRequest { doc_id, policy }` | `SetDownloadPolicyResponse` | Set download policy |
|
||||
| `GetDownloadPolicy` | `GetDownloadPolicyRequest { doc_id }` | `GetDownloadPolicyResponse { policy }` | Get download policy |
|
||||
| `GetSyncPeers` | `GetSyncPeersRequest { doc_id }` | `GetSyncPeersResponse { peers }` | Get known sync peers |
|
||||
|
||||
## RPC Implementation
|
||||
|
||||
The RPC is implemented via `irpc` (for local/remote procedure calls) and `noq` (for remote network access):
|
||||
|
||||
### Local API
|
||||
|
||||
`DocsApi::spawn(engine)` creates an `RpcActor` that processes requests against the engine directly:
|
||||
|
||||
```rust
|
||||
impl DocsApi {
|
||||
pub fn spawn(engine: Arc<Engine>) -> Self {
|
||||
RpcActor::spawn(engine)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Remote API
|
||||
|
||||
When the `rpc` feature is enabled, `DocsApi::connect(endpoint, addr)` creates a remote client that sends requests over the network via `noq`.
|
||||
|
||||
### Protocol Dispatch
|
||||
|
||||
```rust
|
||||
irpc::rpc::Handler<DocsProtocol> dispatches:
|
||||
DocsProtocol::Open(msg) => local.send((msg, tx)).await
|
||||
DocsProtocol::Set(msg) => local.send((msg, tx)).await
|
||||
// ... etc
|
||||
```
|
||||
|
||||
## RpcActor
|
||||
|
||||
The `RpcActor` (in `api/actor.rs`) bridges the RPC protocol to the `Engine`:
|
||||
|
||||
```rust
|
||||
struct RpcActor {
|
||||
engine: Arc<Engine>,
|
||||
}
|
||||
```
|
||||
|
||||
It handles each request type by calling the corresponding `Engine`/`SyncHandle` method and returning the result through the RPC channel.
|
||||
|
||||
For streaming responses (like `GetMany`, `Subscribe`, `AuthorList`), the actor sends results through an `mpsc` channel that the RPC framework streams back to the client.
|
||||
|
||||
## Share Mode and Tickets
|
||||
|
||||
When sharing a document:
|
||||
|
||||
```rust
|
||||
pub enum ShareMode {
|
||||
Read, // Share with read-only capability
|
||||
Write, // Share with full write capability
|
||||
}
|
||||
```
|
||||
|
||||
The `Share` RPC method:
|
||||
1. Gets or creates the namespace capability
|
||||
2. Creates a `DocTicket` with the capability and provided peer addresses
|
||||
3. Starts sync with the provided peers
|
||||
4. Returns the ticket for distribution
|
||||
|
||||
## Example: Basic Setup
|
||||
|
||||
```rust
|
||||
use iroh::{endpoint::presets, protocol::Router, Endpoint};
|
||||
use iroh_blobs::{BlobsProtocol, store::mem::MemStore, ALPN as BLOBS_ALPN};
|
||||
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
|
||||
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() -> anyhow::Result<()> {
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let blobs = MemStore::default();
|
||||
let gossip = Gossip::builder().spawn(endpoint.clone());
|
||||
let docs = Docs::memory()
|
||||
.spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
|
||||
.await?;
|
||||
|
||||
let router = Router::builder(endpoint.clone())
|
||||
.accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
|
||||
.accept(GOSSIP_ALPN, gossip)
|
||||
.accept(DOCS_ALPN, docs)
|
||||
.spawn();
|
||||
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
## Data Flow Summary
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Application / RPC │
|
||||
│ DocsApi ──irpc──▶ RpcActor ──▶ Engine / SyncHandle │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Live Sync (per document) │
|
||||
│ │
|
||||
│ LiveActor event loop: │
|
||||
│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
|
||||
│ │ Actor Messages │ │ Replica Events │ │ Gossip Events │ │
|
||||
│ │ (StartSync, │ │ (LocalInsert, │ │ (Put, │ │
|
||||
│ │ Subscribe, │ │ RemoteInsert) │ │ ContentReady, │ │
|
||||
│ │ Leave, ...) │ │ │ │ SyncReport) │ │
|
||||
│ └──────┬─────────┘ └───────┬────────┘ └──────┬──────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||||
│ │ LiveActor::run_inner() │ │
|
||||
│ │ tokio::select! { ... } │ │
|
||||
│ │ │ │
|
||||
│ │ - Start/stop gossip subscriptions │ │
|
||||
│ │ - Initiate outgoing syncs (connect_and_sync) │ │
|
||||
│ │ - Accept incoming syncs (handle_connection) │ │
|
||||
│ │ - Queue content downloads │ │
|
||||
│ │ - Broadcast local inserts via gossip │ │
|
||||
│ │ - Emit LiveEvent to subscribers │ │
|
||||
│ └──────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Running Tasks: │
|
||||
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||
│ │ sync_connect tasks│ │ sync_accept tasks │ │
|
||||
│ └───────────────────┘ └───────────────────┘ │
|
||||
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||
│ │ download tasks │ │ gossip receive loop│ │
|
||||
│ └───────────────────┘ └───────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Sync Actor (dedicated thread) │
|
||||
│ │
|
||||
│ ┌────────────┐ ┌─────────────────────────────────────────┐ │
|
||||
│ │ Action │ │ Replica Operations: │ │
|
||||
│ │ Channel │──▶│ Insert, Delete, Get, Query, │ │
|
||||
│ │ (bounded) │ │ SyncInit, SyncProcess, Open, Close, ...│ │
|
||||
│ └────────────┘ └─────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Store (redb) ──▶ All reads/writes on this thread │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,318 @@
|
||||
# iroh-docs: Key Types Reference
|
||||
|
||||
## Cryptographic Keys
|
||||
|
||||
### NamespaceSecret
|
||||
|
||||
```rust
|
||||
pub struct NamespaceSecret {
|
||||
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||
}
|
||||
```
|
||||
|
||||
- The write capability for a document
|
||||
- Can sign entries (namespace signature)
|
||||
- Derives `NamespacePublicKey` and `NamespaceId`
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### NamespacePublicKey
|
||||
|
||||
```rust
|
||||
pub struct NamespacePublicKey(VerifyingKey); // ed25519_dalek::VerifyingKey
|
||||
```
|
||||
|
||||
- The verifying key corresponding to `NamespaceSecret`
|
||||
- Can verify namespace signatures on entries
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### NamespaceId
|
||||
|
||||
```rust
|
||||
pub struct NamespaceId([u8; 32]);
|
||||
```
|
||||
|
||||
- The byte representation of `NamespacePublicKey`
|
||||
- Serves as the unique identifier for a document
|
||||
- Can be converted back to `NamespacePublicKey` via `PublicKeyStore` (handles invalid curve points)
|
||||
|
||||
### Author
|
||||
|
||||
```rust
|
||||
pub struct Author {
|
||||
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||
}
|
||||
```
|
||||
|
||||
- A writer identity within a document
|
||||
- Can sign entries (author signature)
|
||||
- Derives `AuthorPublicKey` and `AuthorId`
|
||||
- Created randomly with `Author::new(&mut rng)`
|
||||
- Stored persistently in the redb authors table
|
||||
|
||||
### AuthorPublicKey
|
||||
|
||||
```rust
|
||||
pub struct AuthorPublicKey(VerifyingKey);
|
||||
```
|
||||
|
||||
- The verifying key corresponding to an `Author`
|
||||
- Can verify author signatures on entries
|
||||
- Serialized as 32 bytes
|
||||
|
||||
### AuthorId
|
||||
|
||||
```rust
|
||||
pub struct AuthorId([u8; 32]);
|
||||
```
|
||||
|
||||
- Byte representation of `AuthorPublicKey`
|
||||
- Used as a component of `RecordIdentifier`
|
||||
- Has `fmt_short()` for human-readable display (first 10 hex chars)
|
||||
|
||||
## Entry Types
|
||||
|
||||
### RecordIdentifier
|
||||
|
||||
```rust
|
||||
pub struct RecordIdentifier(Bytes);
|
||||
// Layout: [NamespaceId(32) | AuthorId(32) | Key(variable)]
|
||||
```
|
||||
|
||||
- The composite key for an entry
|
||||
- Byte layout: 32 bytes namespace + 32 bytes author + variable-length key
|
||||
- Ordering: namespace → author → key (lexicographic)
|
||||
- This ordering is critical for the range-based sync algorithm
|
||||
|
||||
### Record
|
||||
|
||||
```rust
|
||||
pub struct Record {
|
||||
len: u64, // Byte length of content
|
||||
hash: Hash, // BLAKE3 hash of content (32 bytes)
|
||||
timestamp: u64, // Microseconds since Unix epoch
|
||||
}
|
||||
```
|
||||
|
||||
- The value portion of an entry
|
||||
- Ordering: timestamp first, then hash (Last-Writer-Wins)
|
||||
- `Record::empty(timestamp)` creates a tombstone (hash=EMPTY, len=0)
|
||||
- `Record::new_current(hash, len)` uses current system time
|
||||
|
||||
### Entry
|
||||
|
||||
```rust
|
||||
pub struct Entry {
|
||||
id: RecordIdentifier,
|
||||
record: Record,
|
||||
}
|
||||
```
|
||||
|
||||
- Combines key and value
|
||||
- `Entry::new(id, record)` constructor
|
||||
- `Entry::new_empty(id)` creates a tombstone with current timestamp
|
||||
- `entry.sign(namespace, author)` produces a `SignedEntry`
|
||||
|
||||
### SignedEntry
|
||||
|
||||
```rust
|
||||
pub struct SignedEntry {
|
||||
signature: EntrySignature, // Dual Ed25519 signatures
|
||||
entry: Entry,
|
||||
}
|
||||
```
|
||||
|
||||
- An entry with cryptographic proof of authorization and authorship
|
||||
- `SignedEntry::from_entry(entry, namespace, author)` — create from entry
|
||||
- `signed_entry.verify(store)` — verify both signatures using a `PublicKeyStore`
|
||||
- Implements `RangeEntry` for the sync algorithm
|
||||
|
||||
### EntrySignature
|
||||
|
||||
```rust
|
||||
pub struct EntrySignature {
|
||||
author_signature: Signature, // 64-byte Ed25519 signature
|
||||
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||
}
|
||||
```
|
||||
|
||||
- Created by signing the canonical byte encoding of the `Entry`
|
||||
- Both signatures cover the same message bytes
|
||||
- Verification requires both `NamespacePublicKey` and `AuthorPublicKey`
|
||||
|
||||
## Sync Types
|
||||
|
||||
### SyncOutcome
|
||||
|
||||
```rust
|
||||
pub struct SyncOutcome {
|
||||
pub heads_received: AuthorHeads,
|
||||
pub num_recv: usize,
|
||||
pub num_sent: usize,
|
||||
}
|
||||
```
|
||||
|
||||
- Tracks the result of a sync session
|
||||
- `heads_received` accumulates the latest timestamp seen from each author on the remote side
|
||||
|
||||
### ProtocolMessage
|
||||
|
||||
```rust
|
||||
pub type ProtocolMessage = ranger::Message<SignedEntry>;
|
||||
```
|
||||
|
||||
- The wire type for sync protocol messages
|
||||
- Contains `Vec<MessagePart<SignedEntry>>`
|
||||
|
||||
### ContentStatus
|
||||
|
||||
```rust
|
||||
pub enum ContentStatus {
|
||||
Complete, // Content blob fully available
|
||||
Incomplete, // Partially available
|
||||
Missing, // Not available
|
||||
}
|
||||
```
|
||||
|
||||
- Communicated alongside entries during sync
|
||||
- Helps peers decide whether to download content
|
||||
|
||||
### InsertOrigin
|
||||
|
||||
```rust
|
||||
pub enum InsertOrigin {
|
||||
Local,
|
||||
Sync {
|
||||
from: PeerIdBytes, // [u8; 32] — the remote peer
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Event Types
|
||||
|
||||
### Event (Internal)
|
||||
|
||||
```rust
|
||||
pub enum Event {
|
||||
LocalInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
},
|
||||
RemoteInsert {
|
||||
namespace: NamespaceId,
|
||||
entry: SignedEntry,
|
||||
from: PeerIdBytes,
|
||||
should_download: bool,
|
||||
remote_content_status: ContentStatus,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
- Emitted by `Replica` via `ReplicaInfo` subscribers
|
||||
- `should_download` is determined by the `DownloadPolicy`
|
||||
|
||||
### LiveEvent (Public)
|
||||
|
||||
```rust
|
||||
pub enum LiveEvent {
|
||||
InsertLocal { entry: Entry },
|
||||
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||
ContentReady { hash: Hash },
|
||||
PendingContentReady,
|
||||
NeighborUp(PublicKey),
|
||||
NeighborDown(PublicKey),
|
||||
SyncFinished(SyncEvent),
|
||||
}
|
||||
```
|
||||
|
||||
- Emitted by the `Engine` through `subscribe()`
|
||||
- `InsertLocal` / `InsertRemote` are derived from `Event` by stripping `SignedEntry` → `Entry`
|
||||
- `ContentReady` is emitted when a blob download completes
|
||||
- `SyncFinished` wraps `SyncFinished` from the network layer
|
||||
|
||||
## Store Types
|
||||
|
||||
### Store (store::fs::Store)
|
||||
|
||||
```rust
|
||||
pub struct Store {
|
||||
db: Database, // redb database
|
||||
transaction: CurrentTransaction, // Current read/write transaction
|
||||
open_replicas: HashSet<NamespaceId>, // Track which replicas are open
|
||||
pubkeys: MemPublicKeyStore, // Cache for expanded public keys
|
||||
}
|
||||
```
|
||||
|
||||
### Query
|
||||
|
||||
```rust
|
||||
pub struct Query {
|
||||
kind: QueryKind, // Flat or SingleLatestPerKey
|
||||
filter_author: AuthorFilter, // Any or Exact
|
||||
filter_key: KeyFilter, // Any, Exact, or Prefix
|
||||
limit: Option<u64>,
|
||||
offset: u64,
|
||||
include_empty: bool,
|
||||
sort_direction: SortDirection,
|
||||
}
|
||||
```
|
||||
|
||||
### Capability
|
||||
|
||||
```rust
|
||||
pub enum Capability {
|
||||
Write(NamespaceSecret),
|
||||
Read(NamespaceId),
|
||||
}
|
||||
```
|
||||
|
||||
- `Write` allows inserting entries and signing them
|
||||
- `Read` allows syncing and reading but not inserting
|
||||
- Can be serialized as `(u8, [u8; 32])` — kind byte + key bytes
|
||||
- `merge()` can upgrade `Read` to `Write`
|
||||
|
||||
### DownloadPolicy
|
||||
|
||||
```rust
|
||||
pub enum DownloadPolicy {
|
||||
NothingExcept(Vec<FilterKind>), // Whitelist mode
|
||||
EverythingExcept(Vec<FilterKind>), // Blacklist mode (default)
|
||||
}
|
||||
```
|
||||
|
||||
### DocTicket
|
||||
|
||||
```rust
|
||||
pub struct DocTicket {
|
||||
pub capability: Capability,
|
||||
pub nodes: Vec<EndpointAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
- Serializable as a base32 string with "doc" prefix
|
||||
- Contains everything needed to join a document
|
||||
- The wire format uses a versioned enum: `TicketWireFormat::Variant0(DocTicket)`
|
||||
|
||||
## OpenState
|
||||
|
||||
```rust
|
||||
pub struct OpenState {
|
||||
pub sync: bool, // Whether sync is enabled
|
||||
pub subscribers: usize, // Number of event subscribers
|
||||
pub handles: usize, // Number of open handles
|
||||
}
|
||||
```
|
||||
|
||||
Returned by the `Status` RPC method to report the state of an open document.
|
||||
|
||||
## Utility Constants
|
||||
|
||||
| Constant | Value | Purpose |
|
||||
|----------|-------|---------|
|
||||
| `MAX_TIMESTAMP_FUTURE_SHIFT` | 10 min in μs | Max future drift for entry timestamps |
|
||||
| `MAX_COMMIT_DELAY` | 500ms | Auto-commit interval for store transactions |
|
||||
| `ACTION_CAP` | 1024 | Bounded channel capacity for SyncHandle actions |
|
||||
| `ACTOR_CHANNEL_CAP` | 64 | Channel capacity for LiveActor messages |
|
||||
| `SUBSCRIBE_CHANNEL_CAP` | 256 | Channel capacity for event subscriptions |
|
||||
| `PEERS_PER_DOC_CACHE_SIZE` | 5 | LRU cache size for sync peers per document |
|
||||
| `MAX_MESSAGE_SIZE` | 1 GiB | Max wire message size |
|
||||
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# iroh-docs Reference Documentation
|
||||
|
||||
> Version: 0.98.0
|
||||
> Repository: https://github.com/n0-computer/iroh-docs
|
||||
> License: MIT/Apache-2.0
|
||||
> Based on: [Range-Based Set Reconciliation (Meyer, 2022)](https://arxiv.org/abs/2212.13567)
|
||||
|
||||
## Document Index
|
||||
|
||||
| # | File | Topic |
|
||||
|---|------|-------|
|
||||
| 01 | [Overview and Architecture](01-overview-and-architecture.md) | High-level architecture, module layout, dependencies, feature flags |
|
||||
| 02 | [Document Model](02-document-model.md) | CRDT data model: namespaces, authors, entries, signatures, prefix deletion, timestamps |
|
||||
| 03 | [Sync Protocol](03-sync-protocol.md) | Range-based set reconciliation algorithm, fingerprints, message format, Store trait |
|
||||
| 04 | [Store and Persistence](04-store-and-persistence.md) | redb table schema, transaction model, queries, download policies, PublicKeyStore |
|
||||
| 05 | [Engine and Live Sync](05-engine-and-live-sync.md) | Engine, LiveActor, GossipState, content download, event system, DefaultAuthor |
|
||||
| 06 | [Network Protocol](06-network-protocol.md) | ALPN, wire format, Alice/Bob protocol flow, error types, gossip integration |
|
||||
| 07 | [API and Data Flow](07-api-and-data-flow.md) | RPC API, DocsApi, protocol messages, data flow diagrams |
|
||||
| 08 | [Key Types Reference](08-key-types-reference.md) | All public types, constants, and their relationships |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Core Concepts
|
||||
|
||||
- **Namespace**: A document identity. Identified by `NamespaceId` (32 bytes), backed by an Ed25519 keypair (`NamespaceSecret`).
|
||||
- **Author**: A writer identity. Identified by `AuthorId` (32 bytes), backed by an Ed25519 keypair (`Author`).
|
||||
- **Entry**: A record identified by (namespace, author, key) with a value of (hash, len, timestamp).
|
||||
- **SignedEntry**: An entry with dual Ed25519 signatures (namespace + author) proving authorization and authorship.
|
||||
- **Replica**: A local instance of a document, holding entries in a store.
|
||||
- **Capability**: Either `Write(NamespaceSecret)` or `Read(NamespaceId)` — controls whether entries can be inserted.
|
||||
- **Store**: A `redb`-backed persistent store managing authors, namespaces, entries, and peer caches.
|
||||
- **Engine**: Coordinates sync actors, gossip, and content downloads for live synchronization.
|
||||
|
||||
### Key Algorithms
|
||||
|
||||
1. **Range-based set reconciliation**: Efficiently compute the union of two entry sets over a network by comparing fingerprints of partitions, subdividing when fingerprints differ.
|
||||
2. **Prefix deletion**: An entry at key "foo" acts as a tombstone for all entries whose key starts with "foo/".
|
||||
3. **Last-writer-wins**: When entries conflict on the same (namespace, author, key), the one with the higher (timestamp, hash) wins.
|
||||
4. **XOR fingerprints**: Fingerprint of a set is the XOR of individual entry fingerprints (BLAKE3 hashes of key data).
|
||||
|
||||
### Data Flow
|
||||
|
||||
```
|
||||
Application → DocsApi → Engine → LiveActor → GossipState → iroh-gossip
|
||||
↓ ↓
|
||||
SyncHandle → Actor → Store (redb) ← QUIC streams (iroh)
|
||||
↓
|
||||
iroh-blobs (content transfer)
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- `iroh` — QUIC networking
|
||||
- `iroh-blobs` — Content-addressed blob storage and transfer
|
||||
- `iroh-gossip` — Gossip protocol for live updates
|
||||
- `redb` — Embedded key-value store
|
||||
- `ed25519-dalek` — Ed25519 signatures
|
||||
- `blake3` — Hashing
|
||||
- `postcard` — Serialization
|
||||
@@ -0,0 +1,79 @@
|
||||
# iroh-gossip: Overview & Architecture
|
||||
|
||||
## What Is iroh-gossip?
|
||||
|
||||
`iroh-gossip` is a Rust crate that implements an **epidemic broadcast tree** protocol for disseminating messages among a swarm of peers interested in a common **topic**. It is based on two academic papers:
|
||||
|
||||
- **HyParView** — A hybrid partial view membership protocol for reliable swarm management ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf))
|
||||
- **PlumTree** — An epidemic broadcast tree protocol for efficient message dissemination ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf))
|
||||
|
||||
The crate is designed as a protocol layer for the [iroh](https://docs.rs/iroh) networking library, but the core protocol logic is **IO-free** and can be used independently.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
The crate is organized into two primary modules:
|
||||
|
||||
| Module | Purpose | IO-aware? |
|
||||
|--------|---------|-----------|
|
||||
| `proto` | Pure state-machine implementation of the gossip protocol | No — completely IO-free |
|
||||
| `net` | Networking layer that runs the protocol over iroh connections | Yes — depends on `iroh` and tokio |
|
||||
|
||||
The `net` module is behind the `net` feature flag (enabled by default). An optional `rpc` feature adds remote procedure call support via the `irpc`/`noq` crates.
|
||||
|
||||
### Module Dependency Graph
|
||||
|
||||
```
|
||||
┌──────────────┐
|
||||
│ api │ ← Public API (Gossip, GossipTopic, GossipSender, GossipReceiver)
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌──────▼───────┐
|
||||
│ net │ ← Networking actor, connection loops, dialer
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌──────▼───────┐
|
||||
│ proto │ ← Pure protocol state machines
|
||||
│ ┌─────────┐ │
|
||||
│ │hyparview│ │ ← Membership layer
|
||||
│ ├─────────┤ │
|
||||
│ │ plumtree│ │ ← Broadcast layer
|
||||
│ ├─────────┤ │
|
||||
│ │ topic │ │ ← Per-topic coordinator
|
||||
│ ├─────────┤ │
|
||||
│ │ state │ │ ← Multi-topic state manager
|
||||
│ ├─────────┤ │
|
||||
│ │ util │ │ ← Shared data structures (IndexSet, TimeBoundCache, TimerMap)
|
||||
│ └─────────┘ │
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
### Key Design Principles
|
||||
|
||||
1. **IO-free protocol core**: The `proto` module is a pure state machine. It takes `InEvent`s, produces `OutEvent`s, and has no knowledge of sockets, async runtimes, or network IO.
|
||||
|
||||
2. **Topic-based isolation**: Each topic (`TopicId` = 32-byte identifier) has completely independent state. Topics are separate swarms and broadcast scopes. Joining multiple topics increases connections and routing table size proportionally.
|
||||
|
||||
3. **Actor model for networking**: The `net` module runs a single async `Actor` that manages all topics, connections, and timers. It bridges between the protocol state machine and real network IO.
|
||||
|
||||
4. **Wire protocol**: Messages are serialized with `postcard` (a `no_std`-friendly serde format) and sent over QUIC streams via iroh connections. Each stream is prefixed with a `StreamHeader` containing the topic ID.
|
||||
|
||||
## Crate Features
|
||||
|
||||
| Feature | Default? | Description |
|
||||
|---------|----------|-------------|
|
||||
| `net` | Yes | Networking layer (requires `iroh`, `tokio`, etc.) |
|
||||
| `rpc` | No | RPC support via `irpc`/`noq` for remote control |
|
||||
| `metrics` | Yes | Prometheus-style metrics via `iroh-metrics` |
|
||||
| `test-utils` | No | Test utilities (seeded RNG, etc.) |
|
||||
| `simulator` | No | CLI simulator for testing |
|
||||
| `examples` | No | Example binaries (chat, setup) |
|
||||
|
||||
## Cargo Dependencies (Key Ones)
|
||||
|
||||
- `iroh` / `iroh-base` — Networking primitives (Endpoint, EndpointId, PublicKey, etc.)
|
||||
- `postcard` — Wire serialization (serde-based, `no_std` compatible)
|
||||
- `blake3` — Message ID hashing
|
||||
- `ed25519-dalek` — Cryptographic signatures
|
||||
- `n0-future` / `n0-error` — Async utilities and error handling
|
||||
- `irpc` / `noq` — RPC infrastructure (optional)
|
||||
- `indexmap` — Order-preserving hash collections used in `IndexSet`
|
||||
@@ -0,0 +1,169 @@
|
||||
# iroh-gossip: HyParView Membership Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
The HyParView protocol provides **swarm membership management** — it maintains which peers are currently part of the swarm for a given topic and ensures the overlay network remains connected even as nodes join, leave, or fail.
|
||||
|
||||
It is implemented in `src/proto/hyparview.rs`.
|
||||
|
||||
## Core Concept: Two Views
|
||||
|
||||
Each peer maintains two sets of peers:
|
||||
|
||||
| View | Description | Default Size | Connection? |
|
||||
|------|-------------|--------------|-------------|
|
||||
| **Active View** | Peers we maintain active bidirectional connections to | 5 | Yes — TCP/QUIC connection is kept open |
|
||||
| **Passive View** | An address book of peers we know about but are not connected to | 30 | No — just contact information |
|
||||
|
||||
Key invariants:
|
||||
- **Active connections are always bidirectional**: If peer A has peer B in its active view, peer B also has peer A in its active view.
|
||||
- The passive view serves as a **failover pool**: When an active peer disconnects, a random peer from the passive view is promoted to fill the slot.
|
||||
|
||||
## Configuration (`hyparview::Config`)
|
||||
|
||||
```rust
|
||||
pub struct Config {
|
||||
pub active_view_capacity: usize, // Default: 5
|
||||
pub passive_view_capacity: usize, // Default: 30
|
||||
pub active_random_walk_length: Ttl, // Default: Ttl(6)
|
||||
pub passive_random_walk_length: Ttl, // Default: Ttl(3)
|
||||
pub shuffle_random_walk_length: Ttl, // Default: Ttl(6)
|
||||
pub shuffle_active_view_count: usize, // Default: 3
|
||||
pub shuffle_passive_view_count: usize, // Default: 4
|
||||
pub shuffle_interval: Duration, // Default: 60s
|
||||
pub neighbor_request_timeout: Duration, // Default: 500ms
|
||||
}
|
||||
```
|
||||
|
||||
These defaults come directly from the HyParView paper (p9), except for `shuffle_interval` and `neighbor_request_timeout` which are "wild guesses" in the code.
|
||||
|
||||
## State Structure
|
||||
|
||||
```rust
|
||||
pub struct State<PI, RG = ThreadRng> {
|
||||
me: PI, // Our peer identity
|
||||
me_data: Option<PeerData>, // Opaque data we share with peers
|
||||
pub active_view: IndexSet<PI>, // Connected peers
|
||||
pub passive_view: IndexSet<PI>, // Known but disconnected peers
|
||||
config: Config,
|
||||
shuffle_scheduled: bool, // Whether shuffle timer is active
|
||||
rng: RG, // Random number generator
|
||||
stats: Stats,
|
||||
pending_neighbor_requests: HashSet<PI>, // Peers we've sent Neighbor to but no reply yet
|
||||
peer_data: HashMap<PI, PeerData>, // Opaque data received from other peers
|
||||
alive_disconnect_peers: HashSet<PI>, // Peers disconnecting but to keep in passive view
|
||||
}
|
||||
```
|
||||
|
||||
## Messages (`hyparview::Message`)
|
||||
|
||||
| Message | Direction | Purpose |
|
||||
|---------|-----------|---------|
|
||||
| `Join(Option<PeerData>)` | New node → Contact | Sent to a known peer to join the swarm |
|
||||
| `ForwardJoin(ForwardJoin)` | Propagated | Forwarded to active view to introduce a new member |
|
||||
| `Neighbor(Neighbor)` | Bidirectional | Request to add sender to active view (with priority) |
|
||||
| `Disconnect(Disconnect)` | Bidirectional | Notification that a peer is leaving or being demoted |
|
||||
| `Shuffle(Shuffle)` | Initiated periodically | Sent to random peer to exchange passive view contacts |
|
||||
| `ShuffleReply(ShuffleReply)` | Reply to Shuffle | Returns a random subset of our views to the origin |
|
||||
|
||||
### Message Details
|
||||
|
||||
```rust
|
||||
pub struct ForwardJoin<PI> {
|
||||
peer: PeerInfo<PI>, // The new peer's identity + optional data
|
||||
ttl: Ttl, // Time-to-live, decremented per hop
|
||||
}
|
||||
|
||||
pub struct Shuffle<PI> {
|
||||
origin: PI, // Who initiated the shuffle
|
||||
nodes: Vec<PeerInfo<PI>>, // Random subset of our views
|
||||
ttl: Ttl, // Time-to-live for the random walk
|
||||
}
|
||||
|
||||
pub struct Neighbor {
|
||||
priority: Priority, // High (cannot be denied) or Low (can be denied)
|
||||
data: Option<PeerData>,
|
||||
}
|
||||
|
||||
pub struct Disconnect {
|
||||
alive: bool, // If true, peer is still alive (just demoting)
|
||||
_respond: bool, // Obsolete, kept for wire compat
|
||||
}
|
||||
```
|
||||
|
||||
## Join Procedure (Step by Step)
|
||||
|
||||
1. A new node sends `Join(me_data)` to a known contact peer.
|
||||
2. The contact peer adds the new node to its active view (even evicting a random peer if necessary).
|
||||
3. The contact peer forwards `ForwardJoin` to all other peers in its active view with `TTL = active_random_walk_length`.
|
||||
4. Each peer receiving `ForwardJoin`:
|
||||
- If `TTL == 0` or active view has ≤1 peer: sends `Neighbor(High)` to the new node (which adds it to active view).
|
||||
- If `TTL == passive_random_walk_length`: adds the new node to passive view.
|
||||
- Decrements TTL and forwards to a random active peer (different from sender).
|
||||
|
||||
5. The `Neighbor` message establishes the bidirectional active connection. A `Priority::High` neighbor request **must** be accepted (potentially evicting a random active peer). A `Priority::Low` request is only accepted if there is room.
|
||||
|
||||
## Shuffle Mechanism
|
||||
|
||||
Periodically (every `shuffle_interval`), each node:
|
||||
1. Picks a random active peer.
|
||||
2. Sends `Shuffle` containing a random subset of active + passive views plus the origin's info, with a TTL.
|
||||
3. The shuffle message does a random walk (each hop decrements TTL).
|
||||
4. When TTL reaches 0 or the active view is ≤1, the peer accepts the shuffle and replies with `ShuffleReply` containing its own random peers.
|
||||
5. The origin receives `ShuffleReply` and adds new peers to its passive view.
|
||||
|
||||
This ensures the passive view remains fresh and provides good connectivity even in dynamic networks.
|
||||
|
||||
## Failure Recovery
|
||||
|
||||
When a peer in the active view disconnects (detected via `PeerDisconnected`):
|
||||
1. The peer is removed from the active view.
|
||||
2. A `NeighborDown` event is emitted.
|
||||
3. A random peer from the passive view is selected and sent a `Neighbor(Low)` request.
|
||||
4. If that peer doesn't respond within `neighbor_request_timeout`, it's removed from the passive view and another peer is tried.
|
||||
5. This continues until a connection is established or the passive view is exhausted.
|
||||
|
||||
If a `Disconnect(alive=true)` message is received:
|
||||
- The peer is moved to the passive view (not just dropped), because it's still alive.
|
||||
- The `alive_disconnect_peers` set tracks which disconnected peers should be retained in passive view when their connection eventually closes.
|
||||
|
||||
## PeerData
|
||||
|
||||
`PeerData` is an opaque `Bytes` type that peers exchange when joining. In the `net` module, it is used to serialize and transmit addressing information (`AddrInfo`):
|
||||
|
||||
```rust
|
||||
struct AddrInfo {
|
||||
relay_url: Option<RelayUrl>,
|
||||
direct_addresses: BTreeSet<SocketAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
This allows the gossip protocol itself to help propagate connectivity information, enabling the `GossipAddressLookup` service to feed addresses back into iroh's endpoint discovery system.
|
||||
|
||||
## Events (`hyparview::Event`)
|
||||
|
||||
| Event | Meaning |
|
||||
|-------|---------|
|
||||
| `NeighborUp(PI)` | A peer was added to our active view |
|
||||
| `NeighborDown(PI)` | A peer was removed from our active view |
|
||||
|
||||
These events are forwarded up to the PlumTree layer and to the application.
|
||||
|
||||
## Timers
|
||||
|
||||
| Timer | Purpose |
|
||||
|-------|---------|
|
||||
| `DoShuffle` | Periodically trigger a shuffle operation |
|
||||
| `PendingNeighborRequest(PI)` | Timeout for a pending neighbor request |
|
||||
|
||||
## IO Trait Pattern
|
||||
|
||||
The HyParView state machine is generic over an `IO` trait:
|
||||
|
||||
```rust
|
||||
pub trait IO<PI: Clone> {
|
||||
fn push(&mut self, event: impl Into<OutEvent<PI>>);
|
||||
}
|
||||
```
|
||||
|
||||
This allows the protocol to emit output events without knowing about the networking layer. The upper layers supply a `VecDeque<OutEvent>` or similar container.
|
||||
@@ -0,0 +1,256 @@
|
||||
# iroh-gossip: PlumTree Broadcast Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
The PlumTree (Epidemic Broadcast Trees) protocol provides **efficient message broadcasting** across all peers in a topic's swarm. It builds on top of HyParView's membership layer, using the active view as its peer set.
|
||||
|
||||
It is implemented in `src/proto/plumtree.rs`.
|
||||
|
||||
## Core Concept: Eager vs Lazy Push
|
||||
|
||||
Each peer maintains two subsets of its HyParView active view:
|
||||
|
||||
| Set | Description | Behavior |
|
||||
|-----|-------------|----------|
|
||||
| **Eager push peers** | Peers to whom full messages are sent immediately | Messages are pushed eagerly (full content) |
|
||||
| **Lazy push peers** | Peers to whom only message IDs (hashes) are sent | `IHave` announcements are sent, requesting content only if needed |
|
||||
|
||||
When a peer broadcasts a message:
|
||||
1. The **full message** is pushed to all **eager** peers.
|
||||
2. The **message ID** (a blake3 hash) is pushed to all **lazy** peers (after a short delay for batching).
|
||||
|
||||
This creates an **optimized broadcast tree**: eager peers form a spanning tree for low-latency delivery, while lazy peers provide redundancy through timeout-based recovery.
|
||||
|
||||
## Configuration (`plumtree::Config`)
|
||||
|
||||
```rust
|
||||
pub struct Config {
|
||||
pub graft_timeout_1: Duration, // Default: 80ms
|
||||
pub graft_timeout_2: Duration, // Default: 40ms
|
||||
pub dispatch_timeout: Duration, // Default: 5ms
|
||||
pub optimization_threshold: Round, // Default: Round(7)
|
||||
pub message_cache_retention: Duration, // Default: 30s
|
||||
pub message_id_retention: Duration, // Default: 90s
|
||||
pub cache_evict_interval: Duration, // Default: 1s
|
||||
}
|
||||
```
|
||||
|
||||
### Timeout Semantics
|
||||
|
||||
- **`graft_timeout_1`**: After receiving an `IHave`, wait this long for the full message from an eager peer. If it doesn't arrive, send a `Graft` to the `IHave` sender.
|
||||
- **`graft_timeout_2`**: After sending a `Graft`, wait this shorter timeout for the reply. If no reply, try the next `IHave` sender.
|
||||
- **`dispatch_timeout`**: Delay before batching and sending `IHave` messages. This allows multiple announcements to be aggregated into a single message.
|
||||
- **`optimization_threshold`**: Number of hops difference required to trigger tree optimization (see below).
|
||||
|
||||
### Cache Settings
|
||||
|
||||
- **`message_cache_retention`**: How long to keep full message payloads in cache. This enables replying to `Graft` requests from peers who missed the eager push.
|
||||
- **`message_id_retention`**: How long to remember that we've already seen a message ID. This prevents re-delivering duplicate messages.
|
||||
- **`cache_evict_interval`**: How often to check and evict expired entries.
|
||||
|
||||
## State Structure
|
||||
|
||||
```rust
|
||||
pub struct State<PI> {
|
||||
me: PI, // Our peer identity
|
||||
config: Config, // Protocol configuration
|
||||
|
||||
pub eager_push_peers: BTreeSet<PI>, // Full message delivery peers
|
||||
pub lazy_push_peers: BTreeSet<PI>, // Message-ID-only delivery peers
|
||||
|
||||
lazy_push_queue: BTreeMap<PI, Vec<IHave>>, // Pending IHave announcements (batched)
|
||||
|
||||
missing_messages: HashMap<MessageId, VecDeque<(PI, Round)>>, // IHave senders awaiting delivery
|
||||
received_messages: TimeBoundCache<MessageId, ()>, // Seen message IDs
|
||||
cache: TimeBoundCache<MessageId, Gossip>, // Full message payloads
|
||||
|
||||
graft_timer_scheduled: HashSet<MessageId>, // Active graft timers
|
||||
dispatch_timer_scheduled: bool, // Whether IHave dispatch is pending
|
||||
|
||||
init: bool, // Whether first event was processed
|
||||
stats: Stats, // Message counters
|
||||
max_message_size: usize, // Maximum allowed message size
|
||||
}
|
||||
```
|
||||
|
||||
## Message Types (`plumtree::Message`)
|
||||
|
||||
| Message | Direction | Purpose |
|
||||
|---------|-----------|---------|
|
||||
| `Gossip(Gossip)` | Eager push | Full message content, broadcast to eager peers |
|
||||
| `Prune` | Bidirectional | Sent when moving a peer from eager to lazy set |
|
||||
| `Graft(Graft)` | Lazy → Eager upgrade | Request to become an eager peer; may include a message ID to request re-delivery |
|
||||
| `IHave(Vec<IHave>)` | Lazy push | Announcement: "I have these messages" (batched, sent after `dispatch_timeout`) |
|
||||
|
||||
### Gossip Message Structure
|
||||
|
||||
```rust
|
||||
pub struct Gossip {
|
||||
id: MessageId, // blake3 hash of content
|
||||
content: Bytes, // The actual message payload
|
||||
scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||
}
|
||||
```
|
||||
|
||||
The `DeliveryScope` tracks how many hops the message has traveled:
|
||||
|
||||
```rust
|
||||
pub enum DeliveryScope {
|
||||
Swarm(Round), // Delivered via the swarm; Round = hop count from origin
|
||||
Neighbors, // Delivered only to direct neighbors (not forwarded further)
|
||||
}
|
||||
```
|
||||
|
||||
Each time a `Gossip` message is forwarded, its `Round` is incremented via `next_round()`. `Neighbors`-scope messages are not forwarded at all.
|
||||
|
||||
### IHave Structure
|
||||
|
||||
```rust
|
||||
pub struct IHave {
|
||||
id: MessageId, // The blake3 hash of the message content
|
||||
round: Round, // The hop count at which the sender received this message
|
||||
}
|
||||
```
|
||||
|
||||
### Graft Structure
|
||||
|
||||
```rust
|
||||
pub struct Graft {
|
||||
id: Option<MessageId>, // If set, also reply with full message content
|
||||
round: Round, // The round from the IHave that triggered this graft
|
||||
}
|
||||
```
|
||||
|
||||
### Message ID
|
||||
|
||||
```rust
|
||||
pub struct MessageId([u8; 32]); // blake3 hash of message content
|
||||
|
||||
impl MessageId {
|
||||
pub fn from_content(message: &[u8]) -> Self {
|
||||
Self::from(blake3::hash(message))
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Messages are validated: when receiving a `Gossip`, the receiver checks that `MessageId::from_content(&content) == id`. Spoofed messages (where the hash doesn't match the content) are silently discarded.
|
||||
|
||||
## Broadcast Flow
|
||||
|
||||
### Sending a Message
|
||||
|
||||
```
|
||||
1. Compute MessageId = blake3(content)
|
||||
2. Create Gossip { id, content, scope: Swarm(Round(0)) or Neighbors }
|
||||
3. If Swarm scope:
|
||||
a. Add to received_messages and cache
|
||||
b. Queue IHave for lazy peers (dispatched after dispatch_timeout)
|
||||
4. Eager-push Gossip to all eager peers (except self and sender)
|
||||
```
|
||||
|
||||
### Receiving a Gossip Message
|
||||
|
||||
```
|
||||
1. Validate: message.id == blake3(message.content) → discard if invalid
|
||||
2. If already received (in received_messages):
|
||||
→ Send Prune to sender (move sender to lazy set)
|
||||
→ Return (don't re-broadcast)
|
||||
3. If Swarm scope:
|
||||
a. Add to received_messages
|
||||
b. Increment round (next_round)
|
||||
c. Add to cache (for Graft replies)
|
||||
d. Eager-push to all eager peers (except sender)
|
||||
e. Lazy-push IHave to all lazy peers (except sender)
|
||||
f. Check if any prior IHave senders had a shorter path → optimize tree
|
||||
4. Emit Received event to application
|
||||
```
|
||||
|
||||
### Receiving an IHave
|
||||
|
||||
```
|
||||
For each IHave entry:
|
||||
If message ID not in received_messages:
|
||||
Add (sender, round) to missing_messages[message_id]
|
||||
If no graft timer scheduled for this message:
|
||||
Schedule SendGraft timer (graft_timeout_1)
|
||||
```
|
||||
|
||||
### Graft Timer Expiry (Two-Phase)
|
||||
|
||||
**Phase 1 (`graft_timeout_1`):**
|
||||
```
|
||||
If message already received → no-op (cancel)
|
||||
Otherwise:
|
||||
Pop first (peer, round) from missing_messages[message_id]
|
||||
Move peer to eager set
|
||||
Send Graft { id: Some(message_id), round } to that peer
|
||||
Schedule another SendGraft timer (graft_timeout_2) for fallback
|
||||
```
|
||||
|
||||
**Phase 2 (`graft_timeout_2`):**
|
||||
```
|
||||
If message already received → no-op
|
||||
Otherwise:
|
||||
Pop next (peer, round) from missing_messages[message_id]
|
||||
Move that peer to eager set
|
||||
Send Graft { id: Some(message_id), round }
|
||||
Schedule another SendGraft timer (graft_timeout_2)
|
||||
(continues until the message is received or senders are exhausted)
|
||||
```
|
||||
|
||||
### Receiving a Graft
|
||||
|
||||
```
|
||||
1. Move sender to eager set
|
||||
2. If Graft contains a message ID:
|
||||
Look up message in cache
|
||||
If found: send Gossip(message) to the requesting peer
|
||||
```
|
||||
|
||||
### Receiving a Prune
|
||||
|
||||
```
|
||||
Move sender from eager set to lazy set
|
||||
```
|
||||
|
||||
## Tree Optimization
|
||||
|
||||
The PlumTree self-optimizes based on latency. When a `Gossip` message is received, if we previously received an `IHave` for the same message from a different peer, we check whether the IHave path was significantly shorter:
|
||||
|
||||
```
|
||||
if (ihave_round < gossip_round) && (gossip_round - ihave_round) >= optimization_threshold:
|
||||
Graft the IHave sender (move to eager)
|
||||
Prune the Gossip sender (move to lazy)
|
||||
```
|
||||
|
||||
This means if a peer consistently has a shorter path to the message origin, they are promoted to eager, and the longer-path peer is demoted. The `optimization_threshold` (default: 7 hops) prevents thrashing from minor latency differences.
|
||||
|
||||
## Neighbor Events
|
||||
|
||||
PlumTree receives neighbor events from HyParView:
|
||||
|
||||
- **`NeighborUp(peer)`**: Add peer to eager set (all new neighbors start as eager)
|
||||
- **`NeighborDown(peer)`**: Remove from both eager and lazy sets; clean up any `IHave` entries from this peer in `missing_messages`
|
||||
|
||||
## Neighbor-Only Broadcast
|
||||
|
||||
The `Scope::Neighbors` broadcast scope sends a message only to directly connected peers (the active view), without any forwarding:
|
||||
|
||||
```rust
|
||||
pub enum Scope {
|
||||
Swarm, // Broadcast to all peers in the swarm
|
||||
Neighbors, // Broadcast only to immediate neighbors
|
||||
}
|
||||
```
|
||||
|
||||
Neighbor-scoped messages are useful for localized communication and are not cached or re-broadcast.
|
||||
|
||||
## Cache Management
|
||||
|
||||
The PlumTree maintains two time-bounded caches:
|
||||
|
||||
1. **`cache`** (`TimeBoundCache<MessageId, Gossip>`): Stores full message payloads for `message_cache_retention` (default 30s). This enables replying to `Graft` requests for recently-broadcast messages.
|
||||
|
||||
2. **`received_messages`** (`TimeBoundCache<MessageId, ()>`): Tracks which messages have been seen for `message_id_retention` (default 90s). This prevents duplicate delivery.
|
||||
|
||||
Both caches are periodically evicted (every `cache_evict_interval`, default 1s) via the `EvictCache` timer.
|
||||
187
docs/research/references/iroh/iroh-gossip/04-state-and-topic.md
Normal file
187
docs/research/references/iroh/iroh-gossip/04-state-and-topic.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# iroh-gossip: Protocol State & Topic Coordination
|
||||
|
||||
## Overview
|
||||
|
||||
The `state` module (`src/proto/state.rs`) provides the **top-level protocol state machine** that manages multiple topics. The `topic` module (`src/proto/topic.rs`) coordinates the HyParView and PlumTree state machines for a single topic.
|
||||
|
||||
## Multi-Topic State (`state::State`)
|
||||
|
||||
```rust
|
||||
pub struct State<PI, R> {
|
||||
me: PI, // Our peer identity
|
||||
me_data: PeerData, // Our opaque peer data
|
||||
config: Config, // Protocol configuration
|
||||
rng: R, // Random number generator
|
||||
states: HashMap<TopicId, topic::State<PI, R>>, // Per-topic state
|
||||
outbox: Outbox<PI>, // Buffered output events
|
||||
peer_topics: ConnsMap<PI>, // Maps peer → set of shared topics
|
||||
}
|
||||
```
|
||||
|
||||
The `State` acts as a **multiplexer** — it routes events to the correct topic's state and collects output events. It also tracks which topics are shared with each peer (in `peer_topics`), which is used to determine when a peer connection can safely be closed (only when no topic still needs it).
|
||||
|
||||
### TopicId
|
||||
|
||||
```rust
|
||||
#[derive(Clone, Copy, Eq, PartialEq, Hash, Serialize, Ord, PartialOrd, Deserialize)]
|
||||
pub struct TopicId([u8; 32]);
|
||||
```
|
||||
|
||||
A 32-byte identifier for a topic. Typically created as `blake3::hash(topic_name)` or from raw bytes. Each topic is an independent swarm and broadcast scope.
|
||||
|
||||
### Wire Message Format
|
||||
|
||||
```rust
|
||||
pub struct Message<PI> {
|
||||
pub topic: TopicId,
|
||||
pub message: topic::Message<PI>,
|
||||
}
|
||||
```
|
||||
|
||||
Every wire message carries the `TopicId` prefix, allowing multiplexing of multiple topics over a single connection.
|
||||
|
||||
### Event Routing
|
||||
|
||||
`InEvent` is mapped to either a topic-specific event or a global event:
|
||||
|
||||
| InEvent | Routing |
|
||||
|---------|---------|
|
||||
| `RecvMessage(from, Message{topic, message})` | → Topic-specific: `topic::InEvent::RecvMessage` |
|
||||
| `Command(topic, command)` | → Topic-specific: `topic::InEvent::Command` |
|
||||
| `TimerExpired(Timer{topic, timer})` | → Topic-specific: `topic::InEvent::TimerExpired` |
|
||||
| `PeerDisconnected(peer)` | → Broadcast to ALL topics |
|
||||
| `UpdatePeerData(data)` | → Broadcast to ALL topics |
|
||||
|
||||
### Topic Lifecycle
|
||||
|
||||
When a `Command::Join(peers)` is received for a topic that doesn't yet have state, a new `topic::State` is automatically created. When `Command::Quit` is received, the topic's state is removed after processing the quit event.
|
||||
|
||||
### Connection Management
|
||||
|
||||
When a `topic::OutEvent::DisconnectPeer(peer)` is emitted, the state module checks `peer_topics` to see if any other topic still needs a connection to that peer. Only when no topic needs the peer anymore is `OutEvent::DisconnectPeer(peer)` emitted at the top level.
|
||||
|
||||
## Topic State (`topic::State`)
|
||||
|
||||
```rust
|
||||
pub struct State<PI, R> {
|
||||
me: PI,
|
||||
pub swarm: hyparview::State<PI, R>, // HyParView membership
|
||||
pub gossip: plumtree::State<PI>, // PlumTree broadcast
|
||||
outbox: VecDeque<OutEvent<PI>>,
|
||||
stats: Stats,
|
||||
}
|
||||
```
|
||||
|
||||
The topic state **composes** HyParView and PlumTree, bridging them together:
|
||||
|
||||
### Event Forwarding
|
||||
|
||||
When `topic::State::handle()` is called:
|
||||
|
||||
1. **HyParView events** are processed first (membership layer).
|
||||
2. **NeighborUp/NeighborDown events** emitted by HyParView are forwarded to PlumTree:
|
||||
- `NeighborUp(peer)` → `plumtree::InEvent::NeighborUp(peer)` — adds peer to eager set
|
||||
- `NeighborDown(peer)` → `plumtree::InEvent::NeighborDown(peer)` — removes peer from both sets
|
||||
3. All output events from both layers are collected and returned.
|
||||
|
||||
### Command Handling
|
||||
|
||||
| Command | Action |
|
||||
|---------|--------|
|
||||
| `Join(peers)` | Sends `RequestJoin(peer)` to HyParView for each peer in the list |
|
||||
| `Broadcast(data, scope)` | Sends `Broadcast(data, scope)` to PlumTree |
|
||||
| `Quit` | Sends `Quit` to HyParView (which sends `Disconnect` to all active peers) |
|
||||
|
||||
### Message Routing
|
||||
|
||||
When a topic message is received:
|
||||
|
||||
```rust
|
||||
match message {
|
||||
Message::Swarm(message) => hyparview.handle(RecvMessage(from, message)),
|
||||
Message::Gossip(message) => plumtree.handle(RecvMessage(from, message)),
|
||||
}
|
||||
```
|
||||
|
||||
### Timer Routing
|
||||
|
||||
```rust
|
||||
match timer {
|
||||
Timer::Swarm(timer) => hyparview.handle(TimerExpired(timer)),
|
||||
Timer::Gossip(timer) => plumtree.handle(TimerExpired(timer)),
|
||||
}
|
||||
```
|
||||
|
||||
## Topic Messages (`topic::Message`)
|
||||
|
||||
```rust
|
||||
pub enum Message<PI> {
|
||||
Swarm(hyparview::Message<PI>), // Membership messages
|
||||
Gossip(plumtree::Message), // Broadcast messages
|
||||
}
|
||||
```
|
||||
|
||||
The message kind is used for metrics tracking:
|
||||
|
||||
```rust
|
||||
pub fn kind(&self) -> MessageKind {
|
||||
match self {
|
||||
Message::Swarm(_) => MessageKind::Control,
|
||||
Message::Gossip(message) => match message {
|
||||
plumtree::Message::Gossip(_) => MessageKind::Data,
|
||||
_ => MessageKind::Control,
|
||||
},
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Topic Events (`topic::Event`)
|
||||
|
||||
```rust
|
||||
pub enum Event<PI> {
|
||||
NeighborUp(PI), // From HyParView: new active neighbor
|
||||
NeighborDown(PI), // From HyParView: lost active neighbor
|
||||
Received(GossipEvent<PI>), // From PlumTree: received a gossip message
|
||||
}
|
||||
```
|
||||
|
||||
The `Received` event contains:
|
||||
|
||||
```rust
|
||||
pub struct GossipEvent<PI> {
|
||||
pub content: Bytes, // Message payload
|
||||
pub delivered_from: PI, // Peer that delivered the message to us
|
||||
pub scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||
}
|
||||
```
|
||||
|
||||
## Topic Configuration
|
||||
|
||||
```rust
|
||||
pub struct Config {
|
||||
pub membership: hyparview::Config, // HyParView configuration
|
||||
pub broadcast: plumtree::Config, // PlumTree configuration
|
||||
pub max_message_size: usize, // Maximum wire message size (default: 4096)
|
||||
}
|
||||
```
|
||||
|
||||
The `max_message_size` is the total wire-level message size including headers. The actual payload capacity is computed as `max_message_size - postcard_header_size`, where the header size accounts for the topic ID and message envelope overhead.
|
||||
|
||||
## Statistics
|
||||
|
||||
Each topic tracks:
|
||||
```rust
|
||||
pub struct Stats {
|
||||
pub messages_sent: usize,
|
||||
pub messages_received: usize,
|
||||
}
|
||||
```
|
||||
|
||||
The PlumTree layer also tracks:
|
||||
```rust
|
||||
pub struct Stats {
|
||||
pub payload_messages_received: u64,
|
||||
pub control_messages_received: u64,
|
||||
pub max_last_delivery_hop: u16,
|
||||
}
|
||||
```
|
||||
244
docs/research/references/iroh/iroh-gossip/05-net-actor.md
Normal file
244
docs/research/references/iroh/iroh-gossip/05-net-actor.md
Normal file
@@ -0,0 +1,244 @@
|
||||
# iroh-gossip: Networking Layer & Actor Model
|
||||
|
||||
## Overview
|
||||
|
||||
The `net` module (`src/net.rs` and submodules) provides the async runtime layer that connects the IO-free protocol state machine to real network IO via iroh QUIC connections. It is built around a **single Actor** that manages all topics and connections.
|
||||
|
||||
## ALPN Protocol
|
||||
|
||||
```rust
|
||||
pub const GOSSIP_ALPN: &[u8] = b"/iroh-gossip/1";
|
||||
```
|
||||
|
||||
This ALPN identifier is used when establishing QUIC connections through iroh.
|
||||
|
||||
## Gossip Handle (`net::Gossip`)
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct Gossip {
|
||||
pub(crate) inner: Arc<Inner>,
|
||||
}
|
||||
```
|
||||
|
||||
`Gossip` is the primary public handle. It derefs to `GossipApi`, providing the user-facing interface:
|
||||
|
||||
```rust
|
||||
// Subscribe to a topic
|
||||
let (sender, receiver) = gossip.subscribe(topic_id, bootstrap_peers).await?.split();
|
||||
|
||||
// Subscribe and wait for at least one connection
|
||||
let topic = gossip.subscribe_and_join(topic_id, bootstrap_peers).await?;
|
||||
|
||||
// Broadcast a message
|
||||
sender.broadcast(b"hello world".to_vec().into()).await?;
|
||||
|
||||
// Broadcast to neighbors only
|
||||
sender.broadcast_neighbors(b"local announcement".to_vec().into()).await?;
|
||||
|
||||
// Join additional peers
|
||||
sender.join_peers(vec![peer_id]).await?;
|
||||
```
|
||||
|
||||
### Builder Pattern
|
||||
|
||||
```rust
|
||||
let gossip = Gossip::builder()
|
||||
.max_message_size(8192) // Default: 4096
|
||||
.membership_config(hyparview_config) // HyParView settings
|
||||
.broadcast_config(plumtree_config) // PlumTree settings
|
||||
.alpn(b"/custom-alpn") // Custom ALPN (must match across network)
|
||||
.spawn(endpoint);
|
||||
```
|
||||
|
||||
## Architecture: The Actor
|
||||
|
||||
The core of the networking layer is the `Actor` struct, which runs as a single async task:
|
||||
|
||||
```rust
|
||||
struct Actor {
|
||||
alpn: Bytes,
|
||||
state: proto::State<PublicKey, StdRng>, // Protocol state machine
|
||||
endpoint: Endpoint, // iroh endpoint for connections
|
||||
dialer: Dialer, // Manages outgoing connections
|
||||
rpc_rx: mpsc::Receiver<RpcMessage>, // API commands
|
||||
local_rx: mpsc::Receiver<LocalActorMessage>, // Local commands (connections, shutdown)
|
||||
in_event_tx: mpsc::Sender<InEvent>, // Protocol input channel
|
||||
in_event_rx: mpsc::Receiver<InEvent>, // Protocol input channel (receiver)
|
||||
timers: Timers<Timer>, // Scheduled timers
|
||||
topics: HashMap<TopicId, TopicState>, // Per-topic subscription state
|
||||
peers: HashMap<EndpointId, PeerState>, // Per-peer connection state
|
||||
command_rx: stream_group::Keyed<TopicCommandStream>, // Per-topic command streams
|
||||
quit_queue: VecDeque<TopicId>, // Topics pending unsubscription
|
||||
connection_tasks: JoinSet<...>, // Running connection loop tasks
|
||||
metrics: Arc<Metrics>,
|
||||
topic_event_forwarders: JoinSet<TopicId>, // Tasks forwarding events to subscribers
|
||||
address_lookup: GossipAddressLookup, // Address discovery integration
|
||||
}
|
||||
```
|
||||
|
||||
### Event Loop
|
||||
|
||||
The actor's `run()` method calls `event_loop()` in a loop. Each iteration uses `tokio::select!` to handle:
|
||||
|
||||
| Source | Action |
|
||||
|--------|--------|
|
||||
| `local_rx` (local messages) | Handle incoming connections or shutdown |
|
||||
| `rpc_rx` (RPC messages) | Process `Join` requests from the API |
|
||||
| `command_rx` (per-topic commands) | Process `Broadcast`, `BroadcastNeighbors`, `JoinPeers`, or stream closure |
|
||||
| `addr_updates` (endpoint addr changes) | Update our `PeerData` in the protocol state |
|
||||
| `dialer` (connection establishment) | Handle successful/failed outgoing connections |
|
||||
| `in_event_rx` (protocol events from connections) | Feed events to the protocol state machine |
|
||||
| `timers` (scheduled timers) | Feed timer expirations to the protocol state machine |
|
||||
| `connection_tasks` (connection task completions) | Handle peer disconnections |
|
||||
| `topic_event_forwarders` (subscription tasks) | Handle topic cleanup when all subscribers drop |
|
||||
|
||||
### Processing InEvents
|
||||
|
||||
When an `InEvent` is processed, the actor calls `self.state.handle(event, now, metrics)`, which returns `Vec<OutEvent>`. For each `OutEvent`:
|
||||
|
||||
| OutEvent | Action |
|
||||
|----------|--------|
|
||||
| `SendMessage(peer, message)` | Send via peer's active connection or queue for pending connection |
|
||||
| `EmitEvent(topic, event)` | Forward to topic's `broadcast::Sender` → subscribers |
|
||||
| `ScheduleTimer(delay, timer)` | Schedule timer via `Timers` data structure |
|
||||
| `DisconnectPeer(peer)` | Drop the peer's send channel, removing from `peers` map |
|
||||
| `PeerData(endpoint_id, data)` | Decode `AddrInfo` from `PeerData`, add to `GossipAddressLookup` |
|
||||
|
||||
## Connection Management
|
||||
|
||||
### Peer States
|
||||
|
||||
```rust
|
||||
enum PeerState {
|
||||
Pending {
|
||||
queue: Vec<ProtoMessage>, // Messages queued while connecting
|
||||
},
|
||||
Active {
|
||||
active_send_tx: mpsc::Sender<ProtoMessage>, // Current active send channel
|
||||
active_conn_id: ConnId, // Stable ID of active connection
|
||||
other_conns: Vec<ConnId>, // Older connections still closing
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
When a message needs to be sent to a peer:
|
||||
- **Active**: Send immediately via `active_send_tx`
|
||||
- **Pending**: Queue the message and initiate a dial
|
||||
|
||||
### Dialer
|
||||
|
||||
```rust
|
||||
struct Dialer {
|
||||
endpoint: Endpoint,
|
||||
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
|
||||
pending_dials: HashMap<EndpointId, CancellationToken>,
|
||||
}
|
||||
```
|
||||
|
||||
The `Dialer` manages outgoing connections. It:
|
||||
1. Checks if a dial is already pending for a peer
|
||||
2. Spawns an async connection task with cancellation support
|
||||
3. Returns completed connections via `next_conn()`
|
||||
|
||||
### Connection Loop
|
||||
|
||||
Each peer connection runs a `connection_loop` task:
|
||||
|
||||
```rust
|
||||
async fn connection_loop(
|
||||
from: PublicKey, // Remote peer's public key
|
||||
conn: Connection, // QUIC connection
|
||||
origin: ConnOrigin, // Accept (incoming) or Dial (outgoing)
|
||||
send_rx: mpsc::Receiver<ProtoMessage>, // Messages to send
|
||||
in_event_tx: mpsc::Sender<InEvent>, // Channel to protocol
|
||||
max_message_size: usize, // Maximum message size
|
||||
queue: Vec<ProtoMessage>, // Queued messages to send first
|
||||
) -> Result<(), ConnectionLoopError>
|
||||
```
|
||||
|
||||
The connection loop:
|
||||
1. First sends any queued messages
|
||||
2. Runs a send loop and receive loop concurrently (`tokio::join!`)
|
||||
3. Uses iroh QUIC bidirectional streams for communication
|
||||
|
||||
### Wire Protocol
|
||||
|
||||
Messages are serialized with `postcard` and sent as **length-prefixed frames** over QUIC unidirectional streams:
|
||||
|
||||
```
|
||||
┌──────────────┐
|
||||
│ Stream Header │ ── Contains TopicId (sent once per stream)
|
||||
├──────────────┤
|
||||
│ Frame (len) │ ── u32 length prefix
|
||||
│ Frame (data) │ ── postcard-encoded topic::Message<PublicKey>
|
||||
├──────────────┤
|
||||
│ Frame (len) │ ── next message...
|
||||
│ Frame (data) │
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
Each topic gets its own unidirectional stream. The stream header is sent once when the stream is opened. Disconnect messages close the stream after being sent.
|
||||
|
||||
The `SendLoop` manages per-topic streams within a connection:
|
||||
|
||||
```rust
|
||||
struct SendLoop {
|
||||
conn: Connection,
|
||||
streams: HashMap<TopicId, SendStream>, // One stream per topic
|
||||
buffer: Vec<u8>,
|
||||
max_message_size: usize,
|
||||
send_rx: mpsc::Receiver<ProtoMessage>,
|
||||
}
|
||||
```
|
||||
|
||||
When a disconnect message is sent for a topic, the stream for that topic is closed (via `finish()`).
|
||||
|
||||
## Topic State (Net Layer)
|
||||
|
||||
```rust
|
||||
struct TopicState {
|
||||
neighbors: BTreeSet<EndpointId>, // Current active neighbors (from protocol)
|
||||
event_sender: broadcast::Sender<ProtoEvent>, // Broadcast channel to subscribers
|
||||
command_rx_keys: HashSet<stream_group::Key>, // Active command stream keys
|
||||
}
|
||||
```
|
||||
|
||||
A topic is considered "still needed" if it has either:
|
||||
- Active command receivers (publishers), or
|
||||
- Active event subscribers (subscribers)
|
||||
|
||||
When neither exists, the topic is queued for quit/unsubscription.
|
||||
|
||||
## Address Lookup Integration
|
||||
|
||||
The `GossipAddressLookup` integrates with iroh's address discovery system:
|
||||
|
||||
```rust
|
||||
pub(crate) struct GossipAddressLookup {
|
||||
endpoints: NodeMap, // BTreeMap<EndpointId, StoredEndpointInfo>
|
||||
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
|
||||
}
|
||||
```
|
||||
|
||||
It implements iroh's `AddressLookup` trait, allowing gossip-discovered peer addresses to feed back into iroh's connection establishment. This means that when a peer shares its address information in `Join` or `ForwardJoin` messages, that information is used to help iroh connect to that peer.
|
||||
|
||||
Entries expire after 5 minutes (configurable via `RetentionOpts`), with eviction checks every 30 seconds.
|
||||
|
||||
## Metrics
|
||||
|
||||
The `Metrics` struct tracks various counters:
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| `msgs_ctrl_sent` | Control messages sent |
|
||||
| `msgs_ctrl_recv` | Control messages received |
|
||||
| `msgs_data_sent` | Data messages sent |
|
||||
| `msgs_data_recv` | Data messages received |
|
||||
| `msgs_data_sent_size` | Total size of data messages sent |
|
||||
| `msgs_data_recv_size` | Total size of data messages received |
|
||||
| `msgs_ctrl_sent_size` | Total size of control messages sent |
|
||||
| `msgs_ctrl_recv_size` | Total size of control messages received |
|
||||
| `neighbor_up` | Neighbor connections established |
|
||||
| `neighbor_down` | Neighbor connections lost |
|
||||
| `actor_tick_*` | Various event loop tick counters |
|
||||
290
docs/research/references/iroh/iroh-gossip/06-api-data-flow.md
Normal file
290
docs/research/references/iroh/iroh-gossip/06-api-data-flow.md
Normal file
@@ -0,0 +1,290 @@
|
||||
# iroh-gossip: Public API & Data Flow
|
||||
|
||||
## Public API Types
|
||||
|
||||
### Gossip (Main Handle)
|
||||
|
||||
The `Gossip` struct is the main entry point, created via a `Builder`:
|
||||
|
||||
```rust
|
||||
let gossip = Gossip::builder()
|
||||
.max_message_size(8192)
|
||||
.membership_config(HyparviewConfig { ... })
|
||||
.broadcast_config(PlumtreeConfig { ... })
|
||||
.alpn(b"/custom-alpn")
|
||||
.spawn(endpoint);
|
||||
```
|
||||
|
||||
It derefs to `GossipApi`, which provides:
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `subscribe(topic_id, bootstrap)` | Join a topic with default options |
|
||||
| `subscribe_and_join(topic_id, bootstrap)` | Join and wait for at least one connection |
|
||||
| `subscribe_with_opts(topic_id, opts)` | Join with custom `JoinOptions` |
|
||||
| `handle_connection(conn)` | Handle an incoming QUIC connection |
|
||||
| `shutdown()` | Gracefully leave all topics and stop |
|
||||
| `max_message_size()` | Get configured max message size |
|
||||
| `metrics()` | Get metrics handle |
|
||||
|
||||
### GossipTopic (Subscription Handle)
|
||||
|
||||
Returned by `subscribe()`, it is a `Stream<Item = Result<Event, ApiError>>`:
|
||||
|
||||
```rust
|
||||
let topic: GossipTopic = gossip.subscribe(topic_id, peers).await?;
|
||||
topic.broadcast(b"hello".to_vec().into()).await?;
|
||||
topic.broadcast_neighbors(b"local".to_vec().into()).await?;
|
||||
topic.joined().await?; // Wait for first connection
|
||||
```
|
||||
|
||||
Can be split into sender and receiver:
|
||||
|
||||
```rust
|
||||
let (sender, receiver) = topic.split();
|
||||
// sender: GossipSender - can broadcast and join peers
|
||||
// receiver: GossipReceiver - can receive events and check neighbors
|
||||
```
|
||||
|
||||
### GossipSender
|
||||
|
||||
```rust
|
||||
pub struct GossipSender(mpsc::Sender<Command>);
|
||||
|
||||
impl GossipSender {
|
||||
pub async fn broadcast(&self, message: Bytes) -> Result<(), ApiError>;
|
||||
pub async fn broadcast_neighbors(&self, message: Bytes) -> Result<(), ApiError>;
|
||||
pub async fn join_peers(&self, peers: Vec<EndpointId>) -> Result<(), ApiError>;
|
||||
}
|
||||
```
|
||||
|
||||
### GossipReceiver
|
||||
|
||||
```rust
|
||||
pub struct GossipReceiver {
|
||||
stream: Pin<Box<dyn Stream<Item = Result<Event, ApiError>> + Send + Sync + 'static>>,
|
||||
neighbors: HashSet<EndpointId>,
|
||||
}
|
||||
|
||||
impl GossipReceiver {
|
||||
pub fn neighbors(&self) -> impl Iterator<Item = EndpointId> + '_;
|
||||
pub async fn joined(&mut self) -> Result<(), ApiError>;
|
||||
pub fn is_joined(&self) -> bool;
|
||||
}
|
||||
```
|
||||
|
||||
The `GossipReceiver` tracks the neighbor set internally by processing `NeighborUp` and `NeighborDown` events.
|
||||
|
||||
### Event Types
|
||||
|
||||
```rust
|
||||
pub enum Event {
|
||||
NeighborUp(EndpointId), // New direct neighbor connected
|
||||
NeighborDown(EndpointId), // Direct neighbor disconnected
|
||||
Received(Message), // Gossip message received
|
||||
Lagged, // Internal channel lagged (messages dropped)
|
||||
}
|
||||
|
||||
pub struct Message {
|
||||
pub content: Bytes, // Message content
|
||||
pub scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||
pub delivered_from: EndpointId, // Peer that delivered the message to us
|
||||
}
|
||||
```
|
||||
|
||||
### Command Types
|
||||
|
||||
```rust
|
||||
pub enum Command {
|
||||
Broadcast(Bytes), // Broadcast to all in swarm
|
||||
BroadcastNeighbors(Bytes), // Broadcast to direct neighbors only
|
||||
JoinPeers(Vec<EndpointId>), // Join additional peers
|
||||
}
|
||||
```
|
||||
|
||||
### JoinOptions
|
||||
|
||||
```rust
|
||||
pub struct JoinOptions {
|
||||
pub bootstrap: BTreeSet<EndpointId>, // Initial peers to connect to
|
||||
pub subscription_capacity: usize, // Event channel capacity (default: 2048)
|
||||
}
|
||||
```
|
||||
|
||||
### DeliveryScope
|
||||
|
||||
```rust
|
||||
pub enum DeliveryScope {
|
||||
Swarm(Round), // Message traveled `Round` hops from origin
|
||||
Neighbors, // Direct neighbor message (not forwarded)
|
||||
}
|
||||
```
|
||||
|
||||
`DeliveryScope::Swarm(Round(0))` means the message was sent by a direct neighbor. `Round(n)` means the message traveled n hops.
|
||||
|
||||
## Data Flow Diagrams
|
||||
|
||||
### Joining a Topic
|
||||
|
||||
```
|
||||
User Code GossipApi Actor Proto State
|
||||
| | | |
|
||||
|-- subscribe(topic, peers)->| | |
|
||||
| |-- JoinRequest ------->| |
|
||||
| | |-- Command::Join ------>|
|
||||
| | | |-- RequestJoin(peers)
|
||||
| | | |-- SendMessage(peer, Join)
|
||||
| | | |-- ...
|
||||
| |<-- NeighborUp events--|<-- EmitEvent(NeighborUp)|
|
||||
|<-- Event::NeighborUp ------| | |
|
||||
```
|
||||
|
||||
### Broadcasting a Message
|
||||
|
||||
```
|
||||
User Code GossipSender Actor Proto State Network
|
||||
| | | | |
|
||||
|-- broadcast(msg) ->| | | |
|
||||
| |-- Command:: --> | | |
|
||||
| | Broadcast | | |
|
||||
| | |-- Broadcast ---->| |
|
||||
| | | |-- eager_push --->|
|
||||
| | | | (Gossip msgs) |
|
||||
| | | |-- lazy_push ----->|
|
||||
| | | | (IHave msgs) |
|
||||
| | | | |
|
||||
| (other peer receives Gossip) | | |
|
||||
| | | |<-- RecvMessage --|
|
||||
| | |<-- InEvent -------| |
|
||||
| | | | (validates ID) |
|
||||
| | | | (forwards) |
|
||||
|<-- Received(msg) -|<-- EmitEvent -| | |
|
||||
```
|
||||
|
||||
### Receiving and Processing IHave/Graft
|
||||
|
||||
```
|
||||
Time →
|
||||
|
||||
Peer A Our Node Peer B
|
||||
| | |
|
||||
|-- IHave(id, round) --->| |
|
||||
| | Schedule graft_timeout_1 |
|
||||
| | (wait for eager push) |
|
||||
| | |
|
||||
| [timeout expires] | |
|
||||
| |-- Graft(id, round) ----->| (Peer B sent IHave)
|
||||
| | |
|
||||
| |<-- Gossip(content) -------| (Peer B replies)
|
||||
| | |
|
||||
| |-- Prune ----------------->| (maybe, if optimization)
|
||||
```
|
||||
|
||||
### HyParView Join Flow
|
||||
|
||||
```
|
||||
New Node Contact Node Active Peers of Contact
|
||||
| | |
|
||||
|-- Join(me_data) -->| |
|
||||
| |-- add_active(new) |
|
||||
| |-- Neighbor(High) ----->| (to new node)
|
||||
| |-- ForwardJoin ------->| (to each active peer)
|
||||
| | |-- add_active or add_passive
|
||||
| | |-- Neighbor(Low/High) -> (to new node)
|
||||
| | |-- ForwardJoin -> (random peer)
|
||||
| | |
|
||||
|<-- Neighbor(High) -| |
|
||||
|<-- Neighbor(Low/High) ----------------------|
|
||||
| | |
|
||||
```
|
||||
|
||||
### Shuffle Periodic Operation
|
||||
|
||||
```
|
||||
Node A Node B Random Node
|
||||
| | |
|
||||
|-- Shuffle ---------->| |
|
||||
| (origin=A, nodes, | |
|
||||
| TTL=6) | |
|
||||
| |-- Shuffle ------------>|
|
||||
| | (origin=A, nodes, |
|
||||
| | TTL=5) |
|
||||
| | |-- ...
|
||||
| | |-- (TTL reaches 0)
|
||||
| | |
|
||||
|<-- ShuffleReply ----|<-- ShuffleReply --------|
|
||||
| (random nodes) | (random nodes) |
|
||||
| | |
|
||||
|-- add_passive(nodes from reply) |
|
||||
```
|
||||
|
||||
## RPC Support (Optional Feature)
|
||||
|
||||
When the `rpc` feature is enabled, `GossipApi` can also operate remotely:
|
||||
|
||||
```rust
|
||||
// Server side
|
||||
gossip.listen(rpc_endpoint).await;
|
||||
|
||||
// Client side
|
||||
let api = GossipApi::connect(rpc_endpoint, addr);
|
||||
let topic = api.subscribe_and_join(topic_id, bootstrap).await?;
|
||||
```
|
||||
|
||||
This uses the `irpc`/`noq` crates for bidirectional streaming RPC. The `Join` request establishes a bidirectional stream:
|
||||
- Client → Server: `Command` messages (Broadcast, BroadcastNeighbors, JoinPeers)
|
||||
- Server → Client: `Event` messages (NeighborUp, NeighborDown, Received, Lagged)
|
||||
|
||||
## Channel Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ Actor │
|
||||
│ │
|
||||
RPC/Local ──────►│ rpc_rx ◄─────────────────────────────────────│
|
||||
Commands │ local_rx ◄── HandleConnection, Shutdown │
|
||||
│ │
|
||||
│ in_event_tx ──► in_event_rx ────────────────│──► proto::State::handle()
|
||||
│ │ │
|
||||
│ ◄── OutEvent ────────────────────────────────│◄──── │
|
||||
│ │ │
|
||||
│ ├──► SendMessage ──► peer.send_tx │
|
||||
│ ├──► EmitEvent ──► topic.event_sender │
|
||||
│ ├──► ScheduleTimer ──► timers │
|
||||
│ ├──► DisconnectPeer ──► drop peer │
|
||||
│ └──► PeerData ──► address_lookup │
|
||||
│ │
|
||||
│ topic.event_sender ──► broadcast channel ────│──► GossipReceiver
|
||||
│ │
|
||||
│ command_rx ◄─── per-topic command streams ──│◄── GossipSender
|
||||
│ │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Configuration Defaults Summary
|
||||
|
||||
| Parameter | Default | Source |
|
||||
|-----------|---------|--------|
|
||||
| Active view capacity | 5 | HyParView paper (p9) |
|
||||
| Passive view capacity | 30 | HyParView paper (p9) |
|
||||
| Active random walk length | 6 | HyParView paper (p9) |
|
||||
| Passive random walk length | 3 | HyParView paper (p9) |
|
||||
| Shuffle random walk length | 6 | HyParView paper (p9) |
|
||||
| Shuffle active view count | 3 | HyParView paper (p9) |
|
||||
| Shuffle passive view count | 4 | HyParView paper (p9) |
|
||||
| Shuffle interval | 60s | Implementation choice |
|
||||
| Neighbor request timeout | 500ms | Implementation choice |
|
||||
| Graft timeout 1 | 80ms | Implementation choice |
|
||||
| Graft timeout 2 | 40ms | Implementation choice |
|
||||
| Dispatch timeout | 5ms | Implementation choice |
|
||||
| Optimization threshold | 7 hops | PlumTree paper (p12) |
|
||||
| Message cache retention | 30s | Implementation choice |
|
||||
| Message ID retention | 90s | Implementation choice |
|
||||
| Cache evict interval | 1s | Implementation choice |
|
||||
| Max message size | 4096 bytes | Implementation choice |
|
||||
| Send queue capacity | 64 messages | Implementation choice |
|
||||
| To-actor channel capacity | 64 messages | Implementation choice |
|
||||
| In-event channel capacity | 1024 messages | Implementation choice |
|
||||
| Topic event channel capacity | 256 events | Implementation choice |
|
||||
| Topic events default capacity | 2048 events | Implementation choice |
|
||||
| Topic commands channel capacity | 64 commands | Implementation choice |
|
||||
@@ -0,0 +1,176 @@
|
||||
# iroh-gossip: Utility Data Structures & Wire Format
|
||||
|
||||
## IndexSet (`proto::util::IndexSet`)
|
||||
|
||||
A wrapper around `indexmap::IndexSet` that provides random selection capabilities needed by HyParView:
|
||||
|
||||
```rust
|
||||
pub(crate) struct IndexSet<T> {
|
||||
inner: indexmap::IndexSet<T>,
|
||||
}
|
||||
```
|
||||
|
||||
### Key Operations
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `insert(value)` | Add element (returns false if already present) |
|
||||
| `remove(value)` | Remove by value (swap-remove, O(1)) |
|
||||
| `remove_index(index)` | Remove by index (swap-remove) |
|
||||
| `remove_random(rng)` | Remove a random element |
|
||||
| `pick_random(rng)` | Get reference to random element |
|
||||
| `pick_random_without(exclude, rng)` | Random element excluding certain elements |
|
||||
| `pick_random_index(rng)` | Random index |
|
||||
| `shuffled(rng)` | All elements in random order |
|
||||
| `shuffled_and_capped(len, rng)` | First `len` elements after shuffle |
|
||||
| `shuffled_without(exclude, rng)` | Random order excluding certain elements |
|
||||
| `shuffled_without_and_capped(exclude, len, rng)` | Capped shuffle excluding elements |
|
||||
| `iter_without(value)` | Iterator skipping a specific element |
|
||||
|
||||
These operations are critical for HyParView's random walks, shuffle exchanges, and passive view management.
|
||||
|
||||
## TimerMap (`proto::util::TimerMap`)
|
||||
|
||||
A priority queue of timer entries sorted by `Instant`, with stable ordering via a sequence counter:
|
||||
|
||||
```rust
|
||||
pub struct TimerMap<T> {
|
||||
heap: BinaryHeap<TimerMapEntry<T>>,
|
||||
seq: u64,
|
||||
}
|
||||
```
|
||||
|
||||
Used by the protocol state machine for scheduling future events (shuffles, graft timeouts, cache eviction). The networking layer wraps this in an async-friendly `Timers` type that can `wait_next()`.
|
||||
|
||||
### Key Operations
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `insert(instant, item)` | Schedule a timer |
|
||||
| `pop_before(limit)` | Pop the earliest entry if it's before `limit` |
|
||||
| `drain_until(from)` | Drain all entries up to a time |
|
||||
| `first()` | Get reference to earliest entry |
|
||||
|
||||
## TimeBoundCache (`proto::util::TimeBoundCache`)
|
||||
|
||||
A `HashMap` where entries expire after a specified `Instant`:
|
||||
|
||||
```rust
|
||||
pub struct TimeBoundCache<K, V> {
|
||||
map: HashMap<K, (Instant, V)>,
|
||||
expiry: TimerMap<K>,
|
||||
}
|
||||
```
|
||||
|
||||
Used by PlumTree for:
|
||||
- `received_messages: TimeBoundCache<MessageId, ()>` — deduplication
|
||||
- `cache: TimeBoundCache<MessageId, Gossip>` — message payload storage for Graft replies
|
||||
|
||||
### Key Operations
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `insert(key, value, expires)` | Insert with expiration |
|
||||
| `contains_key(key)` | Check existence |
|
||||
| `get(key)` | Get value |
|
||||
| `expires(key)` | Get expiration time |
|
||||
| `expire_until(instant)` | Remove all expired entries, returns count |
|
||||
| `len()` / `is_empty()` | Size queries |
|
||||
|
||||
The `expire_until` method correctly handles re-insertions: if a key is re-inserted with a later expiration time after being added to the expiry queue, the old expiry entry is ignored (not removed from the map).
|
||||
|
||||
## Wire Format
|
||||
|
||||
### Frame Encoding
|
||||
|
||||
Messages are encoded using `postcard` (a `no_std`-friendly, `serde`-compatible format) and sent as length-prefixed frames:
|
||||
|
||||
```
|
||||
┌──────────────┬──────────────┬─────────────────┐
|
||||
│ Length (u32) │ TopicHeader │ Message Payload │
|
||||
│ big-endian │ postcard │ postcard │
|
||||
└──────────────┴──────────────┴─────────────────┘
|
||||
```
|
||||
|
||||
### Stream Protocol
|
||||
|
||||
Each QUIC unidirectional stream is dedicated to a single topic. The stream begins with a `StreamHeader`:
|
||||
|
||||
```rust
|
||||
pub(crate) struct StreamHeader {
|
||||
pub(crate) topic_id: TopicId,
|
||||
}
|
||||
```
|
||||
|
||||
All subsequent frames on that stream carry messages for that topic. When a `Disconnect` message is sent, the stream is closed (via `finish()`).
|
||||
|
||||
### Message Types on Wire
|
||||
|
||||
```rust
|
||||
pub enum Message<PI> {
|
||||
Swarm(hyparview::Message<PI>), // Membership messages
|
||||
Gossip(plumtree::Message), // Broadcast messages
|
||||
}
|
||||
```
|
||||
|
||||
Where `PI` is `PublicKey` (32-byte ed25519 public key) in the networking layer.
|
||||
|
||||
The `MessageKind` classification is used for metrics:
|
||||
|
||||
| Kind | Message Types |
|
||||
|------|--------------|
|
||||
| `Data` | `Gossip` messages (actual content) |
|
||||
| `Control` | All Swarm messages, plus `Prune`, `Graft`, `IHave` |
|
||||
|
||||
### Message Size Limits
|
||||
|
||||
- Default max message size: 4096 bytes (minimum: 512)
|
||||
- The header size is computed at compile time via `postcard::experimental::serialized_size`
|
||||
- Actual payload capacity = `max_message_size - header_size`
|
||||
|
||||
The `SendLoop` checks message size before writing and returns `WriteError::TooLarge` if exceeded.
|
||||
|
||||
## PeerData & Address Propagation
|
||||
|
||||
The `PeerData` type is an opaque `Bytes` wrapper used in HyParView messages. In the `net` layer, it carries addressing information:
|
||||
|
||||
```rust
|
||||
struct AddrInfo {
|
||||
relay_url: Option<RelayUrl>,
|
||||
direct_addresses: BTreeSet<SocketAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
This is serialized with `postcard` and passed as `PeerData` in `Join`, `ForwardJoin`, and `Neighbor` messages. When received, the `AddrInfo` is decoded and fed into `GossipAddressLookup`, which implements iroh's `AddressLookup` trait, allowing gossip-discovered addresses to be used for future connections.
|
||||
|
||||
## GossipAddressLookup
|
||||
|
||||
```rust
|
||||
pub(crate) struct GossipAddressLookup {
|
||||
endpoints: NodeMap, // Arc<RwLock<BTreeMap<EndpointId, StoredEndpointInfo>>>
|
||||
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
|
||||
}
|
||||
```
|
||||
|
||||
Key behaviors:
|
||||
- **Merging**: When adding addresses for an already-known endpoint, new addresses are merged (union of direct addresses, relay URL is overwritten)
|
||||
- **Expiration**: Entries expire after 5 minutes, with eviction checks every 30 seconds
|
||||
- **Integration**: Implements `iroh::address_lookup::AddressLookup`, returning data with provenance "gossip"
|
||||
|
||||
## Dialer
|
||||
|
||||
```rust
|
||||
struct Dialer {
|
||||
endpoint: Endpoint,
|
||||
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
|
||||
pending_dials: HashMap<EndpointId, CancellationToken>,
|
||||
}
|
||||
```
|
||||
|
||||
The `Dialer` manages outgoing connection attempts:
|
||||
- Queues a dial via `queue_dial(endpoint_id, alpn)`
|
||||
- Checks for pending dials to avoid duplicate connections
|
||||
- Supports cancellation of in-progress dials
|
||||
- Returns completed connections via `next_conn()`
|
||||
|
||||
When a dial succeeds, the connection is passed to `handle_connection()`. When a dial fails and the peer is not already active, a `PeerDisconnected` event is injected into the protocol state.
|
||||
@@ -0,0 +1,169 @@
|
||||
# iroh-gossip: Testing & Simulation
|
||||
|
||||
## Test Infrastructure
|
||||
|
||||
The crate includes two layers of testing:
|
||||
|
||||
### 1. Unit Tests (in source files)
|
||||
|
||||
Unit tests are embedded in each module file behind `#[cfg(test)]`:
|
||||
|
||||
| Module | Tests |
|
||||
|--------|-------|
|
||||
| `proto/hyparview.rs` | Not shown (would be in the file) |
|
||||
| `proto/plumtree.rs` | `optimize_tree`, `spoofed_messages_are_ignored`, `cache_is_evicted` |
|
||||
| `proto.rs` | `hyparview_smoke`, `plumtree_smoke`, `quit` |
|
||||
| `net.rs` | `gossip_net_smoke`, `subscription_cleanup` |
|
||||
| `api.rs` | `test_rpc`, `ensure_gossip_topic_is_sync` |
|
||||
| `proto/util.rs` | `indexset`, `timer_map`, `hex`, `time_bound_cache` |
|
||||
|
||||
### 2. Protocol Simulator (`proto::sim`)
|
||||
|
||||
The `sim` module (behind `test-utils` feature) provides a deterministic network simulator:
|
||||
|
||||
```rust
|
||||
// Available when feature = "test-utils"
|
||||
pub mod sim;
|
||||
```
|
||||
|
||||
This allows testing the protocol logic without any real networking, using seeded RNG for reproducibility.
|
||||
|
||||
The simulator creates a `Network` of virtual nodes, each running their own `proto::State`. Events are processed in discrete "trips" (round-trips), allowing controlled testing of protocol behavior.
|
||||
|
||||
### 3. Simulation Binary (`sim` feature)
|
||||
|
||||
The crate includes a CLI simulator (behind `simulator` feature) that can run large-scale simulations:
|
||||
|
||||
```
|
||||
cargo run --bin sim --features simulator
|
||||
```
|
||||
|
||||
This uses `rayon` for parallel execution and `comfy-table` for result output.
|
||||
|
||||
### 4. Integration Tests (`tests/sim.rs`)
|
||||
|
||||
Behind the `test-utils` feature, provides end-to-end protocol testing.
|
||||
|
||||
## Key Test Patterns
|
||||
|
||||
### Protocol-Level Smoke Test
|
||||
|
||||
From `proto.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn hyparview_smoke() {
|
||||
let rng = ChaCha12Rng::seed_from_u64(0);
|
||||
let mut config = Config::default();
|
||||
config.membership.active_view_capacity = 2;
|
||||
let mut network = Network::new(config.into(), rng);
|
||||
for i in 0..4 { network.insert(i); }
|
||||
let t: TopicId = [0u8; 32].into();
|
||||
|
||||
// Join nodes
|
||||
network.command(0, t, Command::Join(vec![1, 2]));
|
||||
network.command(1, t, Command::Join(vec![2]));
|
||||
network.command(2, t, Command::Join(vec![]));
|
||||
network.run_trips(3);
|
||||
|
||||
// Verify events and connections
|
||||
assert_eq!(network.events_sorted(), expected);
|
||||
assert_eq!(network.conns(), vec![(0, 1), (0, 2), (1, 2)]);
|
||||
assert!(network.check_synchronicity());
|
||||
}
|
||||
```
|
||||
|
||||
### PlumTree Optimization Test
|
||||
|
||||
From `plumtree.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn optimize_tree() {
|
||||
// When an IHave message arrives with fewer hops than the Gossip message,
|
||||
// and the difference exceeds optimization_threshold, the tree is restructured:
|
||||
// - The IHave sender is promoted to eager (Graft)
|
||||
// - The Gossip sender is demoted to lazy (Prune)
|
||||
}
|
||||
```
|
||||
|
||||
### Spoofed Message Test
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn spoofed_messages_are_ignored() {
|
||||
// Messages where MessageId != blake3(content) are silently discarded
|
||||
let message = Message::Gossip(Gossip {
|
||||
content: content.clone(),
|
||||
id: MessageId::from_content(b"wrong_content"), // Spoofed!
|
||||
scope: DeliveryScope::Swarm(Round(1)),
|
||||
});
|
||||
state.handle(InEvent::RecvMessage(2, message), now, &mut io);
|
||||
// No events are emitted
|
||||
}
|
||||
```
|
||||
|
||||
### Networking Smoke Test
|
||||
|
||||
From `net.rs`:
|
||||
|
||||
```rust
|
||||
#[tokio::test]
|
||||
async fn gossip_net_smoke() {
|
||||
// Creates 3 endpoints with a relay server
|
||||
// Subscribes and joins a topic
|
||||
// Broadcasts messages and verifies reception
|
||||
// Uses real QUIC connections via iroh
|
||||
}
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
The `Metrics` struct (in `src/metrics.rs`) uses `iroh_metrics::MetricsGroup`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Default, MetricsGroup)]
|
||||
#[metrics(name = "gossip")]
|
||||
pub struct Metrics {
|
||||
pub msgs_ctrl_sent: Counter,
|
||||
pub msgs_ctrl_recv: Counter,
|
||||
pub msgs_data_sent: Counter,
|
||||
pub msgs_data_recv: Counter,
|
||||
pub msgs_data_sent_size: Counter,
|
||||
pub msgs_data_recv_size: Counter,
|
||||
pub msgs_ctrl_sent_size: Counter,
|
||||
pub msgs_ctrl_recv_size: Counter,
|
||||
pub neighbor_up: Counter,
|
||||
pub neighbor_down: Counter,
|
||||
pub actor_tick_main: Counter,
|
||||
pub actor_tick_rx: Counter,
|
||||
pub actor_tick_endpoint: Counter,
|
||||
pub actor_tick_dialer: Counter,
|
||||
pub actor_tick_dialer_success: Counter,
|
||||
pub actor_tick_dialer_failure: Counter,
|
||||
pub actor_tick_in_event_rx: Counter,
|
||||
pub actor_tick_timers: Counter,
|
||||
}
|
||||
```
|
||||
|
||||
These are tracked both in the protocol state machine (for message counts) and in the actor event loop (for tick-level diagnostics). When the `metrics` feature is enabled, they are exported via Prometheus-compatible endpoints.
|
||||
|
||||
## References
|
||||
|
||||
### Academic Papers
|
||||
|
||||
- **HyParView**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "HyParView: A Membership Protocol for Reliable Gossip Multicast with Dense Coverage." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf)
|
||||
- **PlumTree**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "Epidemic Broadcast Trees." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf)
|
||||
|
||||
### Implementation Reference
|
||||
|
||||
- Bartosz Sypytkowski's example implementation: [gist](https://gist.github.com/Horusiath/84fac596101b197da0546d1697580d99)
|
||||
|
||||
### Related Projects
|
||||
|
||||
- [iroh](https://docs.rs/iroh) — The networking library that iroh-gossip integrates with
|
||||
- [Earthstar](https://github.com/earthstar-project/earthstar) — Another PlumTree implementation referenced in code comments
|
||||
|
||||
### Crate Repository
|
||||
|
||||
- [github.com/n0-computer/iroh-gossip](https://github.com/n0-computer/iroh-gossip)
|
||||
40
docs/research/references/iroh/iroh-gossip/README.md
Normal file
40
docs/research/references/iroh/iroh-gossip/README.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# iroh-gossip Reference Documentation
|
||||
|
||||
This directory contains a deep-dive reference on how the `iroh-gossip` crate works, based on source code analysis of the repository at `/workspace/iroh-gossip`.
|
||||
|
||||
## Documents
|
||||
|
||||
| # | File | Topic |
|
||||
|---|------|-------|
|
||||
| 01 | [Overview & Architecture](01-overview-architecture.md) | Crate structure, module organization, design principles, features, dependencies |
|
||||
| 02 | [HyParView Membership](02-hyparview-membership.md) | Swarm membership protocol: active/passive views, join procedure, shuffle mechanism, failure recovery, PeerData |
|
||||
| 03 | [PlumTree Broadcast](03-plumtree-broadcast.md) | Epidemic broadcast trees: eager/lazy push, Graft/IHave/Prune, tree optimization, message deduplication, cache management |
|
||||
| 04 | [State & Topic Coordination](04-state-and-topic.md) | Multi-topic state management, topic lifecycle, event routing between HyParView and PlumTree |
|
||||
| 05 | [Net Actor & Networking](05-net-actor.md) | Actor model, event loop, connection management, Dialer, wire protocol, address lookup, topic state in the net layer |
|
||||
| 06 | [API & Data Flow](06-api-data-flow.md) | Public API types, subscription model, event/command flow, channel architecture, configuration defaults |
|
||||
| 07 | [Utilities & Wire Format](07-utilities-wire-format.md) | IndexSet, TimerMap, TimeBoundCache, serialization, PeerData/AddrInfo, Dialer internals |
|
||||
| 08 | [Testing & Metrics](08-testing-metrics-refs.md) | Test infrastructure, simulation, key test patterns, metrics, references |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Version
|
||||
`iroh-gossip` v0.97.0
|
||||
|
||||
### ALPN
|
||||
`/iroh-gossip/1`
|
||||
|
||||
### Core Protocols
|
||||
- **HyParView**: Hybrid partial view membership (active view = 5, passive view = 30 by default)
|
||||
- **PlumTree**: Epidemic broadcast trees (eager + lazy push with Graft/IHave optimization)
|
||||
|
||||
### Key Abstractions
|
||||
- **TopicId**: 32-byte identifier for a topic/swarm
|
||||
- **PeerIdentity**: Generic trait (instantiated as `PublicKey` in the net layer)
|
||||
- **PeerData**: Opaque bytes exchanged on join (carries `AddrInfo` in net layer)
|
||||
- **IO trait**: Interface for protocol output events (pure state machine, no IO)
|
||||
|
||||
### Wire Format
|
||||
- Postcard (serde) encoding over QUIC unidirectional streams
|
||||
- Length-prefixed frames (u32 length + postcard payload)
|
||||
- Stream header with TopicId
|
||||
- Max message size: 4096 bytes (configurable, minimum 512)
|
||||
@@ -0,0 +1,104 @@
|
||||
# iroh-live: Overview and Architecture
|
||||
|
||||
## What It Is
|
||||
|
||||
iroh-live is a real-time audio/video streaming system built on top of [iroh](https://github.com/n0-computer/iroh) (QUIC-based P2P networking) and [Media over QUIC (MoQ)](https://moq.dev/). It handles the full pipeline: camera/mic capture → encoding → transport → decoding → rendering. Connections are peer-to-peer by default, with an optional relay server for browser access via WebTransport.
|
||||
|
||||
**Status:** Early tech preview. APIs are unstable. Windows support is missing. Audio-video sync is basic.
|
||||
|
||||
## Workspace Crates
|
||||
|
||||
| Crate | Description |
|
||||
|-------|-------------|
|
||||
| `iroh-live` | High-level API: `Live`, `Call`, `Room`, tickets, subscriptions |
|
||||
| `iroh-moq` | MoQ transport layer over iroh/QUIC via `web-transport-iroh` |
|
||||
| `iroh-live-relay` | Relay server bridging iroh P2P to browser WebTransport |
|
||||
| `moq-media` | Media pipelines: capture, encode, decode, publish, subscribe, adaptive bitrate. No iroh dependency |
|
||||
| `rusty-codecs` | Codec implementations (H264/openh264, AV1/rav1e+ rav1d, Opus), hardware accel (VAAPI, V4L2, VideoToolbox) |
|
||||
| `rusty-capture` | Cross-platform capture: PipeWire, V4L2, X11, ScreenCaptureKit, AVFoundation |
|
||||
| `moq-media-egui` | egui integration for video rendering |
|
||||
| `moq-media-dioxus` | dioxus-native integration for video rendering |
|
||||
| `moq-media-android` | Android camera, EGL rendering, JNI bridge |
|
||||
| `iroh-live-cli` | CLI tool (`irl`) for publishing, playing, calls, rooms, relay |
|
||||
|
||||
## Layer Architecture
|
||||
|
||||
Three distinct layers, each usable independently:
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────┐
|
||||
│ iroh-live │
|
||||
│ Session management, tickets, rooms, calls │
|
||||
│ Re-exports: moq-media, iroh-moq │
|
||||
├──────────────────────────────────────────────────────────┤
|
||||
│ moq-media │
|
||||
│ Media pipelines: LocalBroadcast, RemoteBroadcast, │
|
||||
│ codecs, adaptive bitrate, playout │
|
||||
│ NO iroh dependency (transport-agnostic) │
|
||||
├──────────────────────────────────────────────────────────┤
|
||||
│ iroh-moq │
|
||||
│ MoQ session management, publish/subscribe over QUIC │
|
||||
│ Uses web-transport-iroh + moq-lite │
|
||||
└──────────────────────────────────────────────────────────┘
|
||||
|
||||
Below moq-media:
|
||||
rusty-codecs ─ codec implementations, hardware accel, wgpu rendering
|
||||
rusty-capture ─ platform-specific screen/camera capture
|
||||
```
|
||||
|
||||
## Design Principles
|
||||
|
||||
1. **`&self` everywhere** — All public types use interior mutability. Safe to share across async tasks/threads without wrappers.
|
||||
2. **Drop-based cleanup** — Dropping a `Call` closes it. Dropping `LocalBroadcast` tears down encoders. Dropping `VideoTrack` stops its decoder thread.
|
||||
3. **Watcher for continuous state, Stream for discrete events** — Connection quality and catalog contents use `n0_watcher::Direct<T>`. Participant joins use `impl Stream`.
|
||||
4. **Declarative intent, not mechanism** — `VideoTarget::default().max_pixels(1280*720)` describes what quality you need. The catalog selects the best rendition.
|
||||
5. **moq-media is standalone** — A recording pipeline can use `LocalBroadcast`/`RemoteBroadcast` without iroh-live. The transport boundary is the `PacketSink`/`PacketSource` trait pair.
|
||||
|
||||
## Data Flow (End-to-End)
|
||||
|
||||
```
|
||||
Publisher Side:
|
||||
capture source (rusty-capture, VideoSource trait)
|
||||
│
|
||||
▼
|
||||
encoder pipeline (moq-media, dedicated OS thread)
|
||||
│
|
||||
▼ EncodedFrame
|
||||
PacketSink (MoqPacketSink — starts new MoQ group on keyframe)
|
||||
│
|
||||
▼ MoQ transport (iroh-moq, QUIC streams)
|
||||
|
||||
Subscriber Side:
|
||||
PacketSource (MoqPacketSource — reads ordered frames from MoQ)
|
||||
│
|
||||
▼ MediaPacket
|
||||
decoder pipeline (moq-media, dedicated OS thread)
|
||||
│
|
||||
▼ VideoFrame
|
||||
FramePacer (PTS-based sleep) or Sync (shared playout clock)
|
||||
│
|
||||
▼
|
||||
renderer (wgpu texture upload or egui widget)
|
||||
```
|
||||
|
||||
Encoder and decoder pipelines run on **dedicated OS threads**, not tokio tasks, so slow codec operations never block the async runtime. The `forward_packets` async task bridges the network-side `PacketSource` into an mpsc channel that the decoder thread reads synchronously.
|
||||
|
||||
## Key Dependencies
|
||||
|
||||
| Dependency | Purpose |
|
||||
|------------|---------|
|
||||
| `iroh` | QUIC endpoint, connection management, P2P connectivity |
|
||||
| `iroh-gossip` | Gossip protocol for room participant discovery |
|
||||
| `iroh-tickets` | Ticket serialization for `RoomTicket` |
|
||||
| `iroh-smol-kv` | Distributed KV store for room state (gossip-backed) |
|
||||
| `moq-lite` | Core MoQ protocol: BroadcastProducer, BroadcastConsumer, Track, Group |
|
||||
| `hang` | Catalog management for broadcast metadata |
|
||||
| `moq-mux` | MoQ multiplexing |
|
||||
| `moq-relay` | Relay server implementation (used by iroh-live-relay) |
|
||||
| `web-transport-iroh` | WebTransport over iroh QUIC connections |
|
||||
| `n0-future` | Async utilities (FuturesUnordered, AbortOnDropHandle) |
|
||||
| `n0-watcher` | Watchable/Direct reactive state |
|
||||
|
||||
## License
|
||||
|
||||
Dual-licensed: MIT OR Apache-2.0. Copyright 2025 N0, INC.
|
||||
167
docs/research/references/iroh/iroh-live/02-core-api.md
Normal file
167
docs/research/references/iroh/iroh-live/02-core-api.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# iroh-live: Core API — Live, Call, Subscription, Ticket
|
||||
|
||||
## `Live` — Entry Point
|
||||
|
||||
The primary entry point for all iroh-live operations. Manages an iroh `Endpoint`, the MoQ transport (`Moq`), and optionally a `Gossip` instance for rooms.
|
||||
|
||||
### Construction
|
||||
|
||||
```rust
|
||||
// Simple: from environment, accept incoming connections
|
||||
let live = Live::from_env().await?.with_router().spawn();
|
||||
|
||||
// With gossip for rooms
|
||||
let live = Live::from_env().await?.with_router().with_gossip().spawn();
|
||||
|
||||
// From an existing endpoint
|
||||
let live = Live::builder(endpoint).with_router().with_gossip().spawn();
|
||||
|
||||
// Manual router mounting (when you have other protocols)
|
||||
let router = live.register_protocols(Router::builder(endpoint));
|
||||
let router = router.accept(other_protocol, other_handler);
|
||||
let router = router.spawn();
|
||||
```
|
||||
|
||||
### Key Methods
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `publish(name, &LocalBroadcast)` | Register a broadcast for all connected peers |
|
||||
| `subscribe(remote, name)` | Connect to a peer and subscribe to a broadcast → `Subscription` |
|
||||
| `subscribe_media(remote, name, audio, config)` | Connect, subscribe, decode → `(MoqSession, MediaTracks)` |
|
||||
| `join_room(ticket)` | Join a gossip-based multi-party room → `Room` |
|
||||
| `endpoint()` | Access the underlying iroh `Endpoint` |
|
||||
| `transport()` | Access the `Moq` transport for advanced operations |
|
||||
| `gossip()` | Access the `Gossip` instance (if enabled) |
|
||||
| `shutdown()` | Close all sessions, stop router, close endpoint |
|
||||
|
||||
### Builder Options
|
||||
|
||||
- **`with_router()`** — Spawns an internal `Router` so the endpoint accepts incoming MoQ sessions. Without this, only outbound connections work.
|
||||
- **`with_gossip()`** — Creates a `Gossip` instance (required for rooms). Internally mounts on the Router if `with_router` is also set.
|
||||
- **`gossip(gossip)`** — Use an externally-managed `Gossip` instance.
|
||||
|
||||
### Internal Architecture
|
||||
|
||||
`Live` holds:
|
||||
- `endpoint: Endpoint` — iroh QUIC endpoint
|
||||
- `moq: Moq` — Internal actor for session/broadcast management
|
||||
- `gossip: Option<Gossip>` — For room coordination
|
||||
- `router: Option<Router>` — For accepting incoming connections
|
||||
|
||||
The `from_env()` method reads `IROH_SECRET` for the secret key and generates one if not set. It uses the `N0` preset for relay and DNS discovery.
|
||||
|
||||
## `LiveTicket` — Connection Sharing
|
||||
|
||||
A serializable ticket that contains everything needed to connect to a publisher.
|
||||
|
||||
```rust
|
||||
// Create a ticket
|
||||
let ticket = LiveTicket::new(endpoint.addr(), "my-stream");
|
||||
|
||||
// Serialize to URI string (fits in QR codes)
|
||||
let s = ticket.to_string();
|
||||
// → "iroh-live:<base64url(postcard(EndpointAddr))>/my-stream"
|
||||
|
||||
// Deserialize
|
||||
let parsed: LiveTicket = s.parse()?;
|
||||
|
||||
// With relay URLs for indirect connectivity
|
||||
let ticket = LiveTicket::new(addr, "stream").with_relay_urls(vec![
|
||||
"https://relay.example.com".to_string(),
|
||||
]);
|
||||
```
|
||||
|
||||
**Format:** `iroh-live:<base64url(postcard(EndpointAddr))>/<name>`
|
||||
|
||||
Also supports legacy `name@base32` format for backward compatibility.
|
||||
|
||||
The ticket string is kept short enough for QR codes (< 2000 bytes). It uses postcard (binary) serialization with base64url encoding.
|
||||
|
||||
## `Call` — 1:1 Video Call
|
||||
|
||||
A convenience wrapper over MoQ primitives for bidirectional calls.
|
||||
|
||||
### Flow
|
||||
|
||||
1. One side creates a `LocalBroadcast` with video/audio configured
|
||||
2. **Dialer:** `Call::dial(live, remote_addr, local_broadcast)` — connects, publishes "call" broadcast, subscribes to remote's "call" broadcast
|
||||
3. **Acceptor:** `Call::accept(session, local_broadcast)` — accepts an incoming session, publishes and subscribes
|
||||
|
||||
The broadcast name is always `"call"` — this is hardcoded (`CALL_BROADCAST_NAME`).
|
||||
|
||||
```rust
|
||||
// Dialer side
|
||||
let local = LocalBroadcast::new();
|
||||
local.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
|
||||
let call = Call::dial(&live, remote_addr, local).await?;
|
||||
|
||||
// Access remote media
|
||||
let remote_broadcast = call.remote();
|
||||
let video = remote_broadcast.video()?;
|
||||
|
||||
// Wait for call to end
|
||||
let reason = call.closed().await;
|
||||
```
|
||||
|
||||
### Key Properties
|
||||
|
||||
- `call.local()` → `&LocalBroadcast` (your media)
|
||||
- `call.remote()` → `&RemoteBroadcast` (peer's media)
|
||||
- `call.signals()` → `watch::Receiver<NetworkSignals>` (for adaptive bitrate)
|
||||
- `call.close()` — closes with error code 0 and reason "call ended"
|
||||
- `call.closed()` → waits for close, returns `DisconnectReason` (LocalClose, RemoteClose, TransportError)
|
||||
|
||||
Auto-wires stats recording and network signal production on the connection.
|
||||
|
||||
## `Subscription` — Subscribe Handle
|
||||
|
||||
Created by `Live::subscribe()`. Wraps the MoQ session, remote broadcast, and network signals into a single handle. The constructor auto-wires stats recording and signal production.
|
||||
|
||||
```rust
|
||||
let sub = live.subscribe(remote_addr, "stream").await?;
|
||||
|
||||
// Access components
|
||||
sub.session() // &MoqSession
|
||||
sub.broadcast() // &RemoteBroadcast
|
||||
sub.signals() // &watch::Receiver<NetworkSignals>
|
||||
|
||||
// Convenience methods
|
||||
let tracks = sub.media(&audio_backend, Default::default()).await?;
|
||||
let tracks = sub.media_with_decoders::<DefaultDecoders>(&audio_backend, config).await?;
|
||||
|
||||
// Decompose
|
||||
let (session, broadcast, signals) = sub.into_parts();
|
||||
```
|
||||
|
||||
## `DisconnectReason`
|
||||
|
||||
```rust
|
||||
pub enum DisconnectReason {
|
||||
LocalClose,
|
||||
RemoteClose,
|
||||
TransportError,
|
||||
}
|
||||
```
|
||||
|
||||
Derived from the QUIC connection's close reason. Used by `Call::closed()`.
|
||||
|
||||
## `util` Module
|
||||
|
||||
### `secret_key_from_env()`
|
||||
|
||||
Loads the iroh secret key from the `IROH_SECRET` environment variable. Generates a new key if not set, printing the hex-encoded key for reuse.
|
||||
|
||||
### `spawn_signal_producer(conn, shutdown)`
|
||||
|
||||
Spawns a background task that polls QUIC connection path stats every 200ms and produces `NetworkSignals` for adaptive rendition selection. Returns a `watch::Receiver<NetworkSignals>`.
|
||||
|
||||
Computes:
|
||||
- **RTT** — from `selected_path.rtt()`
|
||||
- **Loss rate** — delta-based (lost packets / (sent + lost) over the interval)
|
||||
- **Available bandwidth** — estimated from congestion window: `cwnd * 8 / rtt`
|
||||
- **Congestion events** — monotonically increasing counter
|
||||
|
||||
### `spawn_stats_recorder(conn, net_stats, shutdown)`
|
||||
|
||||
Records connection stats (RTT, loss rate, bandwidth, path type) into `NetStats` for debug overlay display. Runs every 200ms.
|
||||
164
docs/research/references/iroh/iroh-live/03-iroh-moq-transport.md
Normal file
164
docs/research/references/iroh/iroh-live/03-iroh-moq-transport.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# iroh-moq: MoQ Transport Layer
|
||||
|
||||
## Overview
|
||||
|
||||
`iroh-moq` is the transport bridge between iroh's QUIC endpoint and the moq-lite broadcast protocol. It manages connections, session lifecycle, broadcast routing, and subscription handling. This is the only crate in the workspace that directly interacts with QUIC transport — everything above uses `Moq`/`MoqSession` as the interface.
|
||||
|
||||
**ALPN:** `moq-lite-03`
|
||||
|
||||
## Core Types
|
||||
|
||||
### `Moq` — Transport Manager
|
||||
|
||||
The top-level transport entry point. Wraps an iroh `Endpoint` and runs an internal actor (`Actor`) that handles all connection and broadcast management.
|
||||
|
||||
```rust
|
||||
let moq = Moq::new(endpoint);
|
||||
```
|
||||
|
||||
**Internal architecture:**
|
||||
|
||||
`Moq` holds an `mpsc::Sender<ActorMessage>` to communicate with a spawned actor task. The actor manages:
|
||||
- A `HashMap<EndpointId, MoqSession>` of active sessions
|
||||
- A `HashMap<BroadcastName, BroadcastProducer>` of locally published broadcasts
|
||||
- A `JoinSet` of session tasks (each tracks session lifetime)
|
||||
- A `FuturesUnordered` of pending connect tasks
|
||||
- A `broadcast::Sender<MoqSession>` for incoming session notifications
|
||||
|
||||
**Key methods:**
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `new(endpoint)` | Creates transport and spawns the actor |
|
||||
| `protocol_handler()` | Returns `MoqProtocolHandler` for Router registration |
|
||||
| `publish(name, producer)` | Register a broadcast for all current and future sessions |
|
||||
| `connect(remote)` | Connect to remote peer, deduplicating existing connections |
|
||||
| `incoming_sessions()` | Get stream of incoming sessions |
|
||||
| `published_broadcasts()` | List currently published broadcast names |
|
||||
| `shutdown()` | Cancel the shutdown token, closing all sessions |
|
||||
|
||||
### `MoqProtocolHandler`
|
||||
|
||||
Implements iroh's `ProtocolHandler` trait. When the Router receives an incoming connection with the `moq-lite-03` ALPN:
|
||||
|
||||
1. Accepts the raw QUIC `Connection`
|
||||
2. Wraps it in a `web_transport_iroh::Session::raw(connection)`
|
||||
3. Completes the moq-lite server handshake: `MoqSession::session_accept(wt_session)`
|
||||
4. Sends the session to the actor via `ActorMessage::HandleSession`
|
||||
|
||||
### `MoqSession` — Single Peer Connection
|
||||
|
||||
Represents a MoQ connection with one remote peer. Created via:
|
||||
- `Moq::connect()` (outbound, client role)
|
||||
- `IncomingSession::accept()` (inbound, server role)
|
||||
|
||||
```rust
|
||||
// Outbound
|
||||
let session = moq.connect(remote_addr).await?;
|
||||
|
||||
// Inbound
|
||||
let incoming = incoming_session.next().await?;
|
||||
let session = incoming.accept(); // or incoming.reject()
|
||||
```
|
||||
|
||||
**Internal structure:**
|
||||
|
||||
```rust
|
||||
pub struct MoqSession {
|
||||
wt_session: web_transport_iroh::Session,
|
||||
_moq_session: Arc<moq_lite::Session>,
|
||||
publish: OriginProducer, // For announcing local broadcasts
|
||||
subscribe: OriginConsumer, // For consuming remote broadcasts
|
||||
}
|
||||
```
|
||||
|
||||
The `OriginProducer`/`OriginConsumer` pair comes from moq-lite. The session creates them before the handshake:
|
||||
|
||||
- **Client (connect):** Creates `OriginProducer` for publish and `OriginConsumer` for subscribe, then `Client::new().with_publish(...).with_consume(...).connect(session)`
|
||||
- **Server (accept):** Same pattern with `Server::new().with_publish(...).with_consume(...).accept(session)`
|
||||
|
||||
**Key methods:**
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `subscribe(name)` | Wait for remote to announce broadcast, return `BroadcastConsumer` |
|
||||
| `publish(name, consumer)` | Make a broadcast available to remote peer |
|
||||
| `conn()` | Reference to underlying QUIC `Connection` (for stats) |
|
||||
| `remote_id()` | Remote peer's `EndpointId` |
|
||||
| `close(code, reason)` | Close the session |
|
||||
| `closed()` | Wait for session to close, returns `SessionError` |
|
||||
| `origin_producer()` | Direct access to moq-lite publish origin |
|
||||
| `origin_consumer()` | Direct access to moq-lite subscribe origin |
|
||||
|
||||
### `IncomingSession` / `IncomingSessionStream`
|
||||
|
||||
`IncomingSession` wraps a `MoqSession` that has completed the handshake. Provides:
|
||||
- `remote_id()` — the connecting peer's identity
|
||||
- `accept()` — returns the `MoqSession`
|
||||
- `reject()` — closes with error code 1
|
||||
|
||||
`IncomingSessionStream` is an async stream that yields `IncomingSession` values. Uses a `broadcast::Receiver<MoqSession>` internally, handling lag by skipping missed sessions.
|
||||
|
||||
## Actor Internals
|
||||
|
||||
The `Actor` is the core event loop for the `Moq` transport:
|
||||
|
||||
```
|
||||
loop {
|
||||
select! {
|
||||
msg = inbox.recv() → handle_message(msg)
|
||||
session_closed → remove session, log
|
||||
broadcast_closed → remove from publishing map
|
||||
connect_completed → handle_session or reply to caller
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Message Types
|
||||
|
||||
```rust
|
||||
enum ActorMessage {
|
||||
HandleSession { session: Box<MoqSession> },
|
||||
LocalBroadcast { broadcast_name: String, producer: BroadcastProducer },
|
||||
Connect { remote: EndpointAddr, reply: oneshot::Sender<...> },
|
||||
GetPublished { reply: oneshot::Sender<Vec<String>> },
|
||||
}
|
||||
```
|
||||
|
||||
### Connection Deduplication
|
||||
|
||||
When `Connect` is received for a peer that already has an active session, the existing session is returned immediately. If a connection attempt is already in progress, the oneshot reply is queued and notified when the connection completes.
|
||||
|
||||
### Broadcast Fan-out
|
||||
|
||||
When a `LocalBroadcast` is published via `Moq::publish()`:
|
||||
1. The actor stores the `BroadcastProducer` in its `publishing` map
|
||||
2. It immediately announces the broadcast to all existing sessions by calling `session.publish(name, producer.consume())` on each
|
||||
3. For future sessions, the actor iterates `publishing` entries and announces each one
|
||||
4. A `FuturesUnordered` tracks when each broadcast closes, removing it from the map
|
||||
|
||||
### Session Lifecycle
|
||||
|
||||
When a session is established (either incoming or outgoing):
|
||||
1. All currently published broadcasts are announced to it
|
||||
2. It's stored in `sessions` by `EndpointId`
|
||||
3. A session task is spawned that waits for the session to close
|
||||
4. If there were pending connect requests for this peer, they're fulfilled
|
||||
|
||||
## Error Types
|
||||
|
||||
```rust
|
||||
enum Error {
|
||||
Connect(ConnectError), // iroh connection failure
|
||||
Moq(moq_lite::Error), // MoQ protocol error
|
||||
Server(web_transport_iroh::ServerError), // WebTransport server error
|
||||
InternalConsistencyError(LiveActorDiedError), // Actor died
|
||||
Request(WriteError), // QUIC write error
|
||||
}
|
||||
|
||||
enum SubscribeError {
|
||||
NotAnnounced, // Track was not announced
|
||||
Closed, // Track was closed
|
||||
SessionClosed(SessionError), // Session closed
|
||||
}
|
||||
```
|
||||
185
docs/research/references/iroh/iroh-live/04-rooms.md
Normal file
185
docs/research/references/iroh/iroh-live/04-rooms.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# iroh-live: Rooms — Multi-Party Coordination
|
||||
|
||||
## Overview
|
||||
|
||||
The `rooms` module provides multi-party room coordination over iroh-gossip. Participants discover each other via a gossip topic, automatically connect and subscribe to each other's broadcasts, and receive `RoomEvent` notifications as peers join, publish, and leave.
|
||||
|
||||
## Core Types
|
||||
|
||||
### `Room`
|
||||
|
||||
The main room handle. Created via `Room::new(live, ticket)`. Spawns an internal actor that manages all peer coordination.
|
||||
|
||||
```rust
|
||||
// Create a room (generates a random topic)
|
||||
let ticket = RoomTicket::generate();
|
||||
let room = Room::new(&live, ticket.clone()).await?;
|
||||
|
||||
// Or join an existing room
|
||||
let room = Room::new(&live, existing_ticket).await?;
|
||||
```
|
||||
|
||||
**Methods:**
|
||||
- `recv()` — Wait for next `RoomEvent`
|
||||
- `try_recv()` — Non-blocking event check
|
||||
- `ticket()` — Get a ticket that includes this peer as a bootstrap node
|
||||
- `split()` — Decompose into `(RoomEvents, RoomHandle)` for use in separate tasks
|
||||
- `publish(name, &LocalBroadcast)` — Publish a broadcast to the room
|
||||
- `set_chat_publisher(publisher)` — Register a chat publisher
|
||||
- `send_chat(text)` — Send a chat message
|
||||
|
||||
### `RoomHandle`
|
||||
|
||||
Cloneable handle for publishing into a room. Obtained from `Room::split()`. Can be shared across tasks.
|
||||
|
||||
```rust
|
||||
let (events, handle) = room.split();
|
||||
|
||||
// In one task: receive events
|
||||
while let Some(event) = events.recv().await {
|
||||
match event { ... }
|
||||
}
|
||||
|
||||
// In another task: publish
|
||||
handle.publish("camera", &broadcast).await?;
|
||||
handle.send_chat("Hello!").await?;
|
||||
handle.set_display_name("Alice").await?;
|
||||
```
|
||||
|
||||
### `RoomTicket`
|
||||
|
||||
```rust
|
||||
pub struct RoomTicket {
|
||||
pub bootstrap: Vec<EndpointId>, // Bootstrap peer IDs for gossip
|
||||
pub topic_id: TopicId, // Gossip topic identifier
|
||||
}
|
||||
```
|
||||
|
||||
Serialized via `iroh_tickets` (binary format). Can be created from:
|
||||
- `RoomTicket::generate()` — Random topic, no bootstrap
|
||||
- `RoomTicket::new(topic_id, bootstrap)` — Specific topic and peers
|
||||
- `RoomTicket::new_from_env()` — From `IROH_LIVE_ROOM` or `IROH_LIVE_TOPIC` env vars
|
||||
|
||||
### `RoomEvent`
|
||||
|
||||
```rust
|
||||
pub enum RoomEvent {
|
||||
RemoteAnnounced {
|
||||
remote: EndpointId,
|
||||
broadcasts: Vec<String>,
|
||||
},
|
||||
BroadcastSubscribed {
|
||||
session: Box<MoqSession>,
|
||||
broadcast: Box<RemoteBroadcast>,
|
||||
},
|
||||
PeerJoined {
|
||||
remote: EndpointId,
|
||||
display_name: Option<String>,
|
||||
},
|
||||
PeerLeft {
|
||||
remote: EndpointId,
|
||||
},
|
||||
ChatReceived {
|
||||
remote: EndpointId,
|
||||
message: ChatMessage,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Room Actor — Internal Architecture
|
||||
|
||||
The room actor is a spawned task that manages the gossip KV subscription and coordinates all peer connections.
|
||||
|
||||
### State
|
||||
|
||||
```rust
|
||||
struct Actor {
|
||||
me: EndpointId,
|
||||
_gossip: Gossip,
|
||||
live: Live,
|
||||
active_subscribe: HashSet<BroadcastId>, // (EndpointId, name) pairs
|
||||
active_publish: HashSet<String>, // Locally published broadcast names
|
||||
known_peers: HashMap<EndpointId, Option<String>>, // display names
|
||||
connecting: ConnectingFutures, // In-flight subscribe attempts
|
||||
subscribe_closed: FuturesUnordered, // Track subscription lifetimes
|
||||
publish_closed: FuturesUnordered, // Track publish lifetimes
|
||||
chat_messages: FuturesUnordered, // Active chat subscribers
|
||||
chat_publisher: Option<ChatPublisher>,
|
||||
display_name: Option<String>,
|
||||
event_tx: mpsc::Sender<RoomEvent>,
|
||||
kv: iroh_smol_kv::Client, // Distributed KV for peer state
|
||||
kv_writer: WriteScope, // KV write access
|
||||
}
|
||||
```
|
||||
|
||||
### Gossip KV for Peer Discovery
|
||||
|
||||
The room uses `iroh-smol-kv` over gossip for peer state coordination. Each peer writes their `PeerState` to key `b"s"`:
|
||||
|
||||
```rust
|
||||
struct PeerState {
|
||||
broadcasts: Vec<String>,
|
||||
display_name: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
Serialized with postcard (binary format — **no `skip_serializing_if`** allowed since postcard is positional).
|
||||
|
||||
### Event Loop
|
||||
|
||||
```
|
||||
loop {
|
||||
select! {
|
||||
update = gossip_kv_stream.next() → handle_gossip_update
|
||||
msg = inbox.recv() → handle_api_message
|
||||
result = connecting.next() → subscribe succeeded/failed
|
||||
broadcast_closed → remove from active, maybe emit PeerLeft
|
||||
publish_closed → remove from active_publish, update KV
|
||||
chat_message → emit ChatReceived
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Peer Discovery Flow
|
||||
|
||||
1. Peer A publishes a broadcast via `handle.publish("camera", &broadcast)`
|
||||
2. Actor publishes to MoQ AND updates gossip KV with `PeerState { broadcasts: ["camera"], display_name: ... }`
|
||||
3. Peer B's gossip KV stream receives the update
|
||||
4. Peer B's actor checks `known_peers` — if new, emits `PeerJoined`
|
||||
5. Peer B's actor checks `active_subscribe` — if new broadcast, initiates `live.subscribe(remote, name)`
|
||||
6. When subscription succeeds, Peer B emits `BroadcastSubscribed`
|
||||
7. If the broadcast has a chat track, a chat subscriber is spawned
|
||||
|
||||
### Chat
|
||||
|
||||
Chat uses a dedicated MoQ track within each broadcast. Each message is a single MoQ group containing one frame of UTF-8 text. The sender identity comes from the broadcast context (peer ID), not the message payload.
|
||||
|
||||
### Connection Lifecycle
|
||||
|
||||
- When a broadcast closes (`subscribe_closed`), it's removed from `active_subscribe`
|
||||
- If this was the last broadcast from that peer, `PeerLeft` is emitted
|
||||
- When a publish closes (`publish_closed`), the KV is updated to remove that broadcast
|
||||
|
||||
### `RoomPublisherSync`
|
||||
|
||||
A convenience wrapper for the common pattern of publishing camera+audio and optionally screen share into a room:
|
||||
|
||||
```rust
|
||||
let publisher = RoomPublisherSync::new(room_handle, audio_backend);
|
||||
publisher.set_state(&PublishOpts::default())?;
|
||||
```
|
||||
|
||||
Automatically publishes a "camera" broadcast and manages a "screen" broadcast when screen sharing is toggled on.
|
||||
|
||||
## API Messages
|
||||
|
||||
```rust
|
||||
enum ApiMessage {
|
||||
Publish { name: String, producer: BroadcastProducer },
|
||||
SendChat { text: String },
|
||||
SetChatPublisher { publisher: ChatPublisher },
|
||||
SetDisplayName { name: String },
|
||||
}
|
||||
```
|
||||
|
||||
These are sent from `RoomHandle` to the actor via an mpsc channel.
|
||||
105
docs/research/references/iroh/iroh-live/05-relay.md
Normal file
105
docs/research/references/iroh/iroh-live/05-relay.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# iroh-live-relay: Browser Bridging
|
||||
|
||||
## Overview
|
||||
|
||||
The relay server bridges iroh P2P streams to browser clients via WebTransport. Browsers cannot speak iroh's QUIC protocol directly, so the relay accepts WebTransport connections and either serves locally-published broadcasts or pulls them from remote iroh publishers on demand.
|
||||
|
||||
**Architecture:**
|
||||
|
||||
```
|
||||
iroh-live publish --(iroh P2P)--> iroh-live-relay <--(WebTransport)-- browser
|
||||
browser --(WebTransport)--> iroh-live-relay --(iroh P2P)--> iroh-live subscribe
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### `RelayConfig` (CLI Configuration)
|
||||
|
||||
```rust
|
||||
pub struct RelayConfig {
|
||||
pub bind: SocketAddr, // QUIC/WebTransport bind (default: [::]:4443)
|
||||
pub http_bind: SocketAddr, // HTTP static files bind (default: same as bind)
|
||||
}
|
||||
```
|
||||
|
||||
Flattenable into a clap CLI via `#[command(flatten)]`.
|
||||
|
||||
### `run(config)` — Main Server Loop
|
||||
|
||||
The main entry point. Sets up:
|
||||
|
||||
1. **QUIC/WebTransport server** — Uses `moq-native::ServerConfig` with:
|
||||
- QUIC backend: `noq` (a custom QUIC implementation)
|
||||
- iroh endpoint integration
|
||||
- Self-signed TLS certificates (dev mode) for `localhost`
|
||||
- Max streams: `moq_relay::DEFAULT_MAX_STREAMS`
|
||||
|
||||
2. **iroh endpoint** — Binds an iroh endpoint for P2P connectivity, prints its ID
|
||||
|
||||
3. **moq-relay Cluster** — The broadcast routing engine. Manages broadcast lifecycle: when all subscribers disconnect, the broadcast is removed.
|
||||
|
||||
4. **HTTP server** — Axum router serving:
|
||||
- `GET /certificate.sha256` — TLS fingerprint for dev mode
|
||||
- `GET /` — Web viewer landing page
|
||||
- `GET /{path}` — Static file serving with CORS
|
||||
- Embedded via `include_dir!` from `web/dist/`
|
||||
|
||||
5. **Pull mode** — If iroh endpoint is available, creates a `PullState` for on-demand remote broadcast fetching
|
||||
|
||||
6. **Connection loop** — Accepts incoming connections, parses the URL path as a `LiveTicket`, and if valid, triggers a pull before running the connection
|
||||
|
||||
### `PullState` — On-Demand Remote Fetching
|
||||
|
||||
When a browser connects with a broadcast name that is a valid `LiveTicket`, the relay:
|
||||
|
||||
1. Checks if the broadcast already exists in the cluster (fast path)
|
||||
2. If not, connects to the remote publisher via iroh-live's `Moq::connect()`
|
||||
3. Subscribes to the remote broadcast
|
||||
4. Publishes the consumer into the local cluster under the ticket string as the name
|
||||
5. Spawns a keepalive task that holds the session until it closes
|
||||
|
||||
**Concurrency:** Duplicate concurrent pulls for the same ticket are deduplicated using a `HashMap<String, Arc<Notify>>`. Waiters block on the `Notify` until the first connector finishes.
|
||||
|
||||
```rust
|
||||
pub(crate) struct PullState {
|
||||
live: iroh_live::Live,
|
||||
cluster: Cluster,
|
||||
connecting: Arc<Mutex<HashMap<String, Arc<Notify>>>>>,
|
||||
}
|
||||
```
|
||||
|
||||
### Web Viewer
|
||||
|
||||
The relay embeds a SolidJS + TypeScript web application compiled by Vite. It uses:
|
||||
- `@moq/watch` — Web component for watching streams via WebCodecs
|
||||
- `@moq/publish` — Web component for publishing from browser camera/mic
|
||||
- WebTransport — For QUIC connectivity from the browser
|
||||
|
||||
Watch URLs: `https://relay:4443/?name=<BROADCAST_OR_TICKET>`
|
||||
|
||||
### Data Directory
|
||||
|
||||
The relay persists data to `$IROH_LIVE_RELAY_DATA` (or the platform default). This includes:
|
||||
- iroh secret key (`iroh_secret_key`) — ensures endpoint ID stability across restarts
|
||||
- TLS certificates
|
||||
|
||||
### TLS and Certificates
|
||||
|
||||
Currently **self-signed only**. ACME/Let's Encrypt is planned but not implemented. In dev mode, browsers need `--ignore-certificate-errors` or the relay's fingerprint (served at `/certificate.sha256`) for WebTransport to work.
|
||||
|
||||
## Error Handling
|
||||
|
||||
No authentication is implemented yet. The relay accepts all connections. MoQ supports token-based authentication which could be added.
|
||||
|
||||
## CLI Binary
|
||||
|
||||
```rust
|
||||
// iroh-live-relay/src/main.rs
|
||||
#[derive(Parser)]
|
||||
struct Cli {
|
||||
#[command(flatten)]
|
||||
relay: RelayConfig,
|
||||
}
|
||||
```
|
||||
|
||||
Must call `rustls::crypto::aws_lc_rs::default_provider().install_default()` before `run()`.
|
||||
@@ -0,0 +1,304 @@
|
||||
# moq-media: Media Pipelines
|
||||
|
||||
## Overview
|
||||
|
||||
`moq-media` owns the media pipeline: broadcast management, codec orchestration, playout timing, adaptive bitrate, and audio backend. **It has no dependency on iroh** — it works with any transport that implements `PacketSource` and `PacketSink`. This makes it usable for recording pipelines, studio links, and camera dashboards without RTC.
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
moq-media/
|
||||
├── lib.rs — Re-exports and feature-gated modules
|
||||
├── publish.rs — LocalBroadcast, VideoPublisher, AudioPublisher
|
||||
├── subscribe.rs — RemoteBroadcast, VideoTrack, AudioTrack, MediaTracks
|
||||
├── transport.rs — PacketSource/PacketSink traits, MoqPacketSource, MoqPacketSink
|
||||
├── net.rs — NetworkSignals (RTT, loss rate, available bandwidth)
|
||||
├── adaptive.rs — Adaptive rendition switching algorithm
|
||||
├── playout.rs — PlaybackPolicy, SyncMode
|
||||
├── chat.rs — ChatPublisher, ChatSubscriber (MoQ track-based)
|
||||
├── frame_channel.rs — Single-frame channel (last-writer-wins for video)
|
||||
├── sync.rs — Shared playout clock (Sync) for A/V sync
|
||||
├── stats.rs — Metric, Label, NetStats, EncodeStats, RenderStats, etc.
|
||||
├── pipeline.rs — Pipeline orchestration
|
||||
├── pipeline/ — VideoEncoderPipeline, AudioEncoderPipeline, VideoDecoderPipeline, etc.
|
||||
├── audio_backend.rs — AudioBackend trait and device enumeration
|
||||
├── audio_backend/ — Platform-specific audio backends (cpal, etc.)
|
||||
├── capture.rs — Camera/screen capture integration
|
||||
├── source_spec.rs — VideoInput, PreEncodedTrack
|
||||
├── test_util.rs — Test utilities (feature-gated)
|
||||
└── processing/ — Scale, color conversion, etc.
|
||||
```
|
||||
|
||||
## Publish Pipeline — `LocalBroadcast`
|
||||
|
||||
`LocalBroadcast` manages encoder pipelines and publishes a catalog that subscribers use to discover available renditions. It owns a `BroadcastProducer` (from moq-lite) and coordinates video and audio track lifecycles.
|
||||
|
||||
### Construction
|
||||
|
||||
```rust
|
||||
let broadcast = LocalBroadcast::new();
|
||||
broadcast.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
|
||||
broadcast.audio().set(mic, AudioCodec::Opus, [AudioPreset::Hq])?;
|
||||
|
||||
// Or pre-encoded sources
|
||||
broadcast.video().set(VideoInput::pre_encoded("video/h264-pi", config, factory))?;
|
||||
```
|
||||
|
||||
### Slot Handles
|
||||
|
||||
- `broadcast.video()` → `VideoPublisher` (borrows `&self`)
|
||||
- `broadcast.audio()` → `AudioPublisher` (borrows `&self`)
|
||||
|
||||
Both use interior mutability. Calling `set()` tears down any existing pipeline and installs the new one.
|
||||
|
||||
### Video Input Modes
|
||||
|
||||
```rust
|
||||
pub enum VideoInput {
|
||||
Renditions(VideoRenditions), // Raw source → multiple encoded renditions (simulcast)
|
||||
PreEncoded(Vec<PreEncodedTrack>), // Already-encoded tracks pass through
|
||||
}
|
||||
```
|
||||
|
||||
**`VideoRenditions`** holds a `SharedVideoSource` and a map of rendition names to encoder factories. Multiple renditions share the same source via `watch::Receiver<Option<VideoFrame>>`. Slow encoders never cause backpressure on the source — intermediate frames are silently skipped.
|
||||
|
||||
**`PreEncodedTrack`** is for hardware encoders that produce compressed output directly (e.g., rpicam-vid on Raspberry Pi). Each track carries a name, `VideoConfig`, and a factory closure that creates a fresh source per subscriber.
|
||||
|
||||
### SharedVideoSource
|
||||
|
||||
Runs the capture source on a dedicated OS thread. Parks when no subscribers are connected (releasing camera/screen resources) and unparks when the first subscriber arrives. Uses `AtomicU32` subscriber counting with proper memory ordering (`AcqRel`/`Acquire`).
|
||||
|
||||
Frames are distributed via `watch::Sender<Option<VideoFrame>>` — always contains the latest frame, so slow encoders never block the source.
|
||||
|
||||
### Demand-Driven Track Startup
|
||||
|
||||
The broadcast's run loop (`LocalBroadcast::run_dynamic`) calls `producer.requested_track().await` to wait for subscriber demand. When a subscriber requests a specific rendition:
|
||||
|
||||
1. The loop looks up the rendition in the current `VideoInput` or `AudioRenditions`
|
||||
2. It starts the corresponding encoder pipeline on a dedicated OS thread
|
||||
3. When all subscribers disconnect (tracked via `track.unused().await`), the pipeline is stopped
|
||||
|
||||
This means encoder threads only run when someone is actually consuming.
|
||||
|
||||
### Catalog
|
||||
|
||||
`LocalBroadcast` maintains a catalog track (hang's built-in catalog mechanism) listing all available video and audio renditions with codec configuration, dimensions, and bitrate. Updated whenever video or audio is set/cleared.
|
||||
|
||||
Catalog format follows the `hang::catalog::Catalog` structure with `Video` and `Audio` entries, each containing a `BTreeMap<String, Config>` of rendition names to configurations.
|
||||
|
||||
### Encoder Pipeline Architecture
|
||||
|
||||
All encoder pipelines run on **dedicated OS threads** (`spawn_thread`), not tokio tasks. Codec operations are CPU-intensive and sometimes block on hardware (VAAPI, V4L2), so running on tokio tasks would starve other async work.
|
||||
|
||||
Communication with the async runtime:
|
||||
- **VideoEncoderPipeline**: reads `SharedVideoSource` via `watch::Receiver`, writes encoded frames to `MoqPacketSink`
|
||||
- **AudioEncoderPipeline**: reads from `AudioSource`, writes to `MoqPacketSink`
|
||||
- **PreEncodedVideoPipeline**: reads from `PreEncodedVideoSource`, writes to `MoqPacketSink`
|
||||
|
||||
### Chat
|
||||
|
||||
```rust
|
||||
let chat_publisher = broadcast.enable_chat()?;
|
||||
chat_publisher.send("Hello!")?;
|
||||
|
||||
// Subscriber side
|
||||
if let Some(chat_sub) = remote_broadcast.chat() {
|
||||
let msg = chat_sub.recv().await;
|
||||
}
|
||||
```
|
||||
|
||||
Each chat message is a single MoQ group with one frame of UTF-8 text. The track name is `"chat"` with priority 10.
|
||||
|
||||
## Subscribe Pipeline — `RemoteBroadcast`
|
||||
|
||||
`RemoteBroadcast` wraps a `BroadcastConsumer` and watches its catalog for available video and audio renditions. Created with a `BroadcastConsumer` and a `PlaybackPolicy`.
|
||||
|
||||
### Construction
|
||||
|
||||
```rust
|
||||
let broadcast = RemoteBroadcast::new("stream-name", consumer).await?;
|
||||
// Or with explicit policy
|
||||
let broadcast = RemoteBroadcast::with_playback_policy("stream", consumer, policy).await?;
|
||||
```
|
||||
|
||||
On construction, spawns a catalog-watching task that publishes snapshots via `Watchable<CatalogSnapshot>`.
|
||||
|
||||
### `CatalogSnapshot`
|
||||
|
||||
Point-in-time view of the broadcast's catalog. Derefs to `hang::Catalog`. Carries a sequence number for change detection.
|
||||
|
||||
```rust
|
||||
let catalog = broadcast.catalog();
|
||||
catalog.video_renditions() // Iterator of rendition names sorted by width
|
||||
catalog.audio_renditions() // Iterator of audio rendition names
|
||||
catalog.select_video_rendition(Quality::High)? // Best match for quality
|
||||
catalog.has_video()
|
||||
catalog.has_audio()
|
||||
catalog.has_chat()
|
||||
catalog.user() // User metadata from publisher
|
||||
```
|
||||
|
||||
### Rendition Selection
|
||||
|
||||
```rust
|
||||
pub enum Quality { Highest, High, Mid, Low }
|
||||
|
||||
pub struct VideoTarget {
|
||||
pub max_pixels: Option<u32>,
|
||||
pub max_bitrate_kbps: Option<u32>,
|
||||
pub rendition: Option<String>, // Pin to specific rendition
|
||||
}
|
||||
```
|
||||
|
||||
`Quality::High` → `max_pixels(1280*720)`, etc. If `rendition` is set, it takes priority.
|
||||
|
||||
### VideoTrack
|
||||
|
||||
Represents a decoded video stream from a remote broadcast. The decoder runs on a dedicated OS thread.
|
||||
|
||||
**Creation flow:**
|
||||
|
||||
1. Pick a rendition (via `VideoTarget` or explicit name)
|
||||
2. Create `TrackConsumer` from `BroadcastConsumer`, wrap in `OrderedConsumer` with `PlaybackPolicy::max_latency`
|
||||
3. Wrap in `MoqPacketSource`
|
||||
4. A `forward_packets` async task reads from `MoqPacketSource` → `mpsc` channel
|
||||
5. Decoder thread reads `mpsc` → decoder → output via `Sync` playout clock (or `FramePacer`)
|
||||
6. Output channel: `FrameReceiver<VideoFrame>` (latest-frame wins, suitable for rendering)
|
||||
|
||||
**Frame access:**
|
||||
- `track.try_recv()` — Returns latest frame, draining older buffered frames (for game loops)
|
||||
- `track.next_frame().await` — Async wait for next frame
|
||||
- `track.has_frame()` — Check without consuming
|
||||
|
||||
**Adaptive rendition switching:**
|
||||
```rust
|
||||
track.enable_adaptation(broadcast, signals, config, decode_config)?;
|
||||
track.disable_adaptation();
|
||||
track.is_adaptive();
|
||||
track.selected_rendition();
|
||||
track.set_rendition_mode(RenditionMode::Fixed("video/h264-360p".into()));
|
||||
track.set_rendition_mode(RenditionMode::Auto);
|
||||
track.rendition_watcher(); // Direct<String> watcher for rendition changes
|
||||
```
|
||||
|
||||
### AudioTrack
|
||||
|
||||
Same pattern as `VideoTrack` but sends decoded samples to an `AudioSink` (typically cpal + sonora). The audio decoder thread runs a 10ms tick loop.
|
||||
|
||||
### MediaTracks
|
||||
|
||||
Convenience struct combining `RemoteBroadcast` with optional `VideoTrack` and `AudioTrack`:
|
||||
|
||||
```rust
|
||||
pub struct MediaTracks {
|
||||
pub broadcast: RemoteBroadcast,
|
||||
pub video: Option<VideoTrack>,
|
||||
pub audio: Option<AudioTrack>,
|
||||
}
|
||||
```
|
||||
|
||||
### Lifecycle
|
||||
|
||||
Both `VideoTrack` and `AudioTrack` use drop-based cleanup. Dropping cancels the decoder thread (via `CancellationToken`) and the `forward_packets` task (via `AbortOnDropHandle`). The `OrderedConsumer` is dropped, signaling the transport that the track is no longer needed.
|
||||
|
||||
## Transport Abstraction — `PacketSource` / `PacketSink`
|
||||
|
||||
The transport boundary between moq-media and the network:
|
||||
|
||||
```rust
|
||||
pub trait PacketSource: Send + 'static {
|
||||
fn read(&mut self) -> impl Future<Output = Result<Option<MediaPacket>>> + Send;
|
||||
}
|
||||
|
||||
pub trait PacketSink: Send + 'static {
|
||||
fn write(&mut self, packet: EncodedFrame) -> Result<()>;
|
||||
fn finish(&mut self) -> Result<()>;
|
||||
}
|
||||
```
|
||||
|
||||
**`MoqPacketSink`** wraps an `OrderedProducer`. When it receives an `EncodedFrame` with `is_keyframe = true`, it calls `keyframe()` on the producer to start a new MoQ group. This keyframe-to-group mapping is how subscribers can join at any group boundary.
|
||||
|
||||
**`MoqPacketSource`** wraps an `OrderedConsumer` and reads frames, converting them to `MediaPacket`.
|
||||
|
||||
**`PipeSink` / `PipeSource`** — In-memory pipe for local encode→decode without network (testing, local preview).
|
||||
|
||||
## Adaptive Rendition Switching
|
||||
|
||||
The adaptation algorithm runs in a background task that monitors `NetworkSignals` and decides whether to switch to a different video rendition.
|
||||
|
||||
### Algorithm
|
||||
|
||||
Renditions are ranked by pixel count (highest first). The algorithm maintains state across ticks:
|
||||
|
||||
```rust
|
||||
pub enum Decision {
|
||||
Hold, // Stay on current rendition
|
||||
Downgrade(usize), // Switch to lower at index
|
||||
Emergency, // Drop to lowest immediately
|
||||
StartProbe(usize), // Try upgrading to index
|
||||
}
|
||||
```
|
||||
|
||||
**Emergency** (immediate): Loss rate ≥ 20% → drop to lowest rendition
|
||||
|
||||
**Downgrade** (sustained 500ms): Loss rate ≥ 10% OR available bandwidth < 85% of current rendition's bitrate
|
||||
|
||||
**Upgrade probe** (sustained 4s good conditions): Loss ≤ 2%, bandwidth ≥ 120% of next-higher rendition's bitrate → start 3-second probe on the higher rendition
|
||||
|
||||
**Probe abort**: Loss ≥ 5% or new congestion events during probe → abort, 8s cooldown
|
||||
|
||||
**Post-downgrade cooldown**: 4s after any downgrade before probes are allowed
|
||||
|
||||
### Implementation
|
||||
|
||||
The adaptation task (`adaptation_task_v2`) creates new `VideoDecoderPipeline`s that write to the same `FrameSender` via `with_sender()`. The frame channel stays the same while the underlying decoder pipeline gets swapped. When switching:
|
||||
|
||||
1. Create a new decoder pipeline for the target rendition
|
||||
2. Drop the old pipeline handle
|
||||
3. Update `selected_rendition` Watchable
|
||||
|
||||
## Playback and Sync
|
||||
|
||||
### PlaybackPolicy
|
||||
|
||||
```rust
|
||||
pub struct PlaybackPolicy {
|
||||
pub sync: SyncMode, // Synced (shared clock) or Unmanaged (PTS pacing)
|
||||
pub max_latency: Duration, // Default: 150ms — how much buffering before skipping forward
|
||||
}
|
||||
```
|
||||
|
||||
### SyncMode
|
||||
|
||||
- **`Synced`** (default): Shared playout clock (`Sync`). Video frames are gated by `Sync::wait(pts)`, which blocks until `reference + pts + latency` arrives. Audio paces itself through its ring buffer (~80ms).
|
||||
- **`Unmanaged`**: No synchronization. `FramePacer` sleeps between frames based on PTS deltas, clamped to 2× frame period.
|
||||
|
||||
### Sync
|
||||
|
||||
The `Sync` type records arrival offsets via `received(pts)` and blocks on `wait(pts)` until `reference + pts + latency`. This keeps audio and video aligned without cross-path gating or signaling. Ported from the moq/js implementation.
|
||||
|
||||
## Stats
|
||||
|
||||
moq-media has a structured stats system for debug overlays:
|
||||
|
||||
- **`NetStats`** — RTT, loss%, bandwidth, path type (written by iroh-live transport bridge)
|
||||
- **`EncodeStats`** — FPS, encode time, bitrate, codec, encoder, resolution, capture path
|
||||
- **`RenderStats`** — FPS, decode time, decoder, renderer, rendition
|
||||
- **`TimingStats`** — Audio buffer level, video/audio lag, A/V delta, video buffer depth
|
||||
- **`Timeline`** — Ring buffer of `FrameMeta` entries for timeline visualization
|
||||
|
||||
Each `Metric` has EMA smoothing, a history ring buffer, and optional color thresholds. `Label` provides atomic string values.
|
||||
|
||||
## Codec Support
|
||||
|
||||
Feature-gated codec support:
|
||||
|
||||
| Feature | Codec | Backend |
|
||||
|---------|-------|---------|
|
||||
| `h264` | H.264 | openh264 (software) |
|
||||
| `av1` | AV1 | rav1e encoder, rav1d decoder |
|
||||
| `opus` | Opus | opus crate |
|
||||
| `vaapi` | VAAPI | Linux hardware encode/decode |
|
||||
| `videotoolbox` | VideoToolbox | macOS hardware |
|
||||
| `v4l2` | V4L2 | Raspberry Pi hardware |
|
||||
| `pcm` | Raw PCM | No encoding |
|
||||
@@ -0,0 +1,95 @@
|
||||
# iroh-live: Network Signals and Adaptive Bitrate
|
||||
|
||||
## NetworkSignals
|
||||
|
||||
Produced by polling iroh QUIC connection stats. Consumed by `VideoTrack::enable_adaptation()` to decide when to switch video renditions.
|
||||
|
||||
```rust
|
||||
pub struct NetworkSignals {
|
||||
pub rtt: Duration, // Round-trip time to remote peer
|
||||
pub loss_rate: f64, // Recent packet loss rate (0.0..=1.0), 200ms delta window
|
||||
pub available_bps: u64, // Estimated available bandwidth (cwnd * 8 / rtt)
|
||||
pub congestion_events: u64, // Monotonically increasing congestion counter
|
||||
}
|
||||
```
|
||||
|
||||
### Production
|
||||
|
||||
`spawn_signal_producer()` in `iroh-live/src/util.rs` polls every 200ms:
|
||||
|
||||
1. Gets connection paths via `conn.paths().get()`
|
||||
2. Finds the selected path (`is_selected()`)
|
||||
3. Reads path stats (`lost_packets`, `udp_tx.datagrams`, `cwnd`) and RTT
|
||||
4. Computes delta-based loss rate: `delta_lost / (delta_sent + delta_lost)`
|
||||
5. Estimates bandwidth: `cwnd * 8 * 1e9 / rtt_ns`
|
||||
6. Writes to `watch::Sender<NetworkSignals>`
|
||||
|
||||
Also: `spawn_stats_recorder()` records into `NetStats` for the debug overlay (RTT, loss%, bandwidth in/out, path type).
|
||||
|
||||
## Adaptive Rendition Algorithm
|
||||
|
||||
Located in `moq-media/src/adaptive.rs`. The algorithm evaluates `NetworkSignals` against configured thresholds and produces `Decision` values.
|
||||
|
||||
### Configuration (`AdaptiveConfig`)
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `upgrade_hold` | 4s | Sustained good conditions before upgrade probe |
|
||||
| `downgrade_hold` | 500ms | Sustained bad conditions before downgrade |
|
||||
| `probe_duration` | 3s | How long a probe runs before committing |
|
||||
| `probe_cooldown` | 8s | Cooldown after a failed probe |
|
||||
| `post_downgrade_cooldown` | 4s | Cooldown after any downgrade |
|
||||
| `loss_downgrade` | 10% | Loss rate threshold for downgrade |
|
||||
| `loss_emergency` | 20% | Loss rate for immediate drop to lowest |
|
||||
| `loss_good` | 2% | Loss rate considered "good" |
|
||||
| `loss_probe_abort` | 5% | Loss rate that aborts an active probe |
|
||||
| `bw_downgrade_ratio` | 85% | Bandwidth utilization ceiling for downgrade |
|
||||
| `bw_probe_headroom` | 120% | Required excess bandwidth for probe |
|
||||
| `check_interval` | 200ms | How often adaptation task checks signals |
|
||||
|
||||
### Decision Logic
|
||||
|
||||
```
|
||||
1. Emergency: loss >= 20% AND not already lowest → Drop to lowest immediately
|
||||
|
||||
2. Downgrade check:
|
||||
- bandwidth_stressed (available < current_bitrate * 85%) OR loss >= 10%
|
||||
- sustained for downgrade_hold (500ms) → Downgrade(next_lower)
|
||||
|
||||
3. Upgrade check:
|
||||
- Already at highest → Hold
|
||||
- Within post_downgrade_cooldown (4s) → Hold
|
||||
- Within probe_cooldown (8s) → Hold
|
||||
- bandwidth_headroom (available >= next_higher_bitrate * 120%) AND loss <= 2%
|
||||
- sustained for upgrade_hold (4s) → StartProbe(next_higher)
|
||||
|
||||
4. Otherwise: Hold
|
||||
```
|
||||
|
||||
### Probe Lifecycle
|
||||
|
||||
When `StartProbe(idx)` is decided:
|
||||
1. Create a new decoder pipeline for the higher rendition
|
||||
2. Write frames to the same `FrameSender` (seamless switch for the consumer)
|
||||
3. Monitor signals during the probe period
|
||||
4. If `should_abort_probe()` (loss ≥ 5% or new congestion events) → abort, drop probe pipeline, cooldown 8s
|
||||
5. If probe duration (3s) passes without abort → commit, replace current pipeline
|
||||
|
||||
### Rendition Ranking
|
||||
|
||||
```rust
|
||||
pub fn rank_renditions(renditions: &BTreeMap<String, VideoConfig>) -> Vec<RankedRendition>
|
||||
```
|
||||
|
||||
Sorts by pixel count descending (highest quality = index 0). Each `RankedRendition` carries name, pixels, bitrate_bps, width, height.
|
||||
|
||||
### RenditionMode
|
||||
|
||||
```rust
|
||||
pub enum RenditionMode {
|
||||
Auto, // Algorithm-driven switching
|
||||
Fixed(String), // Pin to a specific rendition
|
||||
}
|
||||
```
|
||||
|
||||
Controlled via `VideoTrack::set_rendition_mode()`. In Fixed mode, the algorithm switches directly to the named rendition without probing.
|
||||
85
docs/research/references/iroh/iroh-live/08-p2p-and-relay.md
Normal file
85
docs/research/references/iroh/iroh-live/08-p2p-and-relay.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# iroh-live: P2P Connectivity and Relay Architecture
|
||||
|
||||
## Direct Connectivity
|
||||
|
||||
iroh connects peers directly when possible:
|
||||
|
||||
- **Same LAN:** Communicates over the local network without traffic leaving the subnet
|
||||
- **Public IP / simple NAT:** iroh's hole-punching establishes a direct UDP path
|
||||
- **Symmetric NAT / corporate firewalls / CGNAT:** Falls back to iroh relay network
|
||||
|
||||
The iroh endpoint exposes path statistics via `conn.paths()`, which returns a `Watcher<PathInfoList>`. Each `PathInfo` reports RTT, whether the path is selected, and the remote address. The selected path is the one actively carrying traffic; iroh may maintain multiple candidate paths and switch between them.
|
||||
|
||||
The transition between direct and relayed paths is transparent to the application. The media pipeline sees only changes in RTT and bandwidth, which adaptive rendition switching handles automatically.
|
||||
|
||||
## iroh-live-relay: Architecture
|
||||
|
||||
The relay serves two transport protocols simultaneously:
|
||||
|
||||
```
|
||||
iroh P2P publisher ──(QUIC, moq-lite-03)──> iroh-live-relay <──(WebTransport/H3, noq)── browser
|
||||
```
|
||||
|
||||
Both protocols feed into `moq-relay`'s shared `Origin`, which manages broadcast routing. A broadcast published via iroh is automatically available to WebTransport subscribers, and vice versa.
|
||||
|
||||
### Pull Model
|
||||
|
||||
The relay operates in **pull mode**: it connects to iroh publishers on demand when a browser client requests a broadcast. The broadcast name in the URL can be a `LiveTicket` URI. Multiple browser clients watching the same broadcast share a single upstream iroh connection.
|
||||
|
||||
Pull flow:
|
||||
1. Browser connects via WebTransport, requests broadcast by name (or ticket)
|
||||
2. Relay checks if broadcast already exists in local cluster → fast path
|
||||
3. If not, relay uses iroh-live `Moq::connect()` to connect to the remote publisher
|
||||
4. Subscribes to the broadcast via `session.subscribe(broadcast_name)`
|
||||
5. Publishes the consumer into the local cluster under the ticket string as the name
|
||||
6. Spawns a keepalive task holding the session until it closes
|
||||
7. Browser receives the stream through the relay's WebTransport frontend
|
||||
|
||||
### Connection Deduplication
|
||||
|
||||
`PullState` uses a `HashMap<String, Arc<Notify>>` to prevent duplicate concurrent connections to the same remote. If a pull is already in progress for a given ticket, subsequent requests wait on the `Notify` and then check if the broadcast appeared in the cluster.
|
||||
|
||||
### QUIC Backend: noq
|
||||
|
||||
The relay uses `noq` as its QUIC backend (not quinn). This is configured via:
|
||||
|
||||
```rust
|
||||
server_config.backend = Some(moq_native::QuicBackend::Noq);
|
||||
```
|
||||
|
||||
### iroh Endpoint Integration
|
||||
|
||||
The relay also binds an iroh endpoint:
|
||||
|
||||
```rust
|
||||
let mut iroh_config = moq_native::IrohEndpointConfig::default();
|
||||
iroh_config.enabled = Some(true);
|
||||
iroh_config.secret = Some(relay.iroh_secret_path_str());
|
||||
let iroh = iroh_config.bind().await?;
|
||||
```
|
||||
|
||||
This enables the relay to participate in the iroh P2P network directly.
|
||||
|
||||
## Ticket Format
|
||||
|
||||
`LiveTicket` serves as the connection mechanism for both P2P and relay scenarios:
|
||||
|
||||
- **P2P:** Subscriber uses the `EndpointAddr` (node ID + relay URLs) to connect directly
|
||||
- **Relay:** The full ticket string becomes the broadcast name in the URL: `https://relay:4443/?name=iroh-live:...`
|
||||
|
||||
The ticket format: `iroh-live:<base64url(postcard(EndpointAddr))>/<broadcast_name>`
|
||||
|
||||
It also supports a legacy format: `<name>@<base32(postcard(EndpointAddr))>`
|
||||
|
||||
## Connection Access in iroh-moq
|
||||
|
||||
`MoqSession::conn()` returns a reference to the underlying iroh `Connection`. This is used by:
|
||||
|
||||
1. **Signal producer** — Polls path stats for `NetworkSignals`
|
||||
2. **Stats recorder** — Records into `NetStats` for debug overlays
|
||||
3. **Call::closed()** — Inspects QUIC close reason to determine `DisconnectReason`
|
||||
|
||||
The connection provides:
|
||||
- `paths().get()` — List of active network paths with RTT, stats, relay status
|
||||
- `close_reason()` — Why the connection closed (LocallyClosed, ApplicationClosed, ConnectionClosed, Reset)
|
||||
- `remote_id()` — Remote peer's endpoint ID
|
||||
42
docs/research/references/iroh/iroh-live/README.md
Normal file
42
docs/research/references/iroh/iroh-live/README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# iroh-live Reference Documentation
|
||||
|
||||
> **Status:** Early tech preview. APIs are unstable. Based on source code analysis of the iroh-live workspace.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Topic |
|
||||
|------|-------|
|
||||
| [01-overview-and-architecture](01-overview-and-architecture.md) | Workspace structure, crate layers, design principles, data flow, dependencies |
|
||||
| [02-core-api](02-core-api.md) | `Live`, `LiveTicket`, `Call`, `Subscription`, `DisconnectReason`, `util` module |
|
||||
| [03-iroh-moq-transport](03-iroh-moq-transport.md) | `Moq`, `MoqSession`, `MoqProtocolHandler`, actor internals, session lifecycle, error types |
|
||||
| [04-rooms](04-rooms.md) | `Room`, `RoomHandle`, `RoomTicket`, `RoomEvent`, gossip KV coordination, actor architecture |
|
||||
| [05-relay](05-relay.md) | `iroh-live-relay`: browser bridging, pull model, `RelayConfig`, `PullState`, web viewer |
|
||||
| [06-moq-media-pipelines](06-moq-media-pipelines.md) | `LocalBroadcast`, `RemoteBroadcast`, `VideoTrack`, `AudioTrack`, transport abstraction, codec support |
|
||||
| [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md) | `NetworkSignals`, adaptation algorithm, `AdaptiveConfig`, `Decision`, probe lifecycle |
|
||||
| [08-p2p-and-relay](08-p2p-and-relay.md) | iroh P2P connectivity, relay architecture, pull model, ticket format, connection access |
|
||||
|
||||
## Quick Navigation
|
||||
|
||||
### "How do I..."
|
||||
|
||||
- **Publish a stream?** → [02-core-api](02-core-api.md) (`Live::publish`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`LocalBroadcast`)
|
||||
- **Subscribe to a stream?** → [02-core-api](02-core-api.md) (`Live::subscribe`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`RemoteBroadcast`)
|
||||
- **Make a 1:1 call?** → [02-core-api](02-core-api.md) (`Call::dial` / `Call::accept`)
|
||||
- **Create a multi-party room?** → [04-rooms](04-rooms.md) (`Room::new`, `RoomTicket`)
|
||||
- **Bridge to browsers?** → [05-relay](05-relay.md) (`iroh-live-relay`)
|
||||
- **Adapt quality to network conditions?** → [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md)
|
||||
- **Understand the MoQ transport?** → [03-iroh-moq-transport](03-iroh-moq-transport.md)
|
||||
- **Understand the media pipeline?** → [06-moq-media-pipelines](06-moq-media-pipelines.md)
|
||||
|
||||
### Key Source Files
|
||||
|
||||
| Component | Path |
|
||||
|-----------|------|
|
||||
| iroh-live crate | `iroh-live/src/{lib, live, call, subscription, ticket, types, util, rooms}.rs` |
|
||||
| iroh-moq crate | `iroh-moq/src/lib.rs` |
|
||||
| iroh-live-relay | `iroh-live-relay/src/{lib, main, pull}.rs` |
|
||||
| moq-media publish | `moq-media/src/publish.rs` |
|
||||
| moq-media subscribe | `moq-media/src/subscribe.rs` |
|
||||
| moq-media adaptive | `moq-media/src/adaptive.rs` |
|
||||
| moq-media transport | `moq-media/src/transport.rs` |
|
||||
| moq-media network signals | `moq-media/src/net.rs` |
|
||||
160
docs/research/references/iroh/iroh/01-overview-architecture.md
Normal file
160
docs/research/references/iroh/iroh/01-overview-architecture.md
Normal file
@@ -0,0 +1,160 @@
|
||||
# Iroh: Overview & Architecture
|
||||
|
||||
**Version**: 0.98.1
|
||||
**Repository**: https://github.com/n0-computer/iroh
|
||||
**License**: MIT OR Apache-2.0
|
||||
**Rust Edition**: 2024
|
||||
**MSRV**: 1.89
|
||||
|
||||
## What is Iroh?
|
||||
|
||||
Iroh is a Rust library for establishing **peer-to-peer QUIC connections dialed by public key**. You provide an `EndpointAddr` (which identifies a peer), and iroh finds and maintains the fastest connection route — whether direct (hole-punched) or relayed through a server.
|
||||
|
||||
Core value propositions:
|
||||
- **Dial by public key** — no IP addresses or hostnames needed at the application layer
|
||||
- **Hole-punching** — automatically attempts direct P2P connectivity
|
||||
- **Relay fallback** — encrypted relay servers ensure connectivity even behind NATs
|
||||
- **Built on QUIC** — uses the `noq` QUIC implementation for multiplexed, encrypted streams
|
||||
- **Address Lookup** — pluggable discovery system to resolve `EndpointId → addressing info`
|
||||
|
||||
## Workspace Structure
|
||||
|
||||
```
|
||||
iroh/ # Core library (p2p QUIC connections)
|
||||
├── iroh-base/ # Fundamental types: SecretKey, PublicKey, EndpointId, RelayUrl, EndpointAddr
|
||||
├── iroh-dns/ # DNS resolver + endpoint info serialization (pkarr)
|
||||
├── iroh-dns-server/ # DNS server implementation (powers dns.iroh.link)
|
||||
├── iroh-relay/ # Relay server + client implementation
|
||||
└── iroh/bench/ # Benchmarks
|
||||
```
|
||||
|
||||
### Dependency Graph
|
||||
|
||||
```
|
||||
iroh depends on:
|
||||
├── iroh-base (key types, EndpointAddr, RelayUrl)
|
||||
├── iroh-dns (DNS resolution, EndpointInfo serialization)
|
||||
├── iroh-relay (RelayMap, RelayConfig, relay client/server, QUIC client)
|
||||
├── noq (QUIC implementation)
|
||||
├── noq-proto (QUIC protocol types)
|
||||
├── noq-udp (UDP socket abstraction)
|
||||
├── netwatch (network interface monitoring)
|
||||
├── portmapper (UPnP/PCP/NAT-PMP port mapping, optional)
|
||||
├── n0-future (async utilities)
|
||||
├── n0-watcher (watch/subscribe primitives)
|
||||
└── iroh-metrics (metrics collection)
|
||||
```
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### EndpointId / PublicKey
|
||||
Every iroh endpoint has a unique Ed25519 cryptographic key pair. The public key doubles as the endpoint identifier (`EndpointId`). It's used for both:
|
||||
- **Identity** — unique addressing in the network
|
||||
- **Encryption** — TLS authentication (via RFC 7250 Raw Public Keys, no X.509 certificates)
|
||||
|
||||
### EndpointAddr
|
||||
The addressing structure that combines identity with network paths:
|
||||
```rust
|
||||
pub struct EndpointAddr {
|
||||
pub id: EndpointId, // Who to connect to
|
||||
pub addrs: BTreeSet<TransportAddr>, // How to reach them
|
||||
}
|
||||
|
||||
pub enum TransportAddr {
|
||||
Relay(RelayUrl), // Via relay server
|
||||
Ip(SocketAddr), // Direct IP address
|
||||
Custom(CustomAddr), // Via custom transport
|
||||
}
|
||||
```
|
||||
|
||||
### Relay Servers
|
||||
Relay servers provide:
|
||||
1. **Reliable connectivity** — always reachable, forward encrypted traffic to the correct endpoint by `EndpointId`
|
||||
2. **Hole-punching assistance** — QUIC Address Discovery (QAD), STUN-like services
|
||||
3. **Traffic relay** — fallback when direct connections are impossible
|
||||
|
||||
Connections to relays use HTTP/1.1 with TLS, then upgrade to a custom protocol. The relay only sees encrypted traffic.
|
||||
|
||||
### Connection Flow
|
||||
1. Endpoint binds, connects to a "home relay"
|
||||
2. To connect to peer: resolve `EndpointId` → `EndpointAddr` via Address Lookup
|
||||
3. Establish initial connection via relay
|
||||
4. Attempt direct connection (hole-punching if needed)
|
||||
5. Migrate to direct connection when available (relay becomes backup)
|
||||
|
||||
## Crate: `iroh` (Core Library)
|
||||
|
||||
### Main Types
|
||||
| Type | Module | Purpose |
|
||||
|------|--------|---------|
|
||||
| `Endpoint` | `endpoint` | Central API — connect, accept, manage connections |
|
||||
| `Builder` | `endpoint` | Configure and construct an `Endpoint` |
|
||||
| `Router` | `protocol` | Accept loop that dispatches to `ProtocolHandler`s |
|
||||
| `ProtocolHandler` | `protocol` | Trait for handling incoming connections by ALPN |
|
||||
| `Connection` | `endpoint::connection` | QUIC connection wrapper |
|
||||
| `Incoming` | `endpoint::connection` | Pre-handshake incoming connection |
|
||||
| `Accepting` | `endpoint::connection` | Post-accept, pre-handshake state |
|
||||
|
||||
### Feature Flags
|
||||
- `default` = `["metrics", "fast-apple-datapath", "portmapper", "tls-ring"]`
|
||||
- `metrics` — Prometheus-style metrics collection
|
||||
- `portmapper` — UPnP/PCP/NAT-PMP support
|
||||
- `test-utils` — Testing utilities
|
||||
- `platform-verifier` — Use OS TLS trust anchors
|
||||
- `qlog` — QUIC event logging
|
||||
- `fast-apple-datapath` — Private Apple APIs for batched sends
|
||||
- `tls-ring` / `tls-aws-lc-rs` — Choose TLS crypto backend
|
||||
- `unstable-custom-transports` — Custom transport API (unstable)
|
||||
|
||||
### WASM Support
|
||||
The crate compiles to `wasm32-unknown-unknown` for browser targets. Browser builds:
|
||||
- Use `PkarrResolver` instead of `DnsAddressLookup` (DNS-over-HTTPS)
|
||||
- Cannot bind IP sockets (no direct connectivity)
|
||||
- Use `wasm-bindgen-futures` for async runtime
|
||||
|
||||
## Presets
|
||||
|
||||
The `presets` module provides common configurations:
|
||||
|
||||
| Preset | Description |
|
||||
|--------|-------------|
|
||||
| `Empty` | No defaults — you must set all required options yourself |
|
||||
| `Minimal` | Sets only the crypto provider (ring or aws-lc-rs) |
|
||||
| `N0` | Full n0 defaults: crypto provider, Pkarr publisher, DNS resolver, n0 relay servers |
|
||||
| `N0DisableRelay` | N0 defaults but with `RelayMode::Disabled` |
|
||||
|
||||
```rust
|
||||
// Quick start with full n0 infrastructure
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
|
||||
// Minimal — just crypto, no relay or address lookup
|
||||
let endpoint = Endpoint::bind(presets::Minimal).await?;
|
||||
```
|
||||
|
||||
## Encryption & Authentication
|
||||
|
||||
Iroh uses **RFC 7250 Raw Public Keys** for TLS — no X.509 certificates. Each endpoint has:
|
||||
- `SecretKey` (Ed25519) — used for TLS authentication and signing
|
||||
- `PublicKey`/`EndpointId` — derived from `SecretKey`, used as identity
|
||||
|
||||
The TLS server name is encoded as `<base32-dnssec-encoded-public-key>.iroh.invalid` to ensure 0-RTT session ticket separation per endpoint.
|
||||
|
||||
## 0-RTT Support
|
||||
|
||||
Iroh supports QUIC 0-RTT connections:
|
||||
- `Connecting::into_0rtt()` on the client side
|
||||
- `Accepting::into_0rtt()` on the server side
|
||||
- TLS session tickets cached per remote endpoint (default 256 tickets = ~150 KiB)
|
||||
- `max_tls_tickets()` builder option to tune cache size
|
||||
|
||||
## Default Infrastructure (n0)
|
||||
|
||||
Production relay servers (4 regions):
|
||||
| Region | Hostname |
|
||||
|--------|----------|
|
||||
| NA East | `use1-1.relay.n0.iroh-canary.iroh.link` |
|
||||
| NA West | `usw1-1.relay.n0.iroh-canary.iroh.link` |
|
||||
| EU | `euc1-1.relay.n0.iroh-canary.iroh.link` |
|
||||
| AP | `aps1-1.relay.n0.iroh-canary.iroh.link` |
|
||||
|
||||
DNS Address Lookup origin: `dns.iroh.link`
|
||||
392
docs/research/references/iroh/iroh/02-key-types-traits.md
Normal file
392
docs/research/references/iroh/iroh/02-key-types-traits.md
Normal file
@@ -0,0 +1,392 @@
|
||||
# Iroh: Key Types and Traits
|
||||
|
||||
## Core Identity Types (`iroh-base`)
|
||||
|
||||
### `SecretKey`
|
||||
Ed25519 signing key (32 bytes). Used for:
|
||||
- TLS authentication (RFC 7250 Raw Public Key)
|
||||
- Signing pkarr packets for address discovery
|
||||
- Generating the corresponding `PublicKey`/`EndpointId`
|
||||
|
||||
```rust
|
||||
// Generation
|
||||
let secret_key = SecretKey::generate();
|
||||
|
||||
// From bytes
|
||||
let secret_key = SecretKey::from_bytes(&[0u8; 32]);
|
||||
|
||||
// Access public key
|
||||
let public_key: PublicKey = secret_key.public();
|
||||
```
|
||||
|
||||
### `PublicKey` / `EndpointId`
|
||||
`EndpointId` is a type alias for `PublicKey`. Both are 32-byte Ed25519 compressed points.
|
||||
|
||||
```rust
|
||||
pub type EndpointId = PublicKey;
|
||||
|
||||
impl PublicKey {
|
||||
pub const LENGTH: usize = 32;
|
||||
pub fn from_bytes(bytes: &[u8; 32]) -> Result<Self, KeyParsingError>;
|
||||
pub fn as_bytes(&self) -> &[u8; 32];
|
||||
pub fn verify(&self, message: &[u8], signature: &Signature) -> Result<(), SignatureError>;
|
||||
pub fn fmt_short(&self) -> impl Display; // First 5 bytes hex
|
||||
}
|
||||
```
|
||||
|
||||
Serialization: Human-readable → base32 z-base-32 encoding; Binary → 32 raw bytes.
|
||||
|
||||
### `Signature`
|
||||
Ed25519 signature (64 bytes). Used in pkarr for signing endpoint discovery records.
|
||||
|
||||
### `KeyParsingError`
|
||||
Error type for key parsing failures.
|
||||
|
||||
## Addressing Types (`iroh-base`)
|
||||
|
||||
### `EndpointAddr`
|
||||
The primary addressing type — combines identity with network paths:
|
||||
|
||||
```rust
|
||||
pub struct EndpointAddr {
|
||||
pub id: EndpointId,
|
||||
pub addrs: BTreeSet<TransportAddr>,
|
||||
}
|
||||
|
||||
impl EndpointAddr {
|
||||
pub fn new(id: PublicKey) -> Self;
|
||||
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
|
||||
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
|
||||
pub fn is_empty(&self) -> bool;
|
||||
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
|
||||
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
|
||||
}
|
||||
```
|
||||
|
||||
Can be constructed from just an `EndpointId` (relies on Address Lookup), or with explicit paths:
|
||||
```rust
|
||||
// From just EndpointId — needs Address Lookup
|
||||
let addr = EndpointAddr::new(endpoint_id);
|
||||
|
||||
// With relay URL
|
||||
let addr = EndpointAddr::new(endpoint_id).with_relay_url(relay_url);
|
||||
|
||||
// With both
|
||||
let addr = EndpointAddr::from_parts(endpoint_id, [
|
||||
TransportAddr::Relay(relay_url),
|
||||
TransportAddr::Ip(socket_addr),
|
||||
]);
|
||||
```
|
||||
|
||||
### `TransportAddr`
|
||||
```rust
|
||||
pub enum TransportAddr {
|
||||
Relay(RelayUrl),
|
||||
Ip(SocketAddr),
|
||||
Custom(CustomAddr),
|
||||
}
|
||||
```
|
||||
|
||||
### `CustomAddr`
|
||||
Opaque custom transport address (for `unstable-custom-transports` feature):
|
||||
```rust
|
||||
pub struct CustomAddr {
|
||||
id: u32,
|
||||
addr: Vec<u8>,
|
||||
}
|
||||
```
|
||||
|
||||
### `RelayUrl`
|
||||
Arc-wrapped `Url` identifying a relay server. Cheaply clonable. Encourages fully-qualified DNS names (trailing dot).
|
||||
|
||||
```rust
|
||||
let url: RelayUrl = "https://use1-1.relay.n0.iroh-canary.iroh.link.".parse()?;
|
||||
```
|
||||
|
||||
## Endpoint Trait (`iroh`)
|
||||
|
||||
### `Endpoint`
|
||||
The central type — created via `Builder`, used for all connection operations:
|
||||
|
||||
```rust
|
||||
impl Endpoint {
|
||||
// Construction
|
||||
pub fn builder(preset: impl Preset) -> Builder;
|
||||
pub async fn bind(preset: impl Preset) -> Result<Self, BindError>;
|
||||
|
||||
// Connection
|
||||
pub async fn connect(&self, addr: impl Into<EndpointAddr>, alpn: &[u8]) -> Result<Connection, ConnectError>;
|
||||
pub async fn connect_with_opts(&self, addr: impl Into<EndpointAddr>, alpn: &[u8], opts: ConnectOptions) -> Result<Connecting, ConnectWithOptsError>;
|
||||
pub fn accept(&self) -> Accept<'_>;
|
||||
|
||||
// Identity
|
||||
pub fn id(&self) -> EndpointId;
|
||||
pub fn secret_key(&self) -> &SecretKey;
|
||||
pub fn addr(&self) -> EndpointAddr;
|
||||
pub fn watch_addr(&self) -> impl Watcher<Value = EndpointAddr>;
|
||||
|
||||
// Lifecycle
|
||||
pub async fn close(&self);
|
||||
pub fn is_closed(&self) -> bool;
|
||||
pub fn closed(&self) -> EndpointClosed;
|
||||
pub async fn online(&self); // Wait for relay connection
|
||||
|
||||
// Configuration changes
|
||||
pub fn set_alpns(&self, alpns: Vec<Vec<u8>>);
|
||||
pub async fn insert_relay(&self, relay: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
|
||||
pub async fn remove_relay(&self, relay: &RelayUrl) -> Option<Arc<RelayConfig>>;
|
||||
pub async fn add_external_addr(&self, addr: SocketAddr);
|
||||
pub async fn remove_external_addr(&self, addr: &SocketAddr) -> bool;
|
||||
pub fn set_user_data_for_address_lookup(&self, user_data: Option<UserData>);
|
||||
pub async fn network_change(&self);
|
||||
|
||||
// Observers
|
||||
pub fn home_relay_status(&self) -> impl Watcher<Value = Vec<RelayStatus>>;
|
||||
pub fn net_report(&self) -> impl Watcher<Value = Option<NetReport>>;
|
||||
pub fn remote_info(&self, id: EndpointId) -> Option<RemoteInfo>;
|
||||
pub fn metrics(&self) -> &EndpointMetrics;
|
||||
pub fn bound_sockets(&self) -> Vec<SocketAddr>;
|
||||
pub fn dns_resolver(&self) -> Result<&DnsResolver, EndpointError>;
|
||||
pub fn tls_config(&self) -> &rustls::ClientConfig;
|
||||
pub fn address_lookup(&self) -> Result<&AddressLookupServices, EndpointError>;
|
||||
}
|
||||
```
|
||||
|
||||
### `Builder`
|
||||
Fluent builder for `Endpoint`:
|
||||
|
||||
```rust
|
||||
let ep = Endpoint::builder(presets::N0)
|
||||
.secret_key(secret_key) // Identity
|
||||
.alpns(vec![b"my-alpn".to_vec()]) // Accepted protocols
|
||||
.relay_mode(RelayMode::Default) // Relay configuration
|
||||
.address_lookup(PkarrPublisher::n0_dns()) // Address discovery
|
||||
.address_lookup(DnsAddressLookup::n0_dns()) // DNS resolution
|
||||
.addr_filter(AddrFilter::relay_only()) // Filter published addresses
|
||||
.user_data_for_address_lookup(user_data) // Custom discovery data
|
||||
.transport_config(QuicTransportConfig::default()) // QUIC tuning
|
||||
.dns_resolver(dns_resolver) // Custom DNS resolver
|
||||
.proxy_url(proxy_url) // HTTP proxy
|
||||
.ca_roots_config(CaRootsConfig::default()) // TLS CA roots
|
||||
.keylog(true) // SSLKEYLOGFILE debug
|
||||
.max_tls_tickets(256) // 0-RTT ticket cache
|
||||
.hooks(my_hook) // Connection hooks
|
||||
.portmapper_config(PortmapperConfig::Enabled) // UPnP/NAT-PMP
|
||||
.external_addr(addr) // Advertised external addr
|
||||
.bind_addr("0.0.0.0:0")? // Bind specific socket
|
||||
.bind() // Build & bind
|
||||
.await?;
|
||||
```
|
||||
|
||||
### `RelayMode`
|
||||
```rust
|
||||
pub enum RelayMode {
|
||||
Disabled, // No relay
|
||||
Default, // n0 production relays
|
||||
Staging, // n0 staging relays
|
||||
Custom(RelayMap), // Custom relay configuration
|
||||
}
|
||||
```
|
||||
|
||||
## Protocol Handler (`iroh::protocol`)
|
||||
|
||||
### `ProtocolHandler`
|
||||
Trait for handling incoming connections by ALPN:
|
||||
|
||||
```rust
|
||||
pub trait ProtocolHandler: Send + Sync + Debug + 'static {
|
||||
// Optional: intercept at Accepting stage (supports 0-RTT)
|
||||
fn on_accepting(&self, accepting: Accepting) -> impl Future<Output = Result<Connection, AcceptError>> + Send;
|
||||
|
||||
// Required: handle the established connection
|
||||
fn accept(&self, connection: Connection) -> impl Future<Output = Result<(), AcceptError>> + Send;
|
||||
|
||||
// Optional: called on graceful shutdown
|
||||
fn shutdown(&self) -> impl Future<Output = ()> + Send;
|
||||
}
|
||||
```
|
||||
|
||||
### `Router`
|
||||
Spawns an accept loop that dispatches incoming connections to registered handlers:
|
||||
|
||||
```rust
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(b"/my-alpn", Arc::new(MyHandler))
|
||||
.incoming_filter(|incoming| {
|
||||
if !incoming.remote_addr_validated() {
|
||||
IncomingFilterOutcome::Retry
|
||||
} else {
|
||||
IncomingFilterOutcome::Accept
|
||||
}
|
||||
})
|
||||
.spawn();
|
||||
|
||||
// Later...
|
||||
router.shutdown().await?;
|
||||
```
|
||||
|
||||
### `IncomingFilterOutcome`
|
||||
```rust
|
||||
pub enum IncomingFilterOutcome {
|
||||
Accept, // Allow the connection
|
||||
Retry, // Send QUIC retry (address validation)
|
||||
Reject, // Refuse with CONNECTION_REFUSED
|
||||
Ignore, // Drop silently (remote times out)
|
||||
}
|
||||
```
|
||||
|
||||
### `AccessLimit`
|
||||
Wrapper that limits connections to allowed `EndpointId`s:
|
||||
|
||||
```rust
|
||||
let handler = AccessLimit::new(MyHandler, |endpoint_id| allowed_set.contains(&endpoint_id));
|
||||
```
|
||||
|
||||
### `EndpointHooks`
|
||||
Intercept connection establishment at two points:
|
||||
|
||||
```rust
|
||||
pub trait EndpointHooks: Debug + Send + Sync {
|
||||
// Before outgoing connection starts
|
||||
fn before_connect<'a>(&'a self, remote_addr: &'a EndpointAddr, alpn: &'a [u8])
|
||||
-> BoxFuture<'a, BeforeConnectOutcome>;
|
||||
|
||||
// After TLS handshake completes (on both sides)
|
||||
fn after_handshake<'a>(&'a self, info: &'a ConnectionInfo)
|
||||
-> BoxFuture<'a, AfterHandshakeOutcome>;
|
||||
}
|
||||
```
|
||||
|
||||
## Connection Types (`iroh::endpoint::connection`)
|
||||
|
||||
### `Connecting`
|
||||
The state between initiating a connection and completing the handshake:
|
||||
|
||||
```rust
|
||||
impl Connecting {
|
||||
pub async fn await?(self) -> Result<Connection, ConnectingError>;
|
||||
pub fn into_0rtt(self) -> Result<(OutgoingZeroRttConnection, Connection), Connecting>;
|
||||
pub fn alpn(&self) -> Result<Vec<u8>, ConnectingError>;
|
||||
pub fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
|
||||
}
|
||||
```
|
||||
|
||||
### `Connection`
|
||||
Wraps a `noq::Connection` with iroh-specific metadata:
|
||||
|
||||
```rust
|
||||
impl Connection {
|
||||
// Stream operations
|
||||
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), OpenBi>;
|
||||
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), AcceptBi>;
|
||||
pub async fn open_uni(&self) -> Result<SendStream, OpenUni>;
|
||||
pub async fn accept_uni(&self) -> Result<RecvStream, AcceptUni>;
|
||||
|
||||
// Datagrams
|
||||
pub fn send_datagram(&self, data: SendDatagram) -> Result<(), SendDatagramError>;
|
||||
pub async fn read_datagram(&self) -> Result<Bytes, ReadDatagram>;
|
||||
|
||||
// Connection lifecycle
|
||||
pub fn close(&self, error_code: VarInt, reason: &[u8]);
|
||||
pub async fn closed(&self) -> ConnectionError;
|
||||
|
||||
// Identity
|
||||
pub fn remote_id(&self) -> EndpointId;
|
||||
pub fn alpn(&self) -> Vec<u8>;
|
||||
|
||||
// Path observation
|
||||
pub fn paths(&self) -> PathWatcher;
|
||||
|
||||
// Keying material export
|
||||
pub fn export_keying_material(&self, output: &mut [u8], label: &[u8], context: Option<&[u8]>) -> Result<(), ExportKeyingMaterialError>;
|
||||
}
|
||||
```
|
||||
|
||||
### `Incoming`
|
||||
Pre-accept incoming connection:
|
||||
|
||||
```rust
|
||||
impl Incoming {
|
||||
pub fn accept(self) -> Result<Accepting, ConnectionError>;
|
||||
pub fn accept_with(self, server_config: Arc<ServerConfig>) -> Result<Accepting, ConnectionError>;
|
||||
pub fn refuse(self);
|
||||
pub fn retry(self) -> Result<(), RetryError>;
|
||||
pub fn ignore(self);
|
||||
pub fn remote_addr(&self) -> IncomingAddr;
|
||||
pub fn local_ip(&self) -> Option<IpAddr>;
|
||||
pub fn remote_addr_validated(&self) -> bool;
|
||||
pub fn decrypt(&self) -> Option<DecryptedInitial>;
|
||||
}
|
||||
```
|
||||
|
||||
### `IncomingAddr`
|
||||
```rust
|
||||
pub enum IncomingAddr {
|
||||
Ip(SocketAddr),
|
||||
Relay { url: RelayUrl, endpoint_id: EndpointId },
|
||||
Custom(CustomAddr),
|
||||
}
|
||||
```
|
||||
|
||||
## `RelayMap` and `RelayConfig` (`iroh-relay`)
|
||||
|
||||
### `RelayMap`
|
||||
Thread-safe map of relay servers:
|
||||
|
||||
```rust
|
||||
let map = RelayMap::from_iter([
|
||||
"https://relay1.example.org".parse()?,
|
||||
"https://relay2.example.org".parse()?,
|
||||
]);
|
||||
```
|
||||
|
||||
### `RelayConfig`
|
||||
```rust
|
||||
pub struct RelayConfig {
|
||||
pub url: RelayUrl,
|
||||
pub quic: Option<RelayQuicConfig>, // QAD support
|
||||
}
|
||||
|
||||
pub struct RelayQuicConfig {
|
||||
pub port: u16, // Default: 3478
|
||||
}
|
||||
```
|
||||
|
||||
## `EndpointData` and `EndpointInfo` (`iroh-dns`)
|
||||
|
||||
### `EndpointData`
|
||||
The data published about an endpoint:
|
||||
|
||||
```rust
|
||||
pub struct EndpointData {
|
||||
addrs: Vec<TransportAddr>,
|
||||
user_data: Option<UserData>,
|
||||
}
|
||||
```
|
||||
|
||||
### `EndpointInfo`
|
||||
Combines `EndpointId` with `EndpointData`:
|
||||
|
||||
```rust
|
||||
pub struct EndpointInfo {
|
||||
pub endpoint_id: EndpointId,
|
||||
pub data: EndpointData,
|
||||
}
|
||||
```
|
||||
|
||||
### `UserData`
|
||||
Application-defined string data published alongside addressing info:
|
||||
|
||||
```rust
|
||||
pub struct UserData(String); // Max 256 bytes
|
||||
```
|
||||
|
||||
### `AddrFilter`
|
||||
Controls which addresses are published to address lookup services:
|
||||
|
||||
```rust
|
||||
let filter = AddrFilter::relay_only(); // Only relay URLs
|
||||
let filter = AddrFilter::unfiltered(); // All addresses
|
||||
let filter = AddrFilter::custom(|addrs| { /* custom logic */ });
|
||||
```
|
||||
401
docs/research/references/iroh/iroh/03-networking-protocols.md
Normal file
401
docs/research/references/iroh/iroh/03-networking-protocols.md
Normal file
@@ -0,0 +1,401 @@
|
||||
# Iroh: Networking & Protocol Details
|
||||
|
||||
## Connection Establishment
|
||||
|
||||
### Overview
|
||||
The connection process follows this sequence:
|
||||
|
||||
```
|
||||
Caller Callee
|
||||
| |
|
||||
|--- connect(EndpointAddr, alpn) -------->| (via relay first)
|
||||
| |
|
||||
|<------ TLS Handshake (Raw Public Key) ->|
|
||||
| |
|
||||
|<====== QUIC Connection Established ====|
|
||||
| |
|
||||
| (iroh attempts direct path migration) |
|
||||
| |
|
||||
|--- open_bi() / open_uni() ------------->|
|
||||
|<--- accept_bi() / accept_uni() ----------|
|
||||
```
|
||||
|
||||
### Step-by-Step
|
||||
|
||||
1. **Resolve addressing** — `resolve_remote(EndpointAddr)` starts a `RemoteStateActor` for the peer. If no direct addresses or relay URL are provided, Address Lookup services are queried.
|
||||
|
||||
2. **Map addresses** — `EndpointId` is mapped to a synthetic IPv6 address for the QUIC layer (`EndpointIdMappedAddr`). Relay and custom transport addresses are similarly mapped.
|
||||
|
||||
3. **TLS connection** — Uses RFC 7250 Raw Public Keys. The server name is encoded as `<z32-encoded-pubkey>.iroh.invalid`. Both sides authenticate by `EndpointId`.
|
||||
|
||||
4. **ALPN negotiation** — The Application-Layer Protocol Negotiation determines which protocol handler receives the connection.
|
||||
|
||||
5. **Path migration** — Once a QUIC connection is established (initially via relay), iroh continuously searches for better paths. Direct IP paths are preferred when available.
|
||||
|
||||
## Transport Layer Architecture
|
||||
|
||||
### The `Socket` — Core Connectivity Engine
|
||||
|
||||
The `Socket` struct is the heart of iroh's networking. It manages:
|
||||
- Multiple transport paths (IPv4, IPv6, relay, custom)
|
||||
- Address discovery and NAT traversal
|
||||
- Path migration between relay and direct connections
|
||||
|
||||
```
|
||||
┌──────────────┐
|
||||
│ Endpoint │ (Public API)
|
||||
│ (Arc<EndpointInner>) │
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌──────▼───────┐
|
||||
│ Socket │ (Connectivity engine)
|
||||
│ (Arc<Socket>) │
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
│ │ │
|
||||
┌─────▼─────┐ ┌───▼────┐ ┌──────▼──────┐
|
||||
│IpTransport│ │Relay │ │CustomTransport│
|
||||
│(IPv4/v6) │ │Transport│ │(unstable) │
|
||||
└─────┬─────┘ └───┬────┘ └──────┬──────┘
|
||||
│ │ │
|
||||
┌─────▼─────┐ ┌───▼────┐ │
|
||||
│ UdpSocket │ │WebSocket│ │
|
||||
│ (netwatch)│ │ Actor │ │
|
||||
└────────────┘ └────────┘ │
|
||||
```
|
||||
|
||||
### Transport Configuration
|
||||
|
||||
```rust
|
||||
pub enum TransportConfig {
|
||||
Ip {
|
||||
config: IpConfig, // IPv4 or IPv6 socket config
|
||||
is_user_defined: bool,
|
||||
},
|
||||
Relay {
|
||||
relay_map: RelayMap, // Which relay servers to use
|
||||
is_user_defined: bool,
|
||||
},
|
||||
#[cfg(feature = "unstable-custom-transports")]
|
||||
Custom(Arc<dyn CustomTransport>),
|
||||
}
|
||||
|
||||
pub enum IpConfig {
|
||||
V4 { ip_net: Ipv4Net, port: u16, is_required: bool, is_default: bool },
|
||||
V6 { ip_net: Ipv6Net, scope_id: u32, port: u16, is_required: bool, is_default: bool },
|
||||
}
|
||||
```
|
||||
|
||||
### Address Mapping
|
||||
|
||||
Iroh maps all transport addresses to IPv6 for the QUIC layer:
|
||||
|
||||
- **IPv4/IPv6 addresses** → used directly as QUIC path addresses
|
||||
- **Relay addresses** → mapped to synthetic IPv6 addresses in a dedicated range
|
||||
- **Custom addresses** → mapped to synthetic IPv6 addresses in another range
|
||||
|
||||
The `MappedAddrs` struct maintains these mappings:
|
||||
```rust
|
||||
pub(crate) struct MappedAddrs {
|
||||
pub(super) endpoint_addrs: AddrMap<EndpointId, EndpointIdMappedAddr>,
|
||||
pub(super) relay_addrs: AddrMap<(RelayUrl, EndpointId), RelayMappedAddr>,
|
||||
pub(super) custom_addrs: AddrMap<CustomAddr, CustomMappedAddr>,
|
||||
}
|
||||
```
|
||||
|
||||
### Transport Bias
|
||||
|
||||
Path selection uses a configurable bias system:
|
||||
|
||||
```rust
|
||||
let endpoint = Endpoint::builder(presets::N0)
|
||||
.transport_bias(AddrKind::Custom(42), TransportBias::primary())
|
||||
.bind()
|
||||
.await?;
|
||||
```
|
||||
|
||||
Default biases:
|
||||
- IPv4 and IPv6 are **primary** (IPv6 gets small RTT advantage)
|
||||
- Relay is **backup** (only used when no primary transport available)
|
||||
|
||||
## Relay Protocol
|
||||
|
||||
### Architecture
|
||||
|
||||
The relay system is based on a revised version of Tailscale's DERP (Designated Encrypted Relay for Packets) protocol.
|
||||
|
||||
```
|
||||
Client A Relay Server Client B
|
||||
│ │ │
|
||||
│─── HTTP CONNECT ──>| │
|
||||
│<── 200 OK ─────────│ │
|
||||
│ │<─── HTTP CONNECT ────│
|
||||
│ │──── 200 OK ────────>│
|
||||
│ │ │
|
||||
│─── Encrypted QUIC ─>│─── Encrypted QUIC ─>│
|
||||
│<── Encrypted QUIC ──│<── Encrypted QUIC ──│
|
||||
```
|
||||
|
||||
### Relay Actor
|
||||
|
||||
The `RelayActor` manages the WebSocket connection to the relay:
|
||||
- Connects to relay via HTTPS, upgrades to custom protocol
|
||||
- Sends/receives encrypted datagrams on behalf of the local endpoint
|
||||
- Manages reconnection on network changes or relay restarts
|
||||
- Reports connection status via `HomeRelayWatch`
|
||||
|
||||
### Relay Data Flow
|
||||
1. Outgoing packet → `RelayTransport::send()` → `RelayActor` → WebSocket → Relay server → WebSocket → remote `RelayActor` → remote `RelayTransport::recv()` → QUIC
|
||||
2. The relay only sees encrypted QUIC packets — it cannot decode application data
|
||||
|
||||
### Home Relay Selection
|
||||
|
||||
The `net_report` module continuously probes relay servers and maintains latency statistics. The "home relay" is selected based on:
|
||||
- Lowest recent latency (with hysteresis to avoid flapping)
|
||||
- At most a 2/3 improvement threshold to switch from current relay
|
||||
|
||||
## Hole-Punching & NAT Traversal
|
||||
|
||||
### QUIC Address Discovery (QAD)
|
||||
|
||||
Iroh uses QUIC Address Discovery (based on [draft-ietf-quic-address-discovery](https://datatracker.ietf.org/doc/draft-ietf-quic-address-discovery/)) to discover external IP addresses. The relay servers expose QAD endpoints.
|
||||
|
||||
The `net_report` module:
|
||||
1. Establishes QUIC connections to relay servers
|
||||
2. Uses `observed_external_addr()` to learn external addresses
|
||||
3. Reports NAT type, mapping behavior, and preferred relay
|
||||
|
||||
### NAT Traversal Strategy
|
||||
|
||||
```
|
||||
┌──────────────────────────────┐
|
||||
│ NAT Traversal │
|
||||
│ │
|
||||
│ 1. Direct connection attempt │
|
||||
│ (simultaneous open) │
|
||||
│ │
|
||||
│ 2. QAD-discovered addresses │
|
||||
│ (relay reports observed IP)│
|
||||
│ │
|
||||
│ 3. Port mapping (UPnP/PCP/NAT-PMP)│
|
||||
│ (if supported by gateway) │
|
||||
│ │
|
||||
│ 4. Relay fallback │
|
||||
│ (always available) │
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
### Port Mapper
|
||||
|
||||
```rust
|
||||
pub enum PortmapperConfig {
|
||||
Enabled {}, // Default: tries UPnP, PCP, NAT-PMP
|
||||
Disabled, // No port mapping
|
||||
}
|
||||
```
|
||||
|
||||
When enabled, the port mapper:
|
||||
- Discovers gateway devices
|
||||
- Requests port mappings
|
||||
- Provides external addresses to the endpoint
|
||||
- Updates when mappings change
|
||||
|
||||
### Net Report
|
||||
|
||||
`NetReport` discovers network conditions:
|
||||
- IPv4/IPv6 connectivity
|
||||
- NAT mapping behavior (varies by destination or not)
|
||||
- Captive portal detection
|
||||
- Preferred relay selection
|
||||
- External IP addresses (via QAD)
|
||||
|
||||
Key timeouts:
|
||||
- `NET_REPORT_TIMEOUT` = 10 seconds
|
||||
- `FULL_REPORT_INTERVAL` = 5 minutes
|
||||
- `HEARTBEAT_INTERVAL` = 5 seconds (keepalive)
|
||||
- `PATH_MAX_IDLE_TIMEOUT` = 15 seconds (direct)
|
||||
- `RELAY_PATH_MAX_IDLE_TIMEOUT` = 30 seconds (relay)
|
||||
|
||||
## Address Lookup System
|
||||
|
||||
### Trait Definition
|
||||
|
||||
```rust
|
||||
pub trait AddressLookup: Debug + Send + Sync + 'static {
|
||||
fn publish(&self, data: &EndpointData);
|
||||
fn resolve(&self, endpoint_id: EndpointId) -> Option<BoxStream<Result<Item, Error>>>;
|
||||
}
|
||||
```
|
||||
|
||||
### `AddressLookupServices`
|
||||
A composite that runs multiple lookup services concurrently:
|
||||
|
||||
```rust
|
||||
let services = AddressLookupServices::default();
|
||||
services.set_addr_filter(AddrFilter::relay_only());
|
||||
services.add(publisher);
|
||||
services.add(resolver);
|
||||
```
|
||||
|
||||
Resolution merges results from all services. Individual service errors don't block other services.
|
||||
|
||||
### Built-in Implementations
|
||||
|
||||
#### `PkarrPublisher`
|
||||
Publishes endpoint info to a pkarr relay via HTTP PUT:
|
||||
```rust
|
||||
let publisher = PkarrPublisher::builder(pkarr_url)
|
||||
.addr_filter(AddrFilter::relay_only()) // Default: relay-only
|
||||
.build(secret_key, tls_config);
|
||||
```
|
||||
|
||||
#### `PkarrResolver` (browser/WASM)
|
||||
Resolves endpoint info from a pkarr relay via HTTP GET.
|
||||
|
||||
#### `DnsAddressLookup` (non-browser)
|
||||
Resolves endpoint info via DNS TXT records:
|
||||
```rust
|
||||
// Default n0 DNS
|
||||
let lookup = DnsAddressLookup::n0_dns();
|
||||
|
||||
// Custom DNS origin
|
||||
let lookup = DnsAddressLookup::new(dns_resolver, origin);
|
||||
```
|
||||
|
||||
#### `MemoryLookup`
|
||||
In-memory address lookup for testing:
|
||||
```rust
|
||||
let lookup = MemoryLookup::new();
|
||||
lookup.add_endpoint(endpoint_id, endpoint_data);
|
||||
```
|
||||
|
||||
### DNS Record Format
|
||||
```
|
||||
_iroh.<z32-encoded-endpoint-id>.<origin-domain> TXT
|
||||
```
|
||||
Attributes:
|
||||
- `relay=<url>` — Home relay URL
|
||||
- `addr=<addr> <addr>` — Space-separated socket addresses
|
||||
- `user_data=<base64-encoded-data>` — Application-specific data
|
||||
|
||||
## TLS Configuration
|
||||
|
||||
### `TlsConfig`
|
||||
Manages TLS state shared across sessions:
|
||||
```rust
|
||||
struct TlsConfig {
|
||||
secret_key: SecretKey,
|
||||
cert_resolver: Arc<ResolveRawPublicKeyCert>,
|
||||
server_verifier: Arc<ServerCertificateVerifier>,
|
||||
client_verifier: Arc<ClientCertificateVerifier>,
|
||||
session_store: Arc<dyn ClientSessionStore>,
|
||||
crypto_provider: Arc<CryptoProvider>,
|
||||
}
|
||||
```
|
||||
|
||||
### Raw Public Key Certificate
|
||||
Uses RFC 7250 — no X.509 certificates. The `ResolveRawPublicKeyCert` resolver creates TLS certificates on-the-fly from the Ed25519 public key.
|
||||
|
||||
### Verification Flow
|
||||
- **Client verifies server**: The `ServerCertificateVerifier` checks that the server's `EndpointId` matches the expected `EndpointId` encoded in the TLS server name.
|
||||
- **Server verifies client**: The `ClientCertificateVerifier` ensures the client presents a valid raw public key.
|
||||
|
||||
### Crypto Providers
|
||||
Two built-in options via feature flags:
|
||||
- `tls-ring` — uses `ring` crypto (default)
|
||||
- `tls-aws-lc-rs` — uses AWS LC-RS crypto
|
||||
|
||||
Custom providers can be set via `Builder::crypto_provider()`.
|
||||
|
||||
## Multipath & Path Migration
|
||||
|
||||
Iroh supports QUIC multipath connections. Multiple paths can be active simultaneously:
|
||||
|
||||
```rust
|
||||
// Watch path changes
|
||||
let paths = connection.paths();
|
||||
while let Some(infos) = paths.stream().next().await {
|
||||
for info in infos.iter() {
|
||||
if info.is_ip() { /* direct path */ }
|
||||
if info.is_relay() { /* relay path */ }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Maximum multipath paths per connection: 12 (`MAX_MULTIPATH_PATHS`).
|
||||
|
||||
### Path Types
|
||||
```rust
|
||||
pub struct PathInfo {
|
||||
pub addr: TransportAddr,
|
||||
pub usage: TransportAddrUsage,
|
||||
}
|
||||
|
||||
pub enum TransportAddrUsage {
|
||||
DefaultRoute,
|
||||
SubnetRoute,
|
||||
Backup,
|
||||
}
|
||||
```
|
||||
|
||||
## Connection Hooks
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
struct MyHook;
|
||||
|
||||
impl EndpointHooks for MyHook {
|
||||
fn before_connect<'a>(
|
||||
&'a self,
|
||||
remote_addr: &'a EndpointAddr,
|
||||
alpn: &'a [u8],
|
||||
) -> BoxFuture<'a, BeforeConnectOutcome> {
|
||||
Box::pin(async move {
|
||||
if is_allowed(remote_addr.id()) {
|
||||
BeforeConnectOutcome::Accept
|
||||
} else {
|
||||
BeforeConnectOutcome::Reject
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
fn after_handshake<'a>(
|
||||
&'a self,
|
||||
info: &'a ConnectionInfo,
|
||||
) -> BoxFuture<'a, AfterHandshakeOutcome> {
|
||||
Box::pin(async move {
|
||||
AfterHandshakeOutcome::Accept
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Custom Transports (Unstable)
|
||||
|
||||
```rust
|
||||
pub trait CustomTransport: Send + Sync + Debug + 'static {
|
||||
// Create an endpoint for this transport
|
||||
fn create_endpoint(&self, config: CustomEndpointConfig) -> Result<Arc<dyn CustomEndpoint>, CustomTransportError>;
|
||||
}
|
||||
|
||||
pub trait CustomEndpoint: Send + Sync + Debug + 'static {
|
||||
fn send(&self, item: CustomSendItem) -> Result<(), CustomTransportError>;
|
||||
fn recv(&self) -> Result<CustomRecvItem, CustomTransportError>;
|
||||
}
|
||||
|
||||
// Register:
|
||||
let ep = Endpoint::builder(presets::N0)
|
||||
.add_custom_transport(Arc::new(MyTransport))
|
||||
.bind()
|
||||
.await?;
|
||||
```
|
||||
|
||||
Transport IDs (from `TRANSPORTS.md`):
|
||||
|
||||
| ID | Transport | Address format |
|
||||
|----|-----------|---------------|
|
||||
| `0x00-0x1F` | Reserved | - |
|
||||
| `0x20` | Test | Ed25519 public key (32 bytes) |
|
||||
| `0x544F52` | Tor | Ed25519 public key (32 bytes) |
|
||||
| `0x424C45` | BLE | Bluetooth MAC address (6 bytes) |
|
||||
294
docs/research/references/iroh/iroh/04-sub-crates.md
Normal file
294
docs/research/references/iroh/iroh/04-sub-crates.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Iroh: Sub-Crates
|
||||
|
||||
## `iroh-base`
|
||||
|
||||
**Purpose**: Fundamental types shared across all iroh crates.
|
||||
**Features**: `key` (default), `relay` (default)
|
||||
|
||||
### Key Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `SecretKey` | Ed25519 signing key (32 bytes). Generated randomly or from bytes. |
|
||||
| `PublicKey` | Ed25519 public key (32 bytes). Verifies signatures. |
|
||||
| `EndpointId` | Type alias for `PublicKey` — used as network identity. |
|
||||
| `Signature` | Ed25519 signature (64 bytes). |
|
||||
| `RelayUrl` | Arc-wrapped `Url` identifying a relay server. |
|
||||
| `EndpointAddr` | Combines `EndpointId` + `BTreeSet<TransportAddr>`. Primary addressing type. |
|
||||
| `TransportAddr` | Enum: `Relay(RelayUrl)`, `Ip(SocketAddr)`, `Custom(CustomAddr)`. |
|
||||
| `CustomAddr` | Opaque address for custom transports (id + bytes). |
|
||||
| `KeyParsingError` | Error type for key parsing. |
|
||||
| `RelayUrlParseError` | Error type for URL parsing. |
|
||||
|
||||
### `EndpointAddr` Methods
|
||||
|
||||
```rust
|
||||
impl EndpointAddr {
|
||||
pub fn new(id: PublicKey) -> Self;
|
||||
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
|
||||
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
|
||||
pub fn with_addrs(self, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||
pub fn is_empty(&self) -> bool;
|
||||
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
|
||||
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
|
||||
}
|
||||
```
|
||||
|
||||
### Serialization
|
||||
- `PublicKey`/`EndpointId`: Human-readable → base32 z-base-32; Binary → 32 raw bytes
|
||||
- `EndpointAddr`: Serialized as `{id, addrs}` with `TransportAddr` as tagged enum
|
||||
- `RelayUrl`: Serialized as URL string
|
||||
|
||||
---
|
||||
|
||||
## `iroh-dns`
|
||||
|
||||
**Purpose**: DNS resolver and endpoint info serialization for address discovery.
|
||||
**Key Features**: pkarr signed packet creation/verification, DNS TXT record parsing, configurable DNS resolver.
|
||||
|
||||
### Modules
|
||||
|
||||
| Module | Description |
|
||||
|--------|-------------|
|
||||
| `dns` | `DnsResolver` — configurable async DNS resolver with IPv4/IPv6 staggered lookup |
|
||||
| `endpoint_info` | `EndpointInfo`, `EndpointData`, `AddrFilter`, `UserData` — serialization/deserialization |
|
||||
| `pkarr` | Pkarr signed packet creation and verification |
|
||||
| `attrs` | Low-level TXT record attribute parsing |
|
||||
|
||||
### `DnsResolver`
|
||||
|
||||
```rust
|
||||
impl DnsResolver {
|
||||
pub fn new() -> Self;
|
||||
pub fn with_nameserver(addr: SocketAddr) -> Self;
|
||||
pub fn with_nameservers(addrs: Vec<SocketAddr>) -> Self;
|
||||
|
||||
// Lookup methods
|
||||
pub async fn lookup_ipv4(&self, host: String) -> Result<...>;
|
||||
pub async fn lookup_ipv6(&self, host: String) -> Result<...>;
|
||||
pub async fn lookup_ipv4_ipv6_staggered(&self, host: &str, timeout: Duration, delays: &[u64]) -> Result<...>;
|
||||
pub async fn lookup_txt(&self, host: String) -> Result<...>;
|
||||
pub async fn lookup_endpoint_by_id(&self, id: &EndpointId, origin: &str) -> Result<EndpointInfo>;
|
||||
|
||||
// Cache management
|
||||
pub fn clear_cache(&self);
|
||||
pub fn reset_resolver(&self);
|
||||
}
|
||||
```
|
||||
|
||||
### `EndpointInfo` & `EndpointData`
|
||||
|
||||
```rust
|
||||
pub struct EndpointInfo {
|
||||
pub endpoint_id: EndpointId,
|
||||
pub data: EndpointData,
|
||||
}
|
||||
|
||||
pub struct EndpointData {
|
||||
addrs: Vec<TransportAddr>,
|
||||
user_data: Option<UserData>,
|
||||
}
|
||||
|
||||
impl EndpointData {
|
||||
pub fn new(addrs: Vec<TransportAddr>) -> Self;
|
||||
pub fn from_iter(addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||
pub fn with_user_data(mut self, user_data: UserData) -> Self;
|
||||
pub fn addrs(&self) -> impl Iterator<Item = &TransportAddr>;
|
||||
pub fn user_data(&self) -> Option<&UserData>;
|
||||
pub fn apply_filter(&self, filter: &AddrFilter) -> Cow<'_, EndpointData>;
|
||||
}
|
||||
```
|
||||
|
||||
### `AddrFilter`
|
||||
|
||||
Controls which addresses are published in address lookup:
|
||||
|
||||
```rust
|
||||
pub enum AddrFilter {
|
||||
RelayOnly, // Only relay URLs
|
||||
Unfiltered, // All addresses
|
||||
Custom(fn(&[TransportAddr]) -> Vec<TransportAddr>),
|
||||
}
|
||||
```
|
||||
|
||||
### Pkarr Integration
|
||||
|
||||
```rust
|
||||
// Creating signed packets
|
||||
let info = EndpointInfo::new(secret_key.public())
|
||||
.with_relay_url(relay_url);
|
||||
let packet = info.to_pkarr_signed_packet(&secret_key, 30)?; // 30 second TTL
|
||||
|
||||
// Verifying and extracting
|
||||
let info = EndpointInfo::from_pkarr_signed_packet(&packet)?;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `iroh-relay`
|
||||
|
||||
**Purpose**: Relay server and client implementation. Provides DERP-like relay protocol, QAD support, and relay server binary.
|
||||
|
||||
### Key Exports
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `RelayMap` | Thread-safe map of `RelayUrl → RelayConfig` |
|
||||
| `RelayConfig` | Configuration for a single relay server |
|
||||
| `RelayQuicConfig` | QUIC address discovery configuration |
|
||||
| `KeyCache` | Cache for relay server public keys |
|
||||
| `PingTracker` | Ping/pong tracking for relay connections |
|
||||
| `MAX_PACKET_SIZE` | Maximum relay packet size (64KB - overhead) |
|
||||
|
||||
### Modules
|
||||
|
||||
| Module | Description |
|
||||
|--------|-------------|
|
||||
| `client` | HTTP client for relay server connections |
|
||||
| `http` | HTTP-related relay functionality |
|
||||
| `protos` | Protocol definitions (handshake, relay, streams) |
|
||||
| `quic` | QUIC client for QAD probing |
|
||||
| `server` | Full relay server implementation (`feature = "server"`) |
|
||||
| `tls` | TLS configuration utilities |
|
||||
|
||||
### `RelayConfig`
|
||||
|
||||
```rust
|
||||
pub struct RelayConfig {
|
||||
pub url: RelayUrl,
|
||||
pub quic: Option<RelayQuicConfig>,
|
||||
}
|
||||
|
||||
impl RelayConfig {
|
||||
pub fn new(url: RelayUrl, quic: Option<RelayQuicConfig>) -> Self;
|
||||
pub fn from(url: RelayUrl) -> Self; // No QAD
|
||||
}
|
||||
```
|
||||
|
||||
### `RelayMap`
|
||||
|
||||
```rust
|
||||
impl RelayMap {
|
||||
pub fn empty() -> Self;
|
||||
pub fn from(relay: RelayConfig) -> Self;
|
||||
pub fn from_iter(iter: impl IntoIterator<Item = impl Into<RelayConfig>>) -> Self;
|
||||
pub fn try_from_iter(iter: impl IntoIterator<Item = &str>) -> Result<Self, RelayUrlParseError>;
|
||||
pub fn insert(&self, url: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
|
||||
pub fn remove(&self, url: &RelayUrl) -> Option<Arc<RelayConfig>>;
|
||||
pub fn len(&self) -> usize;
|
||||
pub fn is_empty(&self) -> bool;
|
||||
pub fn urls<T: FromIterator<RelayUrl>>(&self) -> T;
|
||||
pub fn relays<T: FromIterator<Arc<RelayConfig>>>(&self) -> T;
|
||||
}
|
||||
```
|
||||
|
||||
### Relay Protocol (DERP-like)
|
||||
|
||||
The relay protocol is based on Tailscale's DERP protocol, adapted for iroh:
|
||||
|
||||
1. Client connects via HTTPS, upgrades to custom protocol
|
||||
2. Authentication via raw public key (Ed25519)
|
||||
3. Encrypted datagram forwarding by `EndpointId`
|
||||
4. QAD probes via QUIC for address discovery
|
||||
5. Ping/pong keepalive mechanism
|
||||
|
||||
### TLS Utilities
|
||||
|
||||
```rust
|
||||
pub use iroh_relay::tls::{CaRootsConfig, default_provider};
|
||||
|
||||
// Skip certificate verification (testing only)
|
||||
let config = CaRootsConfig::insecure_skip_verify();
|
||||
|
||||
// Use system trust roots
|
||||
let config = CaRootsConfig::platform_verifier();
|
||||
|
||||
// Use specific roots
|
||||
let config = CaRootsConfig::from_pem(pem_bytes);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `iroh-dns-server`
|
||||
|
||||
**Purpose**: DNS server that resolves iroh `EndpointId`s to addressing information. Powers `dns.iroh.link`.
|
||||
|
||||
### Key Features
|
||||
- Serves DNS TXT records for `_iroh.<z32-endpoint-id>.<origin>` queries
|
||||
- Integrates with pkarr for signed record verification
|
||||
- Supports production (`dns.iroh.link`) and staging (`staging-dns.iroh.link`) origins
|
||||
- Includes benchmarking support
|
||||
|
||||
### Configuration Files
|
||||
- `config.dev.toml` — Development configuration
|
||||
- `config.prod.toml` — Production configuration
|
||||
|
||||
---
|
||||
|
||||
## Internal Modules in `iroh` Crate
|
||||
|
||||
### `socket` Module
|
||||
The connectivity layer — manages the `Socket` struct that orchestrates:
|
||||
- Multiple transport paths
|
||||
- Network change detection
|
||||
- Address discovery and publication
|
||||
- Remote state actors (per-peer state machines)
|
||||
|
||||
**Key sub-modules**:
|
||||
|
||||
| Sub-module | Description |
|
||||
|-----------|-------------|
|
||||
| `transports/` | Transport implementations (IP, relay, custom) |
|
||||
| `transports/ip.rs` | IPv4/IPv6 UDP transport |
|
||||
| `transports/relay.rs` | Relay WebSocket transport |
|
||||
| `transports/relay/actor.rs` | Relay connection management actor |
|
||||
| `transports/custom.rs` | Unstable custom transport API |
|
||||
| `remote_map.rs` | Per-peer `RemoteStateActor` management |
|
||||
| `remote_map/remote_state.rs` | State machine for connecting to a peer |
|
||||
| `mapped_addrs.rs` | Address mapping for QUIC layer |
|
||||
| `concurrent_read_map.rs` | Lock-free concurrent map for remote actors |
|
||||
| `metrics.rs` | Socket-level metrics |
|
||||
|
||||
### `net_report` Module
|
||||
Network condition reporter:
|
||||
- Discovers external IP addresses (QAD)
|
||||
- Measures relay latencies
|
||||
- Detects NAT types
|
||||
- Detects captive portals
|
||||
- Selects preferred relay
|
||||
|
||||
### `portmapper` Module
|
||||
UPnP/PCP/NAT-PMP port mapping:
|
||||
- Gateway discovery
|
||||
- Port mapping procurement
|
||||
- External address monitoring
|
||||
|
||||
### `address_lookup` Module
|
||||
Pluggable address discovery:
|
||||
|
||||
| Sub-module | Description |
|
||||
|-----------|-------------|
|
||||
| `dns.rs` | `DnsAddressLookup` — resolves via DNS TXT records |
|
||||
| `pkarr.rs` | `PkarrPublisher` — publishes via HTTP PUT to pkarr relay; `PkarrResolver` — resolves from pkarr relay |
|
||||
| `memory.rs` | `MemoryLookup` — in-memory lookup for testing |
|
||||
|
||||
### `runtime` Module
|
||||
Tokio-based async runtime wrapper for `noq`:
|
||||
- Task spawning with cancellation support
|
||||
- Timer management
|
||||
- Graceful and abrupt shutdown
|
||||
- WASM browser support (delegates to `wasm-bindgen-futures`)
|
||||
|
||||
### `defaults` Module
|
||||
Default configuration values:
|
||||
- Production relay servers (4 regions)
|
||||
- Staging relay servers (2 regions)
|
||||
- Timeout constants
|
||||
- Environment variable for forcing staging (`IROH_FORCE_STAGING_RELAYS`)
|
||||
|
||||
### `metrics` Module
|
||||
`EndpointMetrics` collection:
|
||||
- Socket metrics (datagrams sent/received, data by transport type)
|
||||
- Net report metrics (reports generated, full vs incremental)
|
||||
- Port mapper metrics
|
||||
261
docs/research/references/iroh/iroh/05-data-flow-internals.md
Normal file
261
docs/research/references/iroh/iroh/05-data-flow-internals.md
Normal file
@@ -0,0 +1,261 @@
|
||||
# Iroh: Data Flow & Internal Architecture
|
||||
|
||||
## Data Flow: Connecting to a Remote Endpoint
|
||||
|
||||
```
|
||||
Endpoint::connect(endpoint_addr, alpn)
|
||||
│
|
||||
▼
|
||||
resolve_remote(endpoint_addr)
|
||||
│
|
||||
├─ If addr has direct IPs or relay URL → use those
|
||||
│
|
||||
└─ If addr is just EndpointId → query AddressLookupServices
|
||||
│
|
||||
├─ PkarrPublisher/PkarrResolver (HTTP)
|
||||
├─ DnsAddressLookup (DNS TXT)
|
||||
├─ MemoryLookup (in-memory)
|
||||
└─ ...custom implementations
|
||||
│
|
||||
▼
|
||||
Map EndpointId → MappedAddr for QUIC layer
|
||||
│
|
||||
▼
|
||||
noq::Endpoint::connect(client_config, dest_addr, server_name)
|
||||
│
|
||||
├─ TLS handshake with Raw Public Key authentication
|
||||
│ server_name = "<z32-encoded-endpoint-id>.iroh.invalid"
|
||||
│
|
||||
└─ QUIC connection established
|
||||
│
|
||||
▼
|
||||
Connecting → Connection
|
||||
│
|
||||
├─ Connection stays on relay path initially
|
||||
│
|
||||
└─ RemoteStateActor discovers direct paths
|
||||
│
|
||||
├─ QAD-discovered addresses
|
||||
├─ Addresses from Address Lookup
|
||||
├─ Port mapper external addresses
|
||||
│
|
||||
└─ Path migration: relay → direct (if possible)
|
||||
```
|
||||
|
||||
## Data Flow: Accepting Connections
|
||||
|
||||
```
|
||||
Endpoint::accept() → Accept<'_>
|
||||
│
|
||||
▼ (incoming QUIC packet arrives on any transport)
|
||||
│
|
||||
noq::Endpoint::accept()
|
||||
│
|
||||
▼
|
||||
Incoming
|
||||
│
|
||||
├─ incoming.remote_addr() → IncomingAddr (Ip/Relay/Custom)
|
||||
├─ incoming.remote_addr_validated() → bool
|
||||
├─ incoming.accept() → Accepting
|
||||
├─ incoming.refuse() → reject
|
||||
├─ incoming.retry() → QUIC retry (address validation)
|
||||
└─ incoming.ignore() → drop silently
|
||||
│
|
||||
Accepting
|
||||
│
|
||||
├─ accepting.alpn().await → alpn bytes
|
||||
├─ accepting.into_0rtt() → (OutgoingZeroRtt, Connection) [optional]
|
||||
└─ accepting.await → Connection
|
||||
```
|
||||
|
||||
## Data Flow: Router Accept Loop
|
||||
|
||||
```
|
||||
Router::spawn()
|
||||
│
|
||||
├─ endpoint.set_alpns(registered_alpns)
|
||||
│
|
||||
└─ Loop:
|
||||
│
|
||||
├─ endpoint.accept().await → Incoming
|
||||
│ │
|
||||
│ ├─ Apply incoming_filter (optional)
|
||||
│ │ ├─ Accept → continue
|
||||
│ │ ├─ Retry → incoming.retry()
|
||||
│ │ ├─ Reject → incoming.refuse()
|
||||
│ │ └─ Ignore → incoming.ignore()
|
||||
│ │
|
||||
│ ├─ incoming.accept() → Accepting
|
||||
│ ├─ accepting.alpn().await → determine ALPN
|
||||
│ │
|
||||
│ └─ protocols.get(alpn) → handler
|
||||
│ │
|
||||
│ ├─ handler.on_accepting(accepting).await
|
||||
│ └─ handler.accept(connection).await
|
||||
│
|
||||
└─ On shutdown:
|
||||
├─ protocols.shutdown().await
|
||||
├─ handler_cancel_token.cancel()
|
||||
└─ endpoint.close().await
|
||||
```
|
||||
|
||||
## Actor Model: Per-Remote State
|
||||
|
||||
Each remote peer gets a `RemoteStateActor` that manages the connection state:
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────┐
|
||||
│ RemoteStateActor │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────────┐ │
|
||||
│ │ Address │ │ Connection │ │
|
||||
│ │ Lookup │ │ Tracker │ │
|
||||
│ │ Resolution │ │ │ │
|
||||
│ └──────┬──────┘ └────────┬────────┘ │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ Path Selection │ │
|
||||
│ │ ┌────────┐ ┌────────┐ │ │
|
||||
│ │ │ IPv4 │ │ IPv6 │ │ │
|
||||
│ │ │primary │ │primary │ │ │
|
||||
│ │ └────────┘ └────────┘ │ │
|
||||
│ │ ┌────────┐ ┌────────┐ │ │
|
||||
│ │ │ Relay │ │Custom │ │ │
|
||||
│ │ │backup │ │primary │ │ │
|
||||
│ │ └────────┘ └────────┘ │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ Mapped Addresses │ │
|
||||
│ │ EndpointId → MappedIPv6Addr │ │
|
||||
│ │ (RelayUrl, EndpointId) → Addr │ │
|
||||
│ │ CustomAddr → MappedIPv6Addr │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Messages: │
|
||||
│ ├─ ResolveRemote(EndpointAddr, reply) │
|
||||
│ ├─ AddConnection(EndpointId, WeakConn, reply)│
|
||||
│ └─ RemoteInfo(reply) │
|
||||
└───────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Data Flow: Socket Actor
|
||||
|
||||
The `Actor` in `Socket` runs as a background task handling network changes:
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────┐
|
||||
│ Socket Actor │
|
||||
│ │
|
||||
│ ┌──────────────────┐ ┌─────────────────┐ │
|
||||
│ │ Network Monitor │ │ Direct Addr │ │
|
||||
│ │ (netwatch) │ │ Update State │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Detects: │ │ Manages: │ │
|
||||
│ │ - Interface up/down│ │ - NetReport runs │ │
|
||||
│ │ - Address changes │ │ - Port mapper │ │
|
||||
│ │ - Route changes │ │ - Direct addrs │ │
|
||||
│ └────────┬─────────┘ └────────┬──────────┘ │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Triggers │ │
|
||||
│ │ - NetworkChange (major/minor) │ │
|
||||
│ │ - PeriodicReStun (every 30s-5min) │ │
|
||||
│ │ - PortmapUpdated │ │
|
||||
│ │ - RelayMapChange │ │
|
||||
│ │ - DirectAddrRefresh │ │
|
||||
│ │ - ResolveRemote (from connect) │ │
|
||||
│ │ - AddConnection (from new QUIC conn) │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ On address change: │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ 1. Run net_report to discover external addrs │ │
|
||||
│ │ 2. Update direct_addrs watchable │ │
|
||||
│ │ 3. Publish new addresses to AddressLookup │ │
|
||||
│ │ 4. Notify noq of network changes │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
└────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Shutdown Sequence
|
||||
|
||||
```
|
||||
Endpoint::close()
|
||||
│
|
||||
├─ Cancel at_close_start token
|
||||
│ (stops net_reports, address lookups)
|
||||
│
|
||||
├─ Clear address_lookup services
|
||||
│
|
||||
├─ noq_endpoint.close(0, b"")
|
||||
│ (refuses new connections, starts close for existing)
|
||||
│
|
||||
├─ noq_endpoint.wait_idle().await
|
||||
│ (waits for close frames to be acknowledged)
|
||||
│
|
||||
├─ Cancel at_endpoint_closed token
|
||||
│
|
||||
├─ Wait for actor task (100ms timeout, then abort)
|
||||
│
|
||||
└─ runtime.shutdown().await
|
||||
(waits for all spawned tasks)
|
||||
```
|
||||
|
||||
## WASM/Browser Differences
|
||||
|
||||
When compiled to `wasm32-unknown-unknown`:
|
||||
|
||||
| Feature | Native | WASM/Browser |
|
||||
|---------|--------|-------------|
|
||||
| IP transports | Yes (IPv4 + IPv6) | No (no socket access) |
|
||||
| DNS resolution | `DnsAddressLookup` (system DNS) | `PkarrResolver` (HTTP) |
|
||||
| Network monitoring | `netwatch` (interface changes) | Not available |
|
||||
| Port mapping | UPnP/PCP/NAT-PMP | Not available |
|
||||
| Net report | Full (QAD, HTTPS probes) | Limited |
|
||||
| Runtime | Tokio | `wasm-bindgen-futures` |
|
||||
| Timer | Tokio timer | `web::Timer` wrapping `sleep_until` |
|
||||
|
||||
## Thread Safety & Concurrency
|
||||
|
||||
- `Endpoint` is `Clone` (wraps `Arc<EndpointInner>`)
|
||||
- `Socket` is `Arc<Socket>` — shared across all connections
|
||||
- `RemoteMap` uses `ConcurrentReadMap` — lock-free reads for hot path
|
||||
- `AddressLookupServices` uses `RwLock` — infrequent writes, frequent reads
|
||||
- `DirectAddrs` uses `Watchable` — publishes changes to watchers
|
||||
- `HomeRelayWatch` uses `n0_watcher::Direct` — efficient change notification
|
||||
|
||||
## Error Handling Patterns
|
||||
|
||||
Iroh uses the `n0_error::stack_error` macro for rich error chains:
|
||||
|
||||
```rust
|
||||
#[stack_error(derive, add_meta, from_sources)]
|
||||
pub enum ConnectError {
|
||||
#[error(transparent)]
|
||||
Connect { source: ConnectWithOptsError },
|
||||
#[error(transparent)]
|
||||
Connecting { source: ConnectingError },
|
||||
#[error(transparent)]
|
||||
Connection { source: ConnectionError },
|
||||
}
|
||||
|
||||
// Usage:
|
||||
// ConnectError::Connect { source: ConnectWithOptsError::SelfConnect }
|
||||
// ConnectError::Connecting { source: ConnectingError::AuthenticationError { .. } }
|
||||
```
|
||||
|
||||
## Key Constants & Timeouts
|
||||
|
||||
| Constant | Value | Purpose |
|
||||
|----------|-------|---------|
|
||||
| `HEARTBEAT_INTERVAL` | 5s | Keepalive PING interval |
|
||||
| `PATH_MAX_IDLE_TIMEOUT` | 15s | Max idle before closing direct path |
|
||||
| `RELAY_PATH_MAX_IDLE_TIMEOUT` | 30s | Max idle before closing relay path |
|
||||
| `MAX_MULTIPATH_PATHS` | 12 | Max concurrent paths per connection |
|
||||
| `DEFAULT_MAX_TLS_TICKETS` | 256 (8×32) | TLS session ticket cache size |
|
||||
| `NET_REPORT_TIMEOUT` | 10s | Max time for net report |
|
||||
| `FULL_REPORT_INTERVAL` | 5min | Time between full net reports |
|
||||
| `DEFAULT_RELAY_QUIC_PORT` | 3478 | QAD port on relay servers |
|
||||
@@ -0,0 +1,108 @@
|
||||
# irpc: Overview and Architecture
|
||||
|
||||
## What is irpc?
|
||||
|
||||
`irpc` is a **streaming RPC system** built for [iroh](https://docs.rs/iroh) and [noq](https://docs.rs/noq) (QUIC-based transports). It provides a framework for defining RPC protocols in Rust that work identically whether the communication is **in-process** (via tokio channels) or **cross-process/cross-network** (via QUIC streams).
|
||||
|
||||
**Key design goals:**
|
||||
|
||||
1. **Zero-overhead local use** — When used in-process, irpc should be as lightweight as raw tokio channels, replacing the common pattern of a giant `enum` over an `mpsc` channel with typed backchannels.
|
||||
2. **Transparent local/remote abstraction** — The same protocol definition and client API works for both in-process and remote communication.
|
||||
3. **Streaming-first** — Full support for unary RPC, server streaming, client streaming, and bidirectional streaming interaction patterns.
|
||||
4. **QUIC-native** — Does not abstract over stream types; directly uses noq/iroh QUIC streams, enabling per-request stream tuning (priorities, etc.).
|
||||
|
||||
**Non-goals:**
|
||||
|
||||
- Cross-language interop (Rust-to-Rust only)
|
||||
- Versioning (users must handle this themselves)
|
||||
- Making remote calls look like local async function calls
|
||||
- Runtime agnosticism (tokio only)
|
||||
|
||||
## Crate Structure
|
||||
|
||||
```
|
||||
irpc/
|
||||
├── src/lib.rs # Core library: traits, channels, Client, RPC module
|
||||
├── src/util.rs # Varint utilities, noq endpoint setup helpers
|
||||
├── src/tests.rs # Channel filter/map tests
|
||||
├── irpc-derive/ # Procedural macro crate (rpc_requests)
|
||||
├── irpc-iroh/ # Iroh transport integration
|
||||
├── examples/ # Working examples (storage, compute, derive, local)
|
||||
└── tests/ # Integration tests (channels, derive)
|
||||
```
|
||||
|
||||
### Features
|
||||
|
||||
| Feature | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `rpc` | ✅ | Enables remote RPC (noq transport, postcard serialization) |
|
||||
| `derive` | ✅ | Enables the `#[rpc_requests]` macro |
|
||||
| `spans` | ✅ | Preserves tracing spans across message passing |
|
||||
| `stream` | ✅ | Enables `into_stream()` on mpsc receivers |
|
||||
| `noq_endpoint_setup` | ✅ | Utilities to create noq endpoints (testing, localhost) |
|
||||
| `varint-util` | ❌ | Varint read/write utilities without full RPC |
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Application │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌───────────┐ ┌───────────┐ │
|
||||
│ │ Client │─────│ Protocol │─────│ Actor/ │ │
|
||||
│ │<S> │ │ Enum (S) │ │ Handler │ │
|
||||
│ └────┬─────┘ └───────────┘ └─────┬─────┘ │
|
||||
│ │ │ │
|
||||
│ ┌────▼─────────────────────────────────────▼─────┐ │
|
||||
│ │ WithChannels<I, S> │ │
|
||||
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌─────┐ │ │
|
||||
│ │ │ inner │ │ tx │ │ rx │ │span │ │ │
|
||||
│ │ │ (I) │ │(Sender)│ │(Recv) │ │ │ │ │
|
||||
│ │ └────────┘ └────────┘ └────────┘ └─────┘ │ │
|
||||
│ └────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────┐ ┌─────────────────────────┐ │
|
||||
│ │ Local Path │ │ Remote Path (rpc feat) │ │
|
||||
│ │ tokio::mpsc │ │ noq QUIC streams │ │
|
||||
│ │ tokio::oneshot │ │ postcard serialization │ │
|
||||
│ └────────────────────┘ └─────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Core Flow
|
||||
|
||||
1. **Define a protocol** — An enum where each variant represents an RPC method, annotated with `#[rpc(tx=..., rx=...)]`.
|
||||
2. **The `rpc_requests` macro** generates:
|
||||
- `Channels<S>` impl for each request type
|
||||
- A message enum wrapping each request in `WithChannels<I, S>`
|
||||
- `Service` and `RemoteService` trait implementations
|
||||
- `From` conversions between request types, protocol enum, and message enum
|
||||
3. **Client sends messages** — `Client<S>` either sends over a local `mpsc` channel or serializes and sends over a QUIC stream.
|
||||
4. **Actor/handler processes messages** — Matches on the message enum, extracts `WithChannels { inner, tx, rx, .. }`, and uses `tx`/`rx` to communicate back.
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
```
|
||||
irpc (core)
|
||||
├── serde (always)
|
||||
├── tokio (sync, macros)
|
||||
├── tokio-util
|
||||
├── n0-error
|
||||
├── n0-future
|
||||
├── postcard (rpc feature)
|
||||
├── noq (rpc feature)
|
||||
├── smallvec (rpc feature)
|
||||
├── tracing (spans feature)
|
||||
└── irpc-derive (derive feature)
|
||||
|
||||
irpc-iroh
|
||||
├── irpc
|
||||
├── iroh
|
||||
├── iroh-base
|
||||
├── postcard
|
||||
└── n0-error, n0-future, tokio, tracing, serde
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Dual-licensed: Apache-2.0 OR MIT
|
||||
239
docs/research/references/iroh/irpc/02-types-and-traits.md
Normal file
239
docs/research/references/iroh/irpc/02-types-and-traits.md
Normal file
@@ -0,0 +1,239 @@
|
||||
# irpc: Key Types and Traits
|
||||
|
||||
## Core Traits
|
||||
|
||||
### `RpcMessage`
|
||||
|
||||
```rust
|
||||
pub trait RpcMessage: Debug + Serialize + DeserializeOwned + Send + Sync + Unpin + 'static {}
|
||||
```
|
||||
|
||||
A blanket trait implemented for all types that satisfy the bounds. Every message sent through irpc (both local and remote) must implement this. The `Serialize + DeserializeOwned` requirement exists even without the `rpc` feature because the same protocol definition should work in both modes.
|
||||
|
||||
### `Service`
|
||||
|
||||
```rust
|
||||
pub trait Service: Serialize + DeserializeOwned + Send + Sync + Debug + 'static {
|
||||
type Message: Send + Unpin + 'static;
|
||||
}
|
||||
```
|
||||
|
||||
Implemented on the **protocol enum** (e.g., `StorageProtocol`). The `Message` associated type is the **message enum** — an enum with identical variant names but whose single field is `WithChannels<InnerType, Self>`.
|
||||
|
||||
The `Service` trait acts as a **scope** for channel type definitions, allowing the same inner request type to be used with multiple services.
|
||||
|
||||
### `Channels<S>`
|
||||
|
||||
```rust
|
||||
pub trait Channels<S: Service>: Send + 'static {
|
||||
type Tx: Sender;
|
||||
type Rx: Receiver;
|
||||
}
|
||||
```
|
||||
|
||||
Implemented on each **request type** (e.g., `Get`, `Set`). Specifies what kind of channels accompany that request when sent through service `S`. The `Tx` type is the response channel (server → client); the `Rx` type is the update channel (client → server).
|
||||
|
||||
### `Sender` and `Receiver`
|
||||
|
||||
```rust
|
||||
pub trait Sender: Debug + Sealed {}
|
||||
pub trait Receiver: Debug + Sealed {}
|
||||
```
|
||||
|
||||
Sealed marker traits. Only the types in `irpc::channel` implement these: `oneshot::Sender`, `oneshot::Receiver`, `mpsc::Sender`, `mpsc::Receiver`, `NoSender`, `NoReceiver`.
|
||||
|
||||
### `RemoteService` (rpc feature)
|
||||
|
||||
```rust
|
||||
pub trait RemoteService: Service + Sized {
|
||||
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message;
|
||||
|
||||
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
|
||||
// Default: convert deserialized protocol enum + streams → Message, send to local sender
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Implemented on the protocol enum. Maps a deserialized protocol variant + a pair of QUIC streams into a `WithChannels` message, which is then forwarded to the local actor.
|
||||
|
||||
### `RemoteConnection` (rpc feature)
|
||||
|
||||
```rust
|
||||
pub trait RemoteConnection: Send + Sync + Debug + 'static {
|
||||
fn clone_boxed(&self) -> Box<dyn RemoteConnection>;
|
||||
fn open_bi(&self) -> BoxFuture<Result<(noq::SendStream, noq::RecvStream), RequestError>>;
|
||||
fn zero_rtt_accepted(&self) -> BoxFuture<bool>;
|
||||
}
|
||||
```
|
||||
|
||||
Abstraction over how to open a bidirectional QUIC stream. Implemented for:
|
||||
- `noq::Connection` — direct noq connection
|
||||
- `NoqLazyRemoteConnection` — lazy connection that caches the underlying QUIC connection
|
||||
- `IrohRemoteConnection` — iroh connection (in `irpc-iroh`)
|
||||
- `IrohLazyRemoteConnection` — lazy iroh connection (in `irpc-iroh`)
|
||||
- `IrohZrttRemoteConnection` — 0-RTT iroh connection (in `irpc-iroh`)
|
||||
|
||||
## Key Structs
|
||||
|
||||
### `WithChannels<I, S>`
|
||||
|
||||
```rust
|
||||
pub struct WithChannels<I: Channels<S>, S: Service> {
|
||||
pub inner: I,
|
||||
pub tx: <I as Channels<S>>::Tx,
|
||||
pub rx: <I as Channels<S>>::Rx,
|
||||
#[cfg(feature = "spans")]
|
||||
pub span: tracing::Span,
|
||||
}
|
||||
```
|
||||
|
||||
The central message wrapper. Wraps a request type `I` with its typed channels for service `S`. Implements `Deref` to `I` for convenient field access.
|
||||
|
||||
**Construction** via tuple conversions:
|
||||
- `(inner, tx, rx)` → full channels
|
||||
- `(inner, tx)` → when `Rx = NoReceiver` (most common for RPC/server-streaming)
|
||||
- `(inner,)` → when `Tx = NoSender, Rx = NoReceiver` (notify)
|
||||
|
||||
### `Client<S>`
|
||||
|
||||
```rust
|
||||
#[derive(Debug)]
|
||||
pub struct Client<S: Service>(ClientInner<S::Message>, PhantomData<S>);
|
||||
```
|
||||
|
||||
The primary client type. Generic over a service `S`. Can be either local or remote.
|
||||
|
||||
**Construction:**
|
||||
- `Client::local(mpsc_sender)` — from a tokio mpsc sender
|
||||
- `Client::noq(endpoint, addr)` — from a noq endpoint + address (rpc feature)
|
||||
- `Client::boxed(remote_connection)` — from any `RemoteConnection` impl
|
||||
|
||||
**Key methods** (all handle both local and remote transparently):
|
||||
|
||||
| Method | Pattern | Tx Type | Rx Type |
|
||||
|---|---|---|---|
|
||||
| `rpc()` | Unary RPC | `oneshot::Sender<Res>` | `NoReceiver` |
|
||||
| `server_streaming()` | Server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
|
||||
| `client_streaming()` | Client streaming | `oneshot::Sender<Res>` | `mpsc::Receiver<Update>` |
|
||||
| `bidi_streaming()` | Bidirectional | `mpsc::Sender<Res>` | `mpsc::Receiver<Update>` |
|
||||
| `notify()` | Fire-and-forget | `NoSender` | `NoReceiver` |
|
||||
| `rpc_0rtt()` | 0-RTT unary | `oneshot::Sender<Res>` | `NoReceiver` |
|
||||
| `server_streaming_0rtt()` | 0-RTT server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
|
||||
| `notify_0rtt()` | 0-RTT fire-and-forget | `NoSender` | `NoReceiver` |
|
||||
|
||||
Each method creates the appropriate channel pair, wraps the message into `WithChannels`, and sends it.
|
||||
|
||||
### `LocalSender<S>`
|
||||
|
||||
```rust
|
||||
#[repr(transparent)]
|
||||
pub struct LocalSender<S: Service>(crate::channel::mpsc::Sender<S::Message>);
|
||||
```
|
||||
|
||||
A thin wrapper around `mpsc::Sender<S::Message>` for sending messages to a local actor. Provides:
|
||||
|
||||
```rust
|
||||
impl<S: Service> LocalSender<S> {
|
||||
pub fn send<T>(&self, value: impl Into<WithChannels<T, S>>) -> impl Future<Output = Result<(), SendError>>
|
||||
where
|
||||
T: Channels<S>,
|
||||
S::Message: From<WithChannels<T, S>>;
|
||||
|
||||
pub fn send_raw(&self, value: S::Message) -> impl Future<Output = Result<(), SendError>>;
|
||||
}
|
||||
```
|
||||
|
||||
### `Request<L, R>`
|
||||
|
||||
```rust
|
||||
pub enum Request<L, R> {
|
||||
Local(L),
|
||||
Remote(R),
|
||||
}
|
||||
```
|
||||
|
||||
A generic enum distinguishing local vs remote requests. `Client::request()` returns `Request<LocalSender<S>, RemoteSender<S>>`.
|
||||
|
||||
### `RemoteSender<S>` (rpc feature)
|
||||
|
||||
```rust
|
||||
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
|
||||
```
|
||||
|
||||
Holds a QUIC stream pair after opening a bidirectional stream. The `write()` method serializes the protocol message with postcard + varint length prefix and sends it over the send stream.
|
||||
|
||||
### `Handler<R>` (rpc feature)
|
||||
|
||||
```rust
|
||||
pub type Handler<R> = Arc<
|
||||
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
|
||||
+ Send + Sync + 'static,
|
||||
>;
|
||||
```
|
||||
|
||||
A shared handler function that processes incoming remote requests. Typically created via `Protocol::remote_handler(local_sender)`.
|
||||
|
||||
## Error Types
|
||||
|
||||
### `RequestError`
|
||||
|
||||
```rust
|
||||
pub enum RequestError {
|
||||
Connect { source: noq::ConnectError }, // Connection establishment failed
|
||||
Connection { source: noq::ConnectionError }, // Stream open failed
|
||||
Other { source: AnyError }, // Generic error for non-noq transports
|
||||
}
|
||||
```
|
||||
|
||||
### `SendError` (in `channel` module)
|
||||
|
||||
```rust
|
||||
pub enum SendError {
|
||||
ReceiverClosed, // Local: receiver dropped
|
||||
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||
Io { source: io::Error }, // Remote: network/serialization error
|
||||
}
|
||||
```
|
||||
|
||||
### `RecvError` (oneshot and mpsc variants)
|
||||
|
||||
```rust
|
||||
// oneshot::RecvError
|
||||
pub enum RecvError {
|
||||
SenderClosed, // Local: sender dropped
|
||||
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||
Io { source: io::Error }, // Remote: network/deserialization error
|
||||
}
|
||||
|
||||
// mpsc::RecvError
|
||||
pub enum RecvError {
|
||||
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||
Io { source: io::Error }, // Remote: network/deserialization error
|
||||
}
|
||||
```
|
||||
|
||||
Note: `mpsc::RecvError` does **not** have `SenderClosed` — mpsc receivers return `Ok(None)` when the sender is dropped.
|
||||
|
||||
### `WriteError` (rpc feature)
|
||||
|
||||
```rust
|
||||
pub enum WriteError {
|
||||
Noq { source: noq::WriteError }, // QUIC stream write error
|
||||
MaxMessageSizeExceeded, // Message > 16 MiB
|
||||
Io { source: io::Error }, // Serialization error
|
||||
}
|
||||
```
|
||||
|
||||
### `Error` (top-level umbrella)
|
||||
|
||||
```rust
|
||||
pub enum Error {
|
||||
Request { source: RequestError },
|
||||
Send { source: SendError },
|
||||
MpscRecv { source: mpsc::RecvError },
|
||||
OneshotRecv { source: oneshot::RecvError },
|
||||
Write { source: rpc::WriteError }, // rpc feature only
|
||||
}
|
||||
```
|
||||
|
||||
All error types implement `From<Error>` for `io::Error`, allowing integration with `?` in `io::Result` contexts.
|
||||
168
docs/research/references/iroh/irpc/03-channel-system.md
Normal file
168
docs/research/references/iroh/irpc/03-channel-system.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# irpc: Channel System
|
||||
|
||||
The channel system is the heart of irpc. It provides channel types that abstract over local (tokio) and remote (QUIC stream) communication, with the same API surface regardless of transport.
|
||||
|
||||
## Channel Kinds
|
||||
|
||||
irpc provides three kinds of channels, each with local and remote variants:
|
||||
|
||||
### Oneshot Channels (`channel::oneshot`)
|
||||
|
||||
Single-value, single-use channels for RPC responses.
|
||||
|
||||
| Type | Local Backend | Remote Backend |
|
||||
|---|---|---|
|
||||
| `oneshot::Sender<T>` | `tokio::sync::oneshot::Sender` | `BoxedSender<T>` (FnOnce over QUIC write) |
|
||||
| `oneshot::Receiver<T>` | `FusedOneshotReceiver<T>` | `BoxedReceiver<T>` (boxed future over QUIC read) |
|
||||
|
||||
**Creation:** `oneshot::channel::<T>()` returns `(Sender<T>, Receiver<T>)`
|
||||
|
||||
**Sender behavior:**
|
||||
- Local: `send(value)` is synchronous-ish, fails only if receiver dropped
|
||||
- Remote: `send(value)` is async — serializes with postcard, length-prefixes with varint, writes to QUIC stream
|
||||
|
||||
**Receiver behavior:**
|
||||
- Implements `Future<Output = Result<T, RecvError>>`
|
||||
- Local: resolves to the value or `SenderClosed` error
|
||||
- Remote: reads varint length prefix, reads that many bytes, deserializes with postcard
|
||||
|
||||
**Filtering/Mapping** (on `Sender<T>` where `T: Send + Sync + 'static`):
|
||||
```rust
|
||||
sender.with_filter(|v| v > 0) // Drop messages failing predicate
|
||||
sender.with_map(|v: U| v.into()) // Transform before sending
|
||||
sender.with_filter_map(|v| ...) // Combined filter + map
|
||||
```
|
||||
|
||||
### MPSC Channels (`channel::mpsc`)
|
||||
|
||||
Multi-producer, single-consumer streaming channels for server-streaming, client-streaming, and bidirectional patterns.
|
||||
|
||||
| Type | Local Backend | Remote Backend |
|
||||
|---|---|---|
|
||||
| `mpsc::Sender<T>` | `tokio::sync::mpsc::Sender` | `Arc<DynSender<T>>` (NoqSender) |
|
||||
| `mpsc::Receiver<T>` | `tokio::sync::mpsc::Receiver` | `Box<dyn DynReceiver<T>>` (NoqReceiver) |
|
||||
|
||||
**Creation:** `mpsc::channel::<T>(buffer)` returns `(Sender<T>, Receiver<T>)`
|
||||
|
||||
**Sender behavior:**
|
||||
- `send(value).await` — sends, yielding if full (remote: serializes + writes to stream)
|
||||
- `try_send(value).await` — non-blocking attempt; returns `Ok(false)` if would block
|
||||
- `closed().await` — waits until all receivers are dropped
|
||||
- `is_rpc()` — returns `true` for remote senders
|
||||
|
||||
**Receiver behavior:**
|
||||
- `recv().await` → `Result<Option<T>, RecvError>` — `None` means sender closed/cleanly finished
|
||||
- `filter(pred)`, `map(fn)`, `filter_map(fn)` — chainable transformations
|
||||
- `into_stream()` (with `stream` feature) — converts to `Stream<Item = Result<T, RecvError>>`
|
||||
|
||||
**Cloning:** `mpsc::Sender<T>` implements `Clone`. Local senders clone the underlying tokio sender; remote senders clone the `Arc`.
|
||||
|
||||
### None Channels (`channel::none`)
|
||||
|
||||
Placeholder channels for when no communication is needed.
|
||||
|
||||
```rust
|
||||
pub struct NoSender; // Implements Sender, does nothing
|
||||
pub struct NoReceiver; // Implements Receiver, does nothing
|
||||
```
|
||||
|
||||
Used as defaults when `#[rpc(tx=...)]` or `#[rpc(rx=...)]` are omitted.
|
||||
|
||||
## Remote Channel Internals
|
||||
|
||||
### NoqSender<T>
|
||||
|
||||
```rust
|
||||
struct NoqSender<T>(tokio::sync::Mutex<NoqSenderState<T>>);
|
||||
|
||||
enum NoqSenderState<T> {
|
||||
Open(NoqSenderInner<T>),
|
||||
Closed,
|
||||
}
|
||||
|
||||
struct NoqSenderInner<T> {
|
||||
send: noq::SendStream,
|
||||
buffer: SmallVec<[u8; 128]>, // Stack-allocated buffer for small messages
|
||||
_marker: PhantomData<T>,
|
||||
}
|
||||
```
|
||||
|
||||
Key behaviors:
|
||||
- **Mutex-protected state**: The inner state is `Mutex`-protected because `DynSender::send()` takes `&self`. When a send fails, the state transitions to `Closed` and all subsequent sends return `BrokenPipe`.
|
||||
- **Buffer reuse**: Uses `SmallVec<[u8; 128]>` to avoid heap allocation for messages that serialize to ≤128 bytes.
|
||||
- **Serialization**: Each message is postcard-serialized with a varint length prefix. If serialization exceeds `MAX_MESSAGE_SIZE` (16 MiB), the stream is reset with error code `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` (1).
|
||||
- **Serialization errors**: If postcard serialization fails, the stream is reset with `ERROR_CODE_INVALID_POSTCARD` (2).
|
||||
|
||||
### NoqReceiver<T>
|
||||
|
||||
```rust
|
||||
struct NoqReceiver<T> {
|
||||
recv: noq::RecvStream,
|
||||
_marker: PhantomData<T>,
|
||||
}
|
||||
```
|
||||
|
||||
Reads a varint length prefix, allocates a buffer of that size, reads the data, and deserializes with postcard. If the length exceeds `MAX_MESSAGE_SIZE`, stops the stream with the appropriate error code.
|
||||
|
||||
### Oneshot Remote Sender
|
||||
|
||||
For `oneshot::Sender<T>` over QUIC, the sender is a `BoxedSender<T>` — a `Box<dyn FnOnce(T) -> BoxFuture<Result<(), SendError>>>`. This captures the `noq::SendStream` and on invocation:
|
||||
1. Computes `postcard::experimental::serialized_size(&value)`
|
||||
2. Checks against `MAX_MESSAGE_SIZE`
|
||||
3. Writes length-prefixed postcard data to the stream
|
||||
|
||||
### Oneshot Remote Receiver
|
||||
|
||||
For `oneshot::Receiver<T>` over QUIC, the receiver is constructed from a `noq::RecvStream`:
|
||||
1. Reads a varint length prefix
|
||||
2. Reads that many bytes
|
||||
3. Deserializes with postcard
|
||||
4. Returns the value
|
||||
|
||||
## Channel Conversion Table
|
||||
|
||||
When a QUIC stream pair `(SendStream, RecvStream)` is received for a request:
|
||||
|
||||
| Channel Kind | `Tx` (SendStream →) | `Rx` (RecvStream →) |
|
||||
|---|---|---|
|
||||
| `oneshot::Sender<T>` | Serialize + write, then finish | Read length-prefixed data |
|
||||
| `mpsc::Sender<T>` | Repeatedly serialize + write | N/A |
|
||||
| `oneshot::Receiver<T>` | N/A | Read single length-prefixed value |
|
||||
| `mpsc::Receiver<T>` | N/A | Repeatedly read length-prefixed values |
|
||||
| `NoSender` | Drop the stream | N/A |
|
||||
| `NoReceiver` | N/A | Drop the stream |
|
||||
|
||||
The `From<noq::RecvStream>` and `From<noq::SendStream>` impls handle these conversions automatically based on the target type.
|
||||
|
||||
## DynSender and DynReceiver Traits
|
||||
|
||||
The `mpsc` module exposes traits for dynamic dispatch:
|
||||
|
||||
```rust
|
||||
pub trait DynSender<T>: Debug + Send + Sync + 'static {
|
||||
fn send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<(), SendError>> + Send + '_>>;
|
||||
fn try_send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<bool, SendError>> + Send + '_>>;
|
||||
fn closed(&self) -> Pin<Box<dyn Future<Output = ()> + Send + Sync + '_>>;
|
||||
fn is_rpc(&self) -> bool;
|
||||
}
|
||||
|
||||
pub trait DynReceiver<T>: Debug + Send + Sync + 'static {
|
||||
fn recv(&mut self) -> Pin<Box<dyn Future<Output = Result<Option<T>, RecvError>> + Send + Sync + '_>>;
|
||||
}
|
||||
```
|
||||
|
||||
These enable boxing of remote senders/receivers while keeping the local variants unboxed for zero overhead.
|
||||
|
||||
## FusedOneshotReceiver
|
||||
|
||||
A thin wrapper around `tokio::sync::oneshot::Receiver` that prevents panics when polling an already-completed receiver. It tracks completion state and returns `Poll::Pending` indefinitely after resolution, matching the `FusedFuture` pattern.
|
||||
|
||||
## Cancellation Safety
|
||||
|
||||
For remote `mpsc::Sender`:
|
||||
- If a `send()` future is dropped before completion, the underlying QUIC stream is closed.
|
||||
- All clones of the sender will receive `SendError::Io(BrokenPipe)` on subsequent send attempts.
|
||||
- This is documented behavior: **always poll send futures to completion if you want to reuse the sender**.
|
||||
|
||||
For remote `oneshot::Sender`:
|
||||
- Since it's `FnOnce`, dropping the future before sending simply means the value is never sent. The receiver will get `SenderClosed`.
|
||||
@@ -0,0 +1,272 @@
|
||||
# irpc: Protocol and Message Flow
|
||||
|
||||
## Wire Protocol
|
||||
|
||||
When the `rpc` feature is enabled, irpc uses the following wire format over QUIC streams:
|
||||
|
||||
### Message Framing
|
||||
|
||||
Every message on the wire is **length-prefixed using postcard varints** (LEB128 encoding):
|
||||
|
||||
```
|
||||
┌─────────────────┬──────────────────────┐
|
||||
│ varint length │ postcard-serialized │
|
||||
│ (1-10 bytes) │ message data │
|
||||
└─────────────────┴──────────────────────┘
|
||||
```
|
||||
|
||||
- **Length prefix**: LEB128 varint encoding of `u64` length. Each byte uses 7 bits for the value and the MSB as a continuation bit. Maximum 10 bytes for a full `u64`.
|
||||
- **Payload**: Postcard-encoded (compact, no-schema serde format) Rust message.
|
||||
|
||||
### Maximum Message Size
|
||||
|
||||
`MAX_MESSAGE_SIZE = 16 MiB (16 * 1024 * 1024)`
|
||||
|
||||
Messages exceeding this limit are rejected:
|
||||
- **Send side**: The sender checks `postcard::experimental::serialized_size()` before sending. If exceeded, the stream is reset with error code `1` (`ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED`).
|
||||
- **Receive side**: After reading the varint length, if it exceeds `MAX_MESSAGE_SIZE`, the stream is stopped with error code `1`.
|
||||
|
||||
### Error Codes
|
||||
|
||||
| Code | Constant | Meaning |
|
||||
|---|---|---|
|
||||
| `1` | `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` | Message larger than 16 MiB |
|
||||
| `2` | `ERROR_CODE_INVALID_POSTCARD` | Postcard serialization failed |
|
||||
|
||||
These are used as QUIC stream reset/stop error codes.
|
||||
|
||||
### Connection Closure
|
||||
|
||||
Error code `0` on the QUIC connection means "clean close" — the remote side intentionally shut down. This is distinguished from actual errors.
|
||||
|
||||
## Message Flow: Local Path
|
||||
|
||||
```
|
||||
Client Actor
|
||||
│ │
|
||||
│ Client::rpc(Get { key: "x" }) │
|
||||
│ │
|
||||
│ 1. Create oneshot channel pair │
|
||||
│ (tx, rx) = oneshot::channel() │
|
||||
│ │
|
||||
│ 2. Wrap into WithChannels │
|
||||
│ WithChannels { │
|
||||
│ inner: Get { key: "x" }, │
|
||||
│ tx: oneshot::Sender<Res>, │
|
||||
│ rx: NoReceiver, │
|
||||
│ span: current_span, │
|
||||
│ } │
|
||||
│ │
|
||||
│ 3. Convert to Message enum │
|
||||
│ StorageMessage::Get(wc) │
|
||||
│ │
|
||||
│ 4. Send over mpsc channel ────────►│
|
||||
│ │
|
||||
│ 5. Await on oneshot receiver │
|
||||
│ rx.await ◄─────────────────────│
|
||||
│ tx.send(res)│
|
||||
│ │
|
||||
│ Result: res │
|
||||
```
|
||||
|
||||
For bidirectional streaming:
|
||||
```
|
||||
Client Actor
|
||||
│ │
|
||||
│ Client::bidi_streaming(Sum, 4, 4) │
|
||||
│ │
|
||||
│ 1. Create channel pairs │
|
||||
│ (update_tx, update_rx) │
|
||||
│ (res_tx, res_rx) │
|
||||
│ │
|
||||
│ 2. WithChannels { │
|
||||
│ inner: Sum, │
|
||||
│ tx: mpsc::Sender<i64>, │
|
||||
│ rx: mpsc::Receiver<i64>, │
|
||||
│ } │
|
||||
│ │
|
||||
│ 3. Send message ──────────────────►│
|
||||
│ │
|
||||
│ 4. Use update_tx.send(val) ───────►│
|
||||
│ Use res_rx.recv() ◄─────────│
|
||||
│ res_tx.send(val)
|
||||
│ │
|
||||
```
|
||||
|
||||
## Message Flow: Remote Path
|
||||
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ Client::rpc(Get { key: "x" }) │
|
||||
│ │
|
||||
│ 1. open_bi() → (SendStream, RecvStream)
|
||||
│ │
|
||||
│ 2. Serialize StorageProtocol::Get(Get { key: "x" })
|
||||
│ with postcard + varint prefix │
|
||||
│ │
|
||||
│ 3. Write to SendStream ───────────►│
|
||||
│ │
|
||||
│ │ 4. Accept bi stream
|
||||
│ │ 5. Read varint + deserialize
|
||||
│ │ 6. RemoteService::with_remote_channels()
|
||||
│ │ → WithChannels { inner, tx, rx }
|
||||
│ │ 7. Forward to local actor
|
||||
│ │
|
||||
│ │ Actor processes, sends response
|
||||
│ │ on the SendStream (which is the
|
||||
│ │ oneshot::Sender<T> backed by QUIC)
|
||||
│ │
|
||||
│ 8. Read from RecvStream ◄──────────│
|
||||
│ 9. Deserialize response │
|
||||
│ │
|
||||
│ Result: res │
|
||||
```
|
||||
|
||||
For bidirectional streaming over remote:
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ Client::bidi_streaming(Sum, 4, 4) │
|
||||
│ │
|
||||
│ open_bi() → (SendStream, RecvStream)
|
||||
│ │
|
||||
│ SendStream → mpsc::Sender<Update> │ RecvStream → mpsc::Receiver<Update>
|
||||
│ RecvStream → oneshot::Receiver<Res>│ SendStream → oneshot::Sender<Res>
|
||||
│ (or mpsc::Receiver<Res> for │
|
||||
│ server-streaming with mpsc tx) │
|
||||
│ │
|
||||
│ The initial message is sent on │
|
||||
│ SendStream with varint prefix. │
|
||||
│ │
|
||||
│ Subsequent updates are sent on │
|
||||
│ the same SendStream as varint- │
|
||||
│ prefixed postcard messages. │
|
||||
│ │
|
||||
│ The response stream is read from │
|
||||
│ the RecvStream as varint-prefixed │
|
||||
│ postcard messages. │
|
||||
```
|
||||
|
||||
## Stream Direction Convention
|
||||
|
||||
In irpc's QUIC stream model:
|
||||
- **Client opens** a bidirectional stream (`open_bi()`)
|
||||
- **SendStream** (client → server): carries the initial request message, plus any client-streaming updates
|
||||
- **RecvStream** (server → client): carries the response(s) from the server
|
||||
|
||||
The `RemoteService::with_remote_channels()` method decides how to map streams to channels:
|
||||
|
||||
```rust
|
||||
// For a simple RPC (tx=oneshot, rx=none):
|
||||
fn with_remote_channels(self, rx: RecvStream, tx: SendStream) -> Self::Message {
|
||||
// rx stream is unused (NoReceiver), tx carries response
|
||||
WithChannels::from((msg, tx.into(), rx.into()))
|
||||
// tx → oneshot::Sender<Res> (or mpsc::Sender<Res>)
|
||||
// rx → NoReceiver
|
||||
}
|
||||
```
|
||||
|
||||
Wait — looking at the actual implementation more carefully:
|
||||
|
||||
The `RemoteService::with_remote_channels` method takes `(self, rx: RecvStream, tx: SendStream)` where:
|
||||
- `rx` = the `RecvStream` from the bidirectional stream (client reads from this)
|
||||
- `tx` = the `SendStream` from the bidirectional stream (client writes to this)
|
||||
|
||||
But for the **server side**, the `RecvStream` is what the server reads from (client updates), and `SendStream` is what the server writes to (server responses).
|
||||
|
||||
In the `with_remote_channels` generated code:
|
||||
```rust
|
||||
// For rpc(tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>):
|
||||
WithChannels::from((msg, tx.into(), rx.into()))
|
||||
// tx (SendStream) → oneshot::Sender<Res> — server writes response
|
||||
// rx (RecvStream) → mpsc::Receiver<Update> — server reads client updates
|
||||
```
|
||||
|
||||
So the naming in `with_remote_channels` is from the **server's perspective**:
|
||||
- `rx` parameter = RecvStream = what server receives (client → server updates)
|
||||
- `tx` parameter = SendStream = what server sends (server → client responses)
|
||||
|
||||
## Connection Management
|
||||
|
||||
### NoqLazyRemoteConnection
|
||||
|
||||
```rust
|
||||
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
|
||||
|
||||
struct NoqLazyRemoteConnectionInner {
|
||||
endpoint: noq::Endpoint,
|
||||
addr: SocketAddr,
|
||||
connection: Mutex<Option<noq::Connection>>,
|
||||
}
|
||||
```
|
||||
|
||||
- Lazily establishes connection on first use
|
||||
- Caches the `noq::Connection` inside a `Mutex<Option<...>>`
|
||||
- On `open_bi()`: if cached connection exists, tries to reuse it; if it fails, clears cache and reconnects once
|
||||
- Thread-safe via `Arc` + `Mutex`
|
||||
|
||||
### IrohLazyRemoteConnection (irpc-iroh)
|
||||
|
||||
Same pattern but for iroh endpoints, with an additional `alpn` field for protocol identification.
|
||||
|
||||
### 0-RTT Support
|
||||
|
||||
irpc supports QUIC 0-RTT for reduced latency on reconnections:
|
||||
|
||||
- `Client::rpc_0rtt()` — sends request immediately with 0-RTT data; if the server rejects 0-RTT, re-sends
|
||||
- `Client::server_streaming_0rtt()` — same for server-streaming
|
||||
- `Client::notify_0rtt()` — same for fire-and-forget
|
||||
|
||||
The 0-RTT flow:
|
||||
1. Client serializes the message into a buffer (`prepare_write()`)
|
||||
2. Sends the buffer over a 0-RTT connection
|
||||
3. Awaits `zero_rtt_accepted()` to check if 0-RTT was accepted
|
||||
4. If not accepted, opens a new connection and re-sends the same buffer
|
||||
|
||||
`RemoteConnection::zero_rtt_accepted()` returns `true` for regular connections and for lazy connections. For `IrohZrttRemoteConnection`, it checks the actual 0-RTT status via `handshake_completed()`.
|
||||
|
||||
## Server-Side: Accepting Connections
|
||||
|
||||
### Using noq (direct QUIC)
|
||||
|
||||
```rust
|
||||
irpc::rpc::listen(endpoint, handler)
|
||||
```
|
||||
|
||||
This function:
|
||||
1. Loops on `endpoint.accept()` to accept incoming connections
|
||||
2. For each connection, spawns a task running `handle_connection()`
|
||||
3. `handle_connection()` loops on `read_request_raw()` to read requests from bidirectional streams
|
||||
4. Each request is deserialized and passed to the `Handler`
|
||||
|
||||
### Using iroh
|
||||
|
||||
```rust
|
||||
IrohProtocol::with_sender(local_sender)
|
||||
```
|
||||
|
||||
This creates a `ProtocolHandler` that can be registered with `iroh::protocol::Router`. When a connection arrives, it calls `handle_connection()` from irpc-iroh, which handles the protocol handshake and reads requests.
|
||||
|
||||
For 0-RTT support:
|
||||
```rust
|
||||
Iroh0RttProtocol::with_sender(local_sender)
|
||||
```
|
||||
|
||||
This implements `ProtocolHandler::on_accepting()` to handle 0-RTT connections.
|
||||
|
||||
### Handler Function
|
||||
|
||||
```rust
|
||||
type Handler<R> = Arc<
|
||||
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
|
||||
+ Send + Sync + 'static,
|
||||
>;
|
||||
```
|
||||
|
||||
The handler receives:
|
||||
1. The deserialized protocol message (`R`)
|
||||
2. The `RecvStream` (for client → server updates)
|
||||
3. The `SendStream` (for server → client responses)
|
||||
|
||||
Typically created via `Protocol::remote_handler(local_sender)`, which converts streams to typed channels and forwards the `WithChannels` message to a local actor.
|
||||
278
docs/research/references/iroh/irpc/05-rpc-requests-macro.md
Normal file
278
docs/research/references/iroh/irpc/05-rpc-requests-macro.md
Normal file
@@ -0,0 +1,278 @@
|
||||
# irpc: The rpc_requests Macro
|
||||
|
||||
The `#[rpc_requests]` attribute macro is the primary way to define an irpc protocol. It generates the boilerplate for channel typing, message wrapping, and service trait implementations.
|
||||
|
||||
## Basic Usage
|
||||
|
||||
```rust
|
||||
use irpc::{channel::{mpsc, oneshot}, rpc_requests, Client, WithChannels};
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
#[rpc_requests(message = ComputeMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum ComputeProtocol {
|
||||
/// Unary RPC: one request, one response
|
||||
#[rpc(tx=oneshot::Sender<i64>)]
|
||||
#[wrap(Multiply)]
|
||||
Multiply(i64, i64),
|
||||
|
||||
/// Bidirectional streaming
|
||||
#[rpc(tx=mpsc::Sender<i64>, rx=mpsc::Receiver<i64>)]
|
||||
#[wrap(Sum)]
|
||||
Sum,
|
||||
}
|
||||
```
|
||||
|
||||
This single macro invocation generates:
|
||||
|
||||
1. **Wrapper structs** (from `#[wrap]`): `Multiply` and `Sum` struct types
|
||||
2. **`Channels<ComputeProtocol>` impls**: For each variant's inner type, specifying `Tx` and `Rx`
|
||||
3. **`Service` impl**: `impl Service for ComputeProtocol { type Message = ComputeMessage; }`
|
||||
4. **`RemoteService` impl** (rpc feature): Maps protocol variants + QUIC streams to messages
|
||||
5. **`ComputeMessage` enum**: Wraps each request in `WithChannels`
|
||||
6. **`From` conversions**: Between inner types, `ComputeProtocol`, and `ComputeMessage`
|
||||
|
||||
## Macro Arguments
|
||||
|
||||
### Top-level (on the enum)
|
||||
|
||||
| Argument | Required | Description |
|
||||
|---|---|---|
|
||||
| `message = Name` | Recommended | Name of the generated message enum. Also generates `Service` and `RemoteService` impls. |
|
||||
| `alias = "Suffix"` | Optional | Generates type aliases like `MultiplyMsg = WithChannels<Multiply, ComputeProtocol>` |
|
||||
| `rpc_feature = "feat"` | Optional | Feature-gates the `RemoteService` impl with `#[cfg(feature = "feat")]` |
|
||||
| `no_rpc` | Optional | Skips generating `RemoteService` impl entirely |
|
||||
| `no_spans` | Optional | Skips span-related code (for use without the `spans` feature) |
|
||||
|
||||
### Per-variant
|
||||
|
||||
#### `#[rpc(tx=Type, rx=Type)]`
|
||||
|
||||
Specifies channel types for each request:
|
||||
- `tx` — response channel type (server → client). Defaults to `NoSender`.
|
||||
- `rx` — update channel type (client → server). Defaults to `NoReceiver`.
|
||||
|
||||
Valid types:
|
||||
- `oneshot::Sender<T>` — single response
|
||||
- `mpsc::Sender<T>` — streaming response
|
||||
- `oneshot::Receiver<T>` — not valid as tx (use for rx pattern)
|
||||
- `mpsc::Receiver<T>` — streaming updates (client → server)
|
||||
- `NoSender` / `NoReceiver` — no channel in that direction
|
||||
|
||||
#### `#[wrap(TypeName, derive(Traits))]`
|
||||
|
||||
Generates a struct from the variant's fields:
|
||||
- `TypeName` — name of the generated struct
|
||||
- Optional visibility prefix (e.g., `pub(crate) TypeName`)
|
||||
- `derive(...)` — additional derive macros beyond the default `Serialize, Deserialize, Debug`
|
||||
|
||||
If `#[wrap]` is not used, each variant must have exactly one unnamed field (a named type).
|
||||
|
||||
## Generated Code Walkthrough
|
||||
|
||||
Given this input:
|
||||
```rust
|
||||
#[rpc_requests(message = StoreMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum StoreProtocol {
|
||||
#[rpc(tx=oneshot::Sender<String>)]
|
||||
#[wrap(GetRequest, derive(Clone))]
|
||||
Get(String),
|
||||
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
#[wrap(SetRequest)]
|
||||
Set { key: String, value: String },
|
||||
}
|
||||
```
|
||||
|
||||
The macro generates:
|
||||
|
||||
### 1. Wrapper Structs
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Serialize, Deserialize, Clone)]
|
||||
pub GetRequest(pub String);
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
pub SetRequest { pub key: String, pub value: String }
|
||||
```
|
||||
|
||||
The variants are rewritten to use these:
|
||||
```rust
|
||||
enum StoreProtocol {
|
||||
Get(GetRequest),
|
||||
Set(SetRequest),
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Channels Implementations
|
||||
|
||||
```rust
|
||||
impl Channels<StoreProtocol> for GetRequest {
|
||||
type Tx = oneshot::Sender<String>;
|
||||
type Rx = NoReceiver;
|
||||
}
|
||||
|
||||
impl Channels<StoreProtocol> for SetRequest {
|
||||
type Tx = oneshot::Sender<()>;
|
||||
type Rx = NoReceiver;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Message Enum
|
||||
|
||||
```rust
|
||||
#[doc = "Message enum for [`StoreProtocol`]"]
|
||||
#[allow(missing_docs)]
|
||||
#[derive(Debug)]
|
||||
pub enum StoreMessage {
|
||||
Get(WithChannels<GetRequest, StoreProtocol>),
|
||||
Set(WithChannels<SetRequest, StoreProtocol>),
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Service Implementation
|
||||
|
||||
```rust
|
||||
impl Service for StoreProtocol {
|
||||
type Message = StoreMessage;
|
||||
}
|
||||
```
|
||||
|
||||
### 5. RemoteService Implementation (rpc feature)
|
||||
|
||||
```rust
|
||||
impl RemoteService for StoreProtocol {
|
||||
fn with_remote_channels(
|
||||
self,
|
||||
rx: noq::RecvStream,
|
||||
tx: noq::SendStream,
|
||||
) -> Self::Message {
|
||||
match self {
|
||||
StoreProtocol::Get(msg) => {
|
||||
StoreMessage::from(WithChannels::from((msg, tx, rx)))
|
||||
}
|
||||
StoreProtocol::Set(msg) => {
|
||||
StoreMessage::from(WithChannels::from((msg, tx, rx)))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 6. From Conversions
|
||||
|
||||
```rust
|
||||
// Inner type → Protocol enum
|
||||
impl From<GetRequest> for StoreProtocol { ... }
|
||||
impl From<SetRequest> for StoreProtocol { ... }
|
||||
|
||||
// WithChannels → Message enum
|
||||
impl From<WithChannels<GetRequest, StoreProtocol>> for StoreMessage { ... }
|
||||
impl From<WithChannels<SetRequest, StoreProtocol>> for StoreMessage { ... }
|
||||
```
|
||||
|
||||
### 7. parent_span Method (spans feature)
|
||||
|
||||
```rust
|
||||
impl StoreMessage {
|
||||
pub fn parent_span(&self) -> tracing::Span {
|
||||
let span = match self {
|
||||
StoreMessage::Get(inner) => inner.parent_span_opt(),
|
||||
StoreMessage::Set(inner) => inner.parent_span_opt(),
|
||||
};
|
||||
span.cloned().unwrap_or_else(|| tracing::Span::current())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Interaction Pattern Mapping
|
||||
|
||||
The `#[rpc]` attribute maps directly to gRPC-like patterns:
|
||||
|
||||
| Pattern | `tx` type | `rx` type | Example |
|
||||
|---|---|---|---|
|
||||
| **Unary RPC** | `oneshot::Sender<R>` | `NoReceiver` | Get by key, return value |
|
||||
| **Server streaming** | `mpsc::Sender<R>` | `NoReceiver` | List all items |
|
||||
| **Client streaming** | `oneshot::Sender<R>` | `mpsc::Receiver<U>` | Upload items, get count |
|
||||
| **Bidirectional** | `mpsc::Sender<R>` | `mpsc::Receiver<U>` | Chat, live updates |
|
||||
| **Notify (fire & forget)** | `NoSender` | `NoReceiver` | Log event |
|
||||
|
||||
## Client Methods Generated by Patterns
|
||||
|
||||
The `Client<S>` methods correspond to channel types:
|
||||
|
||||
```rust
|
||||
// Unary RPC: tx=oneshot::Sender<Res>, rx=NoReceiver
|
||||
client.rpc(Get { key: "x" }).await // → Result<Res>
|
||||
|
||||
// Server streaming: tx=mpsc::Sender<Res>, rx=NoReceiver
|
||||
client.server_streaming(List, 16).await // → Result<mpsc::Receiver<Res>>
|
||||
|
||||
// Client streaming: tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>
|
||||
client.client_streaming(SetMany, 4).await // → Result<(mpsc::Sender<Update>, oneshot::Receiver<Res>)>
|
||||
|
||||
// Bidirectional: tx=mpsc::Sender<Res>, rx=mpsc::Receiver<Update>
|
||||
client.bidi_streaming(Sum, 4, 4).await // → Result<(mpsc::Sender<Update>, mpsc::Receiver<Res>)>
|
||||
|
||||
// Notify: tx=NoSender, rx=NoReceiver
|
||||
client.notify(Log { msg: "hi" }).await // → Result<()>
|
||||
```
|
||||
|
||||
## Manual Protocol Definition (Without Macro)
|
||||
|
||||
You can define protocols manually instead of using the macro:
|
||||
|
||||
```rust
|
||||
use irpc::{channel::{mpsc, none::NoReceiver, oneshot}, Channels, Service, WithChannels};
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
// 1. Define request types
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct Get { key: String }
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct Set { key: String, value: String }
|
||||
|
||||
// 2. Implement Channels for each type
|
||||
impl Channels<StorageProtocol> for Get {
|
||||
type Tx = oneshot::Sender<Option<String>>;
|
||||
type Rx = NoReceiver;
|
||||
}
|
||||
|
||||
impl Channels<StorageProtocol> for Set {
|
||||
type Tx = oneshot::Sender<()>;
|
||||
type Rx = NoReceiver;
|
||||
}
|
||||
|
||||
// 3. Define protocol enum
|
||||
#[derive(derive_more::From, Serialize, Deserialize, Debug)]
|
||||
enum StorageProtocol {
|
||||
Get(Get),
|
||||
Set(Set),
|
||||
}
|
||||
|
||||
// 4. Define message enum
|
||||
#[derive(derive_more::From)]
|
||||
enum StorageMessage {
|
||||
Get(WithChannels<Get, StorageProtocol>),
|
||||
Set(WithChannels<Set, StorageProtocol>),
|
||||
}
|
||||
|
||||
// 5. Implement Service
|
||||
impl Service for StorageProtocol {
|
||||
type Message = StorageMessage;
|
||||
}
|
||||
|
||||
// 6. Implement RemoteService (rpc feature)
|
||||
impl RemoteService for StorageProtocol {
|
||||
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message {
|
||||
match self {
|
||||
StorageProtocol::Get(msg) => WithChannels::from((msg, tx, rx)).into(),
|
||||
StorageProtocol::Set(msg) => WithChannels::from((msg, tx, rx)).into(),
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This manual approach gives full control but requires more boilerplate. The macro generates all of this automatically.
|
||||
@@ -0,0 +1,274 @@
|
||||
# irpc: RPC Module and Remote Transport
|
||||
|
||||
The `rpc` module (enabled by the `rpc` feature) contains all cross-process RPC functionality: QUIC stream handling, connection management, serialization, and server-side request processing.
|
||||
|
||||
## Module Structure
|
||||
|
||||
```rust
|
||||
pub mod rpc {
|
||||
pub const MAX_MESSAGE_SIZE: u64 = 1024 * 1024 * 16;
|
||||
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
|
||||
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
|
||||
|
||||
pub enum WriteError { Noq, MaxMessageSizeExceeded, Io }
|
||||
pub trait RemoteConnection: Send + Sync + Debug + 'static { ... }
|
||||
pub struct RemoteSender<S>(SendStream, RecvStream, PhantomData<S>);
|
||||
pub type Handler<R> = Arc<dyn Fn(R, RecvStream, SendStream) -> BoxFuture<Result<(), SendError>> + Send + Sync>;
|
||||
pub trait RemoteService: Service + Sized { ... }
|
||||
pub async fn listen<R>(endpoint, handler);
|
||||
pub async fn handle_connection<R>(connection, handler) -> io::Result<()>;
|
||||
pub async fn read_request<S: RemoteService>(connection) -> io::Result<Option<S::Message>>;
|
||||
pub async fn read_request_raw<R>(connection) -> io::Result<Option<(R, RecvStream, SendStream)>>;
|
||||
}
|
||||
```
|
||||
|
||||
## RemoteConnection Implementations
|
||||
|
||||
### NoqLazyRemoteConnection
|
||||
|
||||
The default remote connection for noq (QUIC-by-socket-address):
|
||||
|
||||
```rust
|
||||
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
|
||||
|
||||
struct NoqLazyRemoteConnectionInner {
|
||||
endpoint: noq::Endpoint,
|
||||
addr: SocketAddr,
|
||||
connection: Mutex<Option<noq::Connection>>,
|
||||
}
|
||||
```
|
||||
|
||||
**Behavior:**
|
||||
- `open_bi()`:
|
||||
1. Locks the `Mutex<Option<Connection>>`
|
||||
2. If a cached connection exists, tries `conn.open_bi()`
|
||||
3. If that fails, clears the cache and establishes a new connection
|
||||
4. If no cached connection, establishes a new one
|
||||
5. Returns `(SendStream, RecvStream)` pair
|
||||
- `zero_rtt_accepted()`: Always returns `true` (noq doesn't have 0-RTT concept in this context)
|
||||
- `clone_boxed()`: Clones the `Arc`, sharing the same connection cache
|
||||
|
||||
### Direct noq::Connection
|
||||
|
||||
```rust
|
||||
impl RemoteConnection for noq::Connection {
|
||||
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||
// Directly opens a bidirectional stream on the connection
|
||||
}
|
||||
fn zero_rtt_accepted(&self) -> BoxFuture<bool> { Box::pin(async { true }) }
|
||||
}
|
||||
```
|
||||
|
||||
## RemoteSender
|
||||
|
||||
```rust
|
||||
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
|
||||
```
|
||||
|
||||
Created by `Client::request()` when the client is remote. Holds both sides of a QUIC bidirectional stream.
|
||||
|
||||
### Key Methods
|
||||
|
||||
```rust
|
||||
impl<S: Service> RemoteSender<S> {
|
||||
pub fn new(send: SendStream, recv: RecvStream) -> Self;
|
||||
|
||||
pub async fn write(self, msg: impl Into<S>) -> Result<(SendStream, RecvStream), WriteError> {
|
||||
let buf = prepare_write(msg)?;
|
||||
self.write_raw(&buf).await
|
||||
}
|
||||
|
||||
// Internal: writes pre-serialized buffer
|
||||
pub(crate) async fn write_raw(self, buf: &[u8]) -> Result<(SendStream, RecvStream), WriteError>;
|
||||
}
|
||||
```
|
||||
|
||||
The `write()` method:
|
||||
1. Converts `msg` into the protocol enum `S` via `Into`
|
||||
2. Checks serialized size against `MAX_MESSAGE_SIZE`
|
||||
3. Length-prefixes with varint + postcard serialization
|
||||
4. Writes to the `SendStream`
|
||||
5. Returns the stream pair (now usable for response channels)
|
||||
|
||||
The `write_raw()` method is used for 0-RTT where the message is pre-serialized to allow re-sending without re-serialization.
|
||||
|
||||
### prepare_write
|
||||
|
||||
```rust
|
||||
fn prepare_write<S: Service>(msg: impl Into<S>) -> Result<SmallVec<[u8; 128]>, WriteError> {
|
||||
let msg = msg.into();
|
||||
if postcard::experimental::serialized_size(&msg)? as u64 > MAX_MESSAGE_SIZE {
|
||||
return Err(WriteError::MaxMessageSizeExceeded);
|
||||
}
|
||||
let mut buf = SmallVec::<[u8; 128]>::new();
|
||||
buf.write_length_prefixed(&msg)?;
|
||||
Ok(buf)
|
||||
}
|
||||
```
|
||||
|
||||
Uses `SmallVec<[u8; 128]>` to avoid heap allocation for small messages.
|
||||
|
||||
## Stream-to-Channel Conversions
|
||||
|
||||
When a QUIC stream pair is received on the server side, it needs to be converted into typed channels. The `From` implementations handle this:
|
||||
|
||||
### SendStream → Channel Tx
|
||||
|
||||
```rust
|
||||
// NoSender: drop the stream
|
||||
impl From<SendStream> for NoSender { ... }
|
||||
|
||||
// Oneshot: serialize and send single value, then done
|
||||
impl<T: RpcMessage> From<SendStream> for oneshot::Sender<T> { ... }
|
||||
|
||||
// MPSC: repeatedly serialize and send values
|
||||
impl<T: RpcMessage> From<SendStream> for mpsc::Sender<T> { ... }
|
||||
```
|
||||
|
||||
### RecvStream → Channel Rx
|
||||
|
||||
```rust
|
||||
// NoReceiver: drop the stream
|
||||
impl From<RecvStream> for NoReceiver { ... }
|
||||
|
||||
// Oneshot: read single length-prefixed value
|
||||
impl<T: DeserializeOwned> From<RecvStream> for oneshot::Receiver<T> { ... }
|
||||
|
||||
// MPSC: repeatedly read length-prefixed values
|
||||
impl<T: RpcMessage> From<RecvStream> for mpsc::Receiver<T> { ... }
|
||||
```
|
||||
|
||||
## Server-Side Request Processing
|
||||
|
||||
### read_request_raw
|
||||
|
||||
```rust
|
||||
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
|
||||
connection: &noq::Connection,
|
||||
) -> io::Result<Option<(R, RecvStream, SendStream)>>
|
||||
```
|
||||
|
||||
1. Calls `connection.accept_bi()` to accept an incoming bidirectional stream
|
||||
2. If `ApplicationClosed(0)`, returns `Ok(None)` (clean shutdown)
|
||||
3. Reads a varint length prefix from the `RecvStream`
|
||||
4. Checks against `MAX_MESSAGE_SIZE`
|
||||
5. Reads `length` bytes from the stream
|
||||
6. Deserializes with `postcard::from_bytes::<R>()`
|
||||
7. Returns `(deserialized_message, RecvStream, SendStream)`
|
||||
|
||||
### read_request (typed)
|
||||
|
||||
```rust
|
||||
pub async fn read_request<S: RemoteService>(
|
||||
connection: &noq::Connection,
|
||||
) -> io::Result<Option<S::Message>>
|
||||
```
|
||||
|
||||
Calls `read_request_raw()` and then applies `S::with_remote_channels()` to convert the raw protocol message + stream pair into a `WithChannels`-wrapped `Message`.
|
||||
|
||||
### handle_connection
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection<R: DeserializeOwned + 'static>(
|
||||
connection: noq::Connection,
|
||||
handler: Handler<R>,
|
||||
) -> io::Result<()>
|
||||
```
|
||||
|
||||
Loops:
|
||||
1. Calls `read_request_raw()` to get the next request
|
||||
2. If `None`, returns `Ok(())` (connection closed)
|
||||
3. Invokes `handler(msg, rx, tx)` to process the request
|
||||
4. Continues until the connection closes or an error occurs
|
||||
|
||||
Each connection is handled in a separate task (spawned by `listen()`).
|
||||
|
||||
### listen
|
||||
|
||||
```rust
|
||||
pub async fn listen<R: DeserializeOwned + 'static>(
|
||||
endpoint: noq::Endpoint,
|
||||
handler: Handler<R>,
|
||||
)
|
||||
```
|
||||
|
||||
The top-level server loop:
|
||||
1. Accepts incoming connections from the `noq::Endpoint`
|
||||
2. Spawns a task for each connection
|
||||
3. Each task calls `handle_connection()`
|
||||
4. Uses a `JoinSet` to manage and clean up completed tasks
|
||||
|
||||
## The Handler and Local Forwarding
|
||||
|
||||
The typical handler is created by `Protocol::remote_handler(local_sender)`:
|
||||
|
||||
```rust
|
||||
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
|
||||
Arc::new(move |msg, rx, tx| {
|
||||
let msg = Self::with_remote_channels(msg, rx, tx);
|
||||
Box::pin(local_sender.send_raw(msg))
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
This converts the raw (deserialized protocol message, RecvStream, SendStream) tuple into a typed `WithChannels` message and forwards it to the local actor via the mpsc channel. The local actor can then use the typed channels without knowing whether they're local or remote.
|
||||
|
||||
## Full Request Lifecycle (Remote)
|
||||
|
||||
```
|
||||
CLIENT SERVER
|
||||
│ │
|
||||
│ 1. Client::request() │
|
||||
│ → open_bi() on connection │
|
||||
│ │
|
||||
│ 2. RemoteSender::write(protocol_msg) │
|
||||
│ → serialize + send on SendStream ────►│
|
||||
│ │ 3. accept_bi()
|
||||
│ │ 4. read_request_raw()
|
||||
│ │ → read varint + data
|
||||
│ │ → deserialize protocol_msg
|
||||
│ │
|
||||
│ │ 5. RemoteService::with_remote_channels()
|
||||
│ │ → creates WithChannels
|
||||
│ │ → SendStream → tx channel
|
||||
│ │ → RecvStream → rx channel
|
||||
│ │
|
||||
│ │ 6. handler(msg, rx, tx)
|
||||
│ │ → local_sender.send_raw(message)
|
||||
│ │ → message goes to actor
|
||||
│ │
|
||||
│ │ 7. Actor processes:
|
||||
│ │ match message {
|
||||
│ │ Msg::Get(wc) => {
|
||||
│ │ let res = db.get(wc.inner.key);
|
||||
│ │ wc.tx.send(res).await;
|
||||
│ │ // tx.send() writes to SendStream
|
||||
│ │ }
|
||||
│ │ }
|
||||
│ │
|
||||
│ 8. RecvStream reads response ◄───────────│
|
||||
│ 9. Deserialize response │
|
||||
│ 10. Return to caller │
|
||||
```
|
||||
|
||||
## 0-RTT Flow
|
||||
|
||||
```
|
||||
CLIENT SERVER
|
||||
│ │
|
||||
│ 1. Serialize message into buffer │
|
||||
│ (prepare_write) │
|
||||
│ │
|
||||
│ 2. Open 0-RTT connection │
|
||||
│ → write buffer immediately ─────────►│
|
||||
│ │
|
||||
│ 3. Check zero_rtt_accepted() │
|
||||
│ → If true: done, read response │
|
||||
│ → If false: │
|
||||
│ 4. Open new (full) connection │
|
||||
│ 5. Re-send same buffer ────────────►│
|
||||
│ │
|
||||
│ 6. Read response ◄──────────────────────│
|
||||
```
|
||||
|
||||
The key insight: the message buffer is pre-serialized so it can be re-sent without re-serialization if 0-RTT is rejected.
|
||||
271
docs/research/references/iroh/irpc/07-irpc-iroh.md
Normal file
271
docs/research/references/iroh/irpc/07-irpc-iroh.md
Normal file
@@ -0,0 +1,271 @@
|
||||
# irpc: irpc-iroh — Iroh Transport Integration
|
||||
|
||||
The `irpc-iroh` crate provides transport integration for iroh, enabling irpc to work with iroh's QUIC connections that use endpoint IDs (rather than socket addresses) for routing.
|
||||
|
||||
## Crate Overview
|
||||
|
||||
```toml
|
||||
[package]
|
||||
name = "irpc-iroh"
|
||||
version = "0.13.0"
|
||||
description = "Iroh transport for irpc"
|
||||
```
|
||||
|
||||
Dependencies: `iroh`, `irpc`, `tokio`, `tracing`, `serde`, `postcard`, `n0-error`, `n0-future`
|
||||
|
||||
## Key Types
|
||||
|
||||
### IrohRemoteConnection
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct IrohRemoteConnection(Connection);
|
||||
```
|
||||
|
||||
Wraps an existing iroh `Connection`. Simplest way to use irpc with iroh — create a connection externally and wrap it.
|
||||
|
||||
```rust
|
||||
impl RemoteConnection for IrohRemoteConnection {
|
||||
fn clone_boxed(&self) -> Box<dyn RemoteConnection> { ... }
|
||||
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||
// Delegates to connection.open_bi()
|
||||
}
|
||||
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
|
||||
// Always true — fully authenticated connection
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Note:** This stops working when the underlying connection is closed. For automatic reconnection, use `IrohLazyRemoteConnection`.
|
||||
|
||||
### IrohZrttRemoteConnection
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct IrohZrttRemoteConnection(OutgoingZeroRttConnection);
|
||||
```
|
||||
|
||||
Wraps an iroh 0-RTT (Zero Round Trip Time) connection. This enables sending data before the full handshake completes for reduced latency on reconnections.
|
||||
|
||||
```rust
|
||||
impl RemoteConnection for IrohZrttRemoteConnection {
|
||||
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||
// Delegates to the 0-RTT connection's open_bi()
|
||||
}
|
||||
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
|
||||
// Actually checks handshake_completed() to determine
|
||||
// if 0-RTT data was accepted
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `zero_rtt_accepted()` method:
|
||||
- Returns `true` if `ZeroRttStatus::Accepted`
|
||||
- Returns `false` if `ZeroRttStatus::Rejected` or on error
|
||||
- This allows the `Client` to decide whether to re-send data
|
||||
|
||||
### IrohLazyRemoteConnection
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct IrohLazyRemoteConnection(Arc<IrohRemoteConnectionInner>);
|
||||
|
||||
struct IrohRemoteConnectionInner {
|
||||
endpoint: iroh::Endpoint,
|
||||
addr: iroh::EndpointAddr,
|
||||
connection: tokio::sync::Mutex<Option<Connection>>,
|
||||
alpn: Vec<u8>,
|
||||
}
|
||||
```
|
||||
|
||||
The lazy connection caches the underlying iroh `Connection` and reconnects automatically:
|
||||
|
||||
1. On first `open_bi()`, establishes a connection via `endpoint.connect(addr, alpn)`
|
||||
2. Caches the connection in a `Mutex<Option<Connection>>`
|
||||
3. On subsequent `open_bi()`, tries to reuse the cached connection
|
||||
4. If the cached connection fails, clears the cache and reconnects once
|
||||
|
||||
The `alpn` field is required because iroh connections need an ALPN protocol identifier.
|
||||
|
||||
### `client()` Function
|
||||
|
||||
```rust
|
||||
pub fn client<S: irpc::Service>(
|
||||
endpoint: iroh::Endpoint,
|
||||
addr: impl Into<iroh::EndpointAddr>,
|
||||
alpn: impl AsRef<[u8]>,
|
||||
) -> irpc::Client<S>
|
||||
```
|
||||
|
||||
Convenience function to create a `Client<S>` using iroh. Creates an `IrohLazyRemoteConnection` and wraps it with `Client::boxed()`.
|
||||
|
||||
## Server-Side: IrohProtocol
|
||||
|
||||
### IrohProtocol
|
||||
|
||||
```rust
|
||||
pub struct IrohProtocol<R> {
|
||||
handler: Handler<R>,
|
||||
request_id: AtomicU64,
|
||||
}
|
||||
```
|
||||
|
||||
Implements `iroh::protocol::ProtocolHandler`, allowing it to be registered with iroh's `Router`:
|
||||
|
||||
```rust
|
||||
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for IrohProtocol<R> {
|
||||
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> {
|
||||
// Handle the connection using irpc's handle_connection
|
||||
let handler = self.handler.clone();
|
||||
let fut = handle_connection(&connection, handler).map_err(AcceptError::from_err);
|
||||
fut.instrument(span).await
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
```rust
|
||||
let protocol = IrohProtocol::with_sender(local_sender);
|
||||
// or
|
||||
let protocol = IrohProtocol::new(handler);
|
||||
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(ALPN, protocol)
|
||||
.spawn();
|
||||
```
|
||||
|
||||
### Iroh0RttProtocol
|
||||
|
||||
```rust
|
||||
pub struct Iroh0RttProtocol<R> { ... }
|
||||
```
|
||||
|
||||
Supports 0-RTT connections by implementing `ProtocolHandler::on_accepting()`:
|
||||
|
||||
```rust
|
||||
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for Iroh0RttProtocol<R> {
|
||||
async fn on_accepting(&self, accepting: Accepting) -> Result<Connection, AcceptError> {
|
||||
let zrtt_conn = accepting.into_0rtt();
|
||||
// Handle 0-RTT data immediately
|
||||
handle_connection(&zrtt_conn, handler).await?;
|
||||
// Wait for handshake completion
|
||||
let conn = zrtt_conn.handshake_completed().await?;
|
||||
Ok(conn)
|
||||
}
|
||||
|
||||
async fn accept(&self, _connection: Connection) -> Result<(), AcceptError> {
|
||||
// Noop — handled in on_accepting
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Warning:** 0-RTT data is replayable. Only use for idempotent operations. See <https://www.iroh.computer/blog/0rtt-api>.
|
||||
|
||||
### IncomingRemoteConnection Trait
|
||||
|
||||
```rust
|
||||
pub trait IncomingRemoteConnection {
|
||||
fn accept_bi(&self) -> impl Future<Output = Result<(SendStream, RecvStream), ConnectionError>> + Send;
|
||||
fn close(&self, error_code: VarInt, reason: &[u8]);
|
||||
fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
|
||||
}
|
||||
```
|
||||
|
||||
Abstraction over `Connection` and `IncomingZeroRttConnection`, enabling `handle_connection` and `read_request` to work with both regular and 0-RTT connections.
|
||||
|
||||
Implemented for:
|
||||
- `Connection` — regular iroh connection
|
||||
- `IncomingZeroRttConnection` — 0-RTT connection
|
||||
|
||||
## handle_connection (iroh variant)
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection<R: DeserializeOwned + 'static>(
|
||||
connection: &impl IncomingRemoteConnection,
|
||||
handler: Handler<R>,
|
||||
) -> io::Result<()>
|
||||
```
|
||||
|
||||
Similar to the noq version but works with iroh's `IncomingRemoteConnection` trait. Records the remote endpoint ID in the tracing span.
|
||||
|
||||
## read_request and read_request_raw (iroh variants)
|
||||
|
||||
Same logic as the noq versions but using `IncomingRemoteConnection` instead of `noq::Connection`:
|
||||
|
||||
```rust
|
||||
pub async fn read_request<S: RemoteService>(
|
||||
connection: &impl IncomingRemoteConnection,
|
||||
) -> io::Result<Option<S::Message>>
|
||||
|
||||
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
|
||||
connection: &impl IncomingRemoteConnection,
|
||||
) -> io::Result<Option<(R, RecvStream, SendStream)>>
|
||||
```
|
||||
|
||||
## listen (iroh variant)
|
||||
|
||||
```rust
|
||||
pub async fn listen<R: DeserializeOwned + 'static>(endpoint: iroh::Endpoint, handler: Handler<R>)
|
||||
```
|
||||
|
||||
Accepts connections from an iroh `Endpoint` and handles them with the provided handler. Uses `n0_future::task::JoinSet` for task management.
|
||||
|
||||
## Example Usage
|
||||
|
||||
### Server
|
||||
|
||||
```rust
|
||||
use irpc::{rpc_requests, channel::oneshot, Client, WithChannels};
|
||||
use irpc_iroh::IrohProtocol;
|
||||
use iroh::{endpoint::presets, protocol::Router, Endpoint};
|
||||
|
||||
#[rpc_requests(message = FooMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum FooProtocol {
|
||||
#[rpc(tx=oneshot::Sender<String>)]
|
||||
Get(String),
|
||||
}
|
||||
|
||||
async fn server() -> Result<()> {
|
||||
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||
tokio::task::spawn(actor(rx));
|
||||
let client = Client::<FooProtocol>::local(tx);
|
||||
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let protocol = IrohProtocol::with_sender(client.as_local().unwrap());
|
||||
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||
// ... keep running
|
||||
}
|
||||
```
|
||||
|
||||
### Client
|
||||
|
||||
```rust
|
||||
async fn connect(endpoint_id: EndpointId) -> Result<Client<FooProtocol>> {
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let client = irpc_iroh::client(endpoint, endpoint_id, ALPN);
|
||||
Ok(client)
|
||||
}
|
||||
|
||||
// Or with direct connection:
|
||||
async fn connect_direct(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<FooProtocol>> {
|
||||
let conn = endpoint.connect(addr, ALPN).await?;
|
||||
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
|
||||
}
|
||||
```
|
||||
|
||||
### 0-RTT Client
|
||||
|
||||
```rust
|
||||
async fn connect_0rtt(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<EchoProtocol>> {
|
||||
let connecting = endpoint.connect_with_opts(addr, ALPN, Default::default()).await?;
|
||||
match connecting.into_0rtt() {
|
||||
Ok(conn) => Ok(Client::boxed(IrohZrttRemoteConnection::new(conn))),
|
||||
Err(connecting) => {
|
||||
let conn = connecting.await?;
|
||||
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,134 @@
|
||||
# irpc: Serialization and Utility Modules
|
||||
|
||||
## Varint Utilities
|
||||
|
||||
The `varint-util` module (available with `rpc` or `varint-util` feature) provides LEB128 varint encoding/decoding compatible with postcard's format.
|
||||
|
||||
### Async Reading
|
||||
|
||||
```rust
|
||||
pub async fn read_varint_u64<R: AsyncRead + Unpin>(reader: &mut R) -> io::Result<Option<u64>>
|
||||
```
|
||||
|
||||
Reads a LEB128-encoded `u64` from an async reader. Returns `Ok(None)` on `UnexpectedEof` at the first byte position (clean stream end).
|
||||
|
||||
**Format:** Each byte uses 7 bits for the value, MSB as continuation bit. Values stored little-endian (least significant group first).
|
||||
|
||||
### Sync Writing
|
||||
|
||||
```rust
|
||||
pub fn write_varint_u64_sync<W: io::Write>(writer: &mut W, value: u64) -> io::Result<usize>
|
||||
```
|
||||
|
||||
Writes a `u64` as LEB128 to a synchronous writer.
|
||||
|
||||
### Length-Prefixed Encoding
|
||||
|
||||
```rust
|
||||
// Sync:
|
||||
pub fn write_length_prefixed<T: Serialize>(write: impl io::Write, value: T) -> io::Result<()>
|
||||
pub trait WriteVarintExt: io::Write {
|
||||
fn write_varint_u64(&mut self, value: u64) -> io::Result<usize>;
|
||||
fn write_length_prefixed<T: Serialize>(&mut self, value: T) -> io::Result<()>;
|
||||
}
|
||||
|
||||
// Async:
|
||||
pub trait AsyncReadVarintExt: AsyncRead + Unpin {
|
||||
fn read_varint_u64(&mut self) -> impl Future<Output = io::Result<Option<u64>>>;
|
||||
fn read_length_prefixed<T: DeserializeOwned>(&mut self, max_size: usize) -> impl Future<Output = io::Result<T>>;
|
||||
}
|
||||
|
||||
pub trait AsyncWriteVarintExt: AsyncWrite + Unpin {
|
||||
fn write_varint_u64(&mut self, value: u64) -> impl Future<Output = io::Result<usize>>;
|
||||
fn write_length_prefixed<T: Serialize>(&mut self, value: V) -> impl Future<Output = io::Result<usize>>;
|
||||
}
|
||||
```
|
||||
|
||||
The length-prefix format is:
|
||||
```
|
||||
[varint-encoded-length][postcard-serialized-data]
|
||||
```
|
||||
|
||||
Used internally by irpc for framing all messages on QUIC streams. The `max_size` parameter in `read_length_prefixed` prevents memory exhaustion from malicious length values.
|
||||
|
||||
## noq Endpoint Setup
|
||||
|
||||
The `noq_endpoint_setup` feature provides helpers for creating noq endpoints with TLS configuration:
|
||||
|
||||
```rust
|
||||
pub fn configure_client(server_certs: &[&[u8]]) -> Result<ClientConfig>
|
||||
pub fn configure_server() -> Result<(ServerConfig, Vec<u8>)>
|
||||
pub fn configure_client_insecure() -> Result<ClientConfig>
|
||||
|
||||
// Non-WASM only:
|
||||
pub fn make_client_endpoint(bind_addr: SocketAddr, server_certs: &[&[u8]]) -> Result<Endpoint>
|
||||
pub fn make_insecure_client_endpoint(bind_addr: SocketAddr) -> Result<Endpoint>
|
||||
pub fn make_server_endpoint(bind_addr: SocketAddr) -> Result<(Endpoint, Vec<u8>)>
|
||||
```
|
||||
|
||||
- `configure_server()`: Creates a self-signed certificate with rcgen and configures the server with TLS 1.3. Returns the DER-encoded certificate for clients to trust.
|
||||
- `configure_client()`: Configures a client to trust specific DER certificates.
|
||||
- `configure_client_insecure()`: Skips certificate verification (for testing only).
|
||||
- Server endpoints set `max_concurrent_uni_streams(0)` to disable unidirectional streams (only bidirectional streams are used).
|
||||
- Keep-alive interval is set to 1 second on client configs.
|
||||
|
||||
## FusedOneshotReceiver
|
||||
|
||||
```rust
|
||||
pub(crate) struct FusedOneshotReceiver<T>(pub tokio::sync::oneshot::Receiver<T>);
|
||||
```
|
||||
|
||||
A wrapper that prevents panics when polling an already-completed oneshot receiver. After the inner receiver resolves, subsequent polls return `Poll::Pending` indefinitely instead of panicking.
|
||||
|
||||
This is important because irpc's `oneshot::Receiver` can be wrapped in `Receiver::Boxed` (a `BoxFuture`), and the inner future might be polled multiple times in certain select patterns.
|
||||
|
||||
## now_or_never
|
||||
|
||||
```rust
|
||||
pub(crate) fn now_or_never<F: Future>(future: F) -> Option<F::Output>
|
||||
```
|
||||
|
||||
Attempts to complete a future immediately without blocking. If the future would block, returns `None`. Used internally by `NoqSenderInner::try_send()` to attempt an immediate write to the QUIC stream without yielding.
|
||||
|
||||
Implementation uses a no-op waker to poll the future once.
|
||||
|
||||
## Spans Feature
|
||||
|
||||
When the `spans` feature is enabled (default), `WithChannels` includes a `span: tracing::Span` field:
|
||||
|
||||
```rust
|
||||
pub struct WithChannels<I: Channels<S>, S: Service> {
|
||||
pub inner: I,
|
||||
pub tx: <I as Channels<S>>::Tx,
|
||||
pub rx: <I as Channels<S>>::Rx,
|
||||
#[cfg(feature = "spans")]
|
||||
pub span: tracing::Span,
|
||||
}
|
||||
```
|
||||
|
||||
The span is captured from `tracing::Span::current()` at the time of `WithChannels` construction (via `From` implementations). This preserves tracing context across async message-passing boundaries.
|
||||
|
||||
The `rpc_requests` macro generates a `parent_span()` method on the message enum when `no_spans` is not set:
|
||||
|
||||
```rust
|
||||
impl ComputeMessage {
|
||||
pub fn parent_span(&self) -> tracing::Span {
|
||||
let span = match self {
|
||||
ComputeMessage::Multiply(inner) => inner.parent_span_opt(),
|
||||
ComputeMessage::Sum(inner) => inner.parent_span_opt(),
|
||||
};
|
||||
span.cloned().unwrap_or_else(|| tracing::Span::current())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This allows server-side handlers to enter the client's tracing span:
|
||||
|
||||
```rust
|
||||
async fn handle(msg: ComputeMessage) {
|
||||
let _entered = msg.parent_span().enter();
|
||||
// ... processing happens in the client's tracing context
|
||||
}
|
||||
```
|
||||
|
||||
When `no_spans` is set in the macro, no span-related code is generated, making it compatible with builds that don't have the `spans` feature enabled.
|
||||
@@ -0,0 +1,249 @@
|
||||
# irpc: Design Patterns and Usage Examples
|
||||
|
||||
## Pattern 1: Actor Model (Most Common)
|
||||
|
||||
The primary usage pattern is an actor that receives messages and processes them sequentially:
|
||||
|
||||
```rust
|
||||
struct StorageActor {
|
||||
recv: tokio::sync::mpsc::Receiver<StorageMessage>,
|
||||
state: BTreeMap<String, String>,
|
||||
}
|
||||
|
||||
impl StorageActor {
|
||||
pub fn spawn() -> StorageApi {
|
||||
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||
let actor = Self { recv: rx, state: BTreeMap::new() };
|
||||
tokio::task::spawn(actor.run());
|
||||
StorageApi { inner: Client::local(tx) }
|
||||
}
|
||||
|
||||
async fn run(mut self) {
|
||||
while let Some(msg) = self.recv.recv().await {
|
||||
self.handle(msg).await;
|
||||
}
|
||||
}
|
||||
|
||||
async fn handle(&mut self, msg: StorageMessage) {
|
||||
match msg {
|
||||
StorageMessage::Get(wc) => {
|
||||
let WithChannels { inner, tx, .. } = wc;
|
||||
tx.send(self.state.get(&inner.key).cloned()).await.ok();
|
||||
}
|
||||
StorageMessage::Set(wc) => {
|
||||
let WithChannels { inner, tx, .. } = wc;
|
||||
self.state.insert(inner.key, inner.value);
|
||||
tx.send(()).await.ok();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key points:**
|
||||
- The actor owns state and processes messages sequentially
|
||||
- `Client::local(tx)` wraps the sender side of the mpsc channel
|
||||
- `WithChannels` destructuring gives access to `inner` (the request data), `tx` (response channel), and `rx` (update channel)
|
||||
- The `..` pattern ignores `rx` when it's `NoReceiver` and `span` (with `spans` feature)
|
||||
|
||||
## Pattern 2: Concurrent Task Per Request
|
||||
|
||||
For long-running or independent requests, spawn a task per message:
|
||||
|
||||
```rust
|
||||
async fn run(mut self) {
|
||||
while let Ok(Some(msg)) = self.recv.recv().await {
|
||||
tokio::task::spawn(async move {
|
||||
if let Err(cause) = Self::handle(msg).await {
|
||||
eprintln!("Error: {cause}");
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This is useful for CPU-intensive or I/O-bound requests that shouldn't block other requests.
|
||||
|
||||
## Pattern 3: Local-Only Usage
|
||||
|
||||
irpc can be used without any RPC feature for pure in-process communication:
|
||||
|
||||
```rust
|
||||
// Cargo.toml: default-features = false, features = ["derive"]
|
||||
#[rpc_requests(message = StorageMessage, no_rpc, no_spans)]
|
||||
#[derive(Serialize, Deserialize, Debug)]
|
||||
enum StorageProtocol {
|
||||
#[rpc(tx=oneshot::Sender<Option<String>>)]
|
||||
Get(Get),
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
Set(Set),
|
||||
}
|
||||
```
|
||||
|
||||
The `no_rpc` flag prevents `RemoteService` from being generated, and `no_spans` removes the tracing dependency. This leaves only the local channel mechanism, with minimal dependencies (serde, tokio, tokio-util).
|
||||
|
||||
## Pattern 4: API Type Wrapping Client
|
||||
|
||||
The recommended pattern is to wrap `Client<S>` in a higher-level API type:
|
||||
|
||||
```rust
|
||||
struct StorageApi {
|
||||
inner: Client<StorageProtocol>,
|
||||
}
|
||||
|
||||
impl StorageApi {
|
||||
// Local
|
||||
pub fn spawn() -> Self {
|
||||
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||
tokio::task::spawn(StorageActor::new(rx).run());
|
||||
Self { inner: Client::local(tx) }
|
||||
}
|
||||
|
||||
// Remote (noq)
|
||||
pub fn connect(endpoint: noq::Endpoint, addr: SocketAddr) -> Self {
|
||||
Self { inner: Client::noq(endpoint, addr) }
|
||||
}
|
||||
|
||||
// Remote (iroh)
|
||||
pub fn connect_iroh(endpoint: iroh::Endpoint, addr: EndpointAddr) -> Self {
|
||||
Self { inner: irpc_iroh::client(endpoint, addr, ALPN) }
|
||||
}
|
||||
|
||||
// Type-safe methods that work for both local and remote
|
||||
pub async fn get(&self, key: String) -> irpc::Result<Option<String>> {
|
||||
self.inner.rpc(Get { key }).await
|
||||
}
|
||||
|
||||
pub async fn set(&self, key: String, value: String) -> irpc::Result<()> {
|
||||
self.inner.rpc(Set { key, value }).await
|
||||
}
|
||||
|
||||
pub async fn list(&self) -> irpc::Result<mpsc::Receiver<String>> {
|
||||
self.inner.server_streaming(List, 16).await
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This encapsulates the protocol details and provides a clean, type-safe API. The same `StorageApi` works identically whether connected locally or remotely.
|
||||
|
||||
## Pattern 5: Server Setup
|
||||
|
||||
### With noq
|
||||
|
||||
```rust
|
||||
fn serve(api: &StorageApi, endpoint: noq::Endpoint) -> Result<JoinHandle<()>> {
|
||||
let local = api.inner.as_local().context("cannot listen on remote service")?;
|
||||
let handler = StorageProtocol::remote_handler(local);
|
||||
Ok(tokio::task::spawn(irpc::rpc::listen(endpoint, handler)))
|
||||
}
|
||||
```
|
||||
|
||||
### With iroh
|
||||
|
||||
```rust
|
||||
fn serve(api: &StorageApi, endpoint: iroh::Endpoint) -> Result<Router> {
|
||||
let local = api.inner.as_local().context("cannot listen on remote service")?;
|
||||
let protocol = IrohProtocol::with_sender(local);
|
||||
Ok(Router::builder(endpoint).accept(ALPN, protocol).spawn())
|
||||
}
|
||||
```
|
||||
|
||||
## Pattern 6: Low-Level Request Handling
|
||||
|
||||
For more control than the `Client` methods provide, use `request()` directly:
|
||||
|
||||
```rust
|
||||
async fn custom_request(&self, msg: Get) -> anyhow::Result<oneshot::Receiver<Option<String>>> {
|
||||
match self.inner.request().await? {
|
||||
Request::Local(request) => {
|
||||
let (tx, rx) = oneshot::channel();
|
||||
request.send((msg, tx)).await?;
|
||||
Ok(rx)
|
||||
}
|
||||
Request::Remote(request) => {
|
||||
let (_tx, rx) = request.write(msg).await?;
|
||||
Ok(rx.into())
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This allows custom channel creation logic, e.g., different buffer sizes for local vs remote.
|
||||
|
||||
## Pattern 7: Channel Filtering and Mapping
|
||||
|
||||
irpc channels support filtering and mapping, which work for both local and remote channels:
|
||||
|
||||
```rust
|
||||
// Server-side: filter responses to only include values > 10
|
||||
let filtered_tx = wc.tx.with_filter(|v: &i64| *v > 10);
|
||||
|
||||
// Server-side: transform responses
|
||||
let mapped_tx = wc.tx.with_map(|v: i64| v * 2);
|
||||
|
||||
// Client-side: filter received updates
|
||||
let filtered_rx = rx.filter(|update: &Update| update.is_relevant());
|
||||
```
|
||||
|
||||
For remote channels, these create boxed wrappers. For local channels, they also create boxed wrappers. The overhead is negligible for remote (network latency dominates) but present for local.
|
||||
|
||||
## Pattern 8: Using the `wrap` Attribute
|
||||
|
||||
The `#[wrap]` attribute generates named structs from variant fields:
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = StoreMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum StoreProtocol {
|
||||
#[rpc(tx=oneshot::Sender<Option<String>>)]
|
||||
#[wrap(GetRequest, derive(Clone))]
|
||||
Get(String), // Generates: pub struct GetRequest(pub String);
|
||||
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
#[wrap(SetRequest)]
|
||||
Set { key: String, value: String }, // Generates: pub struct SetRequest { pub key: String, pub value: String }
|
||||
}
|
||||
```
|
||||
|
||||
Benefits:
|
||||
- Named request types can be imported and constructed by name
|
||||
- Additional derives (e.g., `Clone`) can be added
|
||||
- Custom visibility can be specified: `#[wrap(pub(crate) GetRequest)]`
|
||||
- The generated struct inherits the enum's visibility by default
|
||||
|
||||
## Pattern 9: 0-RTT Connections
|
||||
|
||||
For reduced latency on reconnections with iroh:
|
||||
|
||||
```rust
|
||||
// Client side
|
||||
let result = client.rpc_0rtt(Get { key: "x".into() }).await?;
|
||||
|
||||
// Server side (iroh)
|
||||
let protocol = Iroh0RttProtocol::with_sender(local_sender);
|
||||
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||
```
|
||||
|
||||
**Important:** Only use 0-RTT for idempotent operations, as the data may be replayed by an attacker.
|
||||
|
||||
## Pattern 10: Shared State in Actor
|
||||
|
||||
For actors that need shared state accessible from multiple handlers:
|
||||
|
||||
```rust
|
||||
struct Actor {
|
||||
recv: tokio::sync::mpsc::Receiver<Message>,
|
||||
state: Arc<Mutex<SharedState>>,
|
||||
}
|
||||
```
|
||||
|
||||
Or use the actor pattern with internal mutation:
|
||||
|
||||
```rust
|
||||
struct Actor {
|
||||
recv: tokio::sync::mpsc::Receiver<Message>,
|
||||
db: HashMap<String, String>, // owned state
|
||||
}
|
||||
```
|
||||
|
||||
Since the actor processes messages sequentially, no internal synchronization is needed.
|
||||
230
docs/research/references/iroh/irpc/10-quick-reference.md
Normal file
230
docs/research/references/iroh/irpc/10-quick-reference.md
Normal file
@@ -0,0 +1,230 @@
|
||||
# irpc: Quick Reference
|
||||
|
||||
## Crate Info
|
||||
|
||||
- **Name:** `irpc`
|
||||
- **Version:** 0.13.0
|
||||
- **License:** Apache-2.0 OR MIT
|
||||
- **Repository:** https://github.com/n0-computer/irpc
|
||||
- **MSRV:** 1.89
|
||||
|
||||
## Feature Flags
|
||||
|
||||
| Feature | Default | Dependencies Added |
|
||||
|---|---|---|
|
||||
| `rpc` | ✅ | noq, postcard, smallvec, tracing, tokio/io-util |
|
||||
| `derive` | ✅ | irpc-derive |
|
||||
| `spans` | ✅ | tracing |
|
||||
| `stream` | ✅ | futures-util |
|
||||
| `noq_endpoint_setup` | ✅ | rustls, rcgen, futures-buffered |
|
||||
| `varint-util` | ❌ | postcard, smallvec, tokio/io-util |
|
||||
|
||||
## Type Quick Reference
|
||||
|
||||
### Core Types
|
||||
|
||||
```
|
||||
Service trait — implemented on protocol enum, defines Message type
|
||||
Channels<S> trait — implemented on request types, defines Tx/Rx types
|
||||
RpcMessage trait — blanket impl for Debug+Serialize+DeserializeOwned+Send+Sync+Unpin+'static
|
||||
Sender trait — sealed marker for sender types
|
||||
Receiver trait — sealed marker for receiver types
|
||||
WithChannels<I,S> struct — wraps request I with tx/rx/span for service S
|
||||
Client<S> struct — client to service S (local or remote)
|
||||
LocalSender<S> struct — local sender wrapping mpsc::Sender<S::Message>
|
||||
Request<L,R> enum — Local(L) or Remote(R) request
|
||||
RemoteSender<S> struct — holds QUIC stream pair for sending initial message
|
||||
```
|
||||
|
||||
### Channel Types
|
||||
|
||||
```
|
||||
oneshot::Sender<T> — Tokio or Boxed; single value; async send
|
||||
oneshot::Receiver<T> — Tokio or Boxed; single value; Future impl
|
||||
mpsc::Sender<T> — Tokio or Arc<DynSender>; stream; async send/try_send
|
||||
mpsc::Receiver<T> — Tokio or Box<DynReceiver>; stream; async recv
|
||||
NoSender — No-op sender
|
||||
NoReceiver — No-op receiver
|
||||
```
|
||||
|
||||
### Remote Types (rpc feature)
|
||||
|
||||
```
|
||||
RemoteConnection trait — open_bi(), zero_rtt_accepted(), clone_boxed()
|
||||
NoqLazyRemoteConnection — lazy noq connection with cache
|
||||
Handler<R> type — Arc<dyn Fn(R, RecvStream, SendStream) -> ...>
|
||||
```
|
||||
|
||||
### irpc-iroh Types
|
||||
|
||||
```
|
||||
IrohRemoteConnection — wraps iroh::Connection
|
||||
IrohZrttRemoteConnection — wraps iroh::OutgoingZeroRttConnection
|
||||
IrohLazyRemoteConnection — lazy iroh connection with cache
|
||||
IrohProtocol<R> — ProtocolHandler for iroh Router
|
||||
Iroh0RttProtocol<R> — ProtocolHandler with 0-RTT support
|
||||
IncomingRemoteConnection trait — abstraction over Connection and ZeroRttConnection
|
||||
```
|
||||
|
||||
## Interaction Patterns Cheatsheet
|
||||
|
||||
```rust
|
||||
// ═══════════════════════════════════════════
|
||||
// Protocol Definition
|
||||
// ═══════════════════════════════════════════
|
||||
|
||||
#[rpc_requests(message = MyMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum MyProtocol {
|
||||
// Unary RPC
|
||||
#[rpc(tx=oneshot::Sender<Response>)]
|
||||
#[wrap(GetReq)]
|
||||
Get(String),
|
||||
|
||||
// Server streaming
|
||||
#[rpc(tx=mpsc::Sender<Item>)]
|
||||
#[wrap(ListReq)]
|
||||
List(ListParams),
|
||||
|
||||
// Client streaming
|
||||
#[rpc(tx=oneshot::Sender<Count>, rx=mpsc::Receiver<Item>)]
|
||||
#[wrap(UploadReq)]
|
||||
Upload,
|
||||
|
||||
// Bidirectional streaming
|
||||
#[rpc(tx=mpsc::Sender<Result>, rx=mpsc::Receiver<Update>)]
|
||||
#[wrap(ProcessReq)]
|
||||
Process(ProcessConfig),
|
||||
|
||||
// Fire and forget
|
||||
#[rpc]
|
||||
#[wrap(LogReq)]
|
||||
Log(String),
|
||||
}
|
||||
|
||||
// ═══════════════════════════════════════════
|
||||
// Client Usage
|
||||
// ═══════════════════════════════════════════
|
||||
|
||||
// Local
|
||||
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||
tokio::task::spawn(actor(rx));
|
||||
let client: Client<MyProtocol> = Client::local(tx);
|
||||
|
||||
// Remote (noq)
|
||||
let client: Client<MyProtocol> = Client::noq(endpoint, addr);
|
||||
|
||||
// Remote (iroh)
|
||||
let client: Client<MyProtocol> = irpc_iroh::client(endpoint, addr, alpn);
|
||||
|
||||
// ═══════════════════════════════════════════
|
||||
// Making Requests
|
||||
// ═══════════════════════════════════════════
|
||||
|
||||
// Unary
|
||||
let result: Response = client.rpc(GetReq("key".into())).await?;
|
||||
|
||||
// Server streaming
|
||||
let mut rx: mpsc::Receiver<Item> = client.server_streaming(ListReq(params), 16).await?;
|
||||
while let Some(item) = rx.recv().await? { ... }
|
||||
|
||||
// Client streaming
|
||||
let (update_tx, response_rx): (mpsc::Sender<Item>, oneshot::Receiver<Count>) =
|
||||
client.client_streaming(Upload, 4).await?;
|
||||
update_tx.send(item).await?;
|
||||
let count = response_rx.await?;
|
||||
|
||||
// Bidirectional
|
||||
let (update_tx, mut result_rx): (mpsc::Sender<Update>, mpsc::Receiver<Result>) =
|
||||
client.bidi_streaming(ProcessReq(config), 4, 16).await?;
|
||||
update_tx.send(update).await?;
|
||||
while let Some(result) = result_rx.recv().await? { ... }
|
||||
|
||||
// Fire and forget
|
||||
client.notify(LogReq("message".into())).await?;
|
||||
|
||||
// ═══════════════════════════════════════════
|
||||
// Server Setup
|
||||
// ═══════════════════════════════════════════
|
||||
|
||||
// noq
|
||||
let handler = MyProtocol::remote_handler(local_sender);
|
||||
irpc::rpc::listen(endpoint, handler).await;
|
||||
|
||||
// iroh
|
||||
let protocol = IrohProtocol::with_sender(local_sender);
|
||||
Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||
|
||||
// ═══════════════════════════════════════════
|
||||
// Actor Message Handling
|
||||
// ═══════════════════════════════════════════
|
||||
|
||||
async fn handle(&mut self, msg: MyMessage) {
|
||||
match msg {
|
||||
MyMessage::Get(wc) => {
|
||||
let WithChannels { inner, tx, .. } = wc;
|
||||
let result = self.db.get(&inner.0).cloned();
|
||||
tx.send(result).await.ok();
|
||||
}
|
||||
MyMessage::List(wc) => {
|
||||
let WithChannels { tx, .. } = wc;
|
||||
for item in &self.items {
|
||||
if tx.send(item.clone()).await.is_err() { break; }
|
||||
}
|
||||
}
|
||||
MyMessage::Upload(wc) => {
|
||||
let WithChannels { tx, mut rx, .. } = wc;
|
||||
let mut count = 0;
|
||||
while let Ok(Some(item)) = rx.recv().await {
|
||||
self.process(item);
|
||||
count += 1;
|
||||
}
|
||||
tx.send(count).await.ok();
|
||||
}
|
||||
MyMessage::Process(wc) => {
|
||||
let WithChannels { tx, mut rx, inner, .. } = wc;
|
||||
tokio::task::spawn(async move {
|
||||
while let Ok(Some(update)) = rx.recv().await {
|
||||
if let Some(result) = process(update, &inner) {
|
||||
if tx.send(result).await.is_err() { break; }
|
||||
}
|
||||
}
|
||||
});
|
||||
}
|
||||
MyMessage::Log(wc) => {
|
||||
let WithChannels { inner, .. } = wc;
|
||||
println!("{}", inner.0);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling Quick Reference
|
||||
|
||||
```rust
|
||||
// Client-side errors
|
||||
use irpc::{Error, RequestError, Result};
|
||||
|
||||
// Request errors (connection/stream open failures)
|
||||
match client.rpc(GetReq("key".into())).await {
|
||||
Ok(result) => { ... }
|
||||
Err(Error::Request { source }) => { ... } // Connection failed
|
||||
Err(Error::OneshotRecv { source }) => { ... } // Response channel error
|
||||
}
|
||||
|
||||
// Channel errors
|
||||
use irpc::channel::{SendError, mpsc::RecvError, oneshot::RecvError};
|
||||
|
||||
// SendError: ReceiverClosed | MaxMessageSizeExceeded | Io
|
||||
// RecvError (oneshot): SenderClosed | MaxMessageSizeExceeded | Io
|
||||
// RecvError (mpsc): MaxMessageSizeExceeded | Io
|
||||
```
|
||||
|
||||
## Constants
|
||||
|
||||
```rust
|
||||
pub const MAX_MESSAGE_SIZE: u64 = 16 * 1024 * 1024; // 16 MiB
|
||||
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
|
||||
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
|
||||
// Connection close code 0 = clean shutdown
|
||||
```
|
||||
Reference in New Issue
Block a user