docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs
This commit is contained in:
@@ -0,0 +1,138 @@
|
|||||||
|
# iroh-blobs: Overview and Architecture
|
||||||
|
|
||||||
|
**Version**: 0.100.0
|
||||||
|
**Repository**: https://github.com/n0-computer/iroh-blobs
|
||||||
|
**License**: MIT OR Apache-2.0
|
||||||
|
**Rust Edition**: 2021
|
||||||
|
**MSRV**: 1.89
|
||||||
|
|
||||||
|
## What It Is
|
||||||
|
|
||||||
|
`iroh-blobs` is a Rust crate for content-addressed blob transfer over QUIC connections, built on top of [iroh](https://docs.rs/iroh). It implements a request-response protocol for streaming BLAKE3-verified data between peers, along with store implementations for persisting blobs locally.
|
||||||
|
|
||||||
|
The core value proposition: transfer arbitrary-sized data with **cryptographic integrity guaranteed in-stream** — every 16 KiB chunk group can be verified against the BLAKE3 hash tree as it arrives, without waiting for the complete transfer.
|
||||||
|
|
||||||
|
## Core Concepts
|
||||||
|
|
||||||
|
| Concept | Description |
|
||||||
|
|---------|-------------|
|
||||||
|
| **Blob** | A sequence of bytes of arbitrary size, identified by its BLAKE3 hash. No metadata. |
|
||||||
|
| **Link** | A 32-byte BLAKE3 hash of a blob — the content address. |
|
||||||
|
| **HashSeq** | A blob whose content is a sequence of BLAKE3 hashes (each 32 bytes). Length must be a multiple of 32. |
|
||||||
|
| **Provider** | The side serving data. Waits for incoming requests and responds. |
|
||||||
|
| **Requester** | The side requesting data. Initiates connections and sends requests. |
|
||||||
|
| **Tag** | A persistent named reference to a `HashAndFormat`, protecting blobs from garbage collection. |
|
||||||
|
| **TempTag** | An ephemeral in-memory reference that protects content while the process runs. |
|
||||||
|
| **Chunk** | The fundamental BLAKE3 unit: 1024 bytes. |
|
||||||
|
| **Chunk Group** | Iroh's grouping of 16 chunks (16 KiB), the minimum granularity for range requests and verification. |
|
||||||
|
|
||||||
|
## Architecture Diagram
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────┐
|
||||||
|
│ Application │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||||
|
│ │ Blobs │ │ Tags │ │ Downloader │ │
|
||||||
|
│ │ API │ │ API │ │ API │ │
|
||||||
|
│ └────┬─────┘ └────┬─────┘ └───────┬──────────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ └──────────────┴────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌───────┴───────┐ │
|
||||||
|
│ │ Store (API) │ ← Actor-based, RPC │
|
||||||
|
│ │ Commands │ message passing │
|
||||||
|
│ └───────┬───────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌─────────────┼─────────────┐ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ┌─────┴─────┐ ┌────┴────┐ ┌─────┴─────┐ │
|
||||||
|
│ │ MemStore │ │ FsStore │ │ Readonly │ │
|
||||||
|
│ │ │ │ (redb + │ │ MemStore │ │
|
||||||
|
│ │ │ │ fs) │ │ │ │
|
||||||
|
│ └────────────┘ └─────────┘ └───────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
┌─────────────────────────────────────────────────────┐
|
||||||
|
│ Network Layer │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||||
|
│ │ BlobsProtocol │ │ Remote (Client) │ │
|
||||||
|
│ │ (Provider side) │ │ (Requester side) │ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ │ handle_conn() │ │ Remote::fetch() │ │
|
||||||
|
│ │ handle_stream() │ │ Remote::local() │ │
|
||||||
|
│ └────────┬─────────┘ └──────────┬───────────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ └──────── iroh QUIC ───────┘ │
|
||||||
|
│ ALPN: /iroh-bytes/4 │
|
||||||
|
└─────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Module Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
iroh-blobs/src/
|
||||||
|
├── lib.rs # Crate root, re-exports
|
||||||
|
├── hash.rs # Hash, BlobFormat, HashAndFormat
|
||||||
|
├── hashseq.rs # HashSeq type
|
||||||
|
├── format.rs # Format module (Collection)
|
||||||
|
│ └── collection.rs # Collection type with metadata
|
||||||
|
├── protocol.rs # Wire protocol types (GetRequest, etc.)
|
||||||
|
│ └── range_spec.rs # ChunkRangesSeq, RangeSpec wire encoding
|
||||||
|
├── net_protocol.rs # BlobsProtocol (iroh ProtocolHandler)
|
||||||
|
├── provider.rs # Server-side request handling
|
||||||
|
│ └── events.rs # Event system (connect/disconnect/progress)
|
||||||
|
├── get.rs # Client-side FSM for getting data
|
||||||
|
│ ├── error.rs # GetError, GetResult types
|
||||||
|
│ └── request.rs # Request execution helpers
|
||||||
|
├── api/ # High-level store API
|
||||||
|
│ ├── blobs.rs # Blob operations (add, export, read, etc.)
|
||||||
|
│ │ └── reader.rs # BlobReader (AsyncRead + AsyncSeek)
|
||||||
|
│ ├── downloader.rs # Multi-source download coordinator
|
||||||
|
│ ├── remote.rs # Remote peer interaction (fetch, observe)
|
||||||
|
│ ├── tags.rs # Tag management API
|
||||||
|
│ ├── proto.rs # Store command protocol (RPC messages)
|
||||||
|
│ └── proto/ # Proto sub-modules
|
||||||
|
│ └── bitfield.rs # Bitfield type for chunk tracking
|
||||||
|
├── store/ # Storage implementations
|
||||||
|
│ ├── mod.rs # IROH_BLOCK_SIZE, GcConfig
|
||||||
|
│ ├── mem.rs # MemStore (in-memory, mutable)
|
||||||
|
│ ├── fs.rs # FsStore (filesystem + redb hybrid)
|
||||||
|
│ ├── readonly_mem.rs # Read-only memory store
|
||||||
|
│ ├── gc.rs # Garbage collection
|
||||||
|
│ ├── util.rs # Shared utilities (Tag, SparseMemFile, etc.)
|
||||||
|
│ └── test.rs # Test utilities
|
||||||
|
├── ticket.rs # BlobTicket (shareable connection info)
|
||||||
|
├── metrics.rs # Prometheus metrics definitions
|
||||||
|
└── util/ # Utilities
|
||||||
|
├── channel.rs # Channel helpers
|
||||||
|
├── connection_pool.rs # Connection pooling
|
||||||
|
├── stream.rs # Stream abstractions
|
||||||
|
└── temp_tag.rs # TempTag, TagCounter, TempTags scope management
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Dependencies
|
||||||
|
|
||||||
|
| Dependency | Purpose |
|
||||||
|
|------------|---------|
|
||||||
|
| `bao-tree` | BLAKE3 verified streaming, outboard storage, BaoTree encoding/decoding |
|
||||||
|
| `iroh` | QUIC networking, endpoint, router |
|
||||||
|
| `irpc` | RPC framework for store commands |
|
||||||
|
| `postcard` | Wire serialization (compact, no-schema) |
|
||||||
|
| `redb` | Embedded key-value database (fs-store feature) |
|
||||||
|
| `range-collections` | RangeSet2 / ChunkRanges for chunk tracking |
|
||||||
|
| `bytes` | Efficient byte buffer handling |
|
||||||
|
|
||||||
|
## Feature Flags
|
||||||
|
|
||||||
|
| Feature | Default | Description |
|
||||||
|
|---------|---------|-------------|
|
||||||
|
| `fs-store` | ✅ | Filesystem-based store with redb + file hybrid |
|
||||||
|
| `rpc` | ✅ | RPC support via `noq` / `irpc` |
|
||||||
|
| `metrics` | ❌ | Prometheus metrics |
|
||||||
|
| `hide-proto-docs` | ✅ | Hides protocol docs from rustdocs |
|
||||||
|
|
||||||
|
## BLAKE3 Block Size
|
||||||
|
|
||||||
|
The crate uses a fixed block size of `IROH_BLOCK_SIZE = BlockSize::from_chunk_log(4)`, which means each chunk group is 2^4 = 16 chunks = 16 × 1024 = 16,384 bytes (16 KiB). This is the minimum granularity for range requests and verification.
|
||||||
195
docs/research/references/iroh/iroh-blobs/02-key-types.md
Normal file
195
docs/research/references/iroh/iroh-blobs/02-key-types.md
Normal file
@@ -0,0 +1,195 @@
|
|||||||
|
# iroh-blobs: Key Types and Data Structures
|
||||||
|
|
||||||
|
## Hash
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/hash.rs
|
||||||
|
pub struct Hash(blake3::Hash); // 32-byte BLAKE3 hash, wraps blake3::Hash
|
||||||
|
```
|
||||||
|
|
||||||
|
The fundamental content-address. Created via `Hash::new(data)` or `Hash::from_bytes([u8; 32])`. Has a constant `Hash::EMPTY` for the empty blob. Supports hex display, serde (compact binary for non-human-readable), and is stored as a 32-byte fixed array in redb.
|
||||||
|
|
||||||
|
Wire format: 32 raw bytes (postcard serialization). No framing overhead.
|
||||||
|
|
||||||
|
## BlobFormat
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum BlobFormat {
|
||||||
|
Raw, // A single blob
|
||||||
|
HashSeq, // A sequence of BLAKE3 hashes
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Distinguishes between a raw binary blob and a hash sequence. Wire format: single byte (0 = Raw, 1 = HashSeq).
|
||||||
|
|
||||||
|
## HashAndFormat
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct HashAndFormat {
|
||||||
|
pub hash: Hash,
|
||||||
|
pub format: BlobFormat,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Pairs a hash with its format. Wire format: 33 bytes (32 for hash + 1 for format). Display format: hex string, optionally prefixed with 's' for HashSeq.
|
||||||
|
|
||||||
|
## HashSeq
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/hashseq.rs
|
||||||
|
pub struct HashSeq(Bytes); // Wrapper around Bytes, length must be multiple of 32
|
||||||
|
```
|
||||||
|
|
||||||
|
A blob interpreted as a sequence of 32-byte BLAKE3 hashes. Created from `Bytes` via `HashSeq::new(bytes)` (returns `None` if length is not a multiple of 32). Iterable, supports `get(index)`, `pop_front()`.
|
||||||
|
|
||||||
|
Used extensively: collections are stored as a HashSeq where the first child is metadata and subsequent children are data blobs.
|
||||||
|
|
||||||
|
## Bitfield
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/api/proto/bitfield.rs
|
||||||
|
pub struct Bitfield {
|
||||||
|
pub size: u64, // Total size of the blob in bytes
|
||||||
|
pub ranges: ChunkRanges, // Which chunks are verified/present
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Tracks which chunks of a blob are present and verified. Key methods:
|
||||||
|
- `is_complete()` — all chunks present
|
||||||
|
- `validated_size()` — how many bytes are verified
|
||||||
|
- `diff(&other)` — compute the delta between two bitfields
|
||||||
|
|
||||||
|
Used by the observe protocol and internal state tracking.
|
||||||
|
|
||||||
|
## Tag
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/store/util.rs
|
||||||
|
pub struct Tag(pub Bytes); // Named reference, arbitrary bytes, typically UTF-8
|
||||||
|
```
|
||||||
|
|
||||||
|
A persistent named reference to content in the store. Tags protect content from garbage collection. Auto-generated tags use the format `"auto-2026-01-15T12:34:56.789Z"`. Tags are stored in the store's database and can be listed, created, renamed, and deleted.
|
||||||
|
|
||||||
|
## TempTag
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/util/temp_tag.rs
|
||||||
|
pub struct TempTag {
|
||||||
|
inner: HashAndFormat,
|
||||||
|
on_drop: Option<Weak<dyn TagDrop>>, // Callback when dropped
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
An ephemeral, in-memory tag. While a `TempTag` exists, its referenced content is protected from garbage collection. When dropped, the `TagDrop` callback notifies the store to unprotect. Can be `leak()`ed to make the protection permanent for the process lifetime.
|
||||||
|
|
||||||
|
Scopes: `TempTagScope` manages groups of temp tags. `Scope::GLOBAL` is the default scope. Batches of operations can create scoped temp tags that are cleaned up together.
|
||||||
|
|
||||||
|
## BlobTicket
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/ticket.rs
|
||||||
|
pub struct BlobTicket {
|
||||||
|
addr: EndpointAddr, // How to reach the provider (includes EndpointId, relay URL, direct addresses)
|
||||||
|
format: BlobFormat, // Raw or HashSeq
|
||||||
|
hash: Hash, // What to retrieve
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A shareable token containing everything needed to retrieve a blob from a provider. Serialized via `iroh_tickets::Ticket` trait (base32-encoded with "blob" prefix). Wire format uses postcard with a variant discriminator.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Creating a ticket
|
||||||
|
let ticket = BlobTicket::new(addr, hash, BlobFormat::Raw);
|
||||||
|
|
||||||
|
// From a ticket string
|
||||||
|
let ticket: BlobTicket = ticket_str.parse()?;
|
||||||
|
```
|
||||||
|
|
||||||
|
## ChunkRanges and ChunkRangesSeq
|
||||||
|
|
||||||
|
### ChunkRanges
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub type ChunkRanges = RangeSet2<ChunkNum>; // From range_collections crate
|
||||||
|
```
|
||||||
|
|
||||||
|
A set of non-overlapping chunk ranges. Supports boolean operations (union, intersection, difference). The fundamental unit is `ChunkNum` (a u64 newtype representing a 1024-byte BLAKE3 chunk).
|
||||||
|
|
||||||
|
Helper trait `ChunkRangesExt` provides:
|
||||||
|
- `ChunkRanges::all()` — all chunks
|
||||||
|
- `ChunkRanges::bytes(range)` — byte range rounded up to chunk boundaries
|
||||||
|
- `ChunkRanges::chunks(range)` — chunk range from u64 bounds
|
||||||
|
- `ChunkRanges::last_chunk()` — the very last chunk (for size verification)
|
||||||
|
- `ChunkRanges::chunk(n)` — a single chunk
|
||||||
|
- `ChunkRanges::offset(n)` — a single byte offset rounded to chunk
|
||||||
|
|
||||||
|
### ChunkRangesSeq
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/protocol/range_spec.rs
|
||||||
|
pub struct ChunkRangesSeq(SmallVec<[(u64, ChunkRanges); 2]>);
|
||||||
|
```
|
||||||
|
|
||||||
|
A sequence of `ChunkRanges`, one per blob in a HashSeq. Uses run-length encoding: stores `(offset, ranges)` pairs, where offset is the first blob index with that range spec. Unspecified indices default to the most recent range (or empty for finite sequences).
|
||||||
|
|
||||||
|
Key methods:
|
||||||
|
- `ChunkRangesSeq::all()` — request everything (root + all children, forever)
|
||||||
|
- `ChunkRangesSeq::root()` — request only the root blob
|
||||||
|
- `ChunkRangesSeq::empty()` — request nothing
|
||||||
|
- `ChunkRangesSeq::from_ranges(ranges)` — from explicit iterator
|
||||||
|
- `ChunkRangesSeq::from_ranges_infinite(ranges)` — last range repeats forever
|
||||||
|
- `.iter_non_empty_infinite()` — iterate only non-empty ranges
|
||||||
|
- `.is_blob()` — true if requesting a single blob (offset 0 with one entry)
|
||||||
|
|
||||||
|
### RangeSpec (Wire Format)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RangeSpec(SmallVec<[u64; 2]>);
|
||||||
|
```
|
||||||
|
|
||||||
|
The on-wire encoding of `ChunkRanges`. Uses alternating spans: first span is deselected, second is selected, etc. SmallVec avoids allocation for the common case of a single range.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- `[]` — empty (nothing selected)
|
||||||
|
- `[0]` — everything from chunk 0 selected (entire blob)
|
||||||
|
- `[2, 5, 3, 1]` — chunks 2-7 and 10-11 selected
|
||||||
|
- `[u64::MAX]` — only the last chunk (size proof)
|
||||||
|
|
||||||
|
### ChunkRangesSeq Wire Format
|
||||||
|
|
||||||
|
Serialized as `(SmallVec<[(u64, RangeSpec); 2]>)` where each element is `(delta_offset, rangespec)`. The `delta_offset` is the distance from the previous entry. Uses postcard varint encoding for compact transmission.
|
||||||
|
|
||||||
|
## Store Command Protocol
|
||||||
|
|
||||||
|
The store API uses an RPC-style command pattern via `irpc`. Each command has a `Command` enum variant with typed request/response channels:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[rpc_requests(message = Command, alias = "Msg", rpc_feature = "rpc")]
|
||||||
|
pub enum Request {
|
||||||
|
ListBlobs(ListRequest),
|
||||||
|
Batch(BatchRequest),
|
||||||
|
DeleteBlobs(BlobDeleteRequest),
|
||||||
|
ImportBao(ImportBaoRequest), // streaming: rx bao items, tx result
|
||||||
|
ExportBao(ExportBaoRequest), // streaming: tx encoded items
|
||||||
|
ExportRanges(ExportRangesRequest), // streaming: tx range data
|
||||||
|
Observe(ObserveRequest), // streaming: tx bitfield updates
|
||||||
|
BlobStatus(BlobStatusRequest),
|
||||||
|
ImportBytes(ImportBytesRequest),
|
||||||
|
ImportByteStream(ImportByteStreamRequest), // duplex streaming
|
||||||
|
ImportPath(ImportPathRequest),
|
||||||
|
ExportPath(ExportPathRequest),
|
||||||
|
ListTags(ListTagsRequest),
|
||||||
|
SetTag(SetTagRequest),
|
||||||
|
DeleteTags(DeleteTagsRequest),
|
||||||
|
RenameTag(RenameTagRequest),
|
||||||
|
CreateTag(CreateTagRequest),
|
||||||
|
CreateTempTag(CreateTempTagRequest),
|
||||||
|
ListTempTags(ListTempTagsRequest),
|
||||||
|
SyncDb(SyncDbRequest),
|
||||||
|
WaitIdle(WaitIdleRequest),
|
||||||
|
Shutdown(ShutdownRequest),
|
||||||
|
ClearProtected(ClearProtectedRequest),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows both local (in-process) and remote (RPC) store access through the same API surface.
|
||||||
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
@@ -0,0 +1,249 @@
|
|||||||
|
# iroh-blobs: Transfer Protocol
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The transfer protocol is a **request-response** protocol operating over QUIC streams (via iroh). The ALPN is `b"/iroh-bytes/4"`.
|
||||||
|
|
||||||
|
The requester opens a bidirectional QUIC stream, sends a request, and the provider responds with BLAKE3-verified streaming data on the same stream.
|
||||||
|
|
||||||
|
**Key properties**:
|
||||||
|
- Data integrity is verified in-stream — every 16 KiB chunk group can be independently verified against the BLAKE3 hash tree
|
||||||
|
- No upper limit on blob or collection size — streaming design avoids buffering entire transfers
|
||||||
|
- Zero round-trip overhead for multiple small blobs (via HashSeq/GetManyRequest)
|
||||||
|
- Range requests supported at chunk granularity
|
||||||
|
|
||||||
|
## Request Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Request {
|
||||||
|
Get(GetRequest),
|
||||||
|
Observe(ObserveRequest),
|
||||||
|
Slot2, Slot3, Slot4, Slot5, Slot6, Slot7, // Reserved
|
||||||
|
Push(PushRequest),
|
||||||
|
GetMany(GetManyRequest),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Wire format: 1-byte discriminator (postcard-encoded `RequestType` enum), followed by postcard-serialized request body.
|
||||||
|
|
||||||
|
### GetRequest
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GetRequest {
|
||||||
|
pub hash: Hash, // BLAKE3 hash of the root blob
|
||||||
|
pub ranges: ChunkRangesSeq, // What ranges to request
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The most common request type. The `ranges` field uses `ChunkRangesSeq` to express which parts of the root blob and its children to request.
|
||||||
|
|
||||||
|
**Common patterns**:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Request an entire single blob
|
||||||
|
let req = GetRequest::blob(hash);
|
||||||
|
// -> ChunkRangesSeq with a single element: all chunks of the root
|
||||||
|
|
||||||
|
// Request a HashSeq (root + all children)
|
||||||
|
let req = GetRequest::all(hash);
|
||||||
|
// -> ChunkRangesSeq::all() - infinite sequence of "all chunks"
|
||||||
|
|
||||||
|
// Request parts of a single blob
|
||||||
|
let req = GetRequest::builder()
|
||||||
|
.root(ChunkRanges::bytes(0..1000))
|
||||||
|
.build(hash);
|
||||||
|
|
||||||
|
// Request a HashSeq with specific child ranges
|
||||||
|
let req = GetRequest::builder()
|
||||||
|
.root(ChunkRanges::all()) // full root (the hash seq)
|
||||||
|
.child(1, ChunkRanges::bytes(0..100)) // partial child 1
|
||||||
|
.next(ChunkRanges::all()) // full remaining children
|
||||||
|
.build_open(hash); // build_open = last range repeats forever
|
||||||
|
```
|
||||||
|
|
||||||
|
### GetManyRequest
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GetManyRequest {
|
||||||
|
pub hashes: Vec<Hash>, // Sorted, deduplicated list of hashes
|
||||||
|
pub ranges: ChunkRangesSeq, // Ranges for each hash (no root entry)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Like a `GetRequest` for a HashSeq, but the hashes are provided by the requester instead of looked up from the provider. This avoids the provider needing to have a pre-existing HashSeq blob.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let req = GetManyRequest::builder()
|
||||||
|
.hash(hash1, ChunkRanges::all())
|
||||||
|
.hash(hash2, ChunkRanges::all())
|
||||||
|
.build();
|
||||||
|
// Deduplicates and sorts hashes automatically
|
||||||
|
```
|
||||||
|
|
||||||
|
### PushRequest
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct PushRequest(GetRequest); // Wraps a GetRequest
|
||||||
|
```
|
||||||
|
|
||||||
|
The inverse of a GetRequest — the requester pushes data to the provider. The request describes what will be sent, followed by the actual data stream. Providers may reject push requests (disabled by default via `EventMask`).
|
||||||
|
|
||||||
|
### ObserveRequest
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ObserveRequest {
|
||||||
|
pub hash: Hash,
|
||||||
|
pub ranges: RangeSpec, // Which ranges to observe
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Subscribes to availability changes for a blob's bitfield. The provider sends `ObserveItem` updates as chunks become available.
|
||||||
|
|
||||||
|
## Response Format
|
||||||
|
|
||||||
|
### For Get/GetMany/Push
|
||||||
|
|
||||||
|
The response is BLAKE3-verified streaming data (bao-tree format). For each blob in the request:
|
||||||
|
|
||||||
|
1. **8-byte size header** (little-endian u64) — the total size of the blob
|
||||||
|
2. **BLAKE3 verified stream** — encoded data for the requested ranges, using bao-tree's mixed encoding:
|
||||||
|
- `BaoContentItem::Parent(node, (left_hash, right_hash))` — internal hash tree nodes (64 bytes each)
|
||||||
|
- `BaoContentItem::Leaf(Leaf { offset, data })` — actual data chunks
|
||||||
|
|
||||||
|
The data is sent in order: ascending chunks for each blob, blobs in HashSeq order.
|
||||||
|
|
||||||
|
**Verification**: The requester validates each chunk group against the expected BLAKE3 hash tree. Invalid data is detected within at most 16 KiB of reception. Missing data (provider doesn't have a chunk) causes the provider to close the stream at the point where data becomes unavailable.
|
||||||
|
|
||||||
|
### For Observe
|
||||||
|
|
||||||
|
The provider sends length-prefixed `ObserveItem` messages:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ObserveItem {
|
||||||
|
pub size: u64, // Blob size
|
||||||
|
pub ranges: ChunkRanges, // Available chunks
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Updates are sent as deltas — only the new chunks that have become available since the last update.
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
Error codes for stream/connection closure:
|
||||||
|
|
||||||
|
| Code | Name | Meaning |
|
||||||
|
|------|------|---------|
|
||||||
|
| 0 | StreamDropped | RecvStream was dropped |
|
||||||
|
| 1 | ProviderTerminating | Provider is shutting down |
|
||||||
|
| 2 | RequestReceived | Only one request per stream allowed |
|
||||||
|
| 1 (application) | ERR_PERMISSION | Permission denied |
|
||||||
|
| 2 (application) | ERR_LIMIT | Rate limited |
|
||||||
|
| 3 (application) | ERR_INTERNAL | Internal error |
|
||||||
|
|
||||||
|
## Client-Side FSM (Get)
|
||||||
|
|
||||||
|
The `get::fsm` module implements the get request as a **finite state machine** for maximum control:
|
||||||
|
|
||||||
|
```
|
||||||
|
AtInitial
|
||||||
|
│ (open QUIC stream)
|
||||||
|
▼
|
||||||
|
AtConnected
|
||||||
|
│ (send request, drop writer)
|
||||||
|
▼
|
||||||
|
ConnectedNext ─┬─ StartRoot(hash, ranges) // offset 0 = root blob
|
||||||
|
├─ StartChild(offset, ranges) // offset > 0 = child blob
|
||||||
|
└─ Closing // empty request
|
||||||
|
│
|
||||||
|
AtStartRoot / AtStartChild
|
||||||
|
│ (determine hash for child)
|
||||||
|
▼
|
||||||
|
AtBlobHeader
|
||||||
|
│ (read 8-byte size)
|
||||||
|
▼
|
||||||
|
AtBlobContent
|
||||||
|
│ (stream BLAKE3-verified items)
|
||||||
|
├─ More(content_item) → AtBlobContent // loop
|
||||||
|
└─ Done → AtEndBlob
|
||||||
|
│
|
||||||
|
AtEndBlob
|
||||||
|
│ (iterate to next blob in sequence)
|
||||||
|
├─ MoreChildren(AtStartChild)
|
||||||
|
└─ Closing
|
||||||
|
│ (drain remaining bytes)
|
||||||
|
▼
|
||||||
|
Stats (transfer statistics)
|
||||||
|
```
|
||||||
|
|
||||||
|
Each state transition is explicit. The FSM gives the consumer full control:
|
||||||
|
- `AtBlobContent::next()` returns `BlobContentNext::More((content, item))` or `BlobContentNext::Done(end)`
|
||||||
|
- `AtBlobHeader::next()` reads the size header and creates a `ResponseDecoder`
|
||||||
|
- `AtStartChild::next(hash)` requires the caller to supply the hash (from the HashSeq)
|
||||||
|
|
||||||
|
### Stats Tracking
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Stats {
|
||||||
|
pub payload_bytes_read: u64, // Actual data bytes
|
||||||
|
pub other_bytes_read: u64, // Hash pairs, headers
|
||||||
|
pub payload_bytes_written: u64, // For push
|
||||||
|
pub other_bytes_written: u64, // For push
|
||||||
|
pub elapsed: Duration,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Provider-Side Handling
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn handle_connection(connection: Connection, store: Store, events: EventSender);
|
||||||
|
```
|
||||||
|
|
||||||
|
The provider accepts QUIC streams on a connection. For each stream:
|
||||||
|
1. Read the request type byte
|
||||||
|
2. Deserialize the request
|
||||||
|
3. Dispatch to `handle_get`, `handle_get_many`, `handle_observe`, or `handle_push`
|
||||||
|
4. For `handle_get`: iterate over the `ChunkRangesSeq`, streaming each blob via `store.export_bao(hash, ranges)`
|
||||||
|
5. For HashSeq requests: load the root blob, parse it as `HashSeq`, then stream each requested child
|
||||||
|
|
||||||
|
### Event System
|
||||||
|
|
||||||
|
The provider can emit events for monitoring and access control:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EventMask {
|
||||||
|
pub connected: ConnectMode, // None, Notify, Intercept
|
||||||
|
pub get: RequestMode, // None, Notify, Intercept, NotifyLog, InterceptLog, Disabled
|
||||||
|
pub get_many: RequestMode,
|
||||||
|
pub push: RequestMode, // Disabled by default!
|
||||||
|
pub observe: ObserveMode,
|
||||||
|
pub throttle: ThrottleMode, // None, Intercept
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **None**: No events, requests processed normally
|
||||||
|
- **Notify**: Events sent but cannot block requests
|
||||||
|
- **Intercept**: Events sent as RPC requests; handler can reject with `AbortReason`
|
||||||
|
- **Disabled**: All requests of this type rejected
|
||||||
|
|
||||||
|
Progress events: `TransferStarted`, `TransferProgress`, `TransferCompleted`, `TransferAborted`.
|
||||||
|
|
||||||
|
## Collection Format
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Collection {
|
||||||
|
blobs: Vec<(String, Hash)>, // Named references to child blobs
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Wire format (as a HashSeq blob):
|
||||||
|
1. First child blob: `CollectionMeta` serialized with postcard
|
||||||
|
2. Remaining children: the actual data blobs
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct CollectionMeta {
|
||||||
|
header: [u8; 13], // Must be b"CollectionV0."
|
||||||
|
names: Vec<String>, // Names for each child blob
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The header `b"CollectionV0."` is a magic number for format identification. The meta blob's hash becomes the first entry in the HashSeq, followed by the hashes of each data blob. Names correspond 1:1 with data blobs (excluding the meta entry).
|
||||||
250
docs/research/references/iroh/iroh-blobs/04-storage.md
Normal file
250
docs/research/references/iroh/iroh-blobs/04-storage.md
Normal file
@@ -0,0 +1,250 @@
|
|||||||
|
# iroh-blobs: Storage Architecture
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
iroh-blobs provides three store implementations sharing a common `Store` API surface:
|
||||||
|
|
||||||
|
| Store | Location | Mutable | Use Case |
|
||||||
|
|-------|----------|---------|----------|
|
||||||
|
| `MemStore` | In-memory | ✅ | Small data, testing, WASM |
|
||||||
|
| `FsStore` | Filesystem + redb | ✅ | Production, large data |
|
||||||
|
| `ReadonlyMemStore` | In-memory | ❌ | Static data serving |
|
||||||
|
|
||||||
|
All stores implement the same RPC-based command protocol (`Command` enum), allowing both local in-process and remote RPC access through the same `Store` type.
|
||||||
|
|
||||||
|
## Store API Surface
|
||||||
|
|
||||||
|
The `Store` type (from `api::Store`) is the primary interface. It's accessed via typed sub-APIs:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let store: Store = /* ... */;
|
||||||
|
|
||||||
|
// Blob operations
|
||||||
|
store.blobs() // → Blobs API (add, export, read, delete, observe, etc.)
|
||||||
|
store.tags() // → Tags API (create, list, set, delete, rename)
|
||||||
|
|
||||||
|
// Direct operations
|
||||||
|
store.add_bytes(data) // → AddProgress
|
||||||
|
store.add_slice(data) // → TempTag (convenience)
|
||||||
|
store.get_bytes(hash) // → Result<Bytes>
|
||||||
|
store.has(hash) // → bool
|
||||||
|
store.shutdown() // Clean shutdown
|
||||||
|
store.wait_idle() // Wait for all tasks to complete
|
||||||
|
store.sync_db() // Sync database to disk (FsStore)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Blobs API
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let blobs = store.blobs();
|
||||||
|
|
||||||
|
// Import
|
||||||
|
blobs.add_slice(data) // → AddProgress (raw format)
|
||||||
|
blobs.add_bytes(data) // → AddProgress (raw format)
|
||||||
|
blobs.add_bytes_with_opts(AddBytesOptions{..}) // → AddProgress (with format)
|
||||||
|
blobs.import_byte_stream(format) // → streaming import
|
||||||
|
|
||||||
|
// Export
|
||||||
|
blobs.reader(hash) // → BlobReader (AsyncRead + AsyncSeek)
|
||||||
|
blobs.export(hash, path) // → export to filesystem
|
||||||
|
blobs.export_bao(hash, ranges) // → ExportBao (BLAKE3 verified stream)
|
||||||
|
blobs.export_ranges(hash, ranges) // → ExportRanges (raw data ranges)
|
||||||
|
|
||||||
|
// Observe (subscribe to chunk availability)
|
||||||
|
blobs.observe(hash) // → ObserveAt (bitfield stream)
|
||||||
|
|
||||||
|
// Status
|
||||||
|
blobs.status(hash) // → BlobStatus (NotFound/Partial/Complete)
|
||||||
|
|
||||||
|
// Import BAO-encoded data
|
||||||
|
blobs.import_bao_bytes(hash, ranges, data) // → import verified BAO stream
|
||||||
|
blobs.import_bao_reader(hash, ranges, reader) // → import from async reader
|
||||||
|
|
||||||
|
// Batch operations (scoped temp tags)
|
||||||
|
blobs.batch() // → Batch (auto-cleanup scope)
|
||||||
|
|
||||||
|
// Delete
|
||||||
|
blobs.delete(hashes) // → force delete (use GC normally)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tags API
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let tags = store.tags();
|
||||||
|
|
||||||
|
tags.set(name, value) // Set a persistent tag
|
||||||
|
tags.create(value) // Auto-generate a tag name, return Tag
|
||||||
|
tags.get(name) // → Option<TagInfo>
|
||||||
|
tags.list() // → Stream<TagInfo>
|
||||||
|
tags.list_hash_seq() // → Stream<TagInfo> (only HashSeq format)
|
||||||
|
tags.delete(name) // Delete a tag
|
||||||
|
tags.delete_range(range) // Delete tags in range
|
||||||
|
tags.delete_prefix(prefix) // Delete tags with prefix
|
||||||
|
tags.rename(from, to) // Atomically rename a tag
|
||||||
|
tags.temp_tag(value) // → TempTag (ephemeral protection)
|
||||||
|
```
|
||||||
|
|
||||||
|
## MemStore Architecture
|
||||||
|
|
||||||
|
The in-memory store uses a simple actor pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
MemStore (ApiClient)
|
||||||
|
│
|
||||||
|
└── Actor (tokio task)
|
||||||
|
├── State
|
||||||
|
│ ├── data: HashMap<Hash, BaoFileHandle> // All blob data
|
||||||
|
│ ├── tags: BTreeMap<Tag, HashAndFormat> // Persistent tags
|
||||||
|
│ └── empty_hash: BaoFileHandle // Special entry for empty blob
|
||||||
|
├── tasks: JoinSet<TaskResult> // Spawned import/export tasks
|
||||||
|
├── temp_tags: TempTags // Ephemeral protection
|
||||||
|
├── protected: HashSet<Hash> // GC-protected hashes
|
||||||
|
└── idle_waiters: Vec<oneshot::Sender<()>> // Wait-idle notifications
|
||||||
|
```
|
||||||
|
|
||||||
|
### BaoFileHandle / BaoFileStorage
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum BaoFileStorage {
|
||||||
|
Partial(PartialMemStorage), // Still downloading
|
||||||
|
Complete(CompleteStorage), // Fully available
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct PartialMemStorage {
|
||||||
|
data: SparseMemFile, // Sparse byte array for data
|
||||||
|
outboard: SparseMemFile, // Sparse byte array for BLAKE3 hash tree
|
||||||
|
size: SizeInfo, // Known/estimated size
|
||||||
|
bitfield: Bitfield, // Which chunks are verified
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct CompleteStorage {
|
||||||
|
data: Bytes, // Complete data
|
||||||
|
outboard: Bytes, // Complete outboard (hash tree)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `watch::Sender<BaoFileStorage>` pattern allows subscribers to observe state changes (for the `observe` API).
|
||||||
|
|
||||||
|
### Data Flow (Import)
|
||||||
|
|
||||||
|
1. `add_bytes(data)` → compute outboard via `PreOrderMemOutboard::create()` → transition `Partial → Complete`
|
||||||
|
2. `import_bao(hash, size, stream)` → receive `BaoContentItem` stream → write to `PartialMemStorage` → update bitfield → transition to `Complete` when all chunks present
|
||||||
|
|
||||||
|
### Data Flow (Export)
|
||||||
|
|
||||||
|
1. `export_bao(hash, ranges)` → look up `BaoFileHandle` → `traverse_ranges_validated(data, outboard, &ranges, tx)` — streams validated BAO data
|
||||||
|
|
||||||
|
## FsStore Architecture (Hybrid Store)
|
||||||
|
|
||||||
|
The filesystem store uses a **hybrid approach** that stores small data inline in redb and large data as files on disk.
|
||||||
|
|
||||||
|
### Design Rationale (from DESIGN.md)
|
||||||
|
|
||||||
|
- **Databases** are good for small blobs (low per-entry overhead, fast random access)
|
||||||
|
- **Filesystems** are good for large blobs (OS-level caching, direct file access)
|
||||||
|
- **Neither alone** works well for both cases
|
||||||
|
|
||||||
|
### Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
<data_dir>/
|
||||||
|
├── db/ # redb database
|
||||||
|
│ ├── metadata table # Hash → EntryState
|
||||||
|
│ ├── inline_data table # Hash → Bytes (for small blobs)
|
||||||
|
│ ├── inline_outboard table # Hash → Bytes (for small outboards)
|
||||||
|
│ └── tags table # Tag → HashAndFormat
|
||||||
|
├── data/<hash>.data # Large blob data files
|
||||||
|
├── data/<hash>.outboard # Large outboard files
|
||||||
|
├── data/<hash>.sizes # Size tracking for partial files
|
||||||
|
└── data/<hash>.bitfield # Validated chunk tracking for partial files
|
||||||
|
```
|
||||||
|
|
||||||
|
### EntryState
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Simplified from src/store/fs/entry_state.rs
|
||||||
|
pub enum EntryState {
|
||||||
|
Complete(CompleteEntryState),
|
||||||
|
Partial(PartialEntryState),
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct CompleteEntryState {
|
||||||
|
pub data: DataLocation, // Inline, Owned (canonical path), or External (user path)
|
||||||
|
pub outboard: OutboardLocation, // Inline, Owned, or NotNeeded
|
||||||
|
pub size: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum DataLocation {
|
||||||
|
Inline, // Stored in redb inline_data table
|
||||||
|
Owned, // File at canonical path <hash>.data
|
||||||
|
External(Vec<PathBuf>), // User-owned file paths
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum OutboardLocation {
|
||||||
|
Inline, // Stored in redb inline_outboard table
|
||||||
|
Owned, // File at canonical path <hash>.outboard
|
||||||
|
NotNeeded, // Data ≤ 16 KiB, no outboard needed
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct PartialEntryState {
|
||||||
|
// Either we know the verified size, or we don't yet
|
||||||
|
pub verified_size: Option<NonZeroU64>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Thresholds
|
||||||
|
|
||||||
|
- **Data inline threshold**: 16 KiB (default) — blobs smaller than this are stored entirely in redb
|
||||||
|
- **Outboard inline threshold**: 16 KiB (default) — outboards smaller than this are stored in redb
|
||||||
|
- Data ≤ 16 KiB has no outboard (not needed for verification of a single chunk group)
|
||||||
|
|
||||||
|
### Blob Lifecycle
|
||||||
|
|
||||||
|
**Adding a local file (known data, unknown hash)**:
|
||||||
|
1. Compute the full BLAKE3 hash and outboard
|
||||||
|
2. Atomically move the file into the store under the hash name
|
||||||
|
3. Apply inlining rules: small files → redb, large files → filesystem
|
||||||
|
|
||||||
|
**Syncing from remote (known hash, unknown data)**:
|
||||||
|
1. Start with no data — keep state in memory (not in database)
|
||||||
|
2. As chunks arrive, write incrementally to partial files
|
||||||
|
3. Once size is known to exceed the inline threshold, create database entry + filesystem files
|
||||||
|
4. On completion, transition to `Complete` state and apply inlining rules
|
||||||
|
|
||||||
|
**Deletion**:
|
||||||
|
- Tags protect content from GC
|
||||||
|
- `TempTag` provides ephemeral (process-lifetime) protection
|
||||||
|
- HashSeq tags protect the root blob AND all referenced child blobs
|
||||||
|
- GC is mark-and-sweep: mark all reachable content via tags → sweep (delete) everything else
|
||||||
|
- Explicit `force` deletion bypasses protection (emergency use only)
|
||||||
|
|
||||||
|
### FsStore Actor Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
FsStore (ApiClient)
|
||||||
|
│
|
||||||
|
└── MainActor (tokio task)
|
||||||
|
├── TaskContext { config, db_actor_sender }
|
||||||
|
├── EntityMap: HashMap<Hash, ActiveEntityState> // Currently active entities
|
||||||
|
├── JoinSet<TaskResult> // Running tasks
|
||||||
|
├── TempTags // Ephemeral protection
|
||||||
|
├── ProtectedSet // GC protection
|
||||||
|
└── idle_waiters
|
||||||
|
```
|
||||||
|
|
||||||
|
The FsStore uses an **entity manager** pattern where each hash gets a `BaoFileHandle` (like MemStore) when active, and entries are cleaned up when tasks complete.
|
||||||
|
|
||||||
|
## Garbage Collection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GcConfig {
|
||||||
|
pub interval: Duration,
|
||||||
|
pub add_protected: Option<ProtectCb>, // Optional callback to add more protected hashes
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
GC is a two-phase process:
|
||||||
|
1. **Mark**: Walk all tags (persistent + temp), collect reachable hashes. For HashSeq format, traverse the hash sequence to find all child hashes.
|
||||||
|
2. **Sweep**: Delete all blobs not in the reachable set, in batches of 100.
|
||||||
|
|
||||||
|
GC runs automatically at a configurable interval via `run_gc(store, config)`, or manually via `gc_run_once(store, live)`.
|
||||||
@@ -0,0 +1,202 @@
|
|||||||
|
# iroh-blobs: Remote API and Downloader
|
||||||
|
|
||||||
|
## Remote API
|
||||||
|
|
||||||
|
The `Remote` type (`api::remote::Remote`) provides the client-side interface for interacting with remote iroh-blobs providers. It's a thin wrapper around `ApiClient` that exposes fetch, observe, and push operations.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let remote = store.remote(); // or Remote::from_sender(client)
|
||||||
|
|
||||||
|
// Get local info about what we already have
|
||||||
|
let local = remote.local(hash_and_format).await?;
|
||||||
|
|
||||||
|
// Compute what we need
|
||||||
|
let missing = local.missing();
|
||||||
|
|
||||||
|
// Execute a download
|
||||||
|
let stats = remote.execute_get(connection, request).await?;
|
||||||
|
|
||||||
|
// Or use the simpler fetch API
|
||||||
|
let progress = remote.fetch(connection, hash, format, store);
|
||||||
|
```
|
||||||
|
|
||||||
|
### LocalInfo
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct LocalInfo {
|
||||||
|
pub size: Option<u64>, // Total size if known
|
||||||
|
pub present: ChunkRanges, // Chunks we already have
|
||||||
|
pub missing: ChunkRanges, // Chunks we still need
|
||||||
|
pub hash_and_format: HashAndFormat,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`LocalInfo` is computed by querying the local store's bitfield for a given hash and comparing it against what a full download would require.
|
||||||
|
|
||||||
|
### Fetch Process
|
||||||
|
|
||||||
|
The `fetch` method handles the complete lifecycle:
|
||||||
|
|
||||||
|
1. **Local check**: Query the store for what we already have
|
||||||
|
2. **Request computation**: If format is HashSeq, read the local HashSeq to compute precise missing ranges
|
||||||
|
3. **Connection**: Open a QUIC stream to the provider
|
||||||
|
4. **Transfer**: Use the get FSM to stream data into the store
|
||||||
|
5. **Verification**: BLAKE3 verification happens in-stream during the transfer
|
||||||
|
|
||||||
|
For HashSeq format:
|
||||||
|
- First fetch the root blob (the HashSeq)
|
||||||
|
- Parse it to get child hashes
|
||||||
|
- For each child, check local availability and compute missing ranges
|
||||||
|
- Fetch only what's missing
|
||||||
|
|
||||||
|
### Observe
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Subscribe to bitfield updates from a remote provider
|
||||||
|
let mut stream = remote.observe(connection, hash).stream().await?;
|
||||||
|
while let Some(bitfield) = stream.next().await {
|
||||||
|
// Process availability updates
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The observe protocol sends `ObserveItem` messages (size + available ranges) whenever new chunks become available on the provider. The initial message contains the full current state, subsequent messages contain deltas.
|
||||||
|
|
||||||
|
### Push
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Push local data to a remote provider
|
||||||
|
let progress = remote.push(connection, request, store);
|
||||||
|
```
|
||||||
|
|
||||||
|
Push uses the same FSM-style approach but in reverse — the local side reads from the store and writes BLAKE3-verified data to the QUIC stream.
|
||||||
|
|
||||||
|
## Downloader API
|
||||||
|
|
||||||
|
The `Downloader` (`api::downloader::Downloader`) coordinates downloads from multiple sources:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let downloader = Downloader::new(store, endpoint);
|
||||||
|
|
||||||
|
// Download from specific providers
|
||||||
|
let progress = downloader.download(DownloadRequest {
|
||||||
|
request: FiniteRequest::Get(get_request),
|
||||||
|
providers: vec![endpoint_id_1, endpoint_id_2],
|
||||||
|
strategy: SplitStrategy::Split,
|
||||||
|
}).stream();
|
||||||
|
```
|
||||||
|
|
||||||
|
### SplitStrategy
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum SplitStrategy {
|
||||||
|
Split, // Split the request across multiple providers
|
||||||
|
None, // Use a single provider
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When `SplitStrategy::Split` is used, the downloader:
|
||||||
|
1. Splits the `GetRequest` into per-child requests
|
||||||
|
2. Distributes children across available providers
|
||||||
|
3. Downloads in parallel from multiple sources
|
||||||
|
4. Stores each completed child into the local store
|
||||||
|
|
||||||
|
### DownloadRequest
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct DownloadRequest {
|
||||||
|
pub request: FiniteRequest, // What to download
|
||||||
|
pub providers: Vec<EndpointId>, // Who to download from
|
||||||
|
pub strategy: SplitStrategy, // How to split work
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum FiniteRequest {
|
||||||
|
Get(GetRequest),
|
||||||
|
GetMany(GetManyRequest),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Download Progress
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DownloadProgressItem {
|
||||||
|
TryProvider { id: EndpointId, request: Arc<GetRequest> },
|
||||||
|
ProviderFailed { id: EndpointId, request: Arc<GetRequest> },
|
||||||
|
PartComplete { request: Arc<GetRequest> },
|
||||||
|
Progress(u64),
|
||||||
|
DownloadError,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Connection Pooling
|
||||||
|
|
||||||
|
The `util::connection_pool::ConnectionPool` manages reusable QUIC connections:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let pool = ConnectionPool::new(endpoint, ALPN, options);
|
||||||
|
let connection = pool.connect(endpoint_id).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Options include connection timeout, idle timeout, and maximum connections per peer.
|
||||||
|
|
||||||
|
## Integration with iroh
|
||||||
|
|
||||||
|
### BlobsProtocol
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// src/net_protocol.rs
|
||||||
|
pub struct BlobsProtocol {
|
||||||
|
inner: Arc<BlobsInner>, // (Store, EventSender)
|
||||||
|
}
|
||||||
|
|
||||||
|
impl ProtocolHandler for BlobsProtocol {
|
||||||
|
async fn accept(&self, conn: Connection) -> Result<(), AcceptError> {
|
||||||
|
crate::provider::handle_connection(conn, store, events).await;
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
async fn shutdown(&self) { /* shutdown store */ }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Usage with iroh Router:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let store = MemStore::new(); // or FsStore::load(path).await?
|
||||||
|
let blobs = BlobsProtocol::new(&store, None);
|
||||||
|
let router = Router::builder(endpoint)
|
||||||
|
.accept(iroh_blobs::ALPN, blobs)
|
||||||
|
.spawn();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Creating a BlobTicket
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
endpoint.online().await;
|
||||||
|
let addr = endpoint.addr();
|
||||||
|
|
||||||
|
let tag = store.add_slice(b"hello world").await?;
|
||||||
|
let ticket = BlobTicket::new(addr, tag.hash, tag.format);
|
||||||
|
println!("Share this: {ticket}");
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fetching from a Ticket
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// On the requester side
|
||||||
|
let ticket: BlobTicket = ticket_str.parse()?;
|
||||||
|
let (addr, hash, format) = ticket.into_parts();
|
||||||
|
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
|
||||||
|
|
||||||
|
let request = match format {
|
||||||
|
BlobFormat::Raw => GetRequest::blob(hash),
|
||||||
|
BlobFormat::HashSeq => GetRequest::all(hash),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Use the get FSM
|
||||||
|
let fsm = get::fsm::start(conn, request, RequestCounters::default());
|
||||||
|
let connected = fsm.next().await?;
|
||||||
|
// ... drive the FSM to completion
|
||||||
|
```
|
||||||
@@ -0,0 +1,312 @@
|
|||||||
|
# iroh-blobs: Data Flow and Complete Example
|
||||||
|
|
||||||
|
## Complete Data Flow: Provider Side
|
||||||
|
|
||||||
|
```
|
||||||
|
QUIC Connection Arrives
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
handle_connection(conn, store, events)
|
||||||
|
│
|
||||||
|
┌──────────┴──────────┐
|
||||||
|
│ Accept QUIC BIDI │
|
||||||
|
│ streams in loop │
|
||||||
|
└──────────┬──────────┘
|
||||||
|
│
|
||||||
|
handle_stream(pair, store)
|
||||||
|
│
|
||||||
|
┌──────────┴──────────┐
|
||||||
|
│ Read Request type │
|
||||||
|
│ byte + deserialize │
|
||||||
|
└──────────┬──────────┘
|
||||||
|
│
|
||||||
|
┌─────────────┬───────┼───────┬──────────────┐
|
||||||
|
│ │ │ │ │
|
||||||
|
handle_get handle_get handle handle (reserved)
|
||||||
|
_many _observe _push
|
||||||
|
│ │ │ │
|
||||||
|
▼ ▼ ▼ ▼
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
|
│ For each (offset, ranges) in request.ranges: │
|
||||||
|
│ │
|
||||||
|
│ if offset == 0: │
|
||||||
|
│ send_blob(store, 0, hash, ranges, writer) │
|
||||||
|
│ else: │
|
||||||
|
│ lookup hash in HashSeq[offset-1] │
|
||||||
|
│ send_blob(store, offset, child_hash, ranges, writer) │
|
||||||
|
│ │
|
||||||
|
│ send_blob: │
|
||||||
|
│ store.export_bao(hash, ranges) │
|
||||||
|
│ .write_with_progress(writer, ctx, &hash, idx) │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Complete Data Flow: Requester Side (Get FSM)
|
||||||
|
|
||||||
|
```
|
||||||
|
Create GetRequest
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
fsm::start(connection, request, counters)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
AtInitial.next()
|
||||||
|
│ (open_bi, send request)
|
||||||
|
▼
|
||||||
|
AtConnected.next()
|
||||||
|
│
|
||||||
|
┌───────────┼───────────┐
|
||||||
|
│ │ │
|
||||||
|
StartRoot StartChild Closing
|
||||||
|
(offset=0) (offset>0) (empty)
|
||||||
|
│ │ │
|
||||||
|
▼ ▼ ▼
|
||||||
|
AtBlobHeader AtBlobHeader AtClosing
|
||||||
|
.next() .next(hash) .next()
|
||||||
|
│ │ │
|
||||||
|
▼ ▼ ▼
|
||||||
|
(size, AtBlobContent) Stats
|
||||||
|
│
|
||||||
|
┌────────┴────────┐
|
||||||
|
│ │
|
||||||
|
More(item) Done
|
||||||
|
(loop back to (AtEndBlob)
|
||||||
|
AtBlobContent) │
|
||||||
|
┌─────┼─────┐
|
||||||
|
│ │
|
||||||
|
MoreChildren Closing
|
||||||
|
(AtStartChild) (AtClosing)
|
||||||
|
│ │
|
||||||
|
└───────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Blob Content Items
|
||||||
|
|
||||||
|
During `AtBlobContent`, items arrive as `BaoContentItem`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum BaoContentItem {
|
||||||
|
Parent(ParentNode), // (node, (left_hash, right_hash)) — 64 bytes
|
||||||
|
Leaf(Leaf), // { offset: u64, data: Bytes } — actual data
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Parent nodes** contain BLAKE3 hash pairs for tree verification. They're overhead (~64 bytes per internal node).
|
||||||
|
- **Leaf nodes** contain actual data chunks. Each leaf's data is at most `IROH_BLOCK_SIZE` bytes (16 KiB).
|
||||||
|
|
||||||
|
Verification is automatic: the `ResponseDecoder` from `bao-tree` validates each chunk against the expected hash tree rooted at the request hash.
|
||||||
|
|
||||||
|
## Blob Verification and BaoTree Encoding
|
||||||
|
|
||||||
|
### How BLAKE3 Verified Streaming Works
|
||||||
|
|
||||||
|
1. **The hash is the root** of a binary Merkle tree
|
||||||
|
2. **Internal nodes** store `(left_child_hash, right_child_hash)` — 64 bytes each
|
||||||
|
3. **Leaf nodes** store the actual data chunks (up to 1024 bytes each in standard BLAKE3, or 16 KiB in iroh's block size)
|
||||||
|
4. **Chunk groups** (16 chunks = 16 KiB) are the minimum verification unit in iroh-blobs
|
||||||
|
|
||||||
|
For a request with specific ranges:
|
||||||
|
- The provider traverses the tree, yielding only nodes needed to verify the requested ranges
|
||||||
|
- The requester can verify each chunk group independently after receiving its parent hash pair
|
||||||
|
- Maximum undetected corruption: 16 KiB (one chunk group)
|
||||||
|
|
||||||
|
### Outboard Storage
|
||||||
|
|
||||||
|
The **outboard** is the BLAKE3 hash tree stored separately from the data. For the provider:
|
||||||
|
- Small blobs (≤16 KiB): outboard is empty (not needed, single chunk group)
|
||||||
|
- Large blobs: outboard stored as `PreOrderMemOutboard` (in-memory) or as a file (filesystem store)
|
||||||
|
|
||||||
|
For the requester, the outboard is built incrementally as data arrives.
|
||||||
|
|
||||||
|
## Import and Export Flows
|
||||||
|
|
||||||
|
### Import Bytes (Local Data)
|
||||||
|
|
||||||
|
```
|
||||||
|
add_bytes(data) / add_slice(data)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
ImportBytesRequest { data, format, scope }
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Actor::import_bytes()
|
||||||
|
│ 1. Send AddProgressItem::Size(len)
|
||||||
|
│ 2. Send AddProgressItem::CopyDone
|
||||||
|
│ 3. Compute outboard: PreOrderMemOutboard::create(&data, IROH_BLOCK_SIZE)
|
||||||
|
│ 4. Return ImportEntry { data, outboard, scope, format, tx }
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Actor::finish_import()
|
||||||
|
│ 1. Get hash from outboard.root()
|
||||||
|
│ 2. Get or create BaoFileHandle for hash
|
||||||
|
│ 3. Transition BaoFileStorage::Partial → Complete
|
||||||
|
│ 4. Create TempTag for the hash_and_format
|
||||||
|
│ 5. Send AddProgressItem::Done(temp_tag)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Import BAO Stream (Remote Data)
|
||||||
|
|
||||||
|
```
|
||||||
|
import_bao_bytes(hash, ranges, data) / import_bao_reader(hash, ranges, reader)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
ImportBaoRequest { hash, size }
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Actor::import_bao()
|
||||||
|
│ 1. Set size on partial entry
|
||||||
|
│ 2. Create BaoTree for the size
|
||||||
|
│ 3. For each BaoContentItem from stream:
|
||||||
|
│ - Parent: write hash pair to outboard
|
||||||
|
│ - Leaf: write data to storage, update bitfield
|
||||||
|
│ - If bitfield becomes complete: transition Partial → Complete
|
||||||
|
│ 4. Send result
|
||||||
|
```
|
||||||
|
|
||||||
|
### Export BAO
|
||||||
|
|
||||||
|
```
|
||||||
|
export_bao(hash, ranges) → ExportBao
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Actor::export_bao()
|
||||||
|
│ 1. Look up BaoFileHandle for hash
|
||||||
|
│ 2. If not found: send EncodeError::NotFound and return
|
||||||
|
│ 3. Create BaoTreeSender from data + outboard readers
|
||||||
|
│ 4. Call traverse_ranges_validated(data, outboard, &ranges, tx)
|
||||||
|
│ → streams validated BAO items to the sender
|
||||||
|
```
|
||||||
|
|
||||||
|
### Export Path (To Filesystem)
|
||||||
|
|
||||||
|
```
|
||||||
|
export(hash, target_path) → ExportPath
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Actor::export_path()
|
||||||
|
│ 1. Look up BaoFileHandle for hash
|
||||||
|
│ 2. Create parent directories if needed
|
||||||
|
│ 3. Create file at target_path
|
||||||
|
│ 4. Send ExportProgressItem::Size(total_size)
|
||||||
|
│ 5. Read data from store in 64 KiB chunks
|
||||||
|
│ 6. Write to file, yielding ExportProgressItem::CopyProgress(offset)
|
||||||
|
│ 7. Send ExportProgressItem::Done
|
||||||
|
```
|
||||||
|
|
||||||
|
## Observe Protocol Detail
|
||||||
|
|
||||||
|
```
|
||||||
|
Requester Provider
|
||||||
|
│ │
|
||||||
|
│ ObserveRequest {hash, ranges} │
|
||||||
|
│─────────────────────────────────►│
|
||||||
|
│ │
|
||||||
|
│ ObserveItem {size, ranges} │ (initial state)
|
||||||
|
│◄─────────────────────────────────│
|
||||||
|
│ │
|
||||||
|
│ ... (time passes, more data │
|
||||||
|
│ becomes available) │
|
||||||
|
│ │
|
||||||
|
│ ObserveItem {size, ranges} │ (delta update)
|
||||||
|
│◄─────────────────────────────────│
|
||||||
|
│ │
|
||||||
|
│ ... (continue until │
|
||||||
|
│ requester stops │
|
||||||
|
│ or connection closes) │
|
||||||
|
│ │
|
||||||
|
│ STOP_STREAM │
|
||||||
|
│─────────────────────────────────►│
|
||||||
|
```
|
||||||
|
|
||||||
|
The observe protocol uses `Bitfield::diff()` to send only the new chunks since the last update, minimizing bandwidth.
|
||||||
|
|
||||||
|
## Full Working Example
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use iroh::{protocol::Router, Endpoint, endpoint::presets};
|
||||||
|
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ticket::BlobTicket, BlobFormat};
|
||||||
|
|
||||||
|
// === Provider Side ===
|
||||||
|
async fn provider() -> anyhow::Result<()> {
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let store = MemStore::new();
|
||||||
|
|
||||||
|
// Add some data
|
||||||
|
let tag = store.add_slice(b"Hello, iroh-blobs!").await?;
|
||||||
|
|
||||||
|
let _ = endpoint.online().await;
|
||||||
|
let addr = endpoint.addr();
|
||||||
|
|
||||||
|
// Create ticket for sharing
|
||||||
|
let ticket = BlobTicket::new(addr, tag.hash, BlobFormat::Raw);
|
||||||
|
println!("Ticket: {ticket}");
|
||||||
|
|
||||||
|
// Start serving
|
||||||
|
let blobs = BlobsProtocol::new(&store, None);
|
||||||
|
let router = Router::builder(endpoint)
|
||||||
|
.accept(iroh_blobs::ALPN, blobs)
|
||||||
|
.spawn();
|
||||||
|
|
||||||
|
tokio::signal::ctrl_c().await?;
|
||||||
|
router.shutdown().await?;
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
// === Requester Side ===
|
||||||
|
async fn requester(ticket: BlobTicket) -> anyhow::Result<()> {
|
||||||
|
let (addr, hash, format) = ticket.into_parts();
|
||||||
|
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
|
||||||
|
|
||||||
|
// Build request based on format
|
||||||
|
let request = match format {
|
||||||
|
BlobFormat::Raw => iroh_blobs::protocol::GetRequest::blob(hash),
|
||||||
|
BlobFormat::HashSeq => iroh_blobs::protocol::GetRequest::all(hash),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Use the get FSM
|
||||||
|
let start = iroh_blobs::get::fsm::start(conn, request, Default::default());
|
||||||
|
let connected = start.next().await?;
|
||||||
|
let connected = connected.next().await?;
|
||||||
|
|
||||||
|
match connected {
|
||||||
|
iroh_blobs::get::fsm::ConnectedNext::StartRoot(at_root) => {
|
||||||
|
let (at_content, size) = at_root.next().next().await?;
|
||||||
|
let (at_end, data) = at_content.concatenate_into_vec().await?;
|
||||||
|
println!("Got {} bytes: {:?}", size, data);
|
||||||
|
// ...
|
||||||
|
}
|
||||||
|
iroh_blobs::get::fsm::ConnectedNext::StartChild(at_child) => {
|
||||||
|
// Need to know the child hash
|
||||||
|
}
|
||||||
|
iroh_blobs::get::fsm::ConnectedNext::Closing(at_closing) => {
|
||||||
|
println!("Empty response");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Simplified Fetch (Using Store + Remote)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// The simplest way to download data
|
||||||
|
let store = MemStore::new();
|
||||||
|
let remote = store.remote();
|
||||||
|
|
||||||
|
// Fetch with automatic local availability checking
|
||||||
|
let result = remote.fetch(connection, hash, format, &store).await?;
|
||||||
|
// Result includes Stats with transfer metrics
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Error Types
|
||||||
|
|
||||||
|
| Error Type | Location | Purpose |
|
||||||
|
|------------|----------|---------|
|
||||||
|
| `GetError` | `get::error` | Errors during get FSM |
|
||||||
|
| `ExportBaoError` | `api` | Errors during BAO export |
|
||||||
|
| `RequestError` | `api` | Store command errors |
|
||||||
|
| `DecodeError` | `get::fsm` | BAO stream decode errors |
|
||||||
|
| `ProgressError` | `provider::events` | Provider event errors |
|
||||||
60
docs/research/references/iroh/iroh-blobs/README.md
Normal file
60
docs/research/references/iroh/iroh-blobs/README.md
Normal file
@@ -0,0 +1,60 @@
|
|||||||
|
# iroh-blobs Reference Documentation
|
||||||
|
|
||||||
|
This directory contains a comprehensive reference for the `iroh-blobs` crate (v0.100.0), a Rust library for content-addressed blob transfer over QUIC connections using BLAKE3 verified streaming.
|
||||||
|
|
||||||
|
## Documents
|
||||||
|
|
||||||
|
1. **[Overview and Architecture](01-overview-and-architecture.md)** — Core concepts, module structure, feature flags, and architecture diagram. Start here.
|
||||||
|
|
||||||
|
2. **[Key Types and Data Structures](02-key-types.md)** — Detailed reference for `Hash`, `BlobFormat`, `HashAndFormat`, `HashSeq`, `Bitfield`, `Tag`, `TempTag`, `BlobTicket`, `ChunkRanges`/`ChunkRangesSeq`/`RangeSpec`, and the store command protocol.
|
||||||
|
|
||||||
|
3. **[Transfer Protocol](03-transfer-protocol.md)** — Wire protocol specification: request types (`GetRequest`, `GetManyRequest`, `PushRequest`, `ObserveRequest`), response format (BLAKE3 verified streaming), the client-side FSM, provider handling, event system, and the Collection format.
|
||||||
|
|
||||||
|
4. **[Storage Architecture](04-storage.md)** — Store implementations: `MemStore` (in-memory), `FsStore` (hybrid redb + filesystem), `ReadonlyMemStore`. Covers the actor pattern, `BaoFileHandle`/`BaoFileStorage`, partial/complete states, the hybrid inline/file approach, entry states, blob lifecycle, and garbage collection.
|
||||||
|
|
||||||
|
5. **[Remote API and Downloader](05-remote-and-downloader.md)** — `Remote` API for fetching from/observing/pushing to peers, `Downloader` for multi-source downloads, connection pooling, and iroh integration via `BlobsProtocol`.
|
||||||
|
|
||||||
|
6. **[Data Flow and Examples](06-data-flow-and-examples.md)** — End-to-end data flow diagrams for provider and requester sides, BLAKE3 verification mechanics, import/export flows, observe protocol detail, and complete working examples.
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Creating a Provider
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use iroh::{protocol::Router, Endpoint, endpoint::presets};
|
||||||
|
use iroh_blobs::{store::mem::MemStore, BlobsProtocol};
|
||||||
|
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let store = MemStore::new();
|
||||||
|
let tag = store.add_slice(b"data").await?;
|
||||||
|
let blobs = BlobsProtocol::new(&store, None);
|
||||||
|
let router = Router::builder(endpoint)
|
||||||
|
.accept(iroh_blobs::ALPN, blobs)
|
||||||
|
.spawn();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Constants
|
||||||
|
|
||||||
|
| Constant | Value | Meaning |
|
||||||
|
|----------|-------|---------|
|
||||||
|
| `ALPN` | `b"/iroh-bytes/4"` | QUIC ALPN protocol identifier |
|
||||||
|
| `IROH_BLOCK_SIZE` | `BlockSize::from_chunk_log(4)` | 16 KiB chunk groups |
|
||||||
|
| `MAX_MESSAGE_SIZE` | `1 MiB` | Maximum request message size |
|
||||||
|
| `Hash::EMPTY` | BLAKE3 of `b""` | Hash of the empty blob |
|
||||||
|
|
||||||
|
### Core Crate Exports
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub use hash::{BlobFormat, Hash, HashAndFormat};
|
||||||
|
pub use hashseq::HashSeq;
|
||||||
|
pub use net_protocol::BlobsProtocol;
|
||||||
|
pub use protocol::ALPN;
|
||||||
|
pub mod api; // Store API, Blobs, Tags, Downloader, Remote
|
||||||
|
pub mod format; // Collection type
|
||||||
|
pub mod get; // Client-side FSM
|
||||||
|
pub mod protocol; // Wire protocol types (GetRequest, etc.)
|
||||||
|
pub mod provider; // Server-side handling
|
||||||
|
pub mod store; // Storage implementations
|
||||||
|
pub mod ticket; // BlobTicket
|
||||||
|
pub mod util; // Connection pool, temp tags, stream helpers
|
||||||
|
```
|
||||||
@@ -0,0 +1,98 @@
|
|||||||
|
# iroh-docs: Overview and Architecture
|
||||||
|
|
||||||
|
> Reference document for the `iroh-docs` crate (v0.98.0).
|
||||||
|
> Source: `/workspace/iroh-docs`
|
||||||
|
|
||||||
|
## What Is iroh-docs?
|
||||||
|
|
||||||
|
`iroh-docs` is a Rust crate implementing **multi-dimensional key-value documents with an efficient synchronization protocol**. It provides:
|
||||||
|
|
||||||
|
1. **A CRDT-based document model** — Replicas (documents) hold entries identified by namespace + author + key, with content-addressed values (BLAKE3 hashes).
|
||||||
|
2. **Range-based set reconciliation** — An efficient sync protocol based on [Aljoscha Meyer's paper](https://arxiv.org/abs/2212.13567) for reconciling sets between peers.
|
||||||
|
3. **Live sync via gossip** — Real-time document updates propagated through an iroh-gossip swarm.
|
||||||
|
4. **Persistent storage** — A `redb`-backed store supporting both in-memory and file-based modes.
|
||||||
|
|
||||||
|
## High-Level Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────┐
|
||||||
|
│ Docs (Protocol) │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Engine │ │
|
||||||
|
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │ │
|
||||||
|
│ │ │ LiveActor│ │ GossipState │ │ SyncHandle/Actor │ │ │
|
||||||
|
│ │ │ (events) │ │ (iroh-gossip)│ │ (store + sync) │ │ │
|
||||||
|
│ │ └──────────┘ └──────────────┘ └───────────────────┘ │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
||||||
|
│ │ Replica │ │ SignedEntry │ │ Author/ │ │
|
||||||
|
│ │ (sync.rs) │ │ Entry/Record │ │ Namespace keys │ │
|
||||||
|
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Store (redb) │ │
|
||||||
|
│ │ Authors │ Namespaces │ Records │ RecordsByKey │ ... │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────┘ │
|
||||||
|
└──────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Module Layout
|
||||||
|
|
||||||
|
| Module | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `sync.rs` | Core types: `Replica`, `Entry`, `SignedEntry`, `Record`, `RecordIdentifier`, `Capability`, events |
|
||||||
|
| `keys.rs` | Cryptographic key types: `Author`, `NamespaceSecret`, `AuthorId`, `NamespaceId` |
|
||||||
|
| `ranger.rs` | Range-based set reconciliation algorithm implementation |
|
||||||
|
| `heads.rs` | `AuthorHeads` — latest timestamps per author for efficient sync decisions |
|
||||||
|
| `store/` | Storage abstraction and `redb`-backed persistent store |
|
||||||
|
| `store/fs.rs` | File-based `Store` implementation with redb tables |
|
||||||
|
| `store/pubkeys.rs` | `PublicKeyStore` trait for caching expanded ed25519 public keys |
|
||||||
|
| `actor.rs` | `SyncHandle` / Actor — single-threaded executor for store and replica operations |
|
||||||
|
| `engine/` | Live sync coordination: `Engine`, `LiveActor`, `GossipState`, `NamespaceStates` |
|
||||||
|
| `engine/live.rs` | The `LiveActor` event loop: handles sync, gossip, content download |
|
||||||
|
| `engine/gossip.rs` | Integration with `iroh-gossip` for broadcasting document operations |
|
||||||
|
| `engine/state.rs` | `NamespaceStates` — tracks per-namespace, per-peer sync state |
|
||||||
|
| `net/` | Network protocol: ALPN `/iroh-sync/1`, connection handling |
|
||||||
|
| `net/codec.rs` | Wire codec: length-prefixed postcard-serialized `Message` frames |
|
||||||
|
| `protocol.rs` | `Docs` struct (the `ProtocolHandler`) and `Builder` |
|
||||||
|
| `api/` | irpc-based RPC API for external access |
|
||||||
|
| `ticket.rs` | `DocTicket` — shareable document capability + peer addresses |
|
||||||
|
|
||||||
|
## Key Design Principles
|
||||||
|
|
||||||
|
1. **Two-key identity model**: Every entry is uniquely identified by (namespace, author, key). The namespace key provides write authorization; the author key provides attribution.
|
||||||
|
|
||||||
|
2. **Content-addressed values**: Entries store a BLAKE3 hash + length, not the actual content. Content blobs are handled separately by `iroh-blobs`.
|
||||||
|
|
||||||
|
3. **Prefix deletion**: An entry with key "foo" acts as a tombstone for all entries whose keys start with "foo/" (prefix deletion semantics). This enables hierarchical key structures.
|
||||||
|
|
||||||
|
4. **Last-writer-wins with per-author timestamps**: Entries are ordered by (timestamp, hash). Newer entries dominate older ones. Different authors can have entries for the same key simultaneously (multi-dimensional).
|
||||||
|
|
||||||
|
5. **Actor-based concurrency**: All store and replica mutations go through a single `SyncHandle` actor thread, eliminating the need for locks on the store.
|
||||||
|
|
||||||
|
6. **Event-driven live sync**: The `LiveActor` coordinates gossip, direct sync, and content downloads through a `tokio::select!` event loop.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
Key dependencies from `Cargo.toml`:
|
||||||
|
|
||||||
|
| Crate | Purpose |
|
||||||
|
|-------|---------|
|
||||||
|
| `iroh` | Networking: endpoints, connections, protocol routing |
|
||||||
|
| `iroh-blobs` | Content-addressed blob storage and transfer |
|
||||||
|
| `iroh-gossip` | Gossip protocol for broadcasting updates |
|
||||||
|
| `iroh-tickets` | Ticket-based sharing mechanism |
|
||||||
|
| `redb` | Embedded key-value store for persistence |
|
||||||
|
| `ed25519-dalek` | Ed25519 signatures for entries |
|
||||||
|
| `blake3` | Hashing (fingerprints + content hashes) |
|
||||||
|
| `postcard` | Serialization (wire format for sync protocol) |
|
||||||
|
| `irpc` / `noq` | RPC framework for API |
|
||||||
|
|
||||||
|
## Feature Flags
|
||||||
|
|
||||||
|
| Feature | Default | Description |
|
||||||
|
|---------|---------|-------------|
|
||||||
|
| `metrics` | Yes | Enables iroh-metrics instrumentation |
|
||||||
|
| `rpc` | Yes | Enables irpc-based RPC API (depends on `noq`) |
|
||||||
|
| `fs-store` | Yes | Enables persistent file-based store |
|
||||||
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
201
docs/research/references/iroh/iroh-docs/02-document-model.md
Normal file
@@ -0,0 +1,201 @@
|
|||||||
|
# iroh-docs: Document Model and CRDT Details
|
||||||
|
|
||||||
|
## Core Data Model
|
||||||
|
|
||||||
|
### Namespace (Document Identity)
|
||||||
|
|
||||||
|
A **Namespace** is the identity of a document. It consists of:
|
||||||
|
|
||||||
|
- **`NamespaceSecret`** — An Ed25519 signing key (32 bytes) that grants write capability
|
||||||
|
- **`NamespacePublicKey`** — The corresponding verifying key (32 bytes)
|
||||||
|
- **`NamespaceId`** — A `[u8; 32]` that is the byte representation of the public key; this serves as the unique identifier for a document/replica
|
||||||
|
|
||||||
|
```
|
||||||
|
NamespaceSecret (signing key) ──derives──▶ NamespacePublicKey (verifying key)
|
||||||
|
──into─────▶ NamespaceId ([u8; 32])
|
||||||
|
```
|
||||||
|
|
||||||
|
### Author (Writer Identity)
|
||||||
|
|
||||||
|
An **Author** represents a writer identity within a document. Multiple authors can write to the same namespace.
|
||||||
|
|
||||||
|
- **`Author`** — An Ed25519 signing key (32 bytes)
|
||||||
|
- **`AuthorPublicKey`** — The corresponding verifying key (32 bytes)
|
||||||
|
- **`AuthorId`** — A `[u8; 32]` byte representation of the public key
|
||||||
|
|
||||||
|
Authors are application-defined: an application might create one author per device, per user, or per session.
|
||||||
|
|
||||||
|
### Capability
|
||||||
|
|
||||||
|
Access to a document is controlled through a `Capability`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Capability {
|
||||||
|
Write(NamespaceSecret), // Full read-write access
|
||||||
|
Read(NamespaceId), // Read-only access (can sync but not insert)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Capabilities can be **merged** — a `Read` capability can be upgraded to `Write` if a matching `Write` is presented:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
capability.merge(other_capability) // Read + Write → Write
|
||||||
|
```
|
||||||
|
|
||||||
|
The raw representation is `(u8, [u8; 32])` — a kind byte followed by 32 bytes of key material.
|
||||||
|
|
||||||
|
### Entry (The Fundamental Record)
|
||||||
|
|
||||||
|
An **`Entry`** is the core data unit, consisting of:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Entry {
|
||||||
|
id: RecordIdentifier, // (namespace, author, key)
|
||||||
|
record: Record, // (hash, len, timestamp)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### RecordIdentifier
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RecordIdentifier(Bytes); // namespace[0..32] || author[32..64] || key[64..]
|
||||||
|
```
|
||||||
|
|
||||||
|
The key is a variable-length byte sequence. `RecordIdentifier` implements `Ord` by comparing namespace first, then author, then key — this ordering is critical for the range-based sync algorithm.
|
||||||
|
|
||||||
|
#### Record
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Record {
|
||||||
|
len: u64, // byte length of the content
|
||||||
|
hash: Hash, // BLAKE3 hash of the content (32 bytes)
|
||||||
|
timestamp: u64, // microseconds since Unix epoch
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Record` comparison uses `(timestamp, hash)` ordering — this is the **Last-Writer-Wins** rule for same-key entries. When two records for the same key exist, the one with the higher timestamp wins; if timestamps are equal, the higher hash wins as a tiebreaker.
|
||||||
|
|
||||||
|
### SignedEntry (Entry with Proofs)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SignedEntry {
|
||||||
|
signature: EntrySignature, // dual Ed25519 signatures
|
||||||
|
entry: Entry,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### EntrySignature
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EntrySignature {
|
||||||
|
author_signature: Signature, // 64-byte Ed25519 signature
|
||||||
|
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Both signatures cover the canonical byte encoding of the `Entry` (id + record). This means:
|
||||||
|
- The **namespace signature** proves write authorization (only holders of `NamespaceSecret` can produce valid entries)
|
||||||
|
- The **author signature** proves authorship (provides attribution and non-repudiation)
|
||||||
|
|
||||||
|
#### Verification
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn verify<S: PublicKeyStore>(&self, store: &S) -> Result<(), SignatureError>
|
||||||
|
```
|
||||||
|
|
||||||
|
Verification requires both the `NamespacePublicKey` and `AuthorPublicKey`, which are derived from the entry's namespace and author IDs. The `PublicKeyStore` trait provides caching for these expanded keys.
|
||||||
|
|
||||||
|
### Empty Entries (Tombstones / Prefix Deletion)
|
||||||
|
|
||||||
|
An entry is **empty** when `hash == Hash::EMPTY && len == 0`. Empty entries serve as **deletion markers**:
|
||||||
|
|
||||||
|
- **Key deletion**: Inserting an empty entry with the exact key removes the previous entry for that key
|
||||||
|
- **Prefix deletion**: Inserting an empty entry with key "foo" removes all entries whose keys start with "foo" (prefix deletion)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn delete_prefix(&mut self, prefix: impl AsRef<[u8]>, author: &Author) -> Result<usize, InsertError>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Insert Semantics (CRDT Rules)
|
||||||
|
|
||||||
|
When a `SignedEntry` is inserted into a replica via `Store::put()` (the ranger store trait):
|
||||||
|
|
||||||
|
1. **Check prefixes**: Look up all existing entries whose key is a **prefix** of the new entry's key. If any prefix entry has a value `>=` the new entry's value, the new entry is **rejected** (`InsertOutcome::NotInserted`).
|
||||||
|
|
||||||
|
2. **Remove dominated entries**: Remove all existing entries whose key **starts with** the new entry's key (i.e., the new key is a prefix of theirs) AND whose value is `<=` the new entry's value.
|
||||||
|
|
||||||
|
3. **Insert**: If not rejected, the new entry is stored.
|
||||||
|
|
||||||
|
This implements a **prefix-aware last-writer-wins** CRDT:
|
||||||
|
- Newer entries for the same (namespace, author, key) tuple replace older ones
|
||||||
|
- A new entry at key "/foo" can delete all entries under "/foo/*" if it's newer
|
||||||
|
- Different authors can coexist on the same key — each author's latest entry is kept
|
||||||
|
|
||||||
|
### Timestamp and Future Shift
|
||||||
|
|
||||||
|
Timestamps are in **microseconds since Unix epoch**. There is a maximum allowed future shift:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub const MAX_TIMESTAMP_FUTURE_SHIFT: u64 = 10 * 60 * Duration::from_secs(1).as_millis() as u64;
|
||||||
|
```
|
||||||
|
|
||||||
|
Entries with timestamps more than 10 minutes in the future of the local clock are rejected during validation.
|
||||||
|
|
||||||
|
### Content Status
|
||||||
|
|
||||||
|
Each entry's content has an availability status:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ContentStatus {
|
||||||
|
Complete, // Content blob is fully available locally
|
||||||
|
Incomplete, // Partially available
|
||||||
|
Missing, // Not available
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This status is communicated during sync to help peers decide whether to download content.
|
||||||
|
|
||||||
|
### AuthorHeads (Efficient Sync Optimization)
|
||||||
|
|
||||||
|
`AuthorHeads` tracks the latest timestamp for each author in a document:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct AuthorHeads {
|
||||||
|
heads: BTreeMap<AuthorId, Timestamp>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This enables a quick check: `has_news_for(other)` — comparing local and remote heads to determine whether sync would yield any new entries. If all timestamps are at least as recent locally, no sync is needed.
|
||||||
|
|
||||||
|
`AuthorHeads` can be serialized with a size limit, dropping the oldest entries when the limit is exceeded.
|
||||||
|
|
||||||
|
## Event System
|
||||||
|
|
||||||
|
Replicas emit events through a subscription system:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Event {
|
||||||
|
LocalInsert {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
entry: SignedEntry,
|
||||||
|
},
|
||||||
|
RemoteInsert {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
entry: SignedEntry,
|
||||||
|
from: PeerIdBytes,
|
||||||
|
should_download: bool, // based on download policy
|
||||||
|
remote_content_status: ContentStatus,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Subscribers use `async_channel` for non-blocking notification delivery. The `ReplicaInfo::subscribe()` method registers a sender, and events are fanned out to all subscribers.
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
Entry validation during insertion checks:
|
||||||
|
|
||||||
|
1. **Namespace match**: The entry's namespace must match the replica's namespace
|
||||||
|
2. **Signature verification**: For non-local entries, both namespace and author signatures are verified
|
||||||
|
3. **Timestamp check**: The entry must not be more than `MAX_TIMESTAMP_FUTURE_SHIFT` in the future
|
||||||
|
4. **Empty entry check**: An empty entry must have `hash == EMPTY && len == 0`, and a non-empty entry must have `len != 0`
|
||||||
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
272
docs/research/references/iroh/iroh-docs/03-sync-protocol.md
Normal file
@@ -0,0 +1,272 @@
|
|||||||
|
# iroh-docs: Range-Based Set Reconciliation (Ranger)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The sync protocol in iroh-docs is based on **Range-Based Set Reconciliation**, implementing the algorithm described in [Aljoscha Meyer's paper (arXiv:2212.13567)](https://arxiv.org/abs/2212.13567).
|
||||||
|
|
||||||
|
The core idea: two peers can efficiently compute the union of their entry sets by recursively partitioning the sets and comparing **fingerprints** (hashes) of partitions. When fingerprints match, no further work is needed. When they differ, the partition is subdivided until the difference can be resolved by sending the actual entries.
|
||||||
|
|
||||||
|
## Key Abstractions
|
||||||
|
|
||||||
|
### RangeEntry Trait
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RangeEntry: Debug + Clone {
|
||||||
|
type Key: RangeKey;
|
||||||
|
type Value: RangeValue;
|
||||||
|
|
||||||
|
fn key(&self) -> &Self::Key;
|
||||||
|
fn value(&self) -> &Self::Value;
|
||||||
|
fn as_fingerprint(&self) -> Fingerprint;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`SignedEntry` implements `RangeEntry`:
|
||||||
|
- `Key` = `RecordIdentifier` (namespace || author || key bytes)
|
||||||
|
- `Value` = `Record` (timestamp, hash, len)
|
||||||
|
- Fingerprint = BLAKE3 hash of (namespace || author || key || timestamp || content_hash)
|
||||||
|
|
||||||
|
### RangeKey Trait
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RangeKey: Sized + Debug + Ord + PartialEq + Clone + 'static {
|
||||||
|
fn is_prefix_of(&self, other: &Self) -> bool; // test-only
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`RecordIdentifier` implements this via byte-level prefix matching: `(namespace, author, key)` where key prefix matching supports the hierarchical deletion semantics.
|
||||||
|
|
||||||
|
### RangeValue Trait
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RangeValue: Sized + Debug + Ord + PartialEq + Clone + 'static {}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Record` implements `RangeValue` with ordering by `(timestamp, hash)` — the Last-Writer-Wins ordering.
|
||||||
|
|
||||||
|
### Fingerprint
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Fingerprint(pub [u8; 32]); // BLAKE3 hash
|
||||||
|
```
|
||||||
|
|
||||||
|
Fingerprints are computed by XOR-ing the individual entry fingerprints within a range. This means:
|
||||||
|
- The fingerprint of the empty set is `BLAKE3([])` (the hash of nothing)
|
||||||
|
- Adding/removing an entry toggles its contribution via XOR
|
||||||
|
- Equal sets produce equal fingerprints
|
||||||
|
|
||||||
|
## Range Concept
|
||||||
|
|
||||||
|
A `Range<K>` represents a half-open interval `[x, y)` in the key space, with special semantics:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct Range<K> {
|
||||||
|
x: K,
|
||||||
|
y: K,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `x == y`: The entire set (all elements)
|
||||||
|
- `x < y`: Standard half-open interval `[x, y)` — includes `x`, excludes `y`
|
||||||
|
- `x > y`: Wrapping range — elements from `x` to end + beginning to `y`
|
||||||
|
|
||||||
|
This wrapping range concept allows the algorithm to work with circular key spaces where the "first" element might be anywhere.
|
||||||
|
|
||||||
|
## Protocol Messages
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub type ProtocolMessage = crate::ranger::Message<SignedEntry>;
|
||||||
|
```
|
||||||
|
|
||||||
|
### Message Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Message<E: RangeEntry> {
|
||||||
|
parts: Vec<MessagePart<E>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum MessagePart<E: RangeEntry> {
|
||||||
|
RangeFingerprint(RangeFingerprint<E::Key>), // "Here's a fingerprint for this range"
|
||||||
|
RangeItem(RangeItem<E>), // "Here are the entries in this range"
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct RangeFingerprint<K> {
|
||||||
|
range: Range<K>,
|
||||||
|
fingerprint: Fingerprint,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct RangeItem<E: RangeEntry> {
|
||||||
|
range: Range<E::Key>,
|
||||||
|
values: Vec<(E, ContentStatus)>,
|
||||||
|
have_local: bool, // If true, sender already has these entries
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `have_local` flag is an optimization: when a peer sends entries AND indicates it already has them locally, the receiver doesn't need to send its own entries in that range back.
|
||||||
|
|
||||||
|
### Wire Format
|
||||||
|
|
||||||
|
Messages are serialized using `postcard` (a compact serde format) and framed with a 4-byte big-endian length prefix via `SyncCodec`:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┬──────────────────────────────┐
|
||||||
|
│ u32 BE length │ postcard-encoded Message │
|
||||||
|
└─────────────────┴──────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
Max message size: 1 GiB (`MAX_MESSAGE_SIZE = 1024 * 1024 * 1024`).
|
||||||
|
|
||||||
|
## Sync Algorithm Walkthrough
|
||||||
|
|
||||||
|
### 1. Initiation (Alice → Bob)
|
||||||
|
|
||||||
|
Alice generates the initial message:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn init<S: Store<E>>(store: &mut S) -> Result<Self, S::Error> {
|
||||||
|
let x = store.get_first()?; // First key, or default
|
||||||
|
let range = Range::new(x.clone(), x); // "All elements" range
|
||||||
|
let fingerprint = store.get_fingerprint(&range)?;
|
||||||
|
Ok(Message { parts: vec![RangeFingerprint { range, fingerprint }] })
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This sends a single fingerprint covering the entire set.
|
||||||
|
|
||||||
|
### 2. Processing (Bob processes Alice's message)
|
||||||
|
|
||||||
|
For each part in the message:
|
||||||
|
|
||||||
|
**Case 1: RangeFingerprint matches local fingerprint** → Nothing to do, sets are equal in this range.
|
||||||
|
|
||||||
|
**Case 2: RangeFingerprint is empty OR range has ≤ 1 local entry** → Send all entries in the range as a `RangeItem`.
|
||||||
|
|
||||||
|
**Case 3: Recurse** → Split the range into `split_factor` partitions, compute fingerprints, and send either `RangeFingerprint` (if partition is large) or `RangeItem` (if partition is small enough, ≤ `max_set_size`).
|
||||||
|
|
||||||
|
### 3. Processing RangeItem
|
||||||
|
|
||||||
|
When a peer receives a `RangeItem`:
|
||||||
|
|
||||||
|
1. **Validate** each incoming entry using `validate_cb`
|
||||||
|
2. **Insert** valid entries via `Store::put()` (which handles prefix deletion)
|
||||||
|
3. **Notify** via `on_insert_cb` for actually-inserted entries
|
||||||
|
4. If `have_local` is false, compute the **diff** — entries in the local range not present in the received set — and send them back
|
||||||
|
|
||||||
|
### Configuration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct SyncConfig {
|
||||||
|
max_set_size: usize, // Default: 1 — entries to send before using fingerprints
|
||||||
|
split_factor: usize, // Default: 2 — number of partitions per recursion step
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
With `max_set_size = 1` and `split_factor = 2`, the algorithm behaves like a binary search: each fingerprint mismatch splits the range in two and sends fingerprints for both halves.
|
||||||
|
|
||||||
|
## Store Trait
|
||||||
|
|
||||||
|
The `Store` trait provides the interface that the reconciliation algorithm needs:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Store<E: RangeEntry>: Sized {
|
||||||
|
type Error: Debug + Send + Sync + Into<anyhow::Error> + 'static;
|
||||||
|
type RangeIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||||
|
type ParentIterator<'a>: Iterator<Item = Result<E, Self::Error>> where Self: 'a, E: 'a;
|
||||||
|
|
||||||
|
fn get_first(&mut self) -> Result<E::Key, Self::Error>;
|
||||||
|
fn get_fingerprint(&mut self, range: &Range<E::Key>) -> Result<Fingerprint, Self::Error>;
|
||||||
|
fn entry_put(&mut self, entry: E) -> Result<(), Self::Error>;
|
||||||
|
fn get_range(&mut self, range: Range<E::Key>) -> Result<Self::RangeIterator<'_>, Self::Error>;
|
||||||
|
fn prefixes_of(&mut self, key: &E::Key) -> Result<Self::ParentIterator<'_>, Self::Error>;
|
||||||
|
fn remove_prefix_filtered(&mut self, prefix: &E::Key, predicate: impl Fn(&E::Value) -> bool) -> Result<usize, Self::Error>;
|
||||||
|
fn initial_message(&mut self) -> Result<Message<E>, Self::Error>;
|
||||||
|
async fn process_message<F, F2, F3>(...) -> Result<Option<Message<E>>, Self::Error>;
|
||||||
|
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Insert Semantics in `Store::put()`
|
||||||
|
|
||||||
|
The `put` method implements the CRDT insert logic:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn put(&mut self, entry: E) -> Result<InsertOutcome, Self::Error> {
|
||||||
|
// 1. Check prefix entries — if any parent entry has value >= new entry, reject
|
||||||
|
for prefix_entry in self.prefixes_of(entry.key())? {
|
||||||
|
if entry.value() <= prefix_entry.value() {
|
||||||
|
return Ok(InsertOutcome::NotInserted);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 2. Remove entries whose key is prefixed by new entry's key AND whose value is <=
|
||||||
|
let removed = self.remove_prefix_filtered(entry.key(), |v| entry.value() >= v)?;
|
||||||
|
|
||||||
|
// 3. Insert the new entry
|
||||||
|
self.entry_put(entry)?;
|
||||||
|
Ok(InsertOutcome::Inserted { removed })
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### InsertOutcome
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum InsertOutcome {
|
||||||
|
NotInserted, // A newer or equal entry already exists
|
||||||
|
Inserted { removed: usize }, // Successfully inserted; reports removed entries
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sync Flow at the Protocol Level
|
||||||
|
|
||||||
|
The `Replica` type provides the sync interface:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Create initial message for sync
|
||||||
|
fn sync_initial_message(&mut self) -> anyhow::Result<ProtocolMessage>
|
||||||
|
|
||||||
|
// Process an incoming message and produce optional reply
|
||||||
|
async fn sync_process_message(
|
||||||
|
&mut self,
|
||||||
|
message: ProtocolMessage,
|
||||||
|
from_peer: PeerIdBytes,
|
||||||
|
state: &mut SyncOutcome,
|
||||||
|
) -> Result<Option<ProtocolMessage>, anyhow::Error>
|
||||||
|
```
|
||||||
|
|
||||||
|
### SyncOutcome
|
||||||
|
|
||||||
|
Tracks the result of a sync session:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncOutcome {
|
||||||
|
pub heads_received: AuthorHeads, // Latest timestamps per author from remote
|
||||||
|
pub num_recv: usize, // Number of entries received
|
||||||
|
pub num_sent: usize, // Number of entries sent
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Network Protocol (Codec)
|
||||||
|
|
||||||
|
The sync protocol operates over a QUIC bidirectional stream:
|
||||||
|
|
||||||
|
1. **Alice** (initiator) sends `Message::Init { namespace, message }`
|
||||||
|
2. **Bob** (responder) validates the namespace and either:
|
||||||
|
- Accepts and processes the initial message
|
||||||
|
- Rejects with `Message::Abort { reason }`
|
||||||
|
3. Both peers exchange `Message::Sync(message)` rounds until one side has no reply (convergence reached)
|
||||||
|
|
||||||
|
The `BobState` manages the responder side, tracking namespace and `SyncOutcome` progress across message rounds.
|
||||||
|
|
||||||
|
### Abort Reasons
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum AbortReason {
|
||||||
|
NotFound, // Namespace not available
|
||||||
|
AlreadySyncing, // Already syncing this namespace
|
||||||
|
InternalServerError,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Concurrent Sync Prevention
|
||||||
|
|
||||||
|
When both peers try to sync with each other simultaneously, the system uses a deterministic tiebreaker based on comparing `EndpointId` bytes — the peer with the larger ID accepts, the other connects.
|
||||||
@@ -0,0 +1,257 @@
|
|||||||
|
# iroh-docs: Store and Persistence
|
||||||
|
|
||||||
|
## Store Architecture
|
||||||
|
|
||||||
|
The store is implemented in `store::fs::Store` using `redb`, an embedded key-value database. It supports two modes:
|
||||||
|
|
||||||
|
- **In-memory**: `Store::memory()` — backed by a `Vec<u8>` via `redb::backends::InMemoryBackend`
|
||||||
|
- **Persistent**: `Store::persistent(path)` — backed by a single file on disk
|
||||||
|
|
||||||
|
Both modes use the same `redb` table structure.
|
||||||
|
|
||||||
|
## redb Table Schema
|
||||||
|
|
||||||
|
### Authors Table
|
||||||
|
```
|
||||||
|
Table: "authors-1"
|
||||||
|
Key: [u8; 32] (AuthorId)
|
||||||
|
Value: [u8; 32] (Author secret key bytes)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Namespaces Table
|
||||||
|
```
|
||||||
|
Table: "namespaces-2"
|
||||||
|
Key: [u8; 32] (NamespaceId)
|
||||||
|
Value: (u8, [u8; 32]) (CapabilityKind, key bytes)
|
||||||
|
```
|
||||||
|
|
||||||
|
The `CapabilityKind` discriminates between `Write = 1` (full key stored) and `Read = 2` (only the public key / namespace ID stored).
|
||||||
|
|
||||||
|
### Records Table (Primary)
|
||||||
|
```
|
||||||
|
Table: "records-1"
|
||||||
|
Key: (NamespaceId, AuthorId, key_bytes) = ([u8; 32], [u8; 32], &[u8])
|
||||||
|
Value: (timestamp, namespace_sig, author_sig, len, hash) = (u64, &[u8; 64], &[u8; 64], u64, &[u8; 32])
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the main table storing all document entries. The key layout `(namespace, author, key)` enables efficient range queries for the sync algorithm.
|
||||||
|
|
||||||
|
### Latest-Per-Author Table
|
||||||
|
```
|
||||||
|
Table: "latest-by-author-1"
|
||||||
|
Key: (NamespaceId, AuthorId) = (&[u8; 32], &[u8; 32])
|
||||||
|
Value: (timestamp, key_bytes) = (u64, &[u8])
|
||||||
|
```
|
||||||
|
|
||||||
|
Used to quickly determine the latest entry timestamp for each author, supporting `AuthorHeads` computation and `has_news_for_us()` checks.
|
||||||
|
|
||||||
|
### Records-By-Key Table (Index)
|
||||||
|
```
|
||||||
|
Table: "records-by-key-1"
|
||||||
|
Key: (NamespaceId, key_bytes, AuthorId) = (&[u8; 32], &[u8], &[u8; 32])
|
||||||
|
Value: ()
|
||||||
|
```
|
||||||
|
|
||||||
|
An index table that enables efficient queries by key prefix, supporting `Query::key_prefix()` and `Query::key_exact()` lookups.
|
||||||
|
|
||||||
|
### Namespace Peers Table (Multimap)
|
||||||
|
```
|
||||||
|
MultimapTable: "sync-peers-1"
|
||||||
|
Key: &[u8; 32] (NamespaceId)
|
||||||
|
Value: (Nanos, &PeerIdBytes) (timestamp_nanos, peer_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
Stores up to 5 (`PEERS_PER_DOC_CACHE_SIZE`) recently-useful peers per namespace. This is an LRU cache: when full, the oldest peer is evicted when a new one is registered.
|
||||||
|
|
||||||
|
### Download Policy Table
|
||||||
|
```
|
||||||
|
Table: "download-policy-1"
|
||||||
|
Key: &[u8; 32] (NamespaceId)
|
||||||
|
Value: &[u8] (postcard-encoded DownloadPolicy)
|
||||||
|
```
|
||||||
|
|
||||||
|
Per-namespace download policies controlling which content blobs to automatically download.
|
||||||
|
|
||||||
|
## Store Operations
|
||||||
|
|
||||||
|
### Transaction Model
|
||||||
|
|
||||||
|
The `Store` uses a "current transaction" approach:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum CurrentTransaction {
|
||||||
|
None,
|
||||||
|
Read(ReadOnlyTables),
|
||||||
|
Write(TransactionAndTables),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Read operations obtain a read snapshot
|
||||||
|
- Write operations batch into a write transaction
|
||||||
|
- Transactions older than `MAX_COMMIT_DELAY` (500ms) are automatically committed
|
||||||
|
- `flush()` commits any pending write transaction
|
||||||
|
|
||||||
|
### Core Methods
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Create/open/close replicas
|
||||||
|
fn new_replica(&mut self, namespace: NamespaceSecret) -> Result<Replica<'_>>;
|
||||||
|
fn open_replica(&mut self, namespace_id: &NamespaceId) -> Result<Replica<'_>>;
|
||||||
|
fn close_replica(&mut self, id: NamespaceId);
|
||||||
|
fn import_namespace(&mut self, capability: Capability) -> Result<ImportNamespaceOutcome>;
|
||||||
|
|
||||||
|
// Author management
|
||||||
|
fn new_author<R: CryptoRng>(&mut self, rng: &mut R) -> Result<Author>;
|
||||||
|
fn import_author(&mut self, author: Author) -> Result<()>;
|
||||||
|
fn get_author(&mut self, author_id: &AuthorId) -> Result<Option<Author>>;
|
||||||
|
fn delete_author(&mut self, author: AuthorId) -> Result<()>;
|
||||||
|
|
||||||
|
// Queries
|
||||||
|
fn get_many(&mut self, namespace: NamespaceId, query: impl Into<Query>) -> Result<QueryIterator>;
|
||||||
|
fn get_exact(&mut self, namespace: NamespaceId, author: AuthorId, key: impl AsRef<[u8]>, include_empty: bool) -> Result<Option<SignedEntry>>;
|
||||||
|
fn get_latest_for_each_author(&mut self, namespace: NamespaceId) -> Result<LatestIterator<'_>>;
|
||||||
|
|
||||||
|
// Sync support
|
||||||
|
fn has_news_for_us(&mut self, namespace: NamespaceId, heads: &AuthorHeads) -> Result<Option<NonZeroU64>>;
|
||||||
|
fn get_sync_peers(&mut self, namespace: &NamespaceId) -> Result<Option<PeersIter>>;
|
||||||
|
fn register_useful_peer(&mut self, namespace: NamespaceId, peer: PeerIdBytes) -> Result<()>;
|
||||||
|
|
||||||
|
// Content
|
||||||
|
fn content_hashes(&mut self) -> Result<ContentHashesIterator>;
|
||||||
|
```
|
||||||
|
|
||||||
|
### ImportNamespaceOutcome
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ImportNamespaceOutcome {
|
||||||
|
Inserted, // New namespace created
|
||||||
|
Upgraded, // Existing namespace upgraded from Read to Write
|
||||||
|
NoChange, // Namespace already existed with same or higher capability
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Query System
|
||||||
|
|
||||||
|
The `Query` type supports flexible entry lookups:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Query {
|
||||||
|
kind: QueryKind,
|
||||||
|
filter_author: AuthorFilter,
|
||||||
|
filter_key: KeyFilter,
|
||||||
|
limit: Option<u64>,
|
||||||
|
offset: u64,
|
||||||
|
include_empty: bool,
|
||||||
|
sort_direction: SortDirection,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Query Kinds
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum QueryKind {
|
||||||
|
Flat(FlatQuery), // Returns all matching entries
|
||||||
|
SingleLatestPerKey(SingleLatestPerKeyQuery), // Returns only latest entry per key
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Flat**: Returns all entries matching the filters, sorted by `(namespace, author, key)` or `(namespace, key, author)` depending on `SortBy`
|
||||||
|
- **SingleLatestPerKey**: Groups by key and returns only the latest entry (by record value ordering) per key
|
||||||
|
|
||||||
|
### Filters
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum KeyFilter {
|
||||||
|
Any, // Match all keys
|
||||||
|
Exact(Bytes), // Exact key match
|
||||||
|
Prefix(Bytes), // Key starts with prefix
|
||||||
|
}
|
||||||
|
|
||||||
|
enum AuthorFilter {
|
||||||
|
Any, // Match all authors
|
||||||
|
Exact(AuthorId), // Match specific author
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Builder Pattern
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Get all entries
|
||||||
|
Query::all()
|
||||||
|
|
||||||
|
// Get entries by author
|
||||||
|
Query::author(author_id)
|
||||||
|
|
||||||
|
// Get entries by key prefix
|
||||||
|
Query::key_prefix(b"/path/")
|
||||||
|
|
||||||
|
// Get single latest entry per key
|
||||||
|
Query::single_latest_per_key()
|
||||||
|
.key_prefix(b"/path/")
|
||||||
|
.author(author_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Download Policy
|
||||||
|
|
||||||
|
Controls which content blobs to automatically download after sync:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DownloadPolicy {
|
||||||
|
NothingExcept(Vec<FilterKind>), // Only download matching entries
|
||||||
|
EverythingExcept(Vec<FilterKind>), // Download all except matching (default)
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum FilterKind {
|
||||||
|
Prefix(Bytes), // Matches keys starting with bytes
|
||||||
|
Exact(Bytes), // Matches exact key
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Default: `EverythingExcept(Vec::new())` — download everything.
|
||||||
|
|
||||||
|
## PublicKeyStore
|
||||||
|
|
||||||
|
The `PublicKeyStore` trait caches expanded `ed25519_dalek::VerifyingKey` objects to avoid repeated curve point decompression:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait PublicKeyStore {
|
||||||
|
fn public_key(&self, id: &[u8; 32]) -> Result<VerifyingKey, SignatureError>;
|
||||||
|
fn namespace_key(&self, bytes: &NamespaceId) -> Result<NamespacePublicKey, SignatureError>;
|
||||||
|
fn author_key(&self, bytes: &AuthorId) -> Result<AuthorPublicKey, SignatureError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `MemPublicKeyStore` implementation uses `Arc<RwLock<HashMap<[u8; 32], VerifyingKey>>>` for thread-safe caching.
|
||||||
|
|
||||||
|
The `Store` itself implements `PublicKeyStore`, leveraging its redb tables for author storage and the in-memory cache for fast verification.
|
||||||
|
|
||||||
|
## StoreInstance
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct StoreInstance<'a> {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
store: &'a mut Store,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A `StoreInstance` bundles a namespace ID with a mutable reference to the store, providing the `ranger::Store<SignedEntry>` implementation for the sync algorithm. This is what `Replica` uses internally to perform sync operations.
|
||||||
|
|
||||||
|
## Replica
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Replica<'a, I = Box<ReplicaInfo>> {
|
||||||
|
store: StoreInstance<'a>,
|
||||||
|
info: I,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Replica` is the primary user-facing type for document operations. It combines:
|
||||||
|
- A `StoreInstance` for data access
|
||||||
|
- `ReplicaInfo` for metadata (capability, subscribers, content status callback)
|
||||||
|
|
||||||
|
Key methods:
|
||||||
|
- `insert(key, author, hash, len)` — Insert a new entry
|
||||||
|
- `delete_prefix(prefix, author)` — Delete entries by key prefix
|
||||||
|
- `insert_remote_entry(entry, from, content_status)` — Insert from sync
|
||||||
|
- `hash_and_insert(key, author, data)` — Hash data and insert
|
||||||
|
- `sync_initial_message()` / `sync_process_message()` — Sync protocol operations
|
||||||
@@ -0,0 +1,343 @@
|
|||||||
|
# iroh-docs: Engine and Live Sync
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `Engine` is the top-level coordinator for live document synchronization. It brings together:
|
||||||
|
|
||||||
|
1. **SyncHandle/Actor** — Single-threaded actor for all store and replica operations
|
||||||
|
2. **LiveActor** — Async event loop coordinating sync, gossip, and content downloads
|
||||||
|
3. **GossipState** — Integration with `iroh-gossip` for broadcasting updates
|
||||||
|
4. **Blobs/Downloader** — Integration with `iroh-blobs` for content transfer
|
||||||
|
|
||||||
|
## Engine
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Engine {
|
||||||
|
pub endpoint: Endpoint,
|
||||||
|
pub sync: SyncHandle,
|
||||||
|
pub default_author: DefaultAuthor,
|
||||||
|
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||||
|
actor_handle: AbortOnDropHandle<()>,
|
||||||
|
content_status_cb: ContentStatusCallback,
|
||||||
|
blob_store: iroh_blobs::api::Store,
|
||||||
|
_gc_protect_task: AbortOnDropHandle<()>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Initialization
|
||||||
|
|
||||||
|
```rust
|
||||||
|
Engine::spawn(
|
||||||
|
endpoint, // iroh Endpoint for QUIC connections
|
||||||
|
gossip, // iroh-gossip instance
|
||||||
|
replica_store, // Store for document data
|
||||||
|
bao_store, // iroh-blobs Store for content blobs
|
||||||
|
downloader, // Downloader for fetching blobs
|
||||||
|
default_author_storage, // Where to persist the default author
|
||||||
|
protect_cb, // Optional GC protection callback
|
||||||
|
) -> Result<Self>
|
||||||
|
```
|
||||||
|
|
||||||
|
During spawn:
|
||||||
|
1. A `ContentStatusCallback` is created that checks blob availability in `iroh-blobs`
|
||||||
|
2. A `SyncHandle` actor is spawned on a dedicated thread
|
||||||
|
3. A `LiveActor` is spawned as a tokio task
|
||||||
|
4. The default author is loaded or created
|
||||||
|
5. A GC protection task is started (if callback provided)
|
||||||
|
|
||||||
|
### Key Engine Methods
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Start syncing a document with given peers
|
||||||
|
async fn start_sync(&self, namespace: NamespaceId, peers: Vec<EndpointAddr>) -> Result<()>
|
||||||
|
|
||||||
|
// Stop syncing and leave gossip swarm
|
||||||
|
async fn leave(&self, namespace: NamespaceId, kill_subscribers: bool) -> Result<()>
|
||||||
|
|
||||||
|
// Subscribe to document events
|
||||||
|
async fn subscribe(&self, namespace: NamespaceId) -> Result<impl Stream<Item = Result<LiveEvent>>>
|
||||||
|
|
||||||
|
// Handle incoming QUIC connections
|
||||||
|
async fn handle_connection(&self, conn: Connection) -> Result<()>
|
||||||
|
|
||||||
|
// Shutdown the engine
|
||||||
|
async fn shutdown(&self) -> Result<()>
|
||||||
|
```
|
||||||
|
|
||||||
|
### GC Protection
|
||||||
|
|
||||||
|
The `ProtectCallbackHandler` bridges iroh-docs with iroh-blobs' garbage collection:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let (handler, protect_cb) = ProtectCallbackHandler::new();
|
||||||
|
// protect_cb goes into iroh-blobs GC config
|
||||||
|
// handler goes into Engine::spawn
|
||||||
|
```
|
||||||
|
|
||||||
|
When iroh-blobs runs GC, it calls `protect_cb` which queries the docs store for all content hashes, ensuring blobs referenced by document entries are not garbage-collected.
|
||||||
|
|
||||||
|
## SyncHandle / Actor
|
||||||
|
|
||||||
|
The `SyncHandle` is a handle to a single-threaded actor that processes all store and replica operations sequentially:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncHandle {
|
||||||
|
tx: async_channel::Sender<Action>,
|
||||||
|
join_handle: Arc<Option<std::thread::JoinHandle<()>>>,
|
||||||
|
metrics: Arc<Metrics>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Actor Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
External Code ──async──▶ SyncHandle ──channel──▶ Actor Thread
|
||||||
|
│
|
||||||
|
Store (redb)
|
||||||
|
Replica operations
|
||||||
|
Flush on timeout (500ms)
|
||||||
|
```
|
||||||
|
|
||||||
|
The actor runs on a **dedicated OS thread** (not a tokio task), using `tokio::runtime::Builder::new_current_thread()` internally. This ensures store operations are never concurrent.
|
||||||
|
|
||||||
|
### Action Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum Action {
|
||||||
|
ImportAuthor { author, reply },
|
||||||
|
ExportAuthor { author, reply },
|
||||||
|
DeleteAuthor { author, reply },
|
||||||
|
ImportNamespace { capability, reply },
|
||||||
|
ListAuthors { reply },
|
||||||
|
ListReplicas { reply },
|
||||||
|
ContentHashes { reply },
|
||||||
|
FlushStore { reply },
|
||||||
|
Replica(NamespaceId, ReplicaAction),
|
||||||
|
Shutdown { reply },
|
||||||
|
}
|
||||||
|
|
||||||
|
enum ReplicaAction {
|
||||||
|
Open { reply, opts },
|
||||||
|
Close { reply },
|
||||||
|
GetState { reply },
|
||||||
|
SetSync { sync, reply },
|
||||||
|
Subscribe { sender, reply },
|
||||||
|
Unsubscribe { sender, reply },
|
||||||
|
InsertLocal { author, key, hash, len, reply },
|
||||||
|
DeletePrefix { author, key, reply },
|
||||||
|
InsertRemote { entry, from, content_status, reply },
|
||||||
|
SyncInitialMessage { reply },
|
||||||
|
SyncProcessMessage { message, from, state, reply },
|
||||||
|
GetSyncPeers { reply },
|
||||||
|
RegisterUsefulPeer { peer, reply },
|
||||||
|
GetExact { author, key, include_empty, reply },
|
||||||
|
GetMany { query, reply },
|
||||||
|
DropReplica { reply },
|
||||||
|
ExportSecretKey { reply },
|
||||||
|
HasNewsForUs { heads, reply },
|
||||||
|
SetDownloadPolicy { policy, reply },
|
||||||
|
GetDownloadPolicy { reply },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Replica Opening
|
||||||
|
|
||||||
|
When a replica is opened via the actor, an `OpenReplica` struct is created:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct OpenReplica {
|
||||||
|
info: ReplicaInfo, // Capability, subscribers, content status callback
|
||||||
|
sync: bool, // Whether to accept sync requests
|
||||||
|
handles: usize, // Reference count for open handles
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Multiple handles to the same replica are supported via reference counting.
|
||||||
|
|
||||||
|
## LiveActor
|
||||||
|
|
||||||
|
The `LiveActor` is the central async coordinator:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct LiveActor {
|
||||||
|
inbox: mpsc::Receiver<ToLiveActor>,
|
||||||
|
sync: SyncHandle,
|
||||||
|
endpoint: Endpoint,
|
||||||
|
bao_store: Store,
|
||||||
|
downloader: Downloader,
|
||||||
|
memory_lookup: MemoryLookup,
|
||||||
|
replica_events_tx: async_channel::Sender<Event>,
|
||||||
|
replica_events_rx: async_channel::Receiver<Event>,
|
||||||
|
sync_actor_tx: mpsc::Sender<ToLiveActor>,
|
||||||
|
gossip: GossipState,
|
||||||
|
running_sync_connect: JoinSet<SyncConnectRes>,
|
||||||
|
running_sync_accept: JoinSet<SyncAcceptRes>,
|
||||||
|
download_tasks: JoinSet<DownloadRes>,
|
||||||
|
missing_hashes: HashSet<Hash>,
|
||||||
|
queued_hashes: QueuedHashes,
|
||||||
|
hash_providers: ProviderNodes,
|
||||||
|
subscribers: SubscribersMap,
|
||||||
|
state: NamespaceStates,
|
||||||
|
metrics: Arc<Metrics>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Event Loop
|
||||||
|
|
||||||
|
The `LiveActor::run_inner()` loop uses `tokio::select!` with biased polling:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
tokio::select! {
|
||||||
|
biased;
|
||||||
|
msg = self.inbox.recv() => { /* handle actor messages */ }
|
||||||
|
event = self.replica_events_rx.recv() => { /* handle replica insert events */ }
|
||||||
|
res = self.running_sync_connect.join_next() => { /* sync connect finished */ }
|
||||||
|
res = self.running_sync_accept.join_next() => { /* sync accept finished */ }
|
||||||
|
res = self.download_tasks.join_next() => { /* download completed */ }
|
||||||
|
res = self.gossip.progress() => { /* gossip task progress */ }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### ToLiveActor Messages
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ToLiveActor {
|
||||||
|
StartSync { namespace, peers, reply },
|
||||||
|
Leave { namespace, kill_subscribers, reply },
|
||||||
|
Shutdown { reply },
|
||||||
|
Subscribe { namespace, sender, reply },
|
||||||
|
HandleConnection { conn },
|
||||||
|
AcceptSyncRequest { namespace, peer, reply },
|
||||||
|
IncomingSyncReport { from, report },
|
||||||
|
NeighborContentReady { namespace, node, hash },
|
||||||
|
NeighborUp { namespace, peer },
|
||||||
|
NeighborDown { namespace, peer },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Gossip Operations (Op)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Op {
|
||||||
|
Put(SignedEntry), // New entry inserted
|
||||||
|
ContentReady(Hash), // Content blob now available
|
||||||
|
SyncReport(SyncReport), // Heads summary after sync
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Gossip broadcasts `Op` messages to all swarm participants. When a `Put` is received, the entry is inserted into the local replica. When a `ContentReady` is received, peers know they can download the blob. When a `SyncReport` is received, peers check `has_news_for_us()` to decide if they should sync.
|
||||||
|
|
||||||
|
### Content Download Flow
|
||||||
|
|
||||||
|
1. When a `RemoteInsert` event occurs with `should_download: true`, the entry's content hash is queued for download
|
||||||
|
2. The `LiveActor` uses `iroh_blobs::downloader::Downloader` to fetch the blob
|
||||||
|
3. Known providers (peers who had `ContentStatus::Complete`) are used as download sources
|
||||||
|
4. On download completion, a `LiveEvent::ContentReady` event is emitted
|
||||||
|
|
||||||
|
### LiveEvent (Public API)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum LiveEvent {
|
||||||
|
InsertLocal { entry: Entry },
|
||||||
|
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||||
|
ContentReady { hash: Hash },
|
||||||
|
PendingContentReady,
|
||||||
|
NeighborUp(PublicKey),
|
||||||
|
NeighborDown(PublicKey),
|
||||||
|
SyncFinished(SyncEvent),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`SyncEvent` wraps `SyncFinished`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncFinished {
|
||||||
|
pub namespace: NamespaceId,
|
||||||
|
pub peer: PublicKey,
|
||||||
|
pub outcome: SyncOutcome,
|
||||||
|
pub timings: Timings,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## NamespaceStates
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NamespaceStates(BTreeMap<NamespaceId, NamespaceState>);
|
||||||
|
|
||||||
|
struct NamespaceState {
|
||||||
|
nodes: BTreeMap<EndpointId, PeerState>,
|
||||||
|
may_emit_ready: bool,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Each peer has a `PeerState` tracking sync progress:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct PeerState {
|
||||||
|
state: SyncState, // Idle or Running
|
||||||
|
resync_requested: bool, // Whether a resync was requested during active sync
|
||||||
|
last_sync: Option<(Instant, Result<SyncFinished>)>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This state machine prevents concurrent syncs with the same peer for the same namespace and queues resync requests when needed.
|
||||||
|
|
||||||
|
## DefaultAuthor
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct DefaultAuthor {
|
||||||
|
value: RwLock<AuthorId>,
|
||||||
|
storage: DefaultAuthorStorage,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `DefaultAuthorStorage::Mem` — Ephemeral, creates a new author each time
|
||||||
|
- `DefaultAuthorStorage::Persistent(path)` — Stores the author ID as hex in a file, loads it on startup
|
||||||
|
|
||||||
|
The default author provides a convenient "current user" identity for applications.
|
||||||
|
|
||||||
|
## Docs Protocol Handler
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Docs {
|
||||||
|
engine: Arc<Engine>,
|
||||||
|
api: DocsApi,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Docs` implements `ProtocolHandler` for integration with iroh's `Router`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl ProtocolHandler for Docs {
|
||||||
|
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> { ... }
|
||||||
|
async fn shutdown(&self) { ... }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Builder` pattern configures storage:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let docs = Docs::memory()
|
||||||
|
.spawn(endpoint, blobs, gossip)
|
||||||
|
.await?;
|
||||||
|
// or
|
||||||
|
let docs = Docs::persistent(path)
|
||||||
|
.protect_handler(handler)
|
||||||
|
.spawn(endpoint, blobs, gossip)
|
||||||
|
.await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
## DocTicket
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct DocTicket {
|
||||||
|
pub capability: Capability,
|
||||||
|
pub nodes: Vec<EndpointAddr>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A `DocTicket` encapsulates everything needed to join a document:
|
||||||
|
- A `Capability` (Read or Write) — provides the namespace key
|
||||||
|
- A list of `EndpointAddr` — bootstrap peers to connect to
|
||||||
|
|
||||||
|
Tickets are serialized as base32-encoded postcard data with a `"doc"` prefix, using the `iroh_tickets::Ticket` trait.
|
||||||
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
189
docs/research/references/iroh/iroh-docs/06-network-protocol.md
Normal file
@@ -0,0 +1,189 @@
|
|||||||
|
# iroh-docs: Network Protocol and Wire Format
|
||||||
|
|
||||||
|
## ALPN
|
||||||
|
|
||||||
|
The docs protocol uses ALPN `/iroh-sync/1` for QUIC connection identification.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub const ALPN: &[u8] = b"/iroh-sync/1";
|
||||||
|
```
|
||||||
|
|
||||||
|
## Connection Flow
|
||||||
|
|
||||||
|
### Outgoing Sync (Alice — Initiator)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn connect_and_sync(
|
||||||
|
endpoint: &Endpoint,
|
||||||
|
sync: &SyncHandle,
|
||||||
|
namespace: NamespaceId,
|
||||||
|
peer: EndpointAddr,
|
||||||
|
metrics: Option<&Metrics>,
|
||||||
|
) -> Result<SyncFinished, ConnectError>
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Open a QUIC connection to the peer with ALPN `/iroh-sync/1`
|
||||||
|
2. Open a bidirectional QUIC stream
|
||||||
|
3. Run the Alice (initiator) protocol via `run_alice()`
|
||||||
|
4. Close the stream and return `SyncFinished`
|
||||||
|
|
||||||
|
### Incoming Sync (Bob — Responder)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn handle_connection<F, Fut>(
|
||||||
|
sync: SyncHandle,
|
||||||
|
connection: Connection,
|
||||||
|
accept_cb: F,
|
||||||
|
metrics: Option<&Metrics>,
|
||||||
|
) -> Result<SyncFinished, AcceptError>
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Accept a bidirectional QUIC stream from the connection
|
||||||
|
2. Run the Bob (responder) protocol via `BobState::run()`
|
||||||
|
3. The `accept_cb` determines whether to accept or reject each namespace
|
||||||
|
4. Close the stream and return `SyncFinished`
|
||||||
|
|
||||||
|
## Wire Format
|
||||||
|
|
||||||
|
### Frame Codec
|
||||||
|
|
||||||
|
All messages are length-prefixed:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────┬──────────────────────────────┐
|
||||||
|
│ u32 big-endian len │ postcard-serialized Message │
|
||||||
|
└──────────────────────┴──────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
Maximum message size: 1 GiB.
|
||||||
|
|
||||||
|
### Message Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum Message {
|
||||||
|
Init {
|
||||||
|
namespace: NamespaceId, // Which document to sync
|
||||||
|
message: ProtocolMessage, // Initial sync message (ranger::Message<SignedEntry>)
|
||||||
|
},
|
||||||
|
Sync(ProtocolMessage), // Subsequent sync round-trip messages
|
||||||
|
Abort { reason: AbortReason }, // Responder rejects the request
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Serialization
|
||||||
|
|
||||||
|
Messages use `postcard` (a compact `serde` format optimized for embedded/no-std use). The `SyncCodec` implements `tokio_util::codec::Encoder` and `Decoder` for async stream framing.
|
||||||
|
|
||||||
|
## Protocol Sequence
|
||||||
|
|
||||||
|
```
|
||||||
|
Alice (Initiator) Bob (Responder)
|
||||||
|
│ │
|
||||||
|
│──── Init { namespace, initial_msg } ───────▶│
|
||||||
|
│ │
|
||||||
|
│◀─── Sync(reply_msg) ────────────────────── │ (or Abort)
|
||||||
|
│ │
|
||||||
|
│──── Sync(next_msg) ──────────────────────▶│
|
||||||
|
│ │
|
||||||
|
│◀─── Sync(reply_msg) ────────────────────── │
|
||||||
|
│ │
|
||||||
|
│──── Sync(next_msg) ──────────────────────▶│
|
||||||
|
│ │
|
||||||
|
│ ... until convergence ... │
|
||||||
|
│ │
|
||||||
|
│──── (stream closed) ─────────────────────▶│
|
||||||
|
│ │
|
||||||
|
```
|
||||||
|
|
||||||
|
The protocol terminates when one side has no more messages to send (convergence reached). Each `Sync` message carries a `ProtocolMessage` which is a `ranger::Message<SignedEntry>` containing `MessagePart`s (either `RangeFingerprint` or `RangeItem`).
|
||||||
|
|
||||||
|
## SyncFinished Result
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncFinished {
|
||||||
|
pub namespace: NamespaceId,
|
||||||
|
pub peer: PublicKey,
|
||||||
|
pub outcome: SyncOutcome, // heads_received, num_recv, num_sent
|
||||||
|
pub timings: Timings, // connect duration, process duration
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Types
|
||||||
|
|
||||||
|
### ConnectError
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ConnectError {
|
||||||
|
Connect { error: anyhow::Error }, // Connection failed
|
||||||
|
RemoteAbort(AbortReason), // Remote rejected our request
|
||||||
|
Sync { error: anyhow::Error }, // Sync protocol error
|
||||||
|
Close { error: anyhow::Error }, // Stream close error
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### AcceptError
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum AcceptError {
|
||||||
|
Connect { error: anyhow::Error }, // Connection failed
|
||||||
|
Open { peer: PublicKey, error }, // Failed to open replica
|
||||||
|
Abort { peer, namespace, reason }, // We aborted
|
||||||
|
Sync { peer, namespace, error }, // Sync protocol error
|
||||||
|
Close { peer, namespace, error }, // Stream close error
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gossip Integration
|
||||||
|
|
||||||
|
The `GossipState` manages iroh-gossip subscriptions per namespace:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GossipState {
|
||||||
|
gossip: Gossip,
|
||||||
|
sync: SyncHandle,
|
||||||
|
to_live_actor: mpsc::Sender<ToLiveActor>,
|
||||||
|
active: HashMap<NamespaceId, ActiveState>,
|
||||||
|
active_tasks: JoinSet<(NamespaceId, Result<()>)>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When a document starts syncing:
|
||||||
|
1. The engine joins a gossip topic for that namespace
|
||||||
|
2. `GossipState::join()` subscribes with bootstrap peers
|
||||||
|
3. A receive loop task is spawned to process incoming gossip messages
|
||||||
|
4. `Op` messages (Put, ContentReady, SyncReport) are deserialized and forwarded to `LiveActor`
|
||||||
|
|
||||||
|
When receiving an `Op::Put`:
|
||||||
|
```rust
|
||||||
|
// In the gossip receive loop:
|
||||||
|
let entry = SignedEntry::from_entry(...); // deserialize
|
||||||
|
sync.insert_remote(namespace, entry, from, content_status).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
When receiving an `Op::SyncReport`:
|
||||||
|
```rust
|
||||||
|
// Forward to LiveActor which checks has_news_for_us()
|
||||||
|
to_live_actor.send(ToLiveActor::IncomingSyncReport { from, report }).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Broadcasting:
|
||||||
|
```rust
|
||||||
|
// When a local insert occurs:
|
||||||
|
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::Put(entry))).await;
|
||||||
|
|
||||||
|
// When content becomes ready:
|
||||||
|
gossip.broadcast(&namespace, postcard::to_stdvec(&Op::ContentReady(hash))).await;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sync Report Compression
|
||||||
|
|
||||||
|
`SyncReport` encodes `AuthorHeads` with an optional size limit:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncReport {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
heads: Vec<u8>, // postcard-encoded AuthorHeads with size limit
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The size limit ensures gossip messages stay small, dropping the oldest (least recent) author timestamps when necessary.
|
||||||
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
188
docs/research/references/iroh/iroh-docs/07-api-and-data-flow.md
Normal file
@@ -0,0 +1,188 @@
|
|||||||
|
# iroh-docs: API and RPC
|
||||||
|
|
||||||
|
## DocsApi
|
||||||
|
|
||||||
|
The `DocsApi` provides an RPC-based interface to the docs engine, implemented via `irpc`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct DocsApi {
|
||||||
|
inner: Client<DocsProtocol>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Methods (via irpc)
|
||||||
|
|
||||||
|
The API exposes document operations through an RPC protocol defined in `api/protocol.rs`:
|
||||||
|
|
||||||
|
| Method | Request | Response | Description |
|
||||||
|
|--------|---------|----------|-------------|
|
||||||
|
| `Open` | `OpenRequest { doc_id }` | `OpenResponse` | Open a document for operations |
|
||||||
|
| `Close` | `CloseRequest { doc_id }` | `CloseResponse` | Close a document |
|
||||||
|
| `Status` | `StatusRequest { doc_id }` | `StatusResponse { status: OpenState }` | Get document open state |
|
||||||
|
| `List` | `ListRequest` | Stream of `ListResponse { id, capability }` | List all documents |
|
||||||
|
| `Create` | `CreateRequest` | `CreateResponse { id }` | Create a new document |
|
||||||
|
| `Drop` | `DropRequest { doc_id }` | `DropResponse` | Remove a document |
|
||||||
|
| `Import` | `ImportRequest { capability }` | `ImportResponse { doc_id }` | Import a document by capability |
|
||||||
|
| `Set` | `SetRequest { doc_id, author_id, key, value }` | `SetResponse { entry }` | Set a key-value pair |
|
||||||
|
| `SetHash` | `SetHashRequest { doc_id, author_id, key, hash, size }` | `SetHashResponse` | Set a key with pre-hashed content |
|
||||||
|
| `GetMany` | `GetManyRequest { doc_id, query }` | Stream of entries | Query entries |
|
||||||
|
| `GetExact` | `GetExactRequest { doc_id, key, author, include_empty }` | `GetExactResponse { entry }` | Get single entry |
|
||||||
|
| `Del` | `DelRequest { doc_id, author_id, key }` | `DelResponse { removed }` | Delete by key prefix |
|
||||||
|
| `Subscribe` | `SubscribeRequest { doc_id }` | Stream of `LiveEvent` | Subscribe to document events |
|
||||||
|
| `Share` | `ShareRequest { doc_id, mode, peers }` | `ShareResponse { ticket }` | Create a sharing ticket |
|
||||||
|
| `StartSync` | `StartSyncRequest { doc_id, peers }` | `StartSyncResponse` | Start live sync |
|
||||||
|
| `Leave` | `LeaveRequest { doc_id }` | `LeaveResponse` | Leave gossip swarm |
|
||||||
|
| `ImportFile` | `ImportFileRequest { ... }` | Stream of `ImportProgress` | Import file content and set key |
|
||||||
|
| `ExportFile` | `ExportFileRequest { ... }` | Stream of `ExportProgress` | Export content to file |
|
||||||
|
| `AuthorList` | `AuthorListRequest` | Stream of `AuthorListResponse` | List authors |
|
||||||
|
| `AuthorCreate` | `AuthorCreateRequest` | `AuthorCreateResponse { author_id }` | Create new author |
|
||||||
|
| `AuthorImport` | `AuthorImportRequest { author }` | `AuthorImportResponse { author_id }` | Import author key |
|
||||||
|
| `AuthorExport` | `AuthorExportRequest { author_id }` | `AuthorExportResponse { author }` | Export author key |
|
||||||
|
| `AuthorDelete` | `AuthorDeleteRequest { author_id }` | `AuthorDeleteResponse` | Delete author |
|
||||||
|
| `AuthorGetDefault` | `AuthorGetDefaultRequest` | `AuthorGetDefaultResponse { author_id }` | Get default author |
|
||||||
|
| `AuthorSetDefault` | `AuthorSetDefaultRequest { author_id }` | `AuthorSetDefaultResponse` | Set default author |
|
||||||
|
| `SetDownloadPolicy` | `SetDownloadPolicyRequest { doc_id, policy }` | `SetDownloadPolicyResponse` | Set download policy |
|
||||||
|
| `GetDownloadPolicy` | `GetDownloadPolicyRequest { doc_id }` | `GetDownloadPolicyResponse { policy }` | Get download policy |
|
||||||
|
| `GetSyncPeers` | `GetSyncPeersRequest { doc_id }` | `GetSyncPeersResponse { peers }` | Get known sync peers |
|
||||||
|
|
||||||
|
## RPC Implementation
|
||||||
|
|
||||||
|
The RPC is implemented via `irpc` (for local/remote procedure calls) and `noq` (for remote network access):
|
||||||
|
|
||||||
|
### Local API
|
||||||
|
|
||||||
|
`DocsApi::spawn(engine)` creates an `RpcActor` that processes requests against the engine directly:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl DocsApi {
|
||||||
|
pub fn spawn(engine: Arc<Engine>) -> Self {
|
||||||
|
RpcActor::spawn(engine)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Remote API
|
||||||
|
|
||||||
|
When the `rpc` feature is enabled, `DocsApi::connect(endpoint, addr)` creates a remote client that sends requests over the network via `noq`.
|
||||||
|
|
||||||
|
### Protocol Dispatch
|
||||||
|
|
||||||
|
```rust
|
||||||
|
irpc::rpc::Handler<DocsProtocol> dispatches:
|
||||||
|
DocsProtocol::Open(msg) => local.send((msg, tx)).await
|
||||||
|
DocsProtocol::Set(msg) => local.send((msg, tx)).await
|
||||||
|
// ... etc
|
||||||
|
```
|
||||||
|
|
||||||
|
## RpcActor
|
||||||
|
|
||||||
|
The `RpcActor` (in `api/actor.rs`) bridges the RPC protocol to the `Engine`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct RpcActor {
|
||||||
|
engine: Arc<Engine>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
It handles each request type by calling the corresponding `Engine`/`SyncHandle` method and returning the result through the RPC channel.
|
||||||
|
|
||||||
|
For streaming responses (like `GetMany`, `Subscribe`, `AuthorList`), the actor sends results through an `mpsc` channel that the RPC framework streams back to the client.
|
||||||
|
|
||||||
|
## Share Mode and Tickets
|
||||||
|
|
||||||
|
When sharing a document:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ShareMode {
|
||||||
|
Read, // Share with read-only capability
|
||||||
|
Write, // Share with full write capability
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Share` RPC method:
|
||||||
|
1. Gets or creates the namespace capability
|
||||||
|
2. Creates a `DocTicket` with the capability and provided peer addresses
|
||||||
|
3. Starts sync with the provided peers
|
||||||
|
4. Returns the ticket for distribution
|
||||||
|
|
||||||
|
## Example: Basic Setup
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use iroh::{endpoint::presets, protocol::Router, Endpoint};
|
||||||
|
use iroh_blobs::{BlobsProtocol, store::mem::MemStore, ALPN as BLOBS_ALPN};
|
||||||
|
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
|
||||||
|
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};
|
||||||
|
|
||||||
|
#[tokio::main]
|
||||||
|
async fn main() -> anyhow::Result<()> {
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let blobs = MemStore::default();
|
||||||
|
let gossip = Gossip::builder().spawn(endpoint.clone());
|
||||||
|
let docs = Docs::memory()
|
||||||
|
.spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
|
||||||
|
.await?;
|
||||||
|
|
||||||
|
let router = Router::builder(endpoint.clone())
|
||||||
|
.accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
|
||||||
|
.accept(GOSSIP_ALPN, gossip)
|
||||||
|
.accept(DOCS_ALPN, docs)
|
||||||
|
.spawn();
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow Summary
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Application / RPC │
|
||||||
|
│ DocsApi ──irpc──▶ RpcActor ──▶ Engine / SyncHandle │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Live Sync (per document) │
|
||||||
|
│ │
|
||||||
|
│ LiveActor event loop: │
|
||||||
|
│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
|
||||||
|
│ │ Actor Messages │ │ Replica Events │ │ Gossip Events │ │
|
||||||
|
│ │ (StartSync, │ │ (LocalInsert, │ │ (Put, │ │
|
||||||
|
│ │ Subscribe, │ │ RemoteInsert) │ │ ContentReady, │ │
|
||||||
|
│ │ Leave, ...) │ │ │ │ SyncReport) │ │
|
||||||
|
│ └──────┬─────────┘ └───────┬────────┘ └──────┬──────────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ │
|
||||||
|
│ ┌──────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ LiveActor::run_inner() │ │
|
||||||
|
│ │ tokio::select! { ... } │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ - Start/stop gossip subscriptions │ │
|
||||||
|
│ │ - Initiate outgoing syncs (connect_and_sync) │ │
|
||||||
|
│ │ - Accept incoming syncs (handle_connection) │ │
|
||||||
|
│ │ - Queue content downloads │ │
|
||||||
|
│ │ - Broadcast local inserts via gossip │ │
|
||||||
|
│ │ - Emit LiveEvent to subscribers │ │
|
||||||
|
│ └──────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ Running Tasks: │
|
||||||
|
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||||
|
│ │ sync_connect tasks│ │ sync_accept tasks │ │
|
||||||
|
│ └───────────────────┘ └───────────────────┘ │
|
||||||
|
│ ┌───────────────────┐ ┌───────────────────┐ │
|
||||||
|
│ │ download tasks │ │ gossip receive loop│ │
|
||||||
|
│ └───────────────────┘ └───────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Sync Actor (dedicated thread) │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────┐ ┌─────────────────────────────────────────┐ │
|
||||||
|
│ │ Action │ │ Replica Operations: │ │
|
||||||
|
│ │ Channel │──▶│ Insert, Delete, Get, Query, │ │
|
||||||
|
│ │ (bounded) │ │ SyncInit, SyncProcess, Open, Close, ...│ │
|
||||||
|
│ └────────────┘ └─────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ Store (redb) ──▶ All reads/writes on this thread │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
@@ -0,0 +1,318 @@
|
|||||||
|
# iroh-docs: Key Types Reference
|
||||||
|
|
||||||
|
## Cryptographic Keys
|
||||||
|
|
||||||
|
### NamespaceSecret
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NamespaceSecret {
|
||||||
|
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- The write capability for a document
|
||||||
|
- Can sign entries (namespace signature)
|
||||||
|
- Derives `NamespacePublicKey` and `NamespaceId`
|
||||||
|
- Serialized as 32 bytes
|
||||||
|
|
||||||
|
### NamespacePublicKey
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NamespacePublicKey(VerifyingKey); // ed25519_dalek::VerifyingKey
|
||||||
|
```
|
||||||
|
|
||||||
|
- The verifying key corresponding to `NamespaceSecret`
|
||||||
|
- Can verify namespace signatures on entries
|
||||||
|
- Serialized as 32 bytes
|
||||||
|
|
||||||
|
### NamespaceId
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NamespaceId([u8; 32]);
|
||||||
|
```
|
||||||
|
|
||||||
|
- The byte representation of `NamespacePublicKey`
|
||||||
|
- Serves as the unique identifier for a document
|
||||||
|
- Can be converted back to `NamespacePublicKey` via `PublicKeyStore` (handles invalid curve points)
|
||||||
|
|
||||||
|
### Author
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Author {
|
||||||
|
signing_key: SigningKey, // ed25519_dalek::SigningKey (32 bytes)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- A writer identity within a document
|
||||||
|
- Can sign entries (author signature)
|
||||||
|
- Derives `AuthorPublicKey` and `AuthorId`
|
||||||
|
- Created randomly with `Author::new(&mut rng)`
|
||||||
|
- Stored persistently in the redb authors table
|
||||||
|
|
||||||
|
### AuthorPublicKey
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct AuthorPublicKey(VerifyingKey);
|
||||||
|
```
|
||||||
|
|
||||||
|
- The verifying key corresponding to an `Author`
|
||||||
|
- Can verify author signatures on entries
|
||||||
|
- Serialized as 32 bytes
|
||||||
|
|
||||||
|
### AuthorId
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct AuthorId([u8; 32]);
|
||||||
|
```
|
||||||
|
|
||||||
|
- Byte representation of `AuthorPublicKey`
|
||||||
|
- Used as a component of `RecordIdentifier`
|
||||||
|
- Has `fmt_short()` for human-readable display (first 10 hex chars)
|
||||||
|
|
||||||
|
## Entry Types
|
||||||
|
|
||||||
|
### RecordIdentifier
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RecordIdentifier(Bytes);
|
||||||
|
// Layout: [NamespaceId(32) | AuthorId(32) | Key(variable)]
|
||||||
|
```
|
||||||
|
|
||||||
|
- The composite key for an entry
|
||||||
|
- Byte layout: 32 bytes namespace + 32 bytes author + variable-length key
|
||||||
|
- Ordering: namespace → author → key (lexicographic)
|
||||||
|
- This ordering is critical for the range-based sync algorithm
|
||||||
|
|
||||||
|
### Record
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Record {
|
||||||
|
len: u64, // Byte length of content
|
||||||
|
hash: Hash, // BLAKE3 hash of content (32 bytes)
|
||||||
|
timestamp: u64, // Microseconds since Unix epoch
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- The value portion of an entry
|
||||||
|
- Ordering: timestamp first, then hash (Last-Writer-Wins)
|
||||||
|
- `Record::empty(timestamp)` creates a tombstone (hash=EMPTY, len=0)
|
||||||
|
- `Record::new_current(hash, len)` uses current system time
|
||||||
|
|
||||||
|
### Entry
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Entry {
|
||||||
|
id: RecordIdentifier,
|
||||||
|
record: Record,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Combines key and value
|
||||||
|
- `Entry::new(id, record)` constructor
|
||||||
|
- `Entry::new_empty(id)` creates a tombstone with current timestamp
|
||||||
|
- `entry.sign(namespace, author)` produces a `SignedEntry`
|
||||||
|
|
||||||
|
### SignedEntry
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SignedEntry {
|
||||||
|
signature: EntrySignature, // Dual Ed25519 signatures
|
||||||
|
entry: Entry,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- An entry with cryptographic proof of authorization and authorship
|
||||||
|
- `SignedEntry::from_entry(entry, namespace, author)` — create from entry
|
||||||
|
- `signed_entry.verify(store)` — verify both signatures using a `PublicKeyStore`
|
||||||
|
- Implements `RangeEntry` for the sync algorithm
|
||||||
|
|
||||||
|
### EntrySignature
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EntrySignature {
|
||||||
|
author_signature: Signature, // 64-byte Ed25519 signature
|
||||||
|
namespace_signature: Signature, // 64-byte Ed25519 signature
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Created by signing the canonical byte encoding of the `Entry`
|
||||||
|
- Both signatures cover the same message bytes
|
||||||
|
- Verification requires both `NamespacePublicKey` and `AuthorPublicKey`
|
||||||
|
|
||||||
|
## Sync Types
|
||||||
|
|
||||||
|
### SyncOutcome
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct SyncOutcome {
|
||||||
|
pub heads_received: AuthorHeads,
|
||||||
|
pub num_recv: usize,
|
||||||
|
pub num_sent: usize,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Tracks the result of a sync session
|
||||||
|
- `heads_received` accumulates the latest timestamp seen from each author on the remote side
|
||||||
|
|
||||||
|
### ProtocolMessage
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub type ProtocolMessage = ranger::Message<SignedEntry>;
|
||||||
|
```
|
||||||
|
|
||||||
|
- The wire type for sync protocol messages
|
||||||
|
- Contains `Vec<MessagePart<SignedEntry>>`
|
||||||
|
|
||||||
|
### ContentStatus
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ContentStatus {
|
||||||
|
Complete, // Content blob fully available
|
||||||
|
Incomplete, // Partially available
|
||||||
|
Missing, // Not available
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Communicated alongside entries during sync
|
||||||
|
- Helps peers decide whether to download content
|
||||||
|
|
||||||
|
### InsertOrigin
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum InsertOrigin {
|
||||||
|
Local,
|
||||||
|
Sync {
|
||||||
|
from: PeerIdBytes, // [u8; 32] — the remote peer
|
||||||
|
remote_content_status: ContentStatus,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Event Types
|
||||||
|
|
||||||
|
### Event (Internal)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Event {
|
||||||
|
LocalInsert {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
entry: SignedEntry,
|
||||||
|
},
|
||||||
|
RemoteInsert {
|
||||||
|
namespace: NamespaceId,
|
||||||
|
entry: SignedEntry,
|
||||||
|
from: PeerIdBytes,
|
||||||
|
should_download: bool,
|
||||||
|
remote_content_status: ContentStatus,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Emitted by `Replica` via `ReplicaInfo` subscribers
|
||||||
|
- `should_download` is determined by the `DownloadPolicy`
|
||||||
|
|
||||||
|
### LiveEvent (Public)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum LiveEvent {
|
||||||
|
InsertLocal { entry: Entry },
|
||||||
|
InsertRemote { from: PublicKey, entry: Entry, content_status: ContentStatus },
|
||||||
|
ContentReady { hash: Hash },
|
||||||
|
PendingContentReady,
|
||||||
|
NeighborUp(PublicKey),
|
||||||
|
NeighborDown(PublicKey),
|
||||||
|
SyncFinished(SyncEvent),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Emitted by the `Engine` through `subscribe()`
|
||||||
|
- `InsertLocal` / `InsertRemote` are derived from `Event` by stripping `SignedEntry` → `Entry`
|
||||||
|
- `ContentReady` is emitted when a blob download completes
|
||||||
|
- `SyncFinished` wraps `SyncFinished` from the network layer
|
||||||
|
|
||||||
|
## Store Types
|
||||||
|
|
||||||
|
### Store (store::fs::Store)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Store {
|
||||||
|
db: Database, // redb database
|
||||||
|
transaction: CurrentTransaction, // Current read/write transaction
|
||||||
|
open_replicas: HashSet<NamespaceId>, // Track which replicas are open
|
||||||
|
pubkeys: MemPublicKeyStore, // Cache for expanded public keys
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Query
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Query {
|
||||||
|
kind: QueryKind, // Flat or SingleLatestPerKey
|
||||||
|
filter_author: AuthorFilter, // Any or Exact
|
||||||
|
filter_key: KeyFilter, // Any, Exact, or Prefix
|
||||||
|
limit: Option<u64>,
|
||||||
|
offset: u64,
|
||||||
|
include_empty: bool,
|
||||||
|
sort_direction: SortDirection,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Capability
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Capability {
|
||||||
|
Write(NamespaceSecret),
|
||||||
|
Read(NamespaceId),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `Write` allows inserting entries and signing them
|
||||||
|
- `Read` allows syncing and reading but not inserting
|
||||||
|
- Can be serialized as `(u8, [u8; 32])` — kind byte + key bytes
|
||||||
|
- `merge()` can upgrade `Read` to `Write`
|
||||||
|
|
||||||
|
### DownloadPolicy
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DownloadPolicy {
|
||||||
|
NothingExcept(Vec<FilterKind>), // Whitelist mode
|
||||||
|
EverythingExcept(Vec<FilterKind>), // Blacklist mode (default)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### DocTicket
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct DocTicket {
|
||||||
|
pub capability: Capability,
|
||||||
|
pub nodes: Vec<EndpointAddr>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Serializable as a base32 string with "doc" prefix
|
||||||
|
- Contains everything needed to join a document
|
||||||
|
- The wire format uses a versioned enum: `TicketWireFormat::Variant0(DocTicket)`
|
||||||
|
|
||||||
|
## OpenState
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct OpenState {
|
||||||
|
pub sync: bool, // Whether sync is enabled
|
||||||
|
pub subscribers: usize, // Number of event subscribers
|
||||||
|
pub handles: usize, // Number of open handles
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Returned by the `Status` RPC method to report the state of an open document.
|
||||||
|
|
||||||
|
## Utility Constants
|
||||||
|
|
||||||
|
| Constant | Value | Purpose |
|
||||||
|
|----------|-------|---------|
|
||||||
|
| `MAX_TIMESTAMP_FUTURE_SHIFT` | 10 min in μs | Max future drift for entry timestamps |
|
||||||
|
| `MAX_COMMIT_DELAY` | 500ms | Auto-commit interval for store transactions |
|
||||||
|
| `ACTION_CAP` | 1024 | Bounded channel capacity for SyncHandle actions |
|
||||||
|
| `ACTOR_CHANNEL_CAP` | 64 | Channel capacity for LiveActor messages |
|
||||||
|
| `SUBSCRIBE_CHANNEL_CAP` | 256 | Channel capacity for event subscriptions |
|
||||||
|
| `PEERS_PER_DOC_CACHE_SIZE` | 5 | LRU cache size for sync peers per document |
|
||||||
|
| `MAX_MESSAGE_SIZE` | 1 GiB | Max wire message size |
|
||||||
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
59
docs/research/references/iroh/iroh-docs/README.md
Normal file
@@ -0,0 +1,59 @@
|
|||||||
|
# iroh-docs Reference Documentation
|
||||||
|
|
||||||
|
> Version: 0.98.0
|
||||||
|
> Repository: https://github.com/n0-computer/iroh-docs
|
||||||
|
> License: MIT/Apache-2.0
|
||||||
|
> Based on: [Range-Based Set Reconciliation (Meyer, 2022)](https://arxiv.org/abs/2212.13567)
|
||||||
|
|
||||||
|
## Document Index
|
||||||
|
|
||||||
|
| # | File | Topic |
|
||||||
|
|---|------|-------|
|
||||||
|
| 01 | [Overview and Architecture](01-overview-and-architecture.md) | High-level architecture, module layout, dependencies, feature flags |
|
||||||
|
| 02 | [Document Model](02-document-model.md) | CRDT data model: namespaces, authors, entries, signatures, prefix deletion, timestamps |
|
||||||
|
| 03 | [Sync Protocol](03-sync-protocol.md) | Range-based set reconciliation algorithm, fingerprints, message format, Store trait |
|
||||||
|
| 04 | [Store and Persistence](04-store-and-persistence.md) | redb table schema, transaction model, queries, download policies, PublicKeyStore |
|
||||||
|
| 05 | [Engine and Live Sync](05-engine-and-live-sync.md) | Engine, LiveActor, GossipState, content download, event system, DefaultAuthor |
|
||||||
|
| 06 | [Network Protocol](06-network-protocol.md) | ALPN, wire format, Alice/Bob protocol flow, error types, gossip integration |
|
||||||
|
| 07 | [API and Data Flow](07-api-and-data-flow.md) | RPC API, DocsApi, protocol messages, data flow diagrams |
|
||||||
|
| 08 | [Key Types Reference](08-key-types-reference.md) | All public types, constants, and their relationships |
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Core Concepts
|
||||||
|
|
||||||
|
- **Namespace**: A document identity. Identified by `NamespaceId` (32 bytes), backed by an Ed25519 keypair (`NamespaceSecret`).
|
||||||
|
- **Author**: A writer identity. Identified by `AuthorId` (32 bytes), backed by an Ed25519 keypair (`Author`).
|
||||||
|
- **Entry**: A record identified by (namespace, author, key) with a value of (hash, len, timestamp).
|
||||||
|
- **SignedEntry**: An entry with dual Ed25519 signatures (namespace + author) proving authorization and authorship.
|
||||||
|
- **Replica**: A local instance of a document, holding entries in a store.
|
||||||
|
- **Capability**: Either `Write(NamespaceSecret)` or `Read(NamespaceId)` — controls whether entries can be inserted.
|
||||||
|
- **Store**: A `redb`-backed persistent store managing authors, namespaces, entries, and peer caches.
|
||||||
|
- **Engine**: Coordinates sync actors, gossip, and content downloads for live synchronization.
|
||||||
|
|
||||||
|
### Key Algorithms
|
||||||
|
|
||||||
|
1. **Range-based set reconciliation**: Efficiently compute the union of two entry sets over a network by comparing fingerprints of partitions, subdividing when fingerprints differ.
|
||||||
|
2. **Prefix deletion**: An entry at key "foo" acts as a tombstone for all entries whose key starts with "foo/".
|
||||||
|
3. **Last-writer-wins**: When entries conflict on the same (namespace, author, key), the one with the higher (timestamp, hash) wins.
|
||||||
|
4. **XOR fingerprints**: Fingerprint of a set is the XOR of individual entry fingerprints (BLAKE3 hashes of key data).
|
||||||
|
|
||||||
|
### Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
Application → DocsApi → Engine → LiveActor → GossipState → iroh-gossip
|
||||||
|
↓ ↓
|
||||||
|
SyncHandle → Actor → Store (redb) ← QUIC streams (iroh)
|
||||||
|
↓
|
||||||
|
iroh-blobs (content transfer)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dependencies
|
||||||
|
|
||||||
|
- `iroh` — QUIC networking
|
||||||
|
- `iroh-blobs` — Content-addressed blob storage and transfer
|
||||||
|
- `iroh-gossip` — Gossip protocol for live updates
|
||||||
|
- `redb` — Embedded key-value store
|
||||||
|
- `ed25519-dalek` — Ed25519 signatures
|
||||||
|
- `blake3` — Hashing
|
||||||
|
- `postcard` — Serialization
|
||||||
@@ -0,0 +1,79 @@
|
|||||||
|
# iroh-gossip: Overview & Architecture
|
||||||
|
|
||||||
|
## What Is iroh-gossip?
|
||||||
|
|
||||||
|
`iroh-gossip` is a Rust crate that implements an **epidemic broadcast tree** protocol for disseminating messages among a swarm of peers interested in a common **topic**. It is based on two academic papers:
|
||||||
|
|
||||||
|
- **HyParView** — A hybrid partial view membership protocol for reliable swarm management ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf))
|
||||||
|
- **PlumTree** — An epidemic broadcast tree protocol for efficient message dissemination ([paper](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf))
|
||||||
|
|
||||||
|
The crate is designed as a protocol layer for the [iroh](https://docs.rs/iroh) networking library, but the core protocol logic is **IO-free** and can be used independently.
|
||||||
|
|
||||||
|
## High-Level Architecture
|
||||||
|
|
||||||
|
The crate is organized into two primary modules:
|
||||||
|
|
||||||
|
| Module | Purpose | IO-aware? |
|
||||||
|
|--------|---------|-----------|
|
||||||
|
| `proto` | Pure state-machine implementation of the gossip protocol | No — completely IO-free |
|
||||||
|
| `net` | Networking layer that runs the protocol over iroh connections | Yes — depends on `iroh` and tokio |
|
||||||
|
|
||||||
|
The `net` module is behind the `net` feature flag (enabled by default). An optional `rpc` feature adds remote procedure call support via the `irpc`/`noq` crates.
|
||||||
|
|
||||||
|
### Module Dependency Graph
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┐
|
||||||
|
│ api │ ← Public API (Gossip, GossipTopic, GossipSender, GossipReceiver)
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
┌──────▼───────┐
|
||||||
|
│ net │ ← Networking actor, connection loops, dialer
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
┌──────▼───────┐
|
||||||
|
│ proto │ ← Pure protocol state machines
|
||||||
|
│ ┌─────────┐ │
|
||||||
|
│ │hyparview│ │ ← Membership layer
|
||||||
|
│ ├─────────┤ │
|
||||||
|
│ │ plumtree│ │ ← Broadcast layer
|
||||||
|
│ ├─────────┤ │
|
||||||
|
│ │ topic │ │ ← Per-topic coordinator
|
||||||
|
│ ├─────────┤ │
|
||||||
|
│ │ state │ │ ← Multi-topic state manager
|
||||||
|
│ ├─────────┤ │
|
||||||
|
│ │ util │ │ ← Shared data structures (IndexSet, TimeBoundCache, TimerMap)
|
||||||
|
│ └─────────┘ │
|
||||||
|
└──────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Design Principles
|
||||||
|
|
||||||
|
1. **IO-free protocol core**: The `proto` module is a pure state machine. It takes `InEvent`s, produces `OutEvent`s, and has no knowledge of sockets, async runtimes, or network IO.
|
||||||
|
|
||||||
|
2. **Topic-based isolation**: Each topic (`TopicId` = 32-byte identifier) has completely independent state. Topics are separate swarms and broadcast scopes. Joining multiple topics increases connections and routing table size proportionally.
|
||||||
|
|
||||||
|
3. **Actor model for networking**: The `net` module runs a single async `Actor` that manages all topics, connections, and timers. It bridges between the protocol state machine and real network IO.
|
||||||
|
|
||||||
|
4. **Wire protocol**: Messages are serialized with `postcard` (a `no_std`-friendly serde format) and sent over QUIC streams via iroh connections. Each stream is prefixed with a `StreamHeader` containing the topic ID.
|
||||||
|
|
||||||
|
## Crate Features
|
||||||
|
|
||||||
|
| Feature | Default? | Description |
|
||||||
|
|---------|----------|-------------|
|
||||||
|
| `net` | Yes | Networking layer (requires `iroh`, `tokio`, etc.) |
|
||||||
|
| `rpc` | No | RPC support via `irpc`/`noq` for remote control |
|
||||||
|
| `metrics` | Yes | Prometheus-style metrics via `iroh-metrics` |
|
||||||
|
| `test-utils` | No | Test utilities (seeded RNG, etc.) |
|
||||||
|
| `simulator` | No | CLI simulator for testing |
|
||||||
|
| `examples` | No | Example binaries (chat, setup) |
|
||||||
|
|
||||||
|
## Cargo Dependencies (Key Ones)
|
||||||
|
|
||||||
|
- `iroh` / `iroh-base` — Networking primitives (Endpoint, EndpointId, PublicKey, etc.)
|
||||||
|
- `postcard` — Wire serialization (serde-based, `no_std` compatible)
|
||||||
|
- `blake3` — Message ID hashing
|
||||||
|
- `ed25519-dalek` — Cryptographic signatures
|
||||||
|
- `n0-future` / `n0-error` — Async utilities and error handling
|
||||||
|
- `irpc` / `noq` — RPC infrastructure (optional)
|
||||||
|
- `indexmap` — Order-preserving hash collections used in `IndexSet`
|
||||||
@@ -0,0 +1,169 @@
|
|||||||
|
# iroh-gossip: HyParView Membership Protocol
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The HyParView protocol provides **swarm membership management** — it maintains which peers are currently part of the swarm for a given topic and ensures the overlay network remains connected even as nodes join, leave, or fail.
|
||||||
|
|
||||||
|
It is implemented in `src/proto/hyparview.rs`.
|
||||||
|
|
||||||
|
## Core Concept: Two Views
|
||||||
|
|
||||||
|
Each peer maintains two sets of peers:
|
||||||
|
|
||||||
|
| View | Description | Default Size | Connection? |
|
||||||
|
|------|-------------|--------------|-------------|
|
||||||
|
| **Active View** | Peers we maintain active bidirectional connections to | 5 | Yes — TCP/QUIC connection is kept open |
|
||||||
|
| **Passive View** | An address book of peers we know about but are not connected to | 30 | No — just contact information |
|
||||||
|
|
||||||
|
Key invariants:
|
||||||
|
- **Active connections are always bidirectional**: If peer A has peer B in its active view, peer B also has peer A in its active view.
|
||||||
|
- The passive view serves as a **failover pool**: When an active peer disconnects, a random peer from the passive view is promoted to fill the slot.
|
||||||
|
|
||||||
|
## Configuration (`hyparview::Config`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Config {
|
||||||
|
pub active_view_capacity: usize, // Default: 5
|
||||||
|
pub passive_view_capacity: usize, // Default: 30
|
||||||
|
pub active_random_walk_length: Ttl, // Default: Ttl(6)
|
||||||
|
pub passive_random_walk_length: Ttl, // Default: Ttl(3)
|
||||||
|
pub shuffle_random_walk_length: Ttl, // Default: Ttl(6)
|
||||||
|
pub shuffle_active_view_count: usize, // Default: 3
|
||||||
|
pub shuffle_passive_view_count: usize, // Default: 4
|
||||||
|
pub shuffle_interval: Duration, // Default: 60s
|
||||||
|
pub neighbor_request_timeout: Duration, // Default: 500ms
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
These defaults come directly from the HyParView paper (p9), except for `shuffle_interval` and `neighbor_request_timeout` which are "wild guesses" in the code.
|
||||||
|
|
||||||
|
## State Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct State<PI, RG = ThreadRng> {
|
||||||
|
me: PI, // Our peer identity
|
||||||
|
me_data: Option<PeerData>, // Opaque data we share with peers
|
||||||
|
pub active_view: IndexSet<PI>, // Connected peers
|
||||||
|
pub passive_view: IndexSet<PI>, // Known but disconnected peers
|
||||||
|
config: Config,
|
||||||
|
shuffle_scheduled: bool, // Whether shuffle timer is active
|
||||||
|
rng: RG, // Random number generator
|
||||||
|
stats: Stats,
|
||||||
|
pending_neighbor_requests: HashSet<PI>, // Peers we've sent Neighbor to but no reply yet
|
||||||
|
peer_data: HashMap<PI, PeerData>, // Opaque data received from other peers
|
||||||
|
alive_disconnect_peers: HashSet<PI>, // Peers disconnecting but to keep in passive view
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Messages (`hyparview::Message`)
|
||||||
|
|
||||||
|
| Message | Direction | Purpose |
|
||||||
|
|---------|-----------|---------|
|
||||||
|
| `Join(Option<PeerData>)` | New node → Contact | Sent to a known peer to join the swarm |
|
||||||
|
| `ForwardJoin(ForwardJoin)` | Propagated | Forwarded to active view to introduce a new member |
|
||||||
|
| `Neighbor(Neighbor)` | Bidirectional | Request to add sender to active view (with priority) |
|
||||||
|
| `Disconnect(Disconnect)` | Bidirectional | Notification that a peer is leaving or being demoted |
|
||||||
|
| `Shuffle(Shuffle)` | Initiated periodically | Sent to random peer to exchange passive view contacts |
|
||||||
|
| `ShuffleReply(ShuffleReply)` | Reply to Shuffle | Returns a random subset of our views to the origin |
|
||||||
|
|
||||||
|
### Message Details
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ForwardJoin<PI> {
|
||||||
|
peer: PeerInfo<PI>, // The new peer's identity + optional data
|
||||||
|
ttl: Ttl, // Time-to-live, decremented per hop
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct Shuffle<PI> {
|
||||||
|
origin: PI, // Who initiated the shuffle
|
||||||
|
nodes: Vec<PeerInfo<PI>>, // Random subset of our views
|
||||||
|
ttl: Ttl, // Time-to-live for the random walk
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct Neighbor {
|
||||||
|
priority: Priority, // High (cannot be denied) or Low (can be denied)
|
||||||
|
data: Option<PeerData>,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct Disconnect {
|
||||||
|
alive: bool, // If true, peer is still alive (just demoting)
|
||||||
|
_respond: bool, // Obsolete, kept for wire compat
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Join Procedure (Step by Step)
|
||||||
|
|
||||||
|
1. A new node sends `Join(me_data)` to a known contact peer.
|
||||||
|
2. The contact peer adds the new node to its active view (even evicting a random peer if necessary).
|
||||||
|
3. The contact peer forwards `ForwardJoin` to all other peers in its active view with `TTL = active_random_walk_length`.
|
||||||
|
4. Each peer receiving `ForwardJoin`:
|
||||||
|
- If `TTL == 0` or active view has ≤1 peer: sends `Neighbor(High)` to the new node (which adds it to active view).
|
||||||
|
- If `TTL == passive_random_walk_length`: adds the new node to passive view.
|
||||||
|
- Decrements TTL and forwards to a random active peer (different from sender).
|
||||||
|
|
||||||
|
5. The `Neighbor` message establishes the bidirectional active connection. A `Priority::High` neighbor request **must** be accepted (potentially evicting a random active peer). A `Priority::Low` request is only accepted if there is room.
|
||||||
|
|
||||||
|
## Shuffle Mechanism
|
||||||
|
|
||||||
|
Periodically (every `shuffle_interval`), each node:
|
||||||
|
1. Picks a random active peer.
|
||||||
|
2. Sends `Shuffle` containing a random subset of active + passive views plus the origin's info, with a TTL.
|
||||||
|
3. The shuffle message does a random walk (each hop decrements TTL).
|
||||||
|
4. When TTL reaches 0 or the active view is ≤1, the peer accepts the shuffle and replies with `ShuffleReply` containing its own random peers.
|
||||||
|
5. The origin receives `ShuffleReply` and adds new peers to its passive view.
|
||||||
|
|
||||||
|
This ensures the passive view remains fresh and provides good connectivity even in dynamic networks.
|
||||||
|
|
||||||
|
## Failure Recovery
|
||||||
|
|
||||||
|
When a peer in the active view disconnects (detected via `PeerDisconnected`):
|
||||||
|
1. The peer is removed from the active view.
|
||||||
|
2. A `NeighborDown` event is emitted.
|
||||||
|
3. A random peer from the passive view is selected and sent a `Neighbor(Low)` request.
|
||||||
|
4. If that peer doesn't respond within `neighbor_request_timeout`, it's removed from the passive view and another peer is tried.
|
||||||
|
5. This continues until a connection is established or the passive view is exhausted.
|
||||||
|
|
||||||
|
If a `Disconnect(alive=true)` message is received:
|
||||||
|
- The peer is moved to the passive view (not just dropped), because it's still alive.
|
||||||
|
- The `alive_disconnect_peers` set tracks which disconnected peers should be retained in passive view when their connection eventually closes.
|
||||||
|
|
||||||
|
## PeerData
|
||||||
|
|
||||||
|
`PeerData` is an opaque `Bytes` type that peers exchange when joining. In the `net` module, it is used to serialize and transmit addressing information (`AddrInfo`):
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct AddrInfo {
|
||||||
|
relay_url: Option<RelayUrl>,
|
||||||
|
direct_addresses: BTreeSet<SocketAddr>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows the gossip protocol itself to help propagate connectivity information, enabling the `GossipAddressLookup` service to feed addresses back into iroh's endpoint discovery system.
|
||||||
|
|
||||||
|
## Events (`hyparview::Event`)
|
||||||
|
|
||||||
|
| Event | Meaning |
|
||||||
|
|-------|---------|
|
||||||
|
| `NeighborUp(PI)` | A peer was added to our active view |
|
||||||
|
| `NeighborDown(PI)` | A peer was removed from our active view |
|
||||||
|
|
||||||
|
These events are forwarded up to the PlumTree layer and to the application.
|
||||||
|
|
||||||
|
## Timers
|
||||||
|
|
||||||
|
| Timer | Purpose |
|
||||||
|
|-------|---------|
|
||||||
|
| `DoShuffle` | Periodically trigger a shuffle operation |
|
||||||
|
| `PendingNeighborRequest(PI)` | Timeout for a pending neighbor request |
|
||||||
|
|
||||||
|
## IO Trait Pattern
|
||||||
|
|
||||||
|
The HyParView state machine is generic over an `IO` trait:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait IO<PI: Clone> {
|
||||||
|
fn push(&mut self, event: impl Into<OutEvent<PI>>);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows the protocol to emit output events without knowing about the networking layer. The upper layers supply a `VecDeque<OutEvent>` or similar container.
|
||||||
@@ -0,0 +1,256 @@
|
|||||||
|
# iroh-gossip: PlumTree Broadcast Protocol
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The PlumTree (Epidemic Broadcast Trees) protocol provides **efficient message broadcasting** across all peers in a topic's swarm. It builds on top of HyParView's membership layer, using the active view as its peer set.
|
||||||
|
|
||||||
|
It is implemented in `src/proto/plumtree.rs`.
|
||||||
|
|
||||||
|
## Core Concept: Eager vs Lazy Push
|
||||||
|
|
||||||
|
Each peer maintains two subsets of its HyParView active view:
|
||||||
|
|
||||||
|
| Set | Description | Behavior |
|
||||||
|
|-----|-------------|----------|
|
||||||
|
| **Eager push peers** | Peers to whom full messages are sent immediately | Messages are pushed eagerly (full content) |
|
||||||
|
| **Lazy push peers** | Peers to whom only message IDs (hashes) are sent | `IHave` announcements are sent, requesting content only if needed |
|
||||||
|
|
||||||
|
When a peer broadcasts a message:
|
||||||
|
1. The **full message** is pushed to all **eager** peers.
|
||||||
|
2. The **message ID** (a blake3 hash) is pushed to all **lazy** peers (after a short delay for batching).
|
||||||
|
|
||||||
|
This creates an **optimized broadcast tree**: eager peers form a spanning tree for low-latency delivery, while lazy peers provide redundancy through timeout-based recovery.
|
||||||
|
|
||||||
|
## Configuration (`plumtree::Config`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Config {
|
||||||
|
pub graft_timeout_1: Duration, // Default: 80ms
|
||||||
|
pub graft_timeout_2: Duration, // Default: 40ms
|
||||||
|
pub dispatch_timeout: Duration, // Default: 5ms
|
||||||
|
pub optimization_threshold: Round, // Default: Round(7)
|
||||||
|
pub message_cache_retention: Duration, // Default: 30s
|
||||||
|
pub message_id_retention: Duration, // Default: 90s
|
||||||
|
pub cache_evict_interval: Duration, // Default: 1s
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Timeout Semantics
|
||||||
|
|
||||||
|
- **`graft_timeout_1`**: After receiving an `IHave`, wait this long for the full message from an eager peer. If it doesn't arrive, send a `Graft` to the `IHave` sender.
|
||||||
|
- **`graft_timeout_2`**: After sending a `Graft`, wait this shorter timeout for the reply. If no reply, try the next `IHave` sender.
|
||||||
|
- **`dispatch_timeout`**: Delay before batching and sending `IHave` messages. This allows multiple announcements to be aggregated into a single message.
|
||||||
|
- **`optimization_threshold`**: Number of hops difference required to trigger tree optimization (see below).
|
||||||
|
|
||||||
|
### Cache Settings
|
||||||
|
|
||||||
|
- **`message_cache_retention`**: How long to keep full message payloads in cache. This enables replying to `Graft` requests from peers who missed the eager push.
|
||||||
|
- **`message_id_retention`**: How long to remember that we've already seen a message ID. This prevents re-delivering duplicate messages.
|
||||||
|
- **`cache_evict_interval`**: How often to check and evict expired entries.
|
||||||
|
|
||||||
|
## State Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct State<PI> {
|
||||||
|
me: PI, // Our peer identity
|
||||||
|
config: Config, // Protocol configuration
|
||||||
|
|
||||||
|
pub eager_push_peers: BTreeSet<PI>, // Full message delivery peers
|
||||||
|
pub lazy_push_peers: BTreeSet<PI>, // Message-ID-only delivery peers
|
||||||
|
|
||||||
|
lazy_push_queue: BTreeMap<PI, Vec<IHave>>, // Pending IHave announcements (batched)
|
||||||
|
|
||||||
|
missing_messages: HashMap<MessageId, VecDeque<(PI, Round)>>, // IHave senders awaiting delivery
|
||||||
|
received_messages: TimeBoundCache<MessageId, ()>, // Seen message IDs
|
||||||
|
cache: TimeBoundCache<MessageId, Gossip>, // Full message payloads
|
||||||
|
|
||||||
|
graft_timer_scheduled: HashSet<MessageId>, // Active graft timers
|
||||||
|
dispatch_timer_scheduled: bool, // Whether IHave dispatch is pending
|
||||||
|
|
||||||
|
init: bool, // Whether first event was processed
|
||||||
|
stats: Stats, // Message counters
|
||||||
|
max_message_size: usize, // Maximum allowed message size
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Message Types (`plumtree::Message`)
|
||||||
|
|
||||||
|
| Message | Direction | Purpose |
|
||||||
|
|---------|-----------|---------|
|
||||||
|
| `Gossip(Gossip)` | Eager push | Full message content, broadcast to eager peers |
|
||||||
|
| `Prune` | Bidirectional | Sent when moving a peer from eager to lazy set |
|
||||||
|
| `Graft(Graft)` | Lazy → Eager upgrade | Request to become an eager peer; may include a message ID to request re-delivery |
|
||||||
|
| `IHave(Vec<IHave>)` | Lazy push | Announcement: "I have these messages" (batched, sent after `dispatch_timeout`) |
|
||||||
|
|
||||||
|
### Gossip Message Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Gossip {
|
||||||
|
id: MessageId, // blake3 hash of content
|
||||||
|
content: Bytes, // The actual message payload
|
||||||
|
scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `DeliveryScope` tracks how many hops the message has traveled:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DeliveryScope {
|
||||||
|
Swarm(Round), // Delivered via the swarm; Round = hop count from origin
|
||||||
|
Neighbors, // Delivered only to direct neighbors (not forwarded further)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Each time a `Gossip` message is forwarded, its `Round` is incremented via `next_round()`. `Neighbors`-scope messages are not forwarded at all.
|
||||||
|
|
||||||
|
### IHave Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct IHave {
|
||||||
|
id: MessageId, // The blake3 hash of the message content
|
||||||
|
round: Round, // The hop count at which the sender received this message
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Graft Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Graft {
|
||||||
|
id: Option<MessageId>, // If set, also reply with full message content
|
||||||
|
round: Round, // The round from the IHave that triggered this graft
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Message ID
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct MessageId([u8; 32]); // blake3 hash of message content
|
||||||
|
|
||||||
|
impl MessageId {
|
||||||
|
pub fn from_content(message: &[u8]) -> Self {
|
||||||
|
Self::from(blake3::hash(message))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Messages are validated: when receiving a `Gossip`, the receiver checks that `MessageId::from_content(&content) == id`. Spoofed messages (where the hash doesn't match the content) are silently discarded.
|
||||||
|
|
||||||
|
## Broadcast Flow
|
||||||
|
|
||||||
|
### Sending a Message
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Compute MessageId = blake3(content)
|
||||||
|
2. Create Gossip { id, content, scope: Swarm(Round(0)) or Neighbors }
|
||||||
|
3. If Swarm scope:
|
||||||
|
a. Add to received_messages and cache
|
||||||
|
b. Queue IHave for lazy peers (dispatched after dispatch_timeout)
|
||||||
|
4. Eager-push Gossip to all eager peers (except self and sender)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Receiving a Gossip Message
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Validate: message.id == blake3(message.content) → discard if invalid
|
||||||
|
2. If already received (in received_messages):
|
||||||
|
→ Send Prune to sender (move sender to lazy set)
|
||||||
|
→ Return (don't re-broadcast)
|
||||||
|
3. If Swarm scope:
|
||||||
|
a. Add to received_messages
|
||||||
|
b. Increment round (next_round)
|
||||||
|
c. Add to cache (for Graft replies)
|
||||||
|
d. Eager-push to all eager peers (except sender)
|
||||||
|
e. Lazy-push IHave to all lazy peers (except sender)
|
||||||
|
f. Check if any prior IHave senders had a shorter path → optimize tree
|
||||||
|
4. Emit Received event to application
|
||||||
|
```
|
||||||
|
|
||||||
|
### Receiving an IHave
|
||||||
|
|
||||||
|
```
|
||||||
|
For each IHave entry:
|
||||||
|
If message ID not in received_messages:
|
||||||
|
Add (sender, round) to missing_messages[message_id]
|
||||||
|
If no graft timer scheduled for this message:
|
||||||
|
Schedule SendGraft timer (graft_timeout_1)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Graft Timer Expiry (Two-Phase)
|
||||||
|
|
||||||
|
**Phase 1 (`graft_timeout_1`):**
|
||||||
|
```
|
||||||
|
If message already received → no-op (cancel)
|
||||||
|
Otherwise:
|
||||||
|
Pop first (peer, round) from missing_messages[message_id]
|
||||||
|
Move peer to eager set
|
||||||
|
Send Graft { id: Some(message_id), round } to that peer
|
||||||
|
Schedule another SendGraft timer (graft_timeout_2) for fallback
|
||||||
|
```
|
||||||
|
|
||||||
|
**Phase 2 (`graft_timeout_2`):**
|
||||||
|
```
|
||||||
|
If message already received → no-op
|
||||||
|
Otherwise:
|
||||||
|
Pop next (peer, round) from missing_messages[message_id]
|
||||||
|
Move that peer to eager set
|
||||||
|
Send Graft { id: Some(message_id), round }
|
||||||
|
Schedule another SendGraft timer (graft_timeout_2)
|
||||||
|
(continues until the message is received or senders are exhausted)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Receiving a Graft
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Move sender to eager set
|
||||||
|
2. If Graft contains a message ID:
|
||||||
|
Look up message in cache
|
||||||
|
If found: send Gossip(message) to the requesting peer
|
||||||
|
```
|
||||||
|
|
||||||
|
### Receiving a Prune
|
||||||
|
|
||||||
|
```
|
||||||
|
Move sender from eager set to lazy set
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tree Optimization
|
||||||
|
|
||||||
|
The PlumTree self-optimizes based on latency. When a `Gossip` message is received, if we previously received an `IHave` for the same message from a different peer, we check whether the IHave path was significantly shorter:
|
||||||
|
|
||||||
|
```
|
||||||
|
if (ihave_round < gossip_round) && (gossip_round - ihave_round) >= optimization_threshold:
|
||||||
|
Graft the IHave sender (move to eager)
|
||||||
|
Prune the Gossip sender (move to lazy)
|
||||||
|
```
|
||||||
|
|
||||||
|
This means if a peer consistently has a shorter path to the message origin, they are promoted to eager, and the longer-path peer is demoted. The `optimization_threshold` (default: 7 hops) prevents thrashing from minor latency differences.
|
||||||
|
|
||||||
|
## Neighbor Events
|
||||||
|
|
||||||
|
PlumTree receives neighbor events from HyParView:
|
||||||
|
|
||||||
|
- **`NeighborUp(peer)`**: Add peer to eager set (all new neighbors start as eager)
|
||||||
|
- **`NeighborDown(peer)`**: Remove from both eager and lazy sets; clean up any `IHave` entries from this peer in `missing_messages`
|
||||||
|
|
||||||
|
## Neighbor-Only Broadcast
|
||||||
|
|
||||||
|
The `Scope::Neighbors` broadcast scope sends a message only to directly connected peers (the active view), without any forwarding:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Scope {
|
||||||
|
Swarm, // Broadcast to all peers in the swarm
|
||||||
|
Neighbors, // Broadcast only to immediate neighbors
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Neighbor-scoped messages are useful for localized communication and are not cached or re-broadcast.
|
||||||
|
|
||||||
|
## Cache Management
|
||||||
|
|
||||||
|
The PlumTree maintains two time-bounded caches:
|
||||||
|
|
||||||
|
1. **`cache`** (`TimeBoundCache<MessageId, Gossip>`): Stores full message payloads for `message_cache_retention` (default 30s). This enables replying to `Graft` requests for recently-broadcast messages.
|
||||||
|
|
||||||
|
2. **`received_messages`** (`TimeBoundCache<MessageId, ()>`): Tracks which messages have been seen for `message_id_retention` (default 90s). This prevents duplicate delivery.
|
||||||
|
|
||||||
|
Both caches are periodically evicted (every `cache_evict_interval`, default 1s) via the `EvictCache` timer.
|
||||||
187
docs/research/references/iroh/iroh-gossip/04-state-and-topic.md
Normal file
187
docs/research/references/iroh/iroh-gossip/04-state-and-topic.md
Normal file
@@ -0,0 +1,187 @@
|
|||||||
|
# iroh-gossip: Protocol State & Topic Coordination
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `state` module (`src/proto/state.rs`) provides the **top-level protocol state machine** that manages multiple topics. The `topic` module (`src/proto/topic.rs`) coordinates the HyParView and PlumTree state machines for a single topic.
|
||||||
|
|
||||||
|
## Multi-Topic State (`state::State`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct State<PI, R> {
|
||||||
|
me: PI, // Our peer identity
|
||||||
|
me_data: PeerData, // Our opaque peer data
|
||||||
|
config: Config, // Protocol configuration
|
||||||
|
rng: R, // Random number generator
|
||||||
|
states: HashMap<TopicId, topic::State<PI, R>>, // Per-topic state
|
||||||
|
outbox: Outbox<PI>, // Buffered output events
|
||||||
|
peer_topics: ConnsMap<PI>, // Maps peer → set of shared topics
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `State` acts as a **multiplexer** — it routes events to the correct topic's state and collects output events. It also tracks which topics are shared with each peer (in `peer_topics`), which is used to determine when a peer connection can safely be closed (only when no topic still needs it).
|
||||||
|
|
||||||
|
### TopicId
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Clone, Copy, Eq, PartialEq, Hash, Serialize, Ord, PartialOrd, Deserialize)]
|
||||||
|
pub struct TopicId([u8; 32]);
|
||||||
|
```
|
||||||
|
|
||||||
|
A 32-byte identifier for a topic. Typically created as `blake3::hash(topic_name)` or from raw bytes. Each topic is an independent swarm and broadcast scope.
|
||||||
|
|
||||||
|
### Wire Message Format
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Message<PI> {
|
||||||
|
pub topic: TopicId,
|
||||||
|
pub message: topic::Message<PI>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Every wire message carries the `TopicId` prefix, allowing multiplexing of multiple topics over a single connection.
|
||||||
|
|
||||||
|
### Event Routing
|
||||||
|
|
||||||
|
`InEvent` is mapped to either a topic-specific event or a global event:
|
||||||
|
|
||||||
|
| InEvent | Routing |
|
||||||
|
|---------|---------|
|
||||||
|
| `RecvMessage(from, Message{topic, message})` | → Topic-specific: `topic::InEvent::RecvMessage` |
|
||||||
|
| `Command(topic, command)` | → Topic-specific: `topic::InEvent::Command` |
|
||||||
|
| `TimerExpired(Timer{topic, timer})` | → Topic-specific: `topic::InEvent::TimerExpired` |
|
||||||
|
| `PeerDisconnected(peer)` | → Broadcast to ALL topics |
|
||||||
|
| `UpdatePeerData(data)` | → Broadcast to ALL topics |
|
||||||
|
|
||||||
|
### Topic Lifecycle
|
||||||
|
|
||||||
|
When a `Command::Join(peers)` is received for a topic that doesn't yet have state, a new `topic::State` is automatically created. When `Command::Quit` is received, the topic's state is removed after processing the quit event.
|
||||||
|
|
||||||
|
### Connection Management
|
||||||
|
|
||||||
|
When a `topic::OutEvent::DisconnectPeer(peer)` is emitted, the state module checks `peer_topics` to see if any other topic still needs a connection to that peer. Only when no topic needs the peer anymore is `OutEvent::DisconnectPeer(peer)` emitted at the top level.
|
||||||
|
|
||||||
|
## Topic State (`topic::State`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct State<PI, R> {
|
||||||
|
me: PI,
|
||||||
|
pub swarm: hyparview::State<PI, R>, // HyParView membership
|
||||||
|
pub gossip: plumtree::State<PI>, // PlumTree broadcast
|
||||||
|
outbox: VecDeque<OutEvent<PI>>,
|
||||||
|
stats: Stats,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The topic state **composes** HyParView and PlumTree, bridging them together:
|
||||||
|
|
||||||
|
### Event Forwarding
|
||||||
|
|
||||||
|
When `topic::State::handle()` is called:
|
||||||
|
|
||||||
|
1. **HyParView events** are processed first (membership layer).
|
||||||
|
2. **NeighborUp/NeighborDown events** emitted by HyParView are forwarded to PlumTree:
|
||||||
|
- `NeighborUp(peer)` → `plumtree::InEvent::NeighborUp(peer)` — adds peer to eager set
|
||||||
|
- `NeighborDown(peer)` → `plumtree::InEvent::NeighborDown(peer)` — removes peer from both sets
|
||||||
|
3. All output events from both layers are collected and returned.
|
||||||
|
|
||||||
|
### Command Handling
|
||||||
|
|
||||||
|
| Command | Action |
|
||||||
|
|---------|--------|
|
||||||
|
| `Join(peers)` | Sends `RequestJoin(peer)` to HyParView for each peer in the list |
|
||||||
|
| `Broadcast(data, scope)` | Sends `Broadcast(data, scope)` to PlumTree |
|
||||||
|
| `Quit` | Sends `Quit` to HyParView (which sends `Disconnect` to all active peers) |
|
||||||
|
|
||||||
|
### Message Routing
|
||||||
|
|
||||||
|
When a topic message is received:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
match message {
|
||||||
|
Message::Swarm(message) => hyparview.handle(RecvMessage(from, message)),
|
||||||
|
Message::Gossip(message) => plumtree.handle(RecvMessage(from, message)),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Timer Routing
|
||||||
|
|
||||||
|
```rust
|
||||||
|
match timer {
|
||||||
|
Timer::Swarm(timer) => hyparview.handle(TimerExpired(timer)),
|
||||||
|
Timer::Gossip(timer) => plumtree.handle(TimerExpired(timer)),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Messages (`topic::Message`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Message<PI> {
|
||||||
|
Swarm(hyparview::Message<PI>), // Membership messages
|
||||||
|
Gossip(plumtree::Message), // Broadcast messages
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The message kind is used for metrics tracking:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn kind(&self) -> MessageKind {
|
||||||
|
match self {
|
||||||
|
Message::Swarm(_) => MessageKind::Control,
|
||||||
|
Message::Gossip(message) => match message {
|
||||||
|
plumtree::Message::Gossip(_) => MessageKind::Data,
|
||||||
|
_ => MessageKind::Control,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Events (`topic::Event`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Event<PI> {
|
||||||
|
NeighborUp(PI), // From HyParView: new active neighbor
|
||||||
|
NeighborDown(PI), // From HyParView: lost active neighbor
|
||||||
|
Received(GossipEvent<PI>), // From PlumTree: received a gossip message
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Received` event contains:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GossipEvent<PI> {
|
||||||
|
pub content: Bytes, // Message payload
|
||||||
|
pub delivered_from: PI, // Peer that delivered the message to us
|
||||||
|
pub scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Configuration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Config {
|
||||||
|
pub membership: hyparview::Config, // HyParView configuration
|
||||||
|
pub broadcast: plumtree::Config, // PlumTree configuration
|
||||||
|
pub max_message_size: usize, // Maximum wire message size (default: 4096)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `max_message_size` is the total wire-level message size including headers. The actual payload capacity is computed as `max_message_size - postcard_header_size`, where the header size accounts for the topic ID and message envelope overhead.
|
||||||
|
|
||||||
|
## Statistics
|
||||||
|
|
||||||
|
Each topic tracks:
|
||||||
|
```rust
|
||||||
|
pub struct Stats {
|
||||||
|
pub messages_sent: usize,
|
||||||
|
pub messages_received: usize,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The PlumTree layer also tracks:
|
||||||
|
```rust
|
||||||
|
pub struct Stats {
|
||||||
|
pub payload_messages_received: u64,
|
||||||
|
pub control_messages_received: u64,
|
||||||
|
pub max_last_delivery_hop: u16,
|
||||||
|
}
|
||||||
|
```
|
||||||
244
docs/research/references/iroh/iroh-gossip/05-net-actor.md
Normal file
244
docs/research/references/iroh/iroh-gossip/05-net-actor.md
Normal file
@@ -0,0 +1,244 @@
|
|||||||
|
# iroh-gossip: Networking Layer & Actor Model
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `net` module (`src/net.rs` and submodules) provides the async runtime layer that connects the IO-free protocol state machine to real network IO via iroh QUIC connections. It is built around a **single Actor** that manages all topics and connections.
|
||||||
|
|
||||||
|
## ALPN Protocol
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub const GOSSIP_ALPN: &[u8] = b"/iroh-gossip/1";
|
||||||
|
```
|
||||||
|
|
||||||
|
This ALPN identifier is used when establishing QUIC connections through iroh.
|
||||||
|
|
||||||
|
## Gossip Handle (`net::Gossip`)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct Gossip {
|
||||||
|
pub(crate) inner: Arc<Inner>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Gossip` is the primary public handle. It derefs to `GossipApi`, providing the user-facing interface:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Subscribe to a topic
|
||||||
|
let (sender, receiver) = gossip.subscribe(topic_id, bootstrap_peers).await?.split();
|
||||||
|
|
||||||
|
// Subscribe and wait for at least one connection
|
||||||
|
let topic = gossip.subscribe_and_join(topic_id, bootstrap_peers).await?;
|
||||||
|
|
||||||
|
// Broadcast a message
|
||||||
|
sender.broadcast(b"hello world".to_vec().into()).await?;
|
||||||
|
|
||||||
|
// Broadcast to neighbors only
|
||||||
|
sender.broadcast_neighbors(b"local announcement".to_vec().into()).await?;
|
||||||
|
|
||||||
|
// Join additional peers
|
||||||
|
sender.join_peers(vec![peer_id]).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
### Builder Pattern
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let gossip = Gossip::builder()
|
||||||
|
.max_message_size(8192) // Default: 4096
|
||||||
|
.membership_config(hyparview_config) // HyParView settings
|
||||||
|
.broadcast_config(plumtree_config) // PlumTree settings
|
||||||
|
.alpn(b"/custom-alpn") // Custom ALPN (must match across network)
|
||||||
|
.spawn(endpoint);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture: The Actor
|
||||||
|
|
||||||
|
The core of the networking layer is the `Actor` struct, which runs as a single async task:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Actor {
|
||||||
|
alpn: Bytes,
|
||||||
|
state: proto::State<PublicKey, StdRng>, // Protocol state machine
|
||||||
|
endpoint: Endpoint, // iroh endpoint for connections
|
||||||
|
dialer: Dialer, // Manages outgoing connections
|
||||||
|
rpc_rx: mpsc::Receiver<RpcMessage>, // API commands
|
||||||
|
local_rx: mpsc::Receiver<LocalActorMessage>, // Local commands (connections, shutdown)
|
||||||
|
in_event_tx: mpsc::Sender<InEvent>, // Protocol input channel
|
||||||
|
in_event_rx: mpsc::Receiver<InEvent>, // Protocol input channel (receiver)
|
||||||
|
timers: Timers<Timer>, // Scheduled timers
|
||||||
|
topics: HashMap<TopicId, TopicState>, // Per-topic subscription state
|
||||||
|
peers: HashMap<EndpointId, PeerState>, // Per-peer connection state
|
||||||
|
command_rx: stream_group::Keyed<TopicCommandStream>, // Per-topic command streams
|
||||||
|
quit_queue: VecDeque<TopicId>, // Topics pending unsubscription
|
||||||
|
connection_tasks: JoinSet<...>, // Running connection loop tasks
|
||||||
|
metrics: Arc<Metrics>,
|
||||||
|
topic_event_forwarders: JoinSet<TopicId>, // Tasks forwarding events to subscribers
|
||||||
|
address_lookup: GossipAddressLookup, // Address discovery integration
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Event Loop
|
||||||
|
|
||||||
|
The actor's `run()` method calls `event_loop()` in a loop. Each iteration uses `tokio::select!` to handle:
|
||||||
|
|
||||||
|
| Source | Action |
|
||||||
|
|--------|--------|
|
||||||
|
| `local_rx` (local messages) | Handle incoming connections or shutdown |
|
||||||
|
| `rpc_rx` (RPC messages) | Process `Join` requests from the API |
|
||||||
|
| `command_rx` (per-topic commands) | Process `Broadcast`, `BroadcastNeighbors`, `JoinPeers`, or stream closure |
|
||||||
|
| `addr_updates` (endpoint addr changes) | Update our `PeerData` in the protocol state |
|
||||||
|
| `dialer` (connection establishment) | Handle successful/failed outgoing connections |
|
||||||
|
| `in_event_rx` (protocol events from connections) | Feed events to the protocol state machine |
|
||||||
|
| `timers` (scheduled timers) | Feed timer expirations to the protocol state machine |
|
||||||
|
| `connection_tasks` (connection task completions) | Handle peer disconnections |
|
||||||
|
| `topic_event_forwarders` (subscription tasks) | Handle topic cleanup when all subscribers drop |
|
||||||
|
|
||||||
|
### Processing InEvents
|
||||||
|
|
||||||
|
When an `InEvent` is processed, the actor calls `self.state.handle(event, now, metrics)`, which returns `Vec<OutEvent>`. For each `OutEvent`:
|
||||||
|
|
||||||
|
| OutEvent | Action |
|
||||||
|
|----------|--------|
|
||||||
|
| `SendMessage(peer, message)` | Send via peer's active connection or queue for pending connection |
|
||||||
|
| `EmitEvent(topic, event)` | Forward to topic's `broadcast::Sender` → subscribers |
|
||||||
|
| `ScheduleTimer(delay, timer)` | Schedule timer via `Timers` data structure |
|
||||||
|
| `DisconnectPeer(peer)` | Drop the peer's send channel, removing from `peers` map |
|
||||||
|
| `PeerData(endpoint_id, data)` | Decode `AddrInfo` from `PeerData`, add to `GossipAddressLookup` |
|
||||||
|
|
||||||
|
## Connection Management
|
||||||
|
|
||||||
|
### Peer States
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum PeerState {
|
||||||
|
Pending {
|
||||||
|
queue: Vec<ProtoMessage>, // Messages queued while connecting
|
||||||
|
},
|
||||||
|
Active {
|
||||||
|
active_send_tx: mpsc::Sender<ProtoMessage>, // Current active send channel
|
||||||
|
active_conn_id: ConnId, // Stable ID of active connection
|
||||||
|
other_conns: Vec<ConnId>, // Older connections still closing
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When a message needs to be sent to a peer:
|
||||||
|
- **Active**: Send immediately via `active_send_tx`
|
||||||
|
- **Pending**: Queue the message and initiate a dial
|
||||||
|
|
||||||
|
### Dialer
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Dialer {
|
||||||
|
endpoint: Endpoint,
|
||||||
|
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
|
||||||
|
pending_dials: HashMap<EndpointId, CancellationToken>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Dialer` manages outgoing connections. It:
|
||||||
|
1. Checks if a dial is already pending for a peer
|
||||||
|
2. Spawns an async connection task with cancellation support
|
||||||
|
3. Returns completed connections via `next_conn()`
|
||||||
|
|
||||||
|
### Connection Loop
|
||||||
|
|
||||||
|
Each peer connection runs a `connection_loop` task:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn connection_loop(
|
||||||
|
from: PublicKey, // Remote peer's public key
|
||||||
|
conn: Connection, // QUIC connection
|
||||||
|
origin: ConnOrigin, // Accept (incoming) or Dial (outgoing)
|
||||||
|
send_rx: mpsc::Receiver<ProtoMessage>, // Messages to send
|
||||||
|
in_event_tx: mpsc::Sender<InEvent>, // Channel to protocol
|
||||||
|
max_message_size: usize, // Maximum message size
|
||||||
|
queue: Vec<ProtoMessage>, // Queued messages to send first
|
||||||
|
) -> Result<(), ConnectionLoopError>
|
||||||
|
```
|
||||||
|
|
||||||
|
The connection loop:
|
||||||
|
1. First sends any queued messages
|
||||||
|
2. Runs a send loop and receive loop concurrently (`tokio::join!`)
|
||||||
|
3. Uses iroh QUIC bidirectional streams for communication
|
||||||
|
|
||||||
|
### Wire Protocol
|
||||||
|
|
||||||
|
Messages are serialized with `postcard` and sent as **length-prefixed frames** over QUIC unidirectional streams:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┐
|
||||||
|
│ Stream Header │ ── Contains TopicId (sent once per stream)
|
||||||
|
├──────────────┤
|
||||||
|
│ Frame (len) │ ── u32 length prefix
|
||||||
|
│ Frame (data) │ ── postcard-encoded topic::Message<PublicKey>
|
||||||
|
├──────────────┤
|
||||||
|
│ Frame (len) │ ── next message...
|
||||||
|
│ Frame (data) │
|
||||||
|
└──────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
Each topic gets its own unidirectional stream. The stream header is sent once when the stream is opened. Disconnect messages close the stream after being sent.
|
||||||
|
|
||||||
|
The `SendLoop` manages per-topic streams within a connection:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct SendLoop {
|
||||||
|
conn: Connection,
|
||||||
|
streams: HashMap<TopicId, SendStream>, // One stream per topic
|
||||||
|
buffer: Vec<u8>,
|
||||||
|
max_message_size: usize,
|
||||||
|
send_rx: mpsc::Receiver<ProtoMessage>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When a disconnect message is sent for a topic, the stream for that topic is closed (via `finish()`).
|
||||||
|
|
||||||
|
## Topic State (Net Layer)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct TopicState {
|
||||||
|
neighbors: BTreeSet<EndpointId>, // Current active neighbors (from protocol)
|
||||||
|
event_sender: broadcast::Sender<ProtoEvent>, // Broadcast channel to subscribers
|
||||||
|
command_rx_keys: HashSet<stream_group::Key>, // Active command stream keys
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A topic is considered "still needed" if it has either:
|
||||||
|
- Active command receivers (publishers), or
|
||||||
|
- Active event subscribers (subscribers)
|
||||||
|
|
||||||
|
When neither exists, the topic is queued for quit/unsubscription.
|
||||||
|
|
||||||
|
## Address Lookup Integration
|
||||||
|
|
||||||
|
The `GossipAddressLookup` integrates with iroh's address discovery system:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct GossipAddressLookup {
|
||||||
|
endpoints: NodeMap, // BTreeMap<EndpointId, StoredEndpointInfo>
|
||||||
|
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
It implements iroh's `AddressLookup` trait, allowing gossip-discovered peer addresses to feed back into iroh's connection establishment. This means that when a peer shares its address information in `Join` or `ForwardJoin` messages, that information is used to help iroh connect to that peer.
|
||||||
|
|
||||||
|
Entries expire after 5 minutes (configurable via `RetentionOpts`), with eviction checks every 30 seconds.
|
||||||
|
|
||||||
|
## Metrics
|
||||||
|
|
||||||
|
The `Metrics` struct tracks various counters:
|
||||||
|
|
||||||
|
| Metric | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `msgs_ctrl_sent` | Control messages sent |
|
||||||
|
| `msgs_ctrl_recv` | Control messages received |
|
||||||
|
| `msgs_data_sent` | Data messages sent |
|
||||||
|
| `msgs_data_recv` | Data messages received |
|
||||||
|
| `msgs_data_sent_size` | Total size of data messages sent |
|
||||||
|
| `msgs_data_recv_size` | Total size of data messages received |
|
||||||
|
| `msgs_ctrl_sent_size` | Total size of control messages sent |
|
||||||
|
| `msgs_ctrl_recv_size` | Total size of control messages received |
|
||||||
|
| `neighbor_up` | Neighbor connections established |
|
||||||
|
| `neighbor_down` | Neighbor connections lost |
|
||||||
|
| `actor_tick_*` | Various event loop tick counters |
|
||||||
290
docs/research/references/iroh/iroh-gossip/06-api-data-flow.md
Normal file
290
docs/research/references/iroh/iroh-gossip/06-api-data-flow.md
Normal file
@@ -0,0 +1,290 @@
|
|||||||
|
# iroh-gossip: Public API & Data Flow
|
||||||
|
|
||||||
|
## Public API Types
|
||||||
|
|
||||||
|
### Gossip (Main Handle)
|
||||||
|
|
||||||
|
The `Gossip` struct is the main entry point, created via a `Builder`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let gossip = Gossip::builder()
|
||||||
|
.max_message_size(8192)
|
||||||
|
.membership_config(HyparviewConfig { ... })
|
||||||
|
.broadcast_config(PlumtreeConfig { ... })
|
||||||
|
.alpn(b"/custom-alpn")
|
||||||
|
.spawn(endpoint);
|
||||||
|
```
|
||||||
|
|
||||||
|
It derefs to `GossipApi`, which provides:
|
||||||
|
|
||||||
|
| Method | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `subscribe(topic_id, bootstrap)` | Join a topic with default options |
|
||||||
|
| `subscribe_and_join(topic_id, bootstrap)` | Join and wait for at least one connection |
|
||||||
|
| `subscribe_with_opts(topic_id, opts)` | Join with custom `JoinOptions` |
|
||||||
|
| `handle_connection(conn)` | Handle an incoming QUIC connection |
|
||||||
|
| `shutdown()` | Gracefully leave all topics and stop |
|
||||||
|
| `max_message_size()` | Get configured max message size |
|
||||||
|
| `metrics()` | Get metrics handle |
|
||||||
|
|
||||||
|
### GossipTopic (Subscription Handle)
|
||||||
|
|
||||||
|
Returned by `subscribe()`, it is a `Stream<Item = Result<Event, ApiError>>`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let topic: GossipTopic = gossip.subscribe(topic_id, peers).await?;
|
||||||
|
topic.broadcast(b"hello".to_vec().into()).await?;
|
||||||
|
topic.broadcast_neighbors(b"local".to_vec().into()).await?;
|
||||||
|
topic.joined().await?; // Wait for first connection
|
||||||
|
```
|
||||||
|
|
||||||
|
Can be split into sender and receiver:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let (sender, receiver) = topic.split();
|
||||||
|
// sender: GossipSender - can broadcast and join peers
|
||||||
|
// receiver: GossipReceiver - can receive events and check neighbors
|
||||||
|
```
|
||||||
|
|
||||||
|
### GossipSender
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GossipSender(mpsc::Sender<Command>);
|
||||||
|
|
||||||
|
impl GossipSender {
|
||||||
|
pub async fn broadcast(&self, message: Bytes) -> Result<(), ApiError>;
|
||||||
|
pub async fn broadcast_neighbors(&self, message: Bytes) -> Result<(), ApiError>;
|
||||||
|
pub async fn join_peers(&self, peers: Vec<EndpointId>) -> Result<(), ApiError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GossipReceiver
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GossipReceiver {
|
||||||
|
stream: Pin<Box<dyn Stream<Item = Result<Event, ApiError>> + Send + Sync + 'static>>,
|
||||||
|
neighbors: HashSet<EndpointId>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl GossipReceiver {
|
||||||
|
pub fn neighbors(&self) -> impl Iterator<Item = EndpointId> + '_;
|
||||||
|
pub async fn joined(&mut self) -> Result<(), ApiError>;
|
||||||
|
pub fn is_joined(&self) -> bool;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `GossipReceiver` tracks the neighbor set internally by processing `NeighborUp` and `NeighborDown` events.
|
||||||
|
|
||||||
|
### Event Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Event {
|
||||||
|
NeighborUp(EndpointId), // New direct neighbor connected
|
||||||
|
NeighborDown(EndpointId), // Direct neighbor disconnected
|
||||||
|
Received(Message), // Gossip message received
|
||||||
|
Lagged, // Internal channel lagged (messages dropped)
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct Message {
|
||||||
|
pub content: Bytes, // Message content
|
||||||
|
pub scope: DeliveryScope, // Swarm(round) or Neighbors
|
||||||
|
pub delivered_from: EndpointId, // Peer that delivered the message to us
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Command Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Command {
|
||||||
|
Broadcast(Bytes), // Broadcast to all in swarm
|
||||||
|
BroadcastNeighbors(Bytes), // Broadcast to direct neighbors only
|
||||||
|
JoinPeers(Vec<EndpointId>), // Join additional peers
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### JoinOptions
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct JoinOptions {
|
||||||
|
pub bootstrap: BTreeSet<EndpointId>, // Initial peers to connect to
|
||||||
|
pub subscription_capacity: usize, // Event channel capacity (default: 2048)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### DeliveryScope
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DeliveryScope {
|
||||||
|
Swarm(Round), // Message traveled `Round` hops from origin
|
||||||
|
Neighbors, // Direct neighbor message (not forwarded)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`DeliveryScope::Swarm(Round(0))` means the message was sent by a direct neighbor. `Round(n)` means the message traveled n hops.
|
||||||
|
|
||||||
|
## Data Flow Diagrams
|
||||||
|
|
||||||
|
### Joining a Topic
|
||||||
|
|
||||||
|
```
|
||||||
|
User Code GossipApi Actor Proto State
|
||||||
|
| | | |
|
||||||
|
|-- subscribe(topic, peers)->| | |
|
||||||
|
| |-- JoinRequest ------->| |
|
||||||
|
| | |-- Command::Join ------>|
|
||||||
|
| | | |-- RequestJoin(peers)
|
||||||
|
| | | |-- SendMessage(peer, Join)
|
||||||
|
| | | |-- ...
|
||||||
|
| |<-- NeighborUp events--|<-- EmitEvent(NeighborUp)|
|
||||||
|
|<-- Event::NeighborUp ------| | |
|
||||||
|
```
|
||||||
|
|
||||||
|
### Broadcasting a Message
|
||||||
|
|
||||||
|
```
|
||||||
|
User Code GossipSender Actor Proto State Network
|
||||||
|
| | | | |
|
||||||
|
|-- broadcast(msg) ->| | | |
|
||||||
|
| |-- Command:: --> | | |
|
||||||
|
| | Broadcast | | |
|
||||||
|
| | |-- Broadcast ---->| |
|
||||||
|
| | | |-- eager_push --->|
|
||||||
|
| | | | (Gossip msgs) |
|
||||||
|
| | | |-- lazy_push ----->|
|
||||||
|
| | | | (IHave msgs) |
|
||||||
|
| | | | |
|
||||||
|
| (other peer receives Gossip) | | |
|
||||||
|
| | | |<-- RecvMessage --|
|
||||||
|
| | |<-- InEvent -------| |
|
||||||
|
| | | | (validates ID) |
|
||||||
|
| | | | (forwards) |
|
||||||
|
|<-- Received(msg) -|<-- EmitEvent -| | |
|
||||||
|
```
|
||||||
|
|
||||||
|
### Receiving and Processing IHave/Graft
|
||||||
|
|
||||||
|
```
|
||||||
|
Time →
|
||||||
|
|
||||||
|
Peer A Our Node Peer B
|
||||||
|
| | |
|
||||||
|
|-- IHave(id, round) --->| |
|
||||||
|
| | Schedule graft_timeout_1 |
|
||||||
|
| | (wait for eager push) |
|
||||||
|
| | |
|
||||||
|
| [timeout expires] | |
|
||||||
|
| |-- Graft(id, round) ----->| (Peer B sent IHave)
|
||||||
|
| | |
|
||||||
|
| |<-- Gossip(content) -------| (Peer B replies)
|
||||||
|
| | |
|
||||||
|
| |-- Prune ----------------->| (maybe, if optimization)
|
||||||
|
```
|
||||||
|
|
||||||
|
### HyParView Join Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
New Node Contact Node Active Peers of Contact
|
||||||
|
| | |
|
||||||
|
|-- Join(me_data) -->| |
|
||||||
|
| |-- add_active(new) |
|
||||||
|
| |-- Neighbor(High) ----->| (to new node)
|
||||||
|
| |-- ForwardJoin ------->| (to each active peer)
|
||||||
|
| | |-- add_active or add_passive
|
||||||
|
| | |-- Neighbor(Low/High) -> (to new node)
|
||||||
|
| | |-- ForwardJoin -> (random peer)
|
||||||
|
| | |
|
||||||
|
|<-- Neighbor(High) -| |
|
||||||
|
|<-- Neighbor(Low/High) ----------------------|
|
||||||
|
| | |
|
||||||
|
```
|
||||||
|
|
||||||
|
### Shuffle Periodic Operation
|
||||||
|
|
||||||
|
```
|
||||||
|
Node A Node B Random Node
|
||||||
|
| | |
|
||||||
|
|-- Shuffle ---------->| |
|
||||||
|
| (origin=A, nodes, | |
|
||||||
|
| TTL=6) | |
|
||||||
|
| |-- Shuffle ------------>|
|
||||||
|
| | (origin=A, nodes, |
|
||||||
|
| | TTL=5) |
|
||||||
|
| | |-- ...
|
||||||
|
| | |-- (TTL reaches 0)
|
||||||
|
| | |
|
||||||
|
|<-- ShuffleReply ----|<-- ShuffleReply --------|
|
||||||
|
| (random nodes) | (random nodes) |
|
||||||
|
| | |
|
||||||
|
|-- add_passive(nodes from reply) |
|
||||||
|
```
|
||||||
|
|
||||||
|
## RPC Support (Optional Feature)
|
||||||
|
|
||||||
|
When the `rpc` feature is enabled, `GossipApi` can also operate remotely:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Server side
|
||||||
|
gossip.listen(rpc_endpoint).await;
|
||||||
|
|
||||||
|
// Client side
|
||||||
|
let api = GossipApi::connect(rpc_endpoint, addr);
|
||||||
|
let topic = api.subscribe_and_join(topic_id, bootstrap).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
This uses the `irpc`/`noq` crates for bidirectional streaming RPC. The `Join` request establishes a bidirectional stream:
|
||||||
|
- Client → Server: `Command` messages (Broadcast, BroadcastNeighbors, JoinPeers)
|
||||||
|
- Server → Client: `Event` messages (NeighborUp, NeighborDown, Received, Lagged)
|
||||||
|
|
||||||
|
## Channel Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
|
│ Actor │
|
||||||
|
│ │
|
||||||
|
RPC/Local ──────►│ rpc_rx ◄─────────────────────────────────────│
|
||||||
|
Commands │ local_rx ◄── HandleConnection, Shutdown │
|
||||||
|
│ │
|
||||||
|
│ in_event_tx ──► in_event_rx ────────────────│──► proto::State::handle()
|
||||||
|
│ │ │
|
||||||
|
│ ◄── OutEvent ────────────────────────────────│◄──── │
|
||||||
|
│ │ │
|
||||||
|
│ ├──► SendMessage ──► peer.send_tx │
|
||||||
|
│ ├──► EmitEvent ──► topic.event_sender │
|
||||||
|
│ ├──► ScheduleTimer ──► timers │
|
||||||
|
│ ├──► DisconnectPeer ──► drop peer │
|
||||||
|
│ └──► PeerData ──► address_lookup │
|
||||||
|
│ │
|
||||||
|
│ topic.event_sender ──► broadcast channel ────│──► GossipReceiver
|
||||||
|
│ │
|
||||||
|
│ command_rx ◄─── per-topic command streams ──│◄── GossipSender
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration Defaults Summary
|
||||||
|
|
||||||
|
| Parameter | Default | Source |
|
||||||
|
|-----------|---------|--------|
|
||||||
|
| Active view capacity | 5 | HyParView paper (p9) |
|
||||||
|
| Passive view capacity | 30 | HyParView paper (p9) |
|
||||||
|
| Active random walk length | 6 | HyParView paper (p9) |
|
||||||
|
| Passive random walk length | 3 | HyParView paper (p9) |
|
||||||
|
| Shuffle random walk length | 6 | HyParView paper (p9) |
|
||||||
|
| Shuffle active view count | 3 | HyParView paper (p9) |
|
||||||
|
| Shuffle passive view count | 4 | HyParView paper (p9) |
|
||||||
|
| Shuffle interval | 60s | Implementation choice |
|
||||||
|
| Neighbor request timeout | 500ms | Implementation choice |
|
||||||
|
| Graft timeout 1 | 80ms | Implementation choice |
|
||||||
|
| Graft timeout 2 | 40ms | Implementation choice |
|
||||||
|
| Dispatch timeout | 5ms | Implementation choice |
|
||||||
|
| Optimization threshold | 7 hops | PlumTree paper (p12) |
|
||||||
|
| Message cache retention | 30s | Implementation choice |
|
||||||
|
| Message ID retention | 90s | Implementation choice |
|
||||||
|
| Cache evict interval | 1s | Implementation choice |
|
||||||
|
| Max message size | 4096 bytes | Implementation choice |
|
||||||
|
| Send queue capacity | 64 messages | Implementation choice |
|
||||||
|
| To-actor channel capacity | 64 messages | Implementation choice |
|
||||||
|
| In-event channel capacity | 1024 messages | Implementation choice |
|
||||||
|
| Topic event channel capacity | 256 events | Implementation choice |
|
||||||
|
| Topic events default capacity | 2048 events | Implementation choice |
|
||||||
|
| Topic commands channel capacity | 64 commands | Implementation choice |
|
||||||
@@ -0,0 +1,176 @@
|
|||||||
|
# iroh-gossip: Utility Data Structures & Wire Format
|
||||||
|
|
||||||
|
## IndexSet (`proto::util::IndexSet`)
|
||||||
|
|
||||||
|
A wrapper around `indexmap::IndexSet` that provides random selection capabilities needed by HyParView:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct IndexSet<T> {
|
||||||
|
inner: indexmap::IndexSet<T>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Operations
|
||||||
|
|
||||||
|
| Method | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `insert(value)` | Add element (returns false if already present) |
|
||||||
|
| `remove(value)` | Remove by value (swap-remove, O(1)) |
|
||||||
|
| `remove_index(index)` | Remove by index (swap-remove) |
|
||||||
|
| `remove_random(rng)` | Remove a random element |
|
||||||
|
| `pick_random(rng)` | Get reference to random element |
|
||||||
|
| `pick_random_without(exclude, rng)` | Random element excluding certain elements |
|
||||||
|
| `pick_random_index(rng)` | Random index |
|
||||||
|
| `shuffled(rng)` | All elements in random order |
|
||||||
|
| `shuffled_and_capped(len, rng)` | First `len` elements after shuffle |
|
||||||
|
| `shuffled_without(exclude, rng)` | Random order excluding certain elements |
|
||||||
|
| `shuffled_without_and_capped(exclude, len, rng)` | Capped shuffle excluding elements |
|
||||||
|
| `iter_without(value)` | Iterator skipping a specific element |
|
||||||
|
|
||||||
|
These operations are critical for HyParView's random walks, shuffle exchanges, and passive view management.
|
||||||
|
|
||||||
|
## TimerMap (`proto::util::TimerMap`)
|
||||||
|
|
||||||
|
A priority queue of timer entries sorted by `Instant`, with stable ordering via a sequence counter:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct TimerMap<T> {
|
||||||
|
heap: BinaryHeap<TimerMapEntry<T>>,
|
||||||
|
seq: u64,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Used by the protocol state machine for scheduling future events (shuffles, graft timeouts, cache eviction). The networking layer wraps this in an async-friendly `Timers` type that can `wait_next()`.
|
||||||
|
|
||||||
|
### Key Operations
|
||||||
|
|
||||||
|
| Method | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `insert(instant, item)` | Schedule a timer |
|
||||||
|
| `pop_before(limit)` | Pop the earliest entry if it's before `limit` |
|
||||||
|
| `drain_until(from)` | Drain all entries up to a time |
|
||||||
|
| `first()` | Get reference to earliest entry |
|
||||||
|
|
||||||
|
## TimeBoundCache (`proto::util::TimeBoundCache`)
|
||||||
|
|
||||||
|
A `HashMap` where entries expire after a specified `Instant`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct TimeBoundCache<K, V> {
|
||||||
|
map: HashMap<K, (Instant, V)>,
|
||||||
|
expiry: TimerMap<K>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Used by PlumTree for:
|
||||||
|
- `received_messages: TimeBoundCache<MessageId, ()>` — deduplication
|
||||||
|
- `cache: TimeBoundCache<MessageId, Gossip>` — message payload storage for Graft replies
|
||||||
|
|
||||||
|
### Key Operations
|
||||||
|
|
||||||
|
| Method | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `insert(key, value, expires)` | Insert with expiration |
|
||||||
|
| `contains_key(key)` | Check existence |
|
||||||
|
| `get(key)` | Get value |
|
||||||
|
| `expires(key)` | Get expiration time |
|
||||||
|
| `expire_until(instant)` | Remove all expired entries, returns count |
|
||||||
|
| `len()` / `is_empty()` | Size queries |
|
||||||
|
|
||||||
|
The `expire_until` method correctly handles re-insertions: if a key is re-inserted with a later expiration time after being added to the expiry queue, the old expiry entry is ignored (not removed from the map).
|
||||||
|
|
||||||
|
## Wire Format
|
||||||
|
|
||||||
|
### Frame Encoding
|
||||||
|
|
||||||
|
Messages are encoded using `postcard` (a `no_std`-friendly, `serde`-compatible format) and sent as length-prefixed frames:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┬──────────────┬─────────────────┐
|
||||||
|
│ Length (u32) │ TopicHeader │ Message Payload │
|
||||||
|
│ big-endian │ postcard │ postcard │
|
||||||
|
└──────────────┴──────────────┴─────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Stream Protocol
|
||||||
|
|
||||||
|
Each QUIC unidirectional stream is dedicated to a single topic. The stream begins with a `StreamHeader`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct StreamHeader {
|
||||||
|
pub(crate) topic_id: TopicId,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All subsequent frames on that stream carry messages for that topic. When a `Disconnect` message is sent, the stream is closed (via `finish()`).
|
||||||
|
|
||||||
|
### Message Types on Wire
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Message<PI> {
|
||||||
|
Swarm(hyparview::Message<PI>), // Membership messages
|
||||||
|
Gossip(plumtree::Message), // Broadcast messages
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `PI` is `PublicKey` (32-byte ed25519 public key) in the networking layer.
|
||||||
|
|
||||||
|
The `MessageKind` classification is used for metrics:
|
||||||
|
|
||||||
|
| Kind | Message Types |
|
||||||
|
|------|--------------|
|
||||||
|
| `Data` | `Gossip` messages (actual content) |
|
||||||
|
| `Control` | All Swarm messages, plus `Prune`, `Graft`, `IHave` |
|
||||||
|
|
||||||
|
### Message Size Limits
|
||||||
|
|
||||||
|
- Default max message size: 4096 bytes (minimum: 512)
|
||||||
|
- The header size is computed at compile time via `postcard::experimental::serialized_size`
|
||||||
|
- Actual payload capacity = `max_message_size - header_size`
|
||||||
|
|
||||||
|
The `SendLoop` checks message size before writing and returns `WriteError::TooLarge` if exceeded.
|
||||||
|
|
||||||
|
## PeerData & Address Propagation
|
||||||
|
|
||||||
|
The `PeerData` type is an opaque `Bytes` wrapper used in HyParView messages. In the `net` layer, it carries addressing information:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct AddrInfo {
|
||||||
|
relay_url: Option<RelayUrl>,
|
||||||
|
direct_addresses: BTreeSet<SocketAddr>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is serialized with `postcard` and passed as `PeerData` in `Join`, `ForwardJoin`, and `Neighbor` messages. When received, the `AddrInfo` is decoded and fed into `GossipAddressLookup`, which implements iroh's `AddressLookup` trait, allowing gossip-discovered addresses to be used for future connections.
|
||||||
|
|
||||||
|
## GossipAddressLookup
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct GossipAddressLookup {
|
||||||
|
endpoints: NodeMap, // Arc<RwLock<BTreeMap<EndpointId, StoredEndpointInfo>>>
|
||||||
|
_task_handle: Arc<AbortOnDropHandle<()>>, // Background eviction task
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Key behaviors:
|
||||||
|
- **Merging**: When adding addresses for an already-known endpoint, new addresses are merged (union of direct addresses, relay URL is overwritten)
|
||||||
|
- **Expiration**: Entries expire after 5 minutes, with eviction checks every 30 seconds
|
||||||
|
- **Integration**: Implements `iroh::address_lookup::AddressLookup`, returning data with provenance "gossip"
|
||||||
|
|
||||||
|
## Dialer
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Dialer {
|
||||||
|
endpoint: Endpoint,
|
||||||
|
pending: JoinSet<(EndpointId, Option<Result<Connection, ConnectError>>)>,
|
||||||
|
pending_dials: HashMap<EndpointId, CancellationToken>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `Dialer` manages outgoing connection attempts:
|
||||||
|
- Queues a dial via `queue_dial(endpoint_id, alpn)`
|
||||||
|
- Checks for pending dials to avoid duplicate connections
|
||||||
|
- Supports cancellation of in-progress dials
|
||||||
|
- Returns completed connections via `next_conn()`
|
||||||
|
|
||||||
|
When a dial succeeds, the connection is passed to `handle_connection()`. When a dial fails and the peer is not already active, a `PeerDisconnected` event is injected into the protocol state.
|
||||||
@@ -0,0 +1,169 @@
|
|||||||
|
# iroh-gossip: Testing & Simulation
|
||||||
|
|
||||||
|
## Test Infrastructure
|
||||||
|
|
||||||
|
The crate includes two layers of testing:
|
||||||
|
|
||||||
|
### 1. Unit Tests (in source files)
|
||||||
|
|
||||||
|
Unit tests are embedded in each module file behind `#[cfg(test)]`:
|
||||||
|
|
||||||
|
| Module | Tests |
|
||||||
|
|--------|-------|
|
||||||
|
| `proto/hyparview.rs` | Not shown (would be in the file) |
|
||||||
|
| `proto/plumtree.rs` | `optimize_tree`, `spoofed_messages_are_ignored`, `cache_is_evicted` |
|
||||||
|
| `proto.rs` | `hyparview_smoke`, `plumtree_smoke`, `quit` |
|
||||||
|
| `net.rs` | `gossip_net_smoke`, `subscription_cleanup` |
|
||||||
|
| `api.rs` | `test_rpc`, `ensure_gossip_topic_is_sync` |
|
||||||
|
| `proto/util.rs` | `indexset`, `timer_map`, `hex`, `time_bound_cache` |
|
||||||
|
|
||||||
|
### 2. Protocol Simulator (`proto::sim`)
|
||||||
|
|
||||||
|
The `sim` module (behind `test-utils` feature) provides a deterministic network simulator:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Available when feature = "test-utils"
|
||||||
|
pub mod sim;
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows testing the protocol logic without any real networking, using seeded RNG for reproducibility.
|
||||||
|
|
||||||
|
The simulator creates a `Network` of virtual nodes, each running their own `proto::State`. Events are processed in discrete "trips" (round-trips), allowing controlled testing of protocol behavior.
|
||||||
|
|
||||||
|
### 3. Simulation Binary (`sim` feature)
|
||||||
|
|
||||||
|
The crate includes a CLI simulator (behind `simulator` feature) that can run large-scale simulations:
|
||||||
|
|
||||||
|
```
|
||||||
|
cargo run --bin sim --features simulator
|
||||||
|
```
|
||||||
|
|
||||||
|
This uses `rayon` for parallel execution and `comfy-table` for result output.
|
||||||
|
|
||||||
|
### 4. Integration Tests (`tests/sim.rs`)
|
||||||
|
|
||||||
|
Behind the `test-utils` feature, provides end-to-end protocol testing.
|
||||||
|
|
||||||
|
## Key Test Patterns
|
||||||
|
|
||||||
|
### Protocol-Level Smoke Test
|
||||||
|
|
||||||
|
From `proto.rs`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn hyparview_smoke() {
|
||||||
|
let rng = ChaCha12Rng::seed_from_u64(0);
|
||||||
|
let mut config = Config::default();
|
||||||
|
config.membership.active_view_capacity = 2;
|
||||||
|
let mut network = Network::new(config.into(), rng);
|
||||||
|
for i in 0..4 { network.insert(i); }
|
||||||
|
let t: TopicId = [0u8; 32].into();
|
||||||
|
|
||||||
|
// Join nodes
|
||||||
|
network.command(0, t, Command::Join(vec![1, 2]));
|
||||||
|
network.command(1, t, Command::Join(vec![2]));
|
||||||
|
network.command(2, t, Command::Join(vec![]));
|
||||||
|
network.run_trips(3);
|
||||||
|
|
||||||
|
// Verify events and connections
|
||||||
|
assert_eq!(network.events_sorted(), expected);
|
||||||
|
assert_eq!(network.conns(), vec![(0, 1), (0, 2), (1, 2)]);
|
||||||
|
assert!(network.check_synchronicity());
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### PlumTree Optimization Test
|
||||||
|
|
||||||
|
From `plumtree.rs`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn optimize_tree() {
|
||||||
|
// When an IHave message arrives with fewer hops than the Gossip message,
|
||||||
|
// and the difference exceeds optimization_threshold, the tree is restructured:
|
||||||
|
// - The IHave sender is promoted to eager (Graft)
|
||||||
|
// - The Gossip sender is demoted to lazy (Prune)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Spoofed Message Test
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn spoofed_messages_are_ignored() {
|
||||||
|
// Messages where MessageId != blake3(content) are silently discarded
|
||||||
|
let message = Message::Gossip(Gossip {
|
||||||
|
content: content.clone(),
|
||||||
|
id: MessageId::from_content(b"wrong_content"), // Spoofed!
|
||||||
|
scope: DeliveryScope::Swarm(Round(1)),
|
||||||
|
});
|
||||||
|
state.handle(InEvent::RecvMessage(2, message), now, &mut io);
|
||||||
|
// No events are emitted
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Networking Smoke Test
|
||||||
|
|
||||||
|
From `net.rs`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[tokio::test]
|
||||||
|
async fn gossip_net_smoke() {
|
||||||
|
// Creates 3 endpoints with a relay server
|
||||||
|
// Subscribes and joins a topic
|
||||||
|
// Broadcasts messages and verifies reception
|
||||||
|
// Uses real QUIC connections via iroh
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Metrics
|
||||||
|
|
||||||
|
The `Metrics` struct (in `src/metrics.rs`) uses `iroh_metrics::MetricsGroup`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Default, MetricsGroup)]
|
||||||
|
#[metrics(name = "gossip")]
|
||||||
|
pub struct Metrics {
|
||||||
|
pub msgs_ctrl_sent: Counter,
|
||||||
|
pub msgs_ctrl_recv: Counter,
|
||||||
|
pub msgs_data_sent: Counter,
|
||||||
|
pub msgs_data_recv: Counter,
|
||||||
|
pub msgs_data_sent_size: Counter,
|
||||||
|
pub msgs_data_recv_size: Counter,
|
||||||
|
pub msgs_ctrl_sent_size: Counter,
|
||||||
|
pub msgs_ctrl_recv_size: Counter,
|
||||||
|
pub neighbor_up: Counter,
|
||||||
|
pub neighbor_down: Counter,
|
||||||
|
pub actor_tick_main: Counter,
|
||||||
|
pub actor_tick_rx: Counter,
|
||||||
|
pub actor_tick_endpoint: Counter,
|
||||||
|
pub actor_tick_dialer: Counter,
|
||||||
|
pub actor_tick_dialer_success: Counter,
|
||||||
|
pub actor_tick_dialer_failure: Counter,
|
||||||
|
pub actor_tick_in_event_rx: Counter,
|
||||||
|
pub actor_tick_timers: Counter,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
These are tracked both in the protocol state machine (for message counts) and in the actor event loop (for tick-level diagnostics). When the `metrics` feature is enabled, they are exported via Prometheus-compatible endpoints.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
### Academic Papers
|
||||||
|
|
||||||
|
- **HyParView**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "HyParView: A Membership Protocol for Reliable Gossip Multicast with Dense Coverage." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf)
|
||||||
|
- **PlumTree**: Leitao, J., Pereira, J., & Rodrigues, L. (2007). "Epidemic Broadcast Trees." [PDF](https://asc.di.fct.unl.pt/~jleitao/pdf/srds07-leitao.pdf)
|
||||||
|
|
||||||
|
### Implementation Reference
|
||||||
|
|
||||||
|
- Bartosz Sypytkowski's example implementation: [gist](https://gist.github.com/Horusiath/84fac596101b197da0546d1697580d99)
|
||||||
|
|
||||||
|
### Related Projects
|
||||||
|
|
||||||
|
- [iroh](https://docs.rs/iroh) — The networking library that iroh-gossip integrates with
|
||||||
|
- [Earthstar](https://github.com/earthstar-project/earthstar) — Another PlumTree implementation referenced in code comments
|
||||||
|
|
||||||
|
### Crate Repository
|
||||||
|
|
||||||
|
- [github.com/n0-computer/iroh-gossip](https://github.com/n0-computer/iroh-gossip)
|
||||||
40
docs/research/references/iroh/iroh-gossip/README.md
Normal file
40
docs/research/references/iroh/iroh-gossip/README.md
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
# iroh-gossip Reference Documentation
|
||||||
|
|
||||||
|
This directory contains a deep-dive reference on how the `iroh-gossip` crate works, based on source code analysis of the repository at `/workspace/iroh-gossip`.
|
||||||
|
|
||||||
|
## Documents
|
||||||
|
|
||||||
|
| # | File | Topic |
|
||||||
|
|---|------|-------|
|
||||||
|
| 01 | [Overview & Architecture](01-overview-architecture.md) | Crate structure, module organization, design principles, features, dependencies |
|
||||||
|
| 02 | [HyParView Membership](02-hyparview-membership.md) | Swarm membership protocol: active/passive views, join procedure, shuffle mechanism, failure recovery, PeerData |
|
||||||
|
| 03 | [PlumTree Broadcast](03-plumtree-broadcast.md) | Epidemic broadcast trees: eager/lazy push, Graft/IHave/Prune, tree optimization, message deduplication, cache management |
|
||||||
|
| 04 | [State & Topic Coordination](04-state-and-topic.md) | Multi-topic state management, topic lifecycle, event routing between HyParView and PlumTree |
|
||||||
|
| 05 | [Net Actor & Networking](05-net-actor.md) | Actor model, event loop, connection management, Dialer, wire protocol, address lookup, topic state in the net layer |
|
||||||
|
| 06 | [API & Data Flow](06-api-data-flow.md) | Public API types, subscription model, event/command flow, channel architecture, configuration defaults |
|
||||||
|
| 07 | [Utilities & Wire Format](07-utilities-wire-format.md) | IndexSet, TimerMap, TimeBoundCache, serialization, PeerData/AddrInfo, Dialer internals |
|
||||||
|
| 08 | [Testing & Metrics](08-testing-metrics-refs.md) | Test infrastructure, simulation, key test patterns, metrics, references |
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Version
|
||||||
|
`iroh-gossip` v0.97.0
|
||||||
|
|
||||||
|
### ALPN
|
||||||
|
`/iroh-gossip/1`
|
||||||
|
|
||||||
|
### Core Protocols
|
||||||
|
- **HyParView**: Hybrid partial view membership (active view = 5, passive view = 30 by default)
|
||||||
|
- **PlumTree**: Epidemic broadcast trees (eager + lazy push with Graft/IHave optimization)
|
||||||
|
|
||||||
|
### Key Abstractions
|
||||||
|
- **TopicId**: 32-byte identifier for a topic/swarm
|
||||||
|
- **PeerIdentity**: Generic trait (instantiated as `PublicKey` in the net layer)
|
||||||
|
- **PeerData**: Opaque bytes exchanged on join (carries `AddrInfo` in net layer)
|
||||||
|
- **IO trait**: Interface for protocol output events (pure state machine, no IO)
|
||||||
|
|
||||||
|
### Wire Format
|
||||||
|
- Postcard (serde) encoding over QUIC unidirectional streams
|
||||||
|
- Length-prefixed frames (u32 length + postcard payload)
|
||||||
|
- Stream header with TopicId
|
||||||
|
- Max message size: 4096 bytes (configurable, minimum 512)
|
||||||
@@ -0,0 +1,104 @@
|
|||||||
|
# iroh-live: Overview and Architecture
|
||||||
|
|
||||||
|
## What It Is
|
||||||
|
|
||||||
|
iroh-live is a real-time audio/video streaming system built on top of [iroh](https://github.com/n0-computer/iroh) (QUIC-based P2P networking) and [Media over QUIC (MoQ)](https://moq.dev/). It handles the full pipeline: camera/mic capture → encoding → transport → decoding → rendering. Connections are peer-to-peer by default, with an optional relay server for browser access via WebTransport.
|
||||||
|
|
||||||
|
**Status:** Early tech preview. APIs are unstable. Windows support is missing. Audio-video sync is basic.
|
||||||
|
|
||||||
|
## Workspace Crates
|
||||||
|
|
||||||
|
| Crate | Description |
|
||||||
|
|-------|-------------|
|
||||||
|
| `iroh-live` | High-level API: `Live`, `Call`, `Room`, tickets, subscriptions |
|
||||||
|
| `iroh-moq` | MoQ transport layer over iroh/QUIC via `web-transport-iroh` |
|
||||||
|
| `iroh-live-relay` | Relay server bridging iroh P2P to browser WebTransport |
|
||||||
|
| `moq-media` | Media pipelines: capture, encode, decode, publish, subscribe, adaptive bitrate. No iroh dependency |
|
||||||
|
| `rusty-codecs` | Codec implementations (H264/openh264, AV1/rav1e+ rav1d, Opus), hardware accel (VAAPI, V4L2, VideoToolbox) |
|
||||||
|
| `rusty-capture` | Cross-platform capture: PipeWire, V4L2, X11, ScreenCaptureKit, AVFoundation |
|
||||||
|
| `moq-media-egui` | egui integration for video rendering |
|
||||||
|
| `moq-media-dioxus` | dioxus-native integration for video rendering |
|
||||||
|
| `moq-media-android` | Android camera, EGL rendering, JNI bridge |
|
||||||
|
| `iroh-live-cli` | CLI tool (`irl`) for publishing, playing, calls, rooms, relay |
|
||||||
|
|
||||||
|
## Layer Architecture
|
||||||
|
|
||||||
|
Three distinct layers, each usable independently:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────┐
|
||||||
|
│ iroh-live │
|
||||||
|
│ Session management, tickets, rooms, calls │
|
||||||
|
│ Re-exports: moq-media, iroh-moq │
|
||||||
|
├──────────────────────────────────────────────────────────┤
|
||||||
|
│ moq-media │
|
||||||
|
│ Media pipelines: LocalBroadcast, RemoteBroadcast, │
|
||||||
|
│ codecs, adaptive bitrate, playout │
|
||||||
|
│ NO iroh dependency (transport-agnostic) │
|
||||||
|
├──────────────────────────────────────────────────────────┤
|
||||||
|
│ iroh-moq │
|
||||||
|
│ MoQ session management, publish/subscribe over QUIC │
|
||||||
|
│ Uses web-transport-iroh + moq-lite │
|
||||||
|
└──────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
Below moq-media:
|
||||||
|
rusty-codecs ─ codec implementations, hardware accel, wgpu rendering
|
||||||
|
rusty-capture ─ platform-specific screen/camera capture
|
||||||
|
```
|
||||||
|
|
||||||
|
## Design Principles
|
||||||
|
|
||||||
|
1. **`&self` everywhere** — All public types use interior mutability. Safe to share across async tasks/threads without wrappers.
|
||||||
|
2. **Drop-based cleanup** — Dropping a `Call` closes it. Dropping `LocalBroadcast` tears down encoders. Dropping `VideoTrack` stops its decoder thread.
|
||||||
|
3. **Watcher for continuous state, Stream for discrete events** — Connection quality and catalog contents use `n0_watcher::Direct<T>`. Participant joins use `impl Stream`.
|
||||||
|
4. **Declarative intent, not mechanism** — `VideoTarget::default().max_pixels(1280*720)` describes what quality you need. The catalog selects the best rendition.
|
||||||
|
5. **moq-media is standalone** — A recording pipeline can use `LocalBroadcast`/`RemoteBroadcast` without iroh-live. The transport boundary is the `PacketSink`/`PacketSource` trait pair.
|
||||||
|
|
||||||
|
## Data Flow (End-to-End)
|
||||||
|
|
||||||
|
```
|
||||||
|
Publisher Side:
|
||||||
|
capture source (rusty-capture, VideoSource trait)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
encoder pipeline (moq-media, dedicated OS thread)
|
||||||
|
│
|
||||||
|
▼ EncodedFrame
|
||||||
|
PacketSink (MoqPacketSink — starts new MoQ group on keyframe)
|
||||||
|
│
|
||||||
|
▼ MoQ transport (iroh-moq, QUIC streams)
|
||||||
|
|
||||||
|
Subscriber Side:
|
||||||
|
PacketSource (MoqPacketSource — reads ordered frames from MoQ)
|
||||||
|
│
|
||||||
|
▼ MediaPacket
|
||||||
|
decoder pipeline (moq-media, dedicated OS thread)
|
||||||
|
│
|
||||||
|
▼ VideoFrame
|
||||||
|
FramePacer (PTS-based sleep) or Sync (shared playout clock)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
renderer (wgpu texture upload or egui widget)
|
||||||
|
```
|
||||||
|
|
||||||
|
Encoder and decoder pipelines run on **dedicated OS threads**, not tokio tasks, so slow codec operations never block the async runtime. The `forward_packets` async task bridges the network-side `PacketSource` into an mpsc channel that the decoder thread reads synchronously.
|
||||||
|
|
||||||
|
## Key Dependencies
|
||||||
|
|
||||||
|
| Dependency | Purpose |
|
||||||
|
|------------|---------|
|
||||||
|
| `iroh` | QUIC endpoint, connection management, P2P connectivity |
|
||||||
|
| `iroh-gossip` | Gossip protocol for room participant discovery |
|
||||||
|
| `iroh-tickets` | Ticket serialization for `RoomTicket` |
|
||||||
|
| `iroh-smol-kv` | Distributed KV store for room state (gossip-backed) |
|
||||||
|
| `moq-lite` | Core MoQ protocol: BroadcastProducer, BroadcastConsumer, Track, Group |
|
||||||
|
| `hang` | Catalog management for broadcast metadata |
|
||||||
|
| `moq-mux` | MoQ multiplexing |
|
||||||
|
| `moq-relay` | Relay server implementation (used by iroh-live-relay) |
|
||||||
|
| `web-transport-iroh` | WebTransport over iroh QUIC connections |
|
||||||
|
| `n0-future` | Async utilities (FuturesUnordered, AbortOnDropHandle) |
|
||||||
|
| `n0-watcher` | Watchable/Direct reactive state |
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
Dual-licensed: MIT OR Apache-2.0. Copyright 2025 N0, INC.
|
||||||
167
docs/research/references/iroh/iroh-live/02-core-api.md
Normal file
167
docs/research/references/iroh/iroh-live/02-core-api.md
Normal file
@@ -0,0 +1,167 @@
|
|||||||
|
# iroh-live: Core API — Live, Call, Subscription, Ticket
|
||||||
|
|
||||||
|
## `Live` — Entry Point
|
||||||
|
|
||||||
|
The primary entry point for all iroh-live operations. Manages an iroh `Endpoint`, the MoQ transport (`Moq`), and optionally a `Gossip` instance for rooms.
|
||||||
|
|
||||||
|
### Construction
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Simple: from environment, accept incoming connections
|
||||||
|
let live = Live::from_env().await?.with_router().spawn();
|
||||||
|
|
||||||
|
// With gossip for rooms
|
||||||
|
let live = Live::from_env().await?.with_router().with_gossip().spawn();
|
||||||
|
|
||||||
|
// From an existing endpoint
|
||||||
|
let live = Live::builder(endpoint).with_router().with_gossip().spawn();
|
||||||
|
|
||||||
|
// Manual router mounting (when you have other protocols)
|
||||||
|
let router = live.register_protocols(Router::builder(endpoint));
|
||||||
|
let router = router.accept(other_protocol, other_handler);
|
||||||
|
let router = router.spawn();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Methods
|
||||||
|
|
||||||
|
| Method | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `publish(name, &LocalBroadcast)` | Register a broadcast for all connected peers |
|
||||||
|
| `subscribe(remote, name)` | Connect to a peer and subscribe to a broadcast → `Subscription` |
|
||||||
|
| `subscribe_media(remote, name, audio, config)` | Connect, subscribe, decode → `(MoqSession, MediaTracks)` |
|
||||||
|
| `join_room(ticket)` | Join a gossip-based multi-party room → `Room` |
|
||||||
|
| `endpoint()` | Access the underlying iroh `Endpoint` |
|
||||||
|
| `transport()` | Access the `Moq` transport for advanced operations |
|
||||||
|
| `gossip()` | Access the `Gossip` instance (if enabled) |
|
||||||
|
| `shutdown()` | Close all sessions, stop router, close endpoint |
|
||||||
|
|
||||||
|
### Builder Options
|
||||||
|
|
||||||
|
- **`with_router()`** — Spawns an internal `Router` so the endpoint accepts incoming MoQ sessions. Without this, only outbound connections work.
|
||||||
|
- **`with_gossip()`** — Creates a `Gossip` instance (required for rooms). Internally mounts on the Router if `with_router` is also set.
|
||||||
|
- **`gossip(gossip)`** — Use an externally-managed `Gossip` instance.
|
||||||
|
|
||||||
|
### Internal Architecture
|
||||||
|
|
||||||
|
`Live` holds:
|
||||||
|
- `endpoint: Endpoint` — iroh QUIC endpoint
|
||||||
|
- `moq: Moq` — Internal actor for session/broadcast management
|
||||||
|
- `gossip: Option<Gossip>` — For room coordination
|
||||||
|
- `router: Option<Router>` — For accepting incoming connections
|
||||||
|
|
||||||
|
The `from_env()` method reads `IROH_SECRET` for the secret key and generates one if not set. It uses the `N0` preset for relay and DNS discovery.
|
||||||
|
|
||||||
|
## `LiveTicket` — Connection Sharing
|
||||||
|
|
||||||
|
A serializable ticket that contains everything needed to connect to a publisher.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Create a ticket
|
||||||
|
let ticket = LiveTicket::new(endpoint.addr(), "my-stream");
|
||||||
|
|
||||||
|
// Serialize to URI string (fits in QR codes)
|
||||||
|
let s = ticket.to_string();
|
||||||
|
// → "iroh-live:<base64url(postcard(EndpointAddr))>/my-stream"
|
||||||
|
|
||||||
|
// Deserialize
|
||||||
|
let parsed: LiveTicket = s.parse()?;
|
||||||
|
|
||||||
|
// With relay URLs for indirect connectivity
|
||||||
|
let ticket = LiveTicket::new(addr, "stream").with_relay_urls(vec![
|
||||||
|
"https://relay.example.com".to_string(),
|
||||||
|
]);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Format:** `iroh-live:<base64url(postcard(EndpointAddr))>/<name>`
|
||||||
|
|
||||||
|
Also supports legacy `name@base32` format for backward compatibility.
|
||||||
|
|
||||||
|
The ticket string is kept short enough for QR codes (< 2000 bytes). It uses postcard (binary) serialization with base64url encoding.
|
||||||
|
|
||||||
|
## `Call` — 1:1 Video Call
|
||||||
|
|
||||||
|
A convenience wrapper over MoQ primitives for bidirectional calls.
|
||||||
|
|
||||||
|
### Flow
|
||||||
|
|
||||||
|
1. One side creates a `LocalBroadcast` with video/audio configured
|
||||||
|
2. **Dialer:** `Call::dial(live, remote_addr, local_broadcast)` — connects, publishes "call" broadcast, subscribes to remote's "call" broadcast
|
||||||
|
3. **Acceptor:** `Call::accept(session, local_broadcast)` — accepts an incoming session, publishes and subscribes
|
||||||
|
|
||||||
|
The broadcast name is always `"call"` — this is hardcoded (`CALL_BROADCAST_NAME`).
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Dialer side
|
||||||
|
let local = LocalBroadcast::new();
|
||||||
|
local.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
|
||||||
|
let call = Call::dial(&live, remote_addr, local).await?;
|
||||||
|
|
||||||
|
// Access remote media
|
||||||
|
let remote_broadcast = call.remote();
|
||||||
|
let video = remote_broadcast.video()?;
|
||||||
|
|
||||||
|
// Wait for call to end
|
||||||
|
let reason = call.closed().await;
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Properties
|
||||||
|
|
||||||
|
- `call.local()` → `&LocalBroadcast` (your media)
|
||||||
|
- `call.remote()` → `&RemoteBroadcast` (peer's media)
|
||||||
|
- `call.signals()` → `watch::Receiver<NetworkSignals>` (for adaptive bitrate)
|
||||||
|
- `call.close()` — closes with error code 0 and reason "call ended"
|
||||||
|
- `call.closed()` → waits for close, returns `DisconnectReason` (LocalClose, RemoteClose, TransportError)
|
||||||
|
|
||||||
|
Auto-wires stats recording and network signal production on the connection.
|
||||||
|
|
||||||
|
## `Subscription` — Subscribe Handle
|
||||||
|
|
||||||
|
Created by `Live::subscribe()`. Wraps the MoQ session, remote broadcast, and network signals into a single handle. The constructor auto-wires stats recording and signal production.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let sub = live.subscribe(remote_addr, "stream").await?;
|
||||||
|
|
||||||
|
// Access components
|
||||||
|
sub.session() // &MoqSession
|
||||||
|
sub.broadcast() // &RemoteBroadcast
|
||||||
|
sub.signals() // &watch::Receiver<NetworkSignals>
|
||||||
|
|
||||||
|
// Convenience methods
|
||||||
|
let tracks = sub.media(&audio_backend, Default::default()).await?;
|
||||||
|
let tracks = sub.media_with_decoders::<DefaultDecoders>(&audio_backend, config).await?;
|
||||||
|
|
||||||
|
// Decompose
|
||||||
|
let (session, broadcast, signals) = sub.into_parts();
|
||||||
|
```
|
||||||
|
|
||||||
|
## `DisconnectReason`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum DisconnectReason {
|
||||||
|
LocalClose,
|
||||||
|
RemoteClose,
|
||||||
|
TransportError,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Derived from the QUIC connection's close reason. Used by `Call::closed()`.
|
||||||
|
|
||||||
|
## `util` Module
|
||||||
|
|
||||||
|
### `secret_key_from_env()`
|
||||||
|
|
||||||
|
Loads the iroh secret key from the `IROH_SECRET` environment variable. Generates a new key if not set, printing the hex-encoded key for reuse.
|
||||||
|
|
||||||
|
### `spawn_signal_producer(conn, shutdown)`
|
||||||
|
|
||||||
|
Spawns a background task that polls QUIC connection path stats every 200ms and produces `NetworkSignals` for adaptive rendition selection. Returns a `watch::Receiver<NetworkSignals>`.
|
||||||
|
|
||||||
|
Computes:
|
||||||
|
- **RTT** — from `selected_path.rtt()`
|
||||||
|
- **Loss rate** — delta-based (lost packets / (sent + lost) over the interval)
|
||||||
|
- **Available bandwidth** — estimated from congestion window: `cwnd * 8 / rtt`
|
||||||
|
- **Congestion events** — monotonically increasing counter
|
||||||
|
|
||||||
|
### `spawn_stats_recorder(conn, net_stats, shutdown)`
|
||||||
|
|
||||||
|
Records connection stats (RTT, loss rate, bandwidth, path type) into `NetStats` for debug overlay display. Runs every 200ms.
|
||||||
164
docs/research/references/iroh/iroh-live/03-iroh-moq-transport.md
Normal file
164
docs/research/references/iroh/iroh-live/03-iroh-moq-transport.md
Normal file
@@ -0,0 +1,164 @@
|
|||||||
|
# iroh-moq: MoQ Transport Layer
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
`iroh-moq` is the transport bridge between iroh's QUIC endpoint and the moq-lite broadcast protocol. It manages connections, session lifecycle, broadcast routing, and subscription handling. This is the only crate in the workspace that directly interacts with QUIC transport — everything above uses `Moq`/`MoqSession` as the interface.
|
||||||
|
|
||||||
|
**ALPN:** `moq-lite-03`
|
||||||
|
|
||||||
|
## Core Types
|
||||||
|
|
||||||
|
### `Moq` — Transport Manager
|
||||||
|
|
||||||
|
The top-level transport entry point. Wraps an iroh `Endpoint` and runs an internal actor (`Actor`) that handles all connection and broadcast management.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let moq = Moq::new(endpoint);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Internal architecture:**
|
||||||
|
|
||||||
|
`Moq` holds an `mpsc::Sender<ActorMessage>` to communicate with a spawned actor task. The actor manages:
|
||||||
|
- A `HashMap<EndpointId, MoqSession>` of active sessions
|
||||||
|
- A `HashMap<BroadcastName, BroadcastProducer>` of locally published broadcasts
|
||||||
|
- A `JoinSet` of session tasks (each tracks session lifetime)
|
||||||
|
- A `FuturesUnordered` of pending connect tasks
|
||||||
|
- A `broadcast::Sender<MoqSession>` for incoming session notifications
|
||||||
|
|
||||||
|
**Key methods:**
|
||||||
|
|
||||||
|
| Method | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `new(endpoint)` | Creates transport and spawns the actor |
|
||||||
|
| `protocol_handler()` | Returns `MoqProtocolHandler` for Router registration |
|
||||||
|
| `publish(name, producer)` | Register a broadcast for all current and future sessions |
|
||||||
|
| `connect(remote)` | Connect to remote peer, deduplicating existing connections |
|
||||||
|
| `incoming_sessions()` | Get stream of incoming sessions |
|
||||||
|
| `published_broadcasts()` | List currently published broadcast names |
|
||||||
|
| `shutdown()` | Cancel the shutdown token, closing all sessions |
|
||||||
|
|
||||||
|
### `MoqProtocolHandler`
|
||||||
|
|
||||||
|
Implements iroh's `ProtocolHandler` trait. When the Router receives an incoming connection with the `moq-lite-03` ALPN:
|
||||||
|
|
||||||
|
1. Accepts the raw QUIC `Connection`
|
||||||
|
2. Wraps it in a `web_transport_iroh::Session::raw(connection)`
|
||||||
|
3. Completes the moq-lite server handshake: `MoqSession::session_accept(wt_session)`
|
||||||
|
4. Sends the session to the actor via `ActorMessage::HandleSession`
|
||||||
|
|
||||||
|
### `MoqSession` — Single Peer Connection
|
||||||
|
|
||||||
|
Represents a MoQ connection with one remote peer. Created via:
|
||||||
|
- `Moq::connect()` (outbound, client role)
|
||||||
|
- `IncomingSession::accept()` (inbound, server role)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Outbound
|
||||||
|
let session = moq.connect(remote_addr).await?;
|
||||||
|
|
||||||
|
// Inbound
|
||||||
|
let incoming = incoming_session.next().await?;
|
||||||
|
let session = incoming.accept(); // or incoming.reject()
|
||||||
|
```
|
||||||
|
|
||||||
|
**Internal structure:**
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct MoqSession {
|
||||||
|
wt_session: web_transport_iroh::Session,
|
||||||
|
_moq_session: Arc<moq_lite::Session>,
|
||||||
|
publish: OriginProducer, // For announcing local broadcasts
|
||||||
|
subscribe: OriginConsumer, // For consuming remote broadcasts
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `OriginProducer`/`OriginConsumer` pair comes from moq-lite. The session creates them before the handshake:
|
||||||
|
|
||||||
|
- **Client (connect):** Creates `OriginProducer` for publish and `OriginConsumer` for subscribe, then `Client::new().with_publish(...).with_consume(...).connect(session)`
|
||||||
|
- **Server (accept):** Same pattern with `Server::new().with_publish(...).with_consume(...).accept(session)`
|
||||||
|
|
||||||
|
**Key methods:**
|
||||||
|
|
||||||
|
| Method | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `subscribe(name)` | Wait for remote to announce broadcast, return `BroadcastConsumer` |
|
||||||
|
| `publish(name, consumer)` | Make a broadcast available to remote peer |
|
||||||
|
| `conn()` | Reference to underlying QUIC `Connection` (for stats) |
|
||||||
|
| `remote_id()` | Remote peer's `EndpointId` |
|
||||||
|
| `close(code, reason)` | Close the session |
|
||||||
|
| `closed()` | Wait for session to close, returns `SessionError` |
|
||||||
|
| `origin_producer()` | Direct access to moq-lite publish origin |
|
||||||
|
| `origin_consumer()` | Direct access to moq-lite subscribe origin |
|
||||||
|
|
||||||
|
### `IncomingSession` / `IncomingSessionStream`
|
||||||
|
|
||||||
|
`IncomingSession` wraps a `MoqSession` that has completed the handshake. Provides:
|
||||||
|
- `remote_id()` — the connecting peer's identity
|
||||||
|
- `accept()` — returns the `MoqSession`
|
||||||
|
- `reject()` — closes with error code 1
|
||||||
|
|
||||||
|
`IncomingSessionStream` is an async stream that yields `IncomingSession` values. Uses a `broadcast::Receiver<MoqSession>` internally, handling lag by skipping missed sessions.
|
||||||
|
|
||||||
|
## Actor Internals
|
||||||
|
|
||||||
|
The `Actor` is the core event loop for the `Moq` transport:
|
||||||
|
|
||||||
|
```
|
||||||
|
loop {
|
||||||
|
select! {
|
||||||
|
msg = inbox.recv() → handle_message(msg)
|
||||||
|
session_closed → remove session, log
|
||||||
|
broadcast_closed → remove from publishing map
|
||||||
|
connect_completed → handle_session or reply to caller
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Message Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum ActorMessage {
|
||||||
|
HandleSession { session: Box<MoqSession> },
|
||||||
|
LocalBroadcast { broadcast_name: String, producer: BroadcastProducer },
|
||||||
|
Connect { remote: EndpointAddr, reply: oneshot::Sender<...> },
|
||||||
|
GetPublished { reply: oneshot::Sender<Vec<String>> },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Connection Deduplication
|
||||||
|
|
||||||
|
When `Connect` is received for a peer that already has an active session, the existing session is returned immediately. If a connection attempt is already in progress, the oneshot reply is queued and notified when the connection completes.
|
||||||
|
|
||||||
|
### Broadcast Fan-out
|
||||||
|
|
||||||
|
When a `LocalBroadcast` is published via `Moq::publish()`:
|
||||||
|
1. The actor stores the `BroadcastProducer` in its `publishing` map
|
||||||
|
2. It immediately announces the broadcast to all existing sessions by calling `session.publish(name, producer.consume())` on each
|
||||||
|
3. For future sessions, the actor iterates `publishing` entries and announces each one
|
||||||
|
4. A `FuturesUnordered` tracks when each broadcast closes, removing it from the map
|
||||||
|
|
||||||
|
### Session Lifecycle
|
||||||
|
|
||||||
|
When a session is established (either incoming or outgoing):
|
||||||
|
1. All currently published broadcasts are announced to it
|
||||||
|
2. It's stored in `sessions` by `EndpointId`
|
||||||
|
3. A session task is spawned that waits for the session to close
|
||||||
|
4. If there were pending connect requests for this peer, they're fulfilled
|
||||||
|
|
||||||
|
## Error Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum Error {
|
||||||
|
Connect(ConnectError), // iroh connection failure
|
||||||
|
Moq(moq_lite::Error), // MoQ protocol error
|
||||||
|
Server(web_transport_iroh::ServerError), // WebTransport server error
|
||||||
|
InternalConsistencyError(LiveActorDiedError), // Actor died
|
||||||
|
Request(WriteError), // QUIC write error
|
||||||
|
}
|
||||||
|
|
||||||
|
enum SubscribeError {
|
||||||
|
NotAnnounced, // Track was not announced
|
||||||
|
Closed, // Track was closed
|
||||||
|
SessionClosed(SessionError), // Session closed
|
||||||
|
}
|
||||||
|
```
|
||||||
185
docs/research/references/iroh/iroh-live/04-rooms.md
Normal file
185
docs/research/references/iroh/iroh-live/04-rooms.md
Normal file
@@ -0,0 +1,185 @@
|
|||||||
|
# iroh-live: Rooms — Multi-Party Coordination
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `rooms` module provides multi-party room coordination over iroh-gossip. Participants discover each other via a gossip topic, automatically connect and subscribe to each other's broadcasts, and receive `RoomEvent` notifications as peers join, publish, and leave.
|
||||||
|
|
||||||
|
## Core Types
|
||||||
|
|
||||||
|
### `Room`
|
||||||
|
|
||||||
|
The main room handle. Created via `Room::new(live, ticket)`. Spawns an internal actor that manages all peer coordination.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Create a room (generates a random topic)
|
||||||
|
let ticket = RoomTicket::generate();
|
||||||
|
let room = Room::new(&live, ticket.clone()).await?;
|
||||||
|
|
||||||
|
// Or join an existing room
|
||||||
|
let room = Room::new(&live, existing_ticket).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Methods:**
|
||||||
|
- `recv()` — Wait for next `RoomEvent`
|
||||||
|
- `try_recv()` — Non-blocking event check
|
||||||
|
- `ticket()` — Get a ticket that includes this peer as a bootstrap node
|
||||||
|
- `split()` — Decompose into `(RoomEvents, RoomHandle)` for use in separate tasks
|
||||||
|
- `publish(name, &LocalBroadcast)` — Publish a broadcast to the room
|
||||||
|
- `set_chat_publisher(publisher)` — Register a chat publisher
|
||||||
|
- `send_chat(text)` — Send a chat message
|
||||||
|
|
||||||
|
### `RoomHandle`
|
||||||
|
|
||||||
|
Cloneable handle for publishing into a room. Obtained from `Room::split()`. Can be shared across tasks.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let (events, handle) = room.split();
|
||||||
|
|
||||||
|
// In one task: receive events
|
||||||
|
while let Some(event) = events.recv().await {
|
||||||
|
match event { ... }
|
||||||
|
}
|
||||||
|
|
||||||
|
// In another task: publish
|
||||||
|
handle.publish("camera", &broadcast).await?;
|
||||||
|
handle.send_chat("Hello!").await?;
|
||||||
|
handle.set_display_name("Alice").await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RoomTicket`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RoomTicket {
|
||||||
|
pub bootstrap: Vec<EndpointId>, // Bootstrap peer IDs for gossip
|
||||||
|
pub topic_id: TopicId, // Gossip topic identifier
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Serialized via `iroh_tickets` (binary format). Can be created from:
|
||||||
|
- `RoomTicket::generate()` — Random topic, no bootstrap
|
||||||
|
- `RoomTicket::new(topic_id, bootstrap)` — Specific topic and peers
|
||||||
|
- `RoomTicket::new_from_env()` — From `IROH_LIVE_ROOM` or `IROH_LIVE_TOPIC` env vars
|
||||||
|
|
||||||
|
### `RoomEvent`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum RoomEvent {
|
||||||
|
RemoteAnnounced {
|
||||||
|
remote: EndpointId,
|
||||||
|
broadcasts: Vec<String>,
|
||||||
|
},
|
||||||
|
BroadcastSubscribed {
|
||||||
|
session: Box<MoqSession>,
|
||||||
|
broadcast: Box<RemoteBroadcast>,
|
||||||
|
},
|
||||||
|
PeerJoined {
|
||||||
|
remote: EndpointId,
|
||||||
|
display_name: Option<String>,
|
||||||
|
},
|
||||||
|
PeerLeft {
|
||||||
|
remote: EndpointId,
|
||||||
|
},
|
||||||
|
ChatReceived {
|
||||||
|
remote: EndpointId,
|
||||||
|
message: ChatMessage,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Room Actor — Internal Architecture
|
||||||
|
|
||||||
|
The room actor is a spawned task that manages the gossip KV subscription and coordinates all peer connections.
|
||||||
|
|
||||||
|
### State
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Actor {
|
||||||
|
me: EndpointId,
|
||||||
|
_gossip: Gossip,
|
||||||
|
live: Live,
|
||||||
|
active_subscribe: HashSet<BroadcastId>, // (EndpointId, name) pairs
|
||||||
|
active_publish: HashSet<String>, // Locally published broadcast names
|
||||||
|
known_peers: HashMap<EndpointId, Option<String>>, // display names
|
||||||
|
connecting: ConnectingFutures, // In-flight subscribe attempts
|
||||||
|
subscribe_closed: FuturesUnordered, // Track subscription lifetimes
|
||||||
|
publish_closed: FuturesUnordered, // Track publish lifetimes
|
||||||
|
chat_messages: FuturesUnordered, // Active chat subscribers
|
||||||
|
chat_publisher: Option<ChatPublisher>,
|
||||||
|
display_name: Option<String>,
|
||||||
|
event_tx: mpsc::Sender<RoomEvent>,
|
||||||
|
kv: iroh_smol_kv::Client, // Distributed KV for peer state
|
||||||
|
kv_writer: WriteScope, // KV write access
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Gossip KV for Peer Discovery
|
||||||
|
|
||||||
|
The room uses `iroh-smol-kv` over gossip for peer state coordination. Each peer writes their `PeerState` to key `b"s"`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct PeerState {
|
||||||
|
broadcasts: Vec<String>,
|
||||||
|
display_name: Option<String>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Serialized with postcard (binary format — **no `skip_serializing_if`** allowed since postcard is positional).
|
||||||
|
|
||||||
|
### Event Loop
|
||||||
|
|
||||||
|
```
|
||||||
|
loop {
|
||||||
|
select! {
|
||||||
|
update = gossip_kv_stream.next() → handle_gossip_update
|
||||||
|
msg = inbox.recv() → handle_api_message
|
||||||
|
result = connecting.next() → subscribe succeeded/failed
|
||||||
|
broadcast_closed → remove from active, maybe emit PeerLeft
|
||||||
|
publish_closed → remove from active_publish, update KV
|
||||||
|
chat_message → emit ChatReceived
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Peer Discovery Flow
|
||||||
|
|
||||||
|
1. Peer A publishes a broadcast via `handle.publish("camera", &broadcast)`
|
||||||
|
2. Actor publishes to MoQ AND updates gossip KV with `PeerState { broadcasts: ["camera"], display_name: ... }`
|
||||||
|
3. Peer B's gossip KV stream receives the update
|
||||||
|
4. Peer B's actor checks `known_peers` — if new, emits `PeerJoined`
|
||||||
|
5. Peer B's actor checks `active_subscribe` — if new broadcast, initiates `live.subscribe(remote, name)`
|
||||||
|
6. When subscription succeeds, Peer B emits `BroadcastSubscribed`
|
||||||
|
7. If the broadcast has a chat track, a chat subscriber is spawned
|
||||||
|
|
||||||
|
### Chat
|
||||||
|
|
||||||
|
Chat uses a dedicated MoQ track within each broadcast. Each message is a single MoQ group containing one frame of UTF-8 text. The sender identity comes from the broadcast context (peer ID), not the message payload.
|
||||||
|
|
||||||
|
### Connection Lifecycle
|
||||||
|
|
||||||
|
- When a broadcast closes (`subscribe_closed`), it's removed from `active_subscribe`
|
||||||
|
- If this was the last broadcast from that peer, `PeerLeft` is emitted
|
||||||
|
- When a publish closes (`publish_closed`), the KV is updated to remove that broadcast
|
||||||
|
|
||||||
|
### `RoomPublisherSync`
|
||||||
|
|
||||||
|
A convenience wrapper for the common pattern of publishing camera+audio and optionally screen share into a room:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let publisher = RoomPublisherSync::new(room_handle, audio_backend);
|
||||||
|
publisher.set_state(&PublishOpts::default())?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Automatically publishes a "camera" broadcast and manages a "screen" broadcast when screen sharing is toggled on.
|
||||||
|
|
||||||
|
## API Messages
|
||||||
|
|
||||||
|
```rust
|
||||||
|
enum ApiMessage {
|
||||||
|
Publish { name: String, producer: BroadcastProducer },
|
||||||
|
SendChat { text: String },
|
||||||
|
SetChatPublisher { publisher: ChatPublisher },
|
||||||
|
SetDisplayName { name: String },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
These are sent from `RoomHandle` to the actor via an mpsc channel.
|
||||||
105
docs/research/references/iroh/iroh-live/05-relay.md
Normal file
105
docs/research/references/iroh/iroh-live/05-relay.md
Normal file
@@ -0,0 +1,105 @@
|
|||||||
|
# iroh-live-relay: Browser Bridging
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The relay server bridges iroh P2P streams to browser clients via WebTransport. Browsers cannot speak iroh's QUIC protocol directly, so the relay accepts WebTransport connections and either serves locally-published broadcasts or pulls them from remote iroh publishers on demand.
|
||||||
|
|
||||||
|
**Architecture:**
|
||||||
|
|
||||||
|
```
|
||||||
|
iroh-live publish --(iroh P2P)--> iroh-live-relay <--(WebTransport)-- browser
|
||||||
|
browser --(WebTransport)--> iroh-live-relay --(iroh P2P)--> iroh-live subscribe
|
||||||
|
```
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
### `RelayConfig` (CLI Configuration)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RelayConfig {
|
||||||
|
pub bind: SocketAddr, // QUIC/WebTransport bind (default: [::]:4443)
|
||||||
|
pub http_bind: SocketAddr, // HTTP static files bind (default: same as bind)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Flattenable into a clap CLI via `#[command(flatten)]`.
|
||||||
|
|
||||||
|
### `run(config)` — Main Server Loop
|
||||||
|
|
||||||
|
The main entry point. Sets up:
|
||||||
|
|
||||||
|
1. **QUIC/WebTransport server** — Uses `moq-native::ServerConfig` with:
|
||||||
|
- QUIC backend: `noq` (a custom QUIC implementation)
|
||||||
|
- iroh endpoint integration
|
||||||
|
- Self-signed TLS certificates (dev mode) for `localhost`
|
||||||
|
- Max streams: `moq_relay::DEFAULT_MAX_STREAMS`
|
||||||
|
|
||||||
|
2. **iroh endpoint** — Binds an iroh endpoint for P2P connectivity, prints its ID
|
||||||
|
|
||||||
|
3. **moq-relay Cluster** — The broadcast routing engine. Manages broadcast lifecycle: when all subscribers disconnect, the broadcast is removed.
|
||||||
|
|
||||||
|
4. **HTTP server** — Axum router serving:
|
||||||
|
- `GET /certificate.sha256` — TLS fingerprint for dev mode
|
||||||
|
- `GET /` — Web viewer landing page
|
||||||
|
- `GET /{path}` — Static file serving with CORS
|
||||||
|
- Embedded via `include_dir!` from `web/dist/`
|
||||||
|
|
||||||
|
5. **Pull mode** — If iroh endpoint is available, creates a `PullState` for on-demand remote broadcast fetching
|
||||||
|
|
||||||
|
6. **Connection loop** — Accepts incoming connections, parses the URL path as a `LiveTicket`, and if valid, triggers a pull before running the connection
|
||||||
|
|
||||||
|
### `PullState` — On-Demand Remote Fetching
|
||||||
|
|
||||||
|
When a browser connects with a broadcast name that is a valid `LiveTicket`, the relay:
|
||||||
|
|
||||||
|
1. Checks if the broadcast already exists in the cluster (fast path)
|
||||||
|
2. If not, connects to the remote publisher via iroh-live's `Moq::connect()`
|
||||||
|
3. Subscribes to the remote broadcast
|
||||||
|
4. Publishes the consumer into the local cluster under the ticket string as the name
|
||||||
|
5. Spawns a keepalive task that holds the session until it closes
|
||||||
|
|
||||||
|
**Concurrency:** Duplicate concurrent pulls for the same ticket are deduplicated using a `HashMap<String, Arc<Notify>>`. Waiters block on the `Notify` until the first connector finishes.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct PullState {
|
||||||
|
live: iroh_live::Live,
|
||||||
|
cluster: Cluster,
|
||||||
|
connecting: Arc<Mutex<HashMap<String, Arc<Notify>>>>>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Web Viewer
|
||||||
|
|
||||||
|
The relay embeds a SolidJS + TypeScript web application compiled by Vite. It uses:
|
||||||
|
- `@moq/watch` — Web component for watching streams via WebCodecs
|
||||||
|
- `@moq/publish` — Web component for publishing from browser camera/mic
|
||||||
|
- WebTransport — For QUIC connectivity from the browser
|
||||||
|
|
||||||
|
Watch URLs: `https://relay:4443/?name=<BROADCAST_OR_TICKET>`
|
||||||
|
|
||||||
|
### Data Directory
|
||||||
|
|
||||||
|
The relay persists data to `$IROH_LIVE_RELAY_DATA` (or the platform default). This includes:
|
||||||
|
- iroh secret key (`iroh_secret_key`) — ensures endpoint ID stability across restarts
|
||||||
|
- TLS certificates
|
||||||
|
|
||||||
|
### TLS and Certificates
|
||||||
|
|
||||||
|
Currently **self-signed only**. ACME/Let's Encrypt is planned but not implemented. In dev mode, browsers need `--ignore-certificate-errors` or the relay's fingerprint (served at `/certificate.sha256`) for WebTransport to work.
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
No authentication is implemented yet. The relay accepts all connections. MoQ supports token-based authentication which could be added.
|
||||||
|
|
||||||
|
## CLI Binary
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// iroh-live-relay/src/main.rs
|
||||||
|
#[derive(Parser)]
|
||||||
|
struct Cli {
|
||||||
|
#[command(flatten)]
|
||||||
|
relay: RelayConfig,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Must call `rustls::crypto::aws_lc_rs::default_provider().install_default()` before `run()`.
|
||||||
@@ -0,0 +1,304 @@
|
|||||||
|
# moq-media: Media Pipelines
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
`moq-media` owns the media pipeline: broadcast management, codec orchestration, playout timing, adaptive bitrate, and audio backend. **It has no dependency on iroh** — it works with any transport that implements `PacketSource` and `PacketSink`. This makes it usable for recording pipelines, studio links, and camera dashboards without RTC.
|
||||||
|
|
||||||
|
## Module Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
moq-media/
|
||||||
|
├── lib.rs — Re-exports and feature-gated modules
|
||||||
|
├── publish.rs — LocalBroadcast, VideoPublisher, AudioPublisher
|
||||||
|
├── subscribe.rs — RemoteBroadcast, VideoTrack, AudioTrack, MediaTracks
|
||||||
|
├── transport.rs — PacketSource/PacketSink traits, MoqPacketSource, MoqPacketSink
|
||||||
|
├── net.rs — NetworkSignals (RTT, loss rate, available bandwidth)
|
||||||
|
├── adaptive.rs — Adaptive rendition switching algorithm
|
||||||
|
├── playout.rs — PlaybackPolicy, SyncMode
|
||||||
|
├── chat.rs — ChatPublisher, ChatSubscriber (MoQ track-based)
|
||||||
|
├── frame_channel.rs — Single-frame channel (last-writer-wins for video)
|
||||||
|
├── sync.rs — Shared playout clock (Sync) for A/V sync
|
||||||
|
├── stats.rs — Metric, Label, NetStats, EncodeStats, RenderStats, etc.
|
||||||
|
├── pipeline.rs — Pipeline orchestration
|
||||||
|
├── pipeline/ — VideoEncoderPipeline, AudioEncoderPipeline, VideoDecoderPipeline, etc.
|
||||||
|
├── audio_backend.rs — AudioBackend trait and device enumeration
|
||||||
|
├── audio_backend/ — Platform-specific audio backends (cpal, etc.)
|
||||||
|
├── capture.rs — Camera/screen capture integration
|
||||||
|
├── source_spec.rs — VideoInput, PreEncodedTrack
|
||||||
|
├── test_util.rs — Test utilities (feature-gated)
|
||||||
|
└── processing/ — Scale, color conversion, etc.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Publish Pipeline — `LocalBroadcast`
|
||||||
|
|
||||||
|
`LocalBroadcast` manages encoder pipelines and publishes a catalog that subscribers use to discover available renditions. It owns a `BroadcastProducer` (from moq-lite) and coordinates video and audio track lifecycles.
|
||||||
|
|
||||||
|
### Construction
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let broadcast = LocalBroadcast::new();
|
||||||
|
broadcast.video().set_source(camera, VideoCodec::H264, [VideoPreset::P720])?;
|
||||||
|
broadcast.audio().set(mic, AudioCodec::Opus, [AudioPreset::Hq])?;
|
||||||
|
|
||||||
|
// Or pre-encoded sources
|
||||||
|
broadcast.video().set(VideoInput::pre_encoded("video/h264-pi", config, factory))?;
|
||||||
|
```
|
||||||
|
|
||||||
|
### Slot Handles
|
||||||
|
|
||||||
|
- `broadcast.video()` → `VideoPublisher` (borrows `&self`)
|
||||||
|
- `broadcast.audio()` → `AudioPublisher` (borrows `&self`)
|
||||||
|
|
||||||
|
Both use interior mutability. Calling `set()` tears down any existing pipeline and installs the new one.
|
||||||
|
|
||||||
|
### Video Input Modes
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum VideoInput {
|
||||||
|
Renditions(VideoRenditions), // Raw source → multiple encoded renditions (simulcast)
|
||||||
|
PreEncoded(Vec<PreEncodedTrack>), // Already-encoded tracks pass through
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**`VideoRenditions`** holds a `SharedVideoSource` and a map of rendition names to encoder factories. Multiple renditions share the same source via `watch::Receiver<Option<VideoFrame>>`. Slow encoders never cause backpressure on the source — intermediate frames are silently skipped.
|
||||||
|
|
||||||
|
**`PreEncodedTrack`** is for hardware encoders that produce compressed output directly (e.g., rpicam-vid on Raspberry Pi). Each track carries a name, `VideoConfig`, and a factory closure that creates a fresh source per subscriber.
|
||||||
|
|
||||||
|
### SharedVideoSource
|
||||||
|
|
||||||
|
Runs the capture source on a dedicated OS thread. Parks when no subscribers are connected (releasing camera/screen resources) and unparks when the first subscriber arrives. Uses `AtomicU32` subscriber counting with proper memory ordering (`AcqRel`/`Acquire`).
|
||||||
|
|
||||||
|
Frames are distributed via `watch::Sender<Option<VideoFrame>>` — always contains the latest frame, so slow encoders never block the source.
|
||||||
|
|
||||||
|
### Demand-Driven Track Startup
|
||||||
|
|
||||||
|
The broadcast's run loop (`LocalBroadcast::run_dynamic`) calls `producer.requested_track().await` to wait for subscriber demand. When a subscriber requests a specific rendition:
|
||||||
|
|
||||||
|
1. The loop looks up the rendition in the current `VideoInput` or `AudioRenditions`
|
||||||
|
2. It starts the corresponding encoder pipeline on a dedicated OS thread
|
||||||
|
3. When all subscribers disconnect (tracked via `track.unused().await`), the pipeline is stopped
|
||||||
|
|
||||||
|
This means encoder threads only run when someone is actually consuming.
|
||||||
|
|
||||||
|
### Catalog
|
||||||
|
|
||||||
|
`LocalBroadcast` maintains a catalog track (hang's built-in catalog mechanism) listing all available video and audio renditions with codec configuration, dimensions, and bitrate. Updated whenever video or audio is set/cleared.
|
||||||
|
|
||||||
|
Catalog format follows the `hang::catalog::Catalog` structure with `Video` and `Audio` entries, each containing a `BTreeMap<String, Config>` of rendition names to configurations.
|
||||||
|
|
||||||
|
### Encoder Pipeline Architecture
|
||||||
|
|
||||||
|
All encoder pipelines run on **dedicated OS threads** (`spawn_thread`), not tokio tasks. Codec operations are CPU-intensive and sometimes block on hardware (VAAPI, V4L2), so running on tokio tasks would starve other async work.
|
||||||
|
|
||||||
|
Communication with the async runtime:
|
||||||
|
- **VideoEncoderPipeline**: reads `SharedVideoSource` via `watch::Receiver`, writes encoded frames to `MoqPacketSink`
|
||||||
|
- **AudioEncoderPipeline**: reads from `AudioSource`, writes to `MoqPacketSink`
|
||||||
|
- **PreEncodedVideoPipeline**: reads from `PreEncodedVideoSource`, writes to `MoqPacketSink`
|
||||||
|
|
||||||
|
### Chat
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let chat_publisher = broadcast.enable_chat()?;
|
||||||
|
chat_publisher.send("Hello!")?;
|
||||||
|
|
||||||
|
// Subscriber side
|
||||||
|
if let Some(chat_sub) = remote_broadcast.chat() {
|
||||||
|
let msg = chat_sub.recv().await;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Each chat message is a single MoQ group with one frame of UTF-8 text. The track name is `"chat"` with priority 10.
|
||||||
|
|
||||||
|
## Subscribe Pipeline — `RemoteBroadcast`
|
||||||
|
|
||||||
|
`RemoteBroadcast` wraps a `BroadcastConsumer` and watches its catalog for available video and audio renditions. Created with a `BroadcastConsumer` and a `PlaybackPolicy`.
|
||||||
|
|
||||||
|
### Construction
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let broadcast = RemoteBroadcast::new("stream-name", consumer).await?;
|
||||||
|
// Or with explicit policy
|
||||||
|
let broadcast = RemoteBroadcast::with_playback_policy("stream", consumer, policy).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
On construction, spawns a catalog-watching task that publishes snapshots via `Watchable<CatalogSnapshot>`.
|
||||||
|
|
||||||
|
### `CatalogSnapshot`
|
||||||
|
|
||||||
|
Point-in-time view of the broadcast's catalog. Derefs to `hang::Catalog`. Carries a sequence number for change detection.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let catalog = broadcast.catalog();
|
||||||
|
catalog.video_renditions() // Iterator of rendition names sorted by width
|
||||||
|
catalog.audio_renditions() // Iterator of audio rendition names
|
||||||
|
catalog.select_video_rendition(Quality::High)? // Best match for quality
|
||||||
|
catalog.has_video()
|
||||||
|
catalog.has_audio()
|
||||||
|
catalog.has_chat()
|
||||||
|
catalog.user() // User metadata from publisher
|
||||||
|
```
|
||||||
|
|
||||||
|
### Rendition Selection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Quality { Highest, High, Mid, Low }
|
||||||
|
|
||||||
|
pub struct VideoTarget {
|
||||||
|
pub max_pixels: Option<u32>,
|
||||||
|
pub max_bitrate_kbps: Option<u32>,
|
||||||
|
pub rendition: Option<String>, // Pin to specific rendition
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Quality::High` → `max_pixels(1280*720)`, etc. If `rendition` is set, it takes priority.
|
||||||
|
|
||||||
|
### VideoTrack
|
||||||
|
|
||||||
|
Represents a decoded video stream from a remote broadcast. The decoder runs on a dedicated OS thread.
|
||||||
|
|
||||||
|
**Creation flow:**
|
||||||
|
|
||||||
|
1. Pick a rendition (via `VideoTarget` or explicit name)
|
||||||
|
2. Create `TrackConsumer` from `BroadcastConsumer`, wrap in `OrderedConsumer` with `PlaybackPolicy::max_latency`
|
||||||
|
3. Wrap in `MoqPacketSource`
|
||||||
|
4. A `forward_packets` async task reads from `MoqPacketSource` → `mpsc` channel
|
||||||
|
5. Decoder thread reads `mpsc` → decoder → output via `Sync` playout clock (or `FramePacer`)
|
||||||
|
6. Output channel: `FrameReceiver<VideoFrame>` (latest-frame wins, suitable for rendering)
|
||||||
|
|
||||||
|
**Frame access:**
|
||||||
|
- `track.try_recv()` — Returns latest frame, draining older buffered frames (for game loops)
|
||||||
|
- `track.next_frame().await` — Async wait for next frame
|
||||||
|
- `track.has_frame()` — Check without consuming
|
||||||
|
|
||||||
|
**Adaptive rendition switching:**
|
||||||
|
```rust
|
||||||
|
track.enable_adaptation(broadcast, signals, config, decode_config)?;
|
||||||
|
track.disable_adaptation();
|
||||||
|
track.is_adaptive();
|
||||||
|
track.selected_rendition();
|
||||||
|
track.set_rendition_mode(RenditionMode::Fixed("video/h264-360p".into()));
|
||||||
|
track.set_rendition_mode(RenditionMode::Auto);
|
||||||
|
track.rendition_watcher(); // Direct<String> watcher for rendition changes
|
||||||
|
```
|
||||||
|
|
||||||
|
### AudioTrack
|
||||||
|
|
||||||
|
Same pattern as `VideoTrack` but sends decoded samples to an `AudioSink` (typically cpal + sonora). The audio decoder thread runs a 10ms tick loop.
|
||||||
|
|
||||||
|
### MediaTracks
|
||||||
|
|
||||||
|
Convenience struct combining `RemoteBroadcast` with optional `VideoTrack` and `AudioTrack`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct MediaTracks {
|
||||||
|
pub broadcast: RemoteBroadcast,
|
||||||
|
pub video: Option<VideoTrack>,
|
||||||
|
pub audio: Option<AudioTrack>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Lifecycle
|
||||||
|
|
||||||
|
Both `VideoTrack` and `AudioTrack` use drop-based cleanup. Dropping cancels the decoder thread (via `CancellationToken`) and the `forward_packets` task (via `AbortOnDropHandle`). The `OrderedConsumer` is dropped, signaling the transport that the track is no longer needed.
|
||||||
|
|
||||||
|
## Transport Abstraction — `PacketSource` / `PacketSink`
|
||||||
|
|
||||||
|
The transport boundary between moq-media and the network:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait PacketSource: Send + 'static {
|
||||||
|
fn read(&mut self) -> impl Future<Output = Result<Option<MediaPacket>>> + Send;
|
||||||
|
}
|
||||||
|
|
||||||
|
pub trait PacketSink: Send + 'static {
|
||||||
|
fn write(&mut self, packet: EncodedFrame) -> Result<()>;
|
||||||
|
fn finish(&mut self) -> Result<()>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**`MoqPacketSink`** wraps an `OrderedProducer`. When it receives an `EncodedFrame` with `is_keyframe = true`, it calls `keyframe()` on the producer to start a new MoQ group. This keyframe-to-group mapping is how subscribers can join at any group boundary.
|
||||||
|
|
||||||
|
**`MoqPacketSource`** wraps an `OrderedConsumer` and reads frames, converting them to `MediaPacket`.
|
||||||
|
|
||||||
|
**`PipeSink` / `PipeSource`** — In-memory pipe for local encode→decode without network (testing, local preview).
|
||||||
|
|
||||||
|
## Adaptive Rendition Switching
|
||||||
|
|
||||||
|
The adaptation algorithm runs in a background task that monitors `NetworkSignals` and decides whether to switch to a different video rendition.
|
||||||
|
|
||||||
|
### Algorithm
|
||||||
|
|
||||||
|
Renditions are ranked by pixel count (highest first). The algorithm maintains state across ticks:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Decision {
|
||||||
|
Hold, // Stay on current rendition
|
||||||
|
Downgrade(usize), // Switch to lower at index
|
||||||
|
Emergency, // Drop to lowest immediately
|
||||||
|
StartProbe(usize), // Try upgrading to index
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Emergency** (immediate): Loss rate ≥ 20% → drop to lowest rendition
|
||||||
|
|
||||||
|
**Downgrade** (sustained 500ms): Loss rate ≥ 10% OR available bandwidth < 85% of current rendition's bitrate
|
||||||
|
|
||||||
|
**Upgrade probe** (sustained 4s good conditions): Loss ≤ 2%, bandwidth ≥ 120% of next-higher rendition's bitrate → start 3-second probe on the higher rendition
|
||||||
|
|
||||||
|
**Probe abort**: Loss ≥ 5% or new congestion events during probe → abort, 8s cooldown
|
||||||
|
|
||||||
|
**Post-downgrade cooldown**: 4s after any downgrade before probes are allowed
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
The adaptation task (`adaptation_task_v2`) creates new `VideoDecoderPipeline`s that write to the same `FrameSender` via `with_sender()`. The frame channel stays the same while the underlying decoder pipeline gets swapped. When switching:
|
||||||
|
|
||||||
|
1. Create a new decoder pipeline for the target rendition
|
||||||
|
2. Drop the old pipeline handle
|
||||||
|
3. Update `selected_rendition` Watchable
|
||||||
|
|
||||||
|
## Playback and Sync
|
||||||
|
|
||||||
|
### PlaybackPolicy
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct PlaybackPolicy {
|
||||||
|
pub sync: SyncMode, // Synced (shared clock) or Unmanaged (PTS pacing)
|
||||||
|
pub max_latency: Duration, // Default: 150ms — how much buffering before skipping forward
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### SyncMode
|
||||||
|
|
||||||
|
- **`Synced`** (default): Shared playout clock (`Sync`). Video frames are gated by `Sync::wait(pts)`, which blocks until `reference + pts + latency` arrives. Audio paces itself through its ring buffer (~80ms).
|
||||||
|
- **`Unmanaged`**: No synchronization. `FramePacer` sleeps between frames based on PTS deltas, clamped to 2× frame period.
|
||||||
|
|
||||||
|
### Sync
|
||||||
|
|
||||||
|
The `Sync` type records arrival offsets via `received(pts)` and blocks on `wait(pts)` until `reference + pts + latency`. This keeps audio and video aligned without cross-path gating or signaling. Ported from the moq/js implementation.
|
||||||
|
|
||||||
|
## Stats
|
||||||
|
|
||||||
|
moq-media has a structured stats system for debug overlays:
|
||||||
|
|
||||||
|
- **`NetStats`** — RTT, loss%, bandwidth, path type (written by iroh-live transport bridge)
|
||||||
|
- **`EncodeStats`** — FPS, encode time, bitrate, codec, encoder, resolution, capture path
|
||||||
|
- **`RenderStats`** — FPS, decode time, decoder, renderer, rendition
|
||||||
|
- **`TimingStats`** — Audio buffer level, video/audio lag, A/V delta, video buffer depth
|
||||||
|
- **`Timeline`** — Ring buffer of `FrameMeta` entries for timeline visualization
|
||||||
|
|
||||||
|
Each `Metric` has EMA smoothing, a history ring buffer, and optional color thresholds. `Label` provides atomic string values.
|
||||||
|
|
||||||
|
## Codec Support
|
||||||
|
|
||||||
|
Feature-gated codec support:
|
||||||
|
|
||||||
|
| Feature | Codec | Backend |
|
||||||
|
|---------|-------|---------|
|
||||||
|
| `h264` | H.264 | openh264 (software) |
|
||||||
|
| `av1` | AV1 | rav1e encoder, rav1d decoder |
|
||||||
|
| `opus` | Opus | opus crate |
|
||||||
|
| `vaapi` | VAAPI | Linux hardware encode/decode |
|
||||||
|
| `videotoolbox` | VideoToolbox | macOS hardware |
|
||||||
|
| `v4l2` | V4L2 | Raspberry Pi hardware |
|
||||||
|
| `pcm` | Raw PCM | No encoding |
|
||||||
@@ -0,0 +1,95 @@
|
|||||||
|
# iroh-live: Network Signals and Adaptive Bitrate
|
||||||
|
|
||||||
|
## NetworkSignals
|
||||||
|
|
||||||
|
Produced by polling iroh QUIC connection stats. Consumed by `VideoTrack::enable_adaptation()` to decide when to switch video renditions.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NetworkSignals {
|
||||||
|
pub rtt: Duration, // Round-trip time to remote peer
|
||||||
|
pub loss_rate: f64, // Recent packet loss rate (0.0..=1.0), 200ms delta window
|
||||||
|
pub available_bps: u64, // Estimated available bandwidth (cwnd * 8 / rtt)
|
||||||
|
pub congestion_events: u64, // Monotonically increasing congestion counter
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Production
|
||||||
|
|
||||||
|
`spawn_signal_producer()` in `iroh-live/src/util.rs` polls every 200ms:
|
||||||
|
|
||||||
|
1. Gets connection paths via `conn.paths().get()`
|
||||||
|
2. Finds the selected path (`is_selected()`)
|
||||||
|
3. Reads path stats (`lost_packets`, `udp_tx.datagrams`, `cwnd`) and RTT
|
||||||
|
4. Computes delta-based loss rate: `delta_lost / (delta_sent + delta_lost)`
|
||||||
|
5. Estimates bandwidth: `cwnd * 8 * 1e9 / rtt_ns`
|
||||||
|
6. Writes to `watch::Sender<NetworkSignals>`
|
||||||
|
|
||||||
|
Also: `spawn_stats_recorder()` records into `NetStats` for the debug overlay (RTT, loss%, bandwidth in/out, path type).
|
||||||
|
|
||||||
|
## Adaptive Rendition Algorithm
|
||||||
|
|
||||||
|
Located in `moq-media/src/adaptive.rs`. The algorithm evaluates `NetworkSignals` against configured thresholds and produces `Decision` values.
|
||||||
|
|
||||||
|
### Configuration (`AdaptiveConfig`)
|
||||||
|
|
||||||
|
| Parameter | Default | Description |
|
||||||
|
|-----------|---------|-------------|
|
||||||
|
| `upgrade_hold` | 4s | Sustained good conditions before upgrade probe |
|
||||||
|
| `downgrade_hold` | 500ms | Sustained bad conditions before downgrade |
|
||||||
|
| `probe_duration` | 3s | How long a probe runs before committing |
|
||||||
|
| `probe_cooldown` | 8s | Cooldown after a failed probe |
|
||||||
|
| `post_downgrade_cooldown` | 4s | Cooldown after any downgrade |
|
||||||
|
| `loss_downgrade` | 10% | Loss rate threshold for downgrade |
|
||||||
|
| `loss_emergency` | 20% | Loss rate for immediate drop to lowest |
|
||||||
|
| `loss_good` | 2% | Loss rate considered "good" |
|
||||||
|
| `loss_probe_abort` | 5% | Loss rate that aborts an active probe |
|
||||||
|
| `bw_downgrade_ratio` | 85% | Bandwidth utilization ceiling for downgrade |
|
||||||
|
| `bw_probe_headroom` | 120% | Required excess bandwidth for probe |
|
||||||
|
| `check_interval` | 200ms | How often adaptation task checks signals |
|
||||||
|
|
||||||
|
### Decision Logic
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Emergency: loss >= 20% AND not already lowest → Drop to lowest immediately
|
||||||
|
|
||||||
|
2. Downgrade check:
|
||||||
|
- bandwidth_stressed (available < current_bitrate * 85%) OR loss >= 10%
|
||||||
|
- sustained for downgrade_hold (500ms) → Downgrade(next_lower)
|
||||||
|
|
||||||
|
3. Upgrade check:
|
||||||
|
- Already at highest → Hold
|
||||||
|
- Within post_downgrade_cooldown (4s) → Hold
|
||||||
|
- Within probe_cooldown (8s) → Hold
|
||||||
|
- bandwidth_headroom (available >= next_higher_bitrate * 120%) AND loss <= 2%
|
||||||
|
- sustained for upgrade_hold (4s) → StartProbe(next_higher)
|
||||||
|
|
||||||
|
4. Otherwise: Hold
|
||||||
|
```
|
||||||
|
|
||||||
|
### Probe Lifecycle
|
||||||
|
|
||||||
|
When `StartProbe(idx)` is decided:
|
||||||
|
1. Create a new decoder pipeline for the higher rendition
|
||||||
|
2. Write frames to the same `FrameSender` (seamless switch for the consumer)
|
||||||
|
3. Monitor signals during the probe period
|
||||||
|
4. If `should_abort_probe()` (loss ≥ 5% or new congestion events) → abort, drop probe pipeline, cooldown 8s
|
||||||
|
5. If probe duration (3s) passes without abort → commit, replace current pipeline
|
||||||
|
|
||||||
|
### Rendition Ranking
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn rank_renditions(renditions: &BTreeMap<String, VideoConfig>) -> Vec<RankedRendition>
|
||||||
|
```
|
||||||
|
|
||||||
|
Sorts by pixel count descending (highest quality = index 0). Each `RankedRendition` carries name, pixels, bitrate_bps, width, height.
|
||||||
|
|
||||||
|
### RenditionMode
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum RenditionMode {
|
||||||
|
Auto, // Algorithm-driven switching
|
||||||
|
Fixed(String), // Pin to a specific rendition
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Controlled via `VideoTrack::set_rendition_mode()`. In Fixed mode, the algorithm switches directly to the named rendition without probing.
|
||||||
85
docs/research/references/iroh/iroh-live/08-p2p-and-relay.md
Normal file
85
docs/research/references/iroh/iroh-live/08-p2p-and-relay.md
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
# iroh-live: P2P Connectivity and Relay Architecture
|
||||||
|
|
||||||
|
## Direct Connectivity
|
||||||
|
|
||||||
|
iroh connects peers directly when possible:
|
||||||
|
|
||||||
|
- **Same LAN:** Communicates over the local network without traffic leaving the subnet
|
||||||
|
- **Public IP / simple NAT:** iroh's hole-punching establishes a direct UDP path
|
||||||
|
- **Symmetric NAT / corporate firewalls / CGNAT:** Falls back to iroh relay network
|
||||||
|
|
||||||
|
The iroh endpoint exposes path statistics via `conn.paths()`, which returns a `Watcher<PathInfoList>`. Each `PathInfo` reports RTT, whether the path is selected, and the remote address. The selected path is the one actively carrying traffic; iroh may maintain multiple candidate paths and switch between them.
|
||||||
|
|
||||||
|
The transition between direct and relayed paths is transparent to the application. The media pipeline sees only changes in RTT and bandwidth, which adaptive rendition switching handles automatically.
|
||||||
|
|
||||||
|
## iroh-live-relay: Architecture
|
||||||
|
|
||||||
|
The relay serves two transport protocols simultaneously:
|
||||||
|
|
||||||
|
```
|
||||||
|
iroh P2P publisher ──(QUIC, moq-lite-03)──> iroh-live-relay <──(WebTransport/H3, noq)── browser
|
||||||
|
```
|
||||||
|
|
||||||
|
Both protocols feed into `moq-relay`'s shared `Origin`, which manages broadcast routing. A broadcast published via iroh is automatically available to WebTransport subscribers, and vice versa.
|
||||||
|
|
||||||
|
### Pull Model
|
||||||
|
|
||||||
|
The relay operates in **pull mode**: it connects to iroh publishers on demand when a browser client requests a broadcast. The broadcast name in the URL can be a `LiveTicket` URI. Multiple browser clients watching the same broadcast share a single upstream iroh connection.
|
||||||
|
|
||||||
|
Pull flow:
|
||||||
|
1. Browser connects via WebTransport, requests broadcast by name (or ticket)
|
||||||
|
2. Relay checks if broadcast already exists in local cluster → fast path
|
||||||
|
3. If not, relay uses iroh-live `Moq::connect()` to connect to the remote publisher
|
||||||
|
4. Subscribes to the broadcast via `session.subscribe(broadcast_name)`
|
||||||
|
5. Publishes the consumer into the local cluster under the ticket string as the name
|
||||||
|
6. Spawns a keepalive task holding the session until it closes
|
||||||
|
7. Browser receives the stream through the relay's WebTransport frontend
|
||||||
|
|
||||||
|
### Connection Deduplication
|
||||||
|
|
||||||
|
`PullState` uses a `HashMap<String, Arc<Notify>>` to prevent duplicate concurrent connections to the same remote. If a pull is already in progress for a given ticket, subsequent requests wait on the `Notify` and then check if the broadcast appeared in the cluster.
|
||||||
|
|
||||||
|
### QUIC Backend: noq
|
||||||
|
|
||||||
|
The relay uses `noq` as its QUIC backend (not quinn). This is configured via:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
server_config.backend = Some(moq_native::QuicBackend::Noq);
|
||||||
|
```
|
||||||
|
|
||||||
|
### iroh Endpoint Integration
|
||||||
|
|
||||||
|
The relay also binds an iroh endpoint:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let mut iroh_config = moq_native::IrohEndpointConfig::default();
|
||||||
|
iroh_config.enabled = Some(true);
|
||||||
|
iroh_config.secret = Some(relay.iroh_secret_path_str());
|
||||||
|
let iroh = iroh_config.bind().await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
This enables the relay to participate in the iroh P2P network directly.
|
||||||
|
|
||||||
|
## Ticket Format
|
||||||
|
|
||||||
|
`LiveTicket` serves as the connection mechanism for both P2P and relay scenarios:
|
||||||
|
|
||||||
|
- **P2P:** Subscriber uses the `EndpointAddr` (node ID + relay URLs) to connect directly
|
||||||
|
- **Relay:** The full ticket string becomes the broadcast name in the URL: `https://relay:4443/?name=iroh-live:...`
|
||||||
|
|
||||||
|
The ticket format: `iroh-live:<base64url(postcard(EndpointAddr))>/<broadcast_name>`
|
||||||
|
|
||||||
|
It also supports a legacy format: `<name>@<base32(postcard(EndpointAddr))>`
|
||||||
|
|
||||||
|
## Connection Access in iroh-moq
|
||||||
|
|
||||||
|
`MoqSession::conn()` returns a reference to the underlying iroh `Connection`. This is used by:
|
||||||
|
|
||||||
|
1. **Signal producer** — Polls path stats for `NetworkSignals`
|
||||||
|
2. **Stats recorder** — Records into `NetStats` for debug overlays
|
||||||
|
3. **Call::closed()** — Inspects QUIC close reason to determine `DisconnectReason`
|
||||||
|
|
||||||
|
The connection provides:
|
||||||
|
- `paths().get()` — List of active network paths with RTT, stats, relay status
|
||||||
|
- `close_reason()` — Why the connection closed (LocallyClosed, ApplicationClosed, ConnectionClosed, Reset)
|
||||||
|
- `remote_id()` — Remote peer's endpoint ID
|
||||||
42
docs/research/references/iroh/iroh-live/README.md
Normal file
42
docs/research/references/iroh/iroh-live/README.md
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
# iroh-live Reference Documentation
|
||||||
|
|
||||||
|
> **Status:** Early tech preview. APIs are unstable. Based on source code analysis of the iroh-live workspace.
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
| File | Topic |
|
||||||
|
|------|-------|
|
||||||
|
| [01-overview-and-architecture](01-overview-and-architecture.md) | Workspace structure, crate layers, design principles, data flow, dependencies |
|
||||||
|
| [02-core-api](02-core-api.md) | `Live`, `LiveTicket`, `Call`, `Subscription`, `DisconnectReason`, `util` module |
|
||||||
|
| [03-iroh-moq-transport](03-iroh-moq-transport.md) | `Moq`, `MoqSession`, `MoqProtocolHandler`, actor internals, session lifecycle, error types |
|
||||||
|
| [04-rooms](04-rooms.md) | `Room`, `RoomHandle`, `RoomTicket`, `RoomEvent`, gossip KV coordination, actor architecture |
|
||||||
|
| [05-relay](05-relay.md) | `iroh-live-relay`: browser bridging, pull model, `RelayConfig`, `PullState`, web viewer |
|
||||||
|
| [06-moq-media-pipelines](06-moq-media-pipelines.md) | `LocalBroadcast`, `RemoteBroadcast`, `VideoTrack`, `AudioTrack`, transport abstraction, codec support |
|
||||||
|
| [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md) | `NetworkSignals`, adaptation algorithm, `AdaptiveConfig`, `Decision`, probe lifecycle |
|
||||||
|
| [08-p2p-and-relay](08-p2p-and-relay.md) | iroh P2P connectivity, relay architecture, pull model, ticket format, connection access |
|
||||||
|
|
||||||
|
## Quick Navigation
|
||||||
|
|
||||||
|
### "How do I..."
|
||||||
|
|
||||||
|
- **Publish a stream?** → [02-core-api](02-core-api.md) (`Live::publish`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`LocalBroadcast`)
|
||||||
|
- **Subscribe to a stream?** → [02-core-api](02-core-api.md) (`Live::subscribe`) + [06-moq-media-pipelines](06-moq-media-pipelines.md) (`RemoteBroadcast`)
|
||||||
|
- **Make a 1:1 call?** → [02-core-api](02-core-api.md) (`Call::dial` / `Call::accept`)
|
||||||
|
- **Create a multi-party room?** → [04-rooms](04-rooms.md) (`Room::new`, `RoomTicket`)
|
||||||
|
- **Bridge to browsers?** → [05-relay](05-relay.md) (`iroh-live-relay`)
|
||||||
|
- **Adapt quality to network conditions?** → [07-network-signals-and-adaptive-bitrate](07-network-signals-and-adaptive-bitrate.md)
|
||||||
|
- **Understand the MoQ transport?** → [03-iroh-moq-transport](03-iroh-moq-transport.md)
|
||||||
|
- **Understand the media pipeline?** → [06-moq-media-pipelines](06-moq-media-pipelines.md)
|
||||||
|
|
||||||
|
### Key Source Files
|
||||||
|
|
||||||
|
| Component | Path |
|
||||||
|
|-----------|------|
|
||||||
|
| iroh-live crate | `iroh-live/src/{lib, live, call, subscription, ticket, types, util, rooms}.rs` |
|
||||||
|
| iroh-moq crate | `iroh-moq/src/lib.rs` |
|
||||||
|
| iroh-live-relay | `iroh-live-relay/src/{lib, main, pull}.rs` |
|
||||||
|
| moq-media publish | `moq-media/src/publish.rs` |
|
||||||
|
| moq-media subscribe | `moq-media/src/subscribe.rs` |
|
||||||
|
| moq-media adaptive | `moq-media/src/adaptive.rs` |
|
||||||
|
| moq-media transport | `moq-media/src/transport.rs` |
|
||||||
|
| moq-media network signals | `moq-media/src/net.rs` |
|
||||||
160
docs/research/references/iroh/iroh/01-overview-architecture.md
Normal file
160
docs/research/references/iroh/iroh/01-overview-architecture.md
Normal file
@@ -0,0 +1,160 @@
|
|||||||
|
# Iroh: Overview & Architecture
|
||||||
|
|
||||||
|
**Version**: 0.98.1
|
||||||
|
**Repository**: https://github.com/n0-computer/iroh
|
||||||
|
**License**: MIT OR Apache-2.0
|
||||||
|
**Rust Edition**: 2024
|
||||||
|
**MSRV**: 1.89
|
||||||
|
|
||||||
|
## What is Iroh?
|
||||||
|
|
||||||
|
Iroh is a Rust library for establishing **peer-to-peer QUIC connections dialed by public key**. You provide an `EndpointAddr` (which identifies a peer), and iroh finds and maintains the fastest connection route — whether direct (hole-punched) or relayed through a server.
|
||||||
|
|
||||||
|
Core value propositions:
|
||||||
|
- **Dial by public key** — no IP addresses or hostnames needed at the application layer
|
||||||
|
- **Hole-punching** — automatically attempts direct P2P connectivity
|
||||||
|
- **Relay fallback** — encrypted relay servers ensure connectivity even behind NATs
|
||||||
|
- **Built on QUIC** — uses the `noq` QUIC implementation for multiplexed, encrypted streams
|
||||||
|
- **Address Lookup** — pluggable discovery system to resolve `EndpointId → addressing info`
|
||||||
|
|
||||||
|
## Workspace Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
iroh/ # Core library (p2p QUIC connections)
|
||||||
|
├── iroh-base/ # Fundamental types: SecretKey, PublicKey, EndpointId, RelayUrl, EndpointAddr
|
||||||
|
├── iroh-dns/ # DNS resolver + endpoint info serialization (pkarr)
|
||||||
|
├── iroh-dns-server/ # DNS server implementation (powers dns.iroh.link)
|
||||||
|
├── iroh-relay/ # Relay server + client implementation
|
||||||
|
└── iroh/bench/ # Benchmarks
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dependency Graph
|
||||||
|
|
||||||
|
```
|
||||||
|
iroh depends on:
|
||||||
|
├── iroh-base (key types, EndpointAddr, RelayUrl)
|
||||||
|
├── iroh-dns (DNS resolution, EndpointInfo serialization)
|
||||||
|
├── iroh-relay (RelayMap, RelayConfig, relay client/server, QUIC client)
|
||||||
|
├── noq (QUIC implementation)
|
||||||
|
├── noq-proto (QUIC protocol types)
|
||||||
|
├── noq-udp (UDP socket abstraction)
|
||||||
|
├── netwatch (network interface monitoring)
|
||||||
|
├── portmapper (UPnP/PCP/NAT-PMP port mapping, optional)
|
||||||
|
├── n0-future (async utilities)
|
||||||
|
├── n0-watcher (watch/subscribe primitives)
|
||||||
|
└── iroh-metrics (metrics collection)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Concepts
|
||||||
|
|
||||||
|
### EndpointId / PublicKey
|
||||||
|
Every iroh endpoint has a unique Ed25519 cryptographic key pair. The public key doubles as the endpoint identifier (`EndpointId`). It's used for both:
|
||||||
|
- **Identity** — unique addressing in the network
|
||||||
|
- **Encryption** — TLS authentication (via RFC 7250 Raw Public Keys, no X.509 certificates)
|
||||||
|
|
||||||
|
### EndpointAddr
|
||||||
|
The addressing structure that combines identity with network paths:
|
||||||
|
```rust
|
||||||
|
pub struct EndpointAddr {
|
||||||
|
pub id: EndpointId, // Who to connect to
|
||||||
|
pub addrs: BTreeSet<TransportAddr>, // How to reach them
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum TransportAddr {
|
||||||
|
Relay(RelayUrl), // Via relay server
|
||||||
|
Ip(SocketAddr), // Direct IP address
|
||||||
|
Custom(CustomAddr), // Via custom transport
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Relay Servers
|
||||||
|
Relay servers provide:
|
||||||
|
1. **Reliable connectivity** — always reachable, forward encrypted traffic to the correct endpoint by `EndpointId`
|
||||||
|
2. **Hole-punching assistance** — QUIC Address Discovery (QAD), STUN-like services
|
||||||
|
3. **Traffic relay** — fallback when direct connections are impossible
|
||||||
|
|
||||||
|
Connections to relays use HTTP/1.1 with TLS, then upgrade to a custom protocol. The relay only sees encrypted traffic.
|
||||||
|
|
||||||
|
### Connection Flow
|
||||||
|
1. Endpoint binds, connects to a "home relay"
|
||||||
|
2. To connect to peer: resolve `EndpointId` → `EndpointAddr` via Address Lookup
|
||||||
|
3. Establish initial connection via relay
|
||||||
|
4. Attempt direct connection (hole-punching if needed)
|
||||||
|
5. Migrate to direct connection when available (relay becomes backup)
|
||||||
|
|
||||||
|
## Crate: `iroh` (Core Library)
|
||||||
|
|
||||||
|
### Main Types
|
||||||
|
| Type | Module | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `Endpoint` | `endpoint` | Central API — connect, accept, manage connections |
|
||||||
|
| `Builder` | `endpoint` | Configure and construct an `Endpoint` |
|
||||||
|
| `Router` | `protocol` | Accept loop that dispatches to `ProtocolHandler`s |
|
||||||
|
| `ProtocolHandler` | `protocol` | Trait for handling incoming connections by ALPN |
|
||||||
|
| `Connection` | `endpoint::connection` | QUIC connection wrapper |
|
||||||
|
| `Incoming` | `endpoint::connection` | Pre-handshake incoming connection |
|
||||||
|
| `Accepting` | `endpoint::connection` | Post-accept, pre-handshake state |
|
||||||
|
|
||||||
|
### Feature Flags
|
||||||
|
- `default` = `["metrics", "fast-apple-datapath", "portmapper", "tls-ring"]`
|
||||||
|
- `metrics` — Prometheus-style metrics collection
|
||||||
|
- `portmapper` — UPnP/PCP/NAT-PMP support
|
||||||
|
- `test-utils` — Testing utilities
|
||||||
|
- `platform-verifier` — Use OS TLS trust anchors
|
||||||
|
- `qlog` — QUIC event logging
|
||||||
|
- `fast-apple-datapath` — Private Apple APIs for batched sends
|
||||||
|
- `tls-ring` / `tls-aws-lc-rs` — Choose TLS crypto backend
|
||||||
|
- `unstable-custom-transports` — Custom transport API (unstable)
|
||||||
|
|
||||||
|
### WASM Support
|
||||||
|
The crate compiles to `wasm32-unknown-unknown` for browser targets. Browser builds:
|
||||||
|
- Use `PkarrResolver` instead of `DnsAddressLookup` (DNS-over-HTTPS)
|
||||||
|
- Cannot bind IP sockets (no direct connectivity)
|
||||||
|
- Use `wasm-bindgen-futures` for async runtime
|
||||||
|
|
||||||
|
## Presets
|
||||||
|
|
||||||
|
The `presets` module provides common configurations:
|
||||||
|
|
||||||
|
| Preset | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `Empty` | No defaults — you must set all required options yourself |
|
||||||
|
| `Minimal` | Sets only the crypto provider (ring or aws-lc-rs) |
|
||||||
|
| `N0` | Full n0 defaults: crypto provider, Pkarr publisher, DNS resolver, n0 relay servers |
|
||||||
|
| `N0DisableRelay` | N0 defaults but with `RelayMode::Disabled` |
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Quick start with full n0 infrastructure
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
|
||||||
|
// Minimal — just crypto, no relay or address lookup
|
||||||
|
let endpoint = Endpoint::bind(presets::Minimal).await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Encryption & Authentication
|
||||||
|
|
||||||
|
Iroh uses **RFC 7250 Raw Public Keys** for TLS — no X.509 certificates. Each endpoint has:
|
||||||
|
- `SecretKey` (Ed25519) — used for TLS authentication and signing
|
||||||
|
- `PublicKey`/`EndpointId` — derived from `SecretKey`, used as identity
|
||||||
|
|
||||||
|
The TLS server name is encoded as `<base32-dnssec-encoded-public-key>.iroh.invalid` to ensure 0-RTT session ticket separation per endpoint.
|
||||||
|
|
||||||
|
## 0-RTT Support
|
||||||
|
|
||||||
|
Iroh supports QUIC 0-RTT connections:
|
||||||
|
- `Connecting::into_0rtt()` on the client side
|
||||||
|
- `Accepting::into_0rtt()` on the server side
|
||||||
|
- TLS session tickets cached per remote endpoint (default 256 tickets = ~150 KiB)
|
||||||
|
- `max_tls_tickets()` builder option to tune cache size
|
||||||
|
|
||||||
|
## Default Infrastructure (n0)
|
||||||
|
|
||||||
|
Production relay servers (4 regions):
|
||||||
|
| Region | Hostname |
|
||||||
|
|--------|----------|
|
||||||
|
| NA East | `use1-1.relay.n0.iroh-canary.iroh.link` |
|
||||||
|
| NA West | `usw1-1.relay.n0.iroh-canary.iroh.link` |
|
||||||
|
| EU | `euc1-1.relay.n0.iroh-canary.iroh.link` |
|
||||||
|
| AP | `aps1-1.relay.n0.iroh-canary.iroh.link` |
|
||||||
|
|
||||||
|
DNS Address Lookup origin: `dns.iroh.link`
|
||||||
392
docs/research/references/iroh/iroh/02-key-types-traits.md
Normal file
392
docs/research/references/iroh/iroh/02-key-types-traits.md
Normal file
@@ -0,0 +1,392 @@
|
|||||||
|
# Iroh: Key Types and Traits
|
||||||
|
|
||||||
|
## Core Identity Types (`iroh-base`)
|
||||||
|
|
||||||
|
### `SecretKey`
|
||||||
|
Ed25519 signing key (32 bytes). Used for:
|
||||||
|
- TLS authentication (RFC 7250 Raw Public Key)
|
||||||
|
- Signing pkarr packets for address discovery
|
||||||
|
- Generating the corresponding `PublicKey`/`EndpointId`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Generation
|
||||||
|
let secret_key = SecretKey::generate();
|
||||||
|
|
||||||
|
// From bytes
|
||||||
|
let secret_key = SecretKey::from_bytes(&[0u8; 32]);
|
||||||
|
|
||||||
|
// Access public key
|
||||||
|
let public_key: PublicKey = secret_key.public();
|
||||||
|
```
|
||||||
|
|
||||||
|
### `PublicKey` / `EndpointId`
|
||||||
|
`EndpointId` is a type alias for `PublicKey`. Both are 32-byte Ed25519 compressed points.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub type EndpointId = PublicKey;
|
||||||
|
|
||||||
|
impl PublicKey {
|
||||||
|
pub const LENGTH: usize = 32;
|
||||||
|
pub fn from_bytes(bytes: &[u8; 32]) -> Result<Self, KeyParsingError>;
|
||||||
|
pub fn as_bytes(&self) -> &[u8; 32];
|
||||||
|
pub fn verify(&self, message: &[u8], signature: &Signature) -> Result<(), SignatureError>;
|
||||||
|
pub fn fmt_short(&self) -> impl Display; // First 5 bytes hex
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Serialization: Human-readable → base32 z-base-32 encoding; Binary → 32 raw bytes.
|
||||||
|
|
||||||
|
### `Signature`
|
||||||
|
Ed25519 signature (64 bytes). Used in pkarr for signing endpoint discovery records.
|
||||||
|
|
||||||
|
### `KeyParsingError`
|
||||||
|
Error type for key parsing failures.
|
||||||
|
|
||||||
|
## Addressing Types (`iroh-base`)
|
||||||
|
|
||||||
|
### `EndpointAddr`
|
||||||
|
The primary addressing type — combines identity with network paths:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EndpointAddr {
|
||||||
|
pub id: EndpointId,
|
||||||
|
pub addrs: BTreeSet<TransportAddr>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl EndpointAddr {
|
||||||
|
pub fn new(id: PublicKey) -> Self;
|
||||||
|
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||||
|
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
|
||||||
|
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
|
||||||
|
pub fn is_empty(&self) -> bool;
|
||||||
|
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
|
||||||
|
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Can be constructed from just an `EndpointId` (relies on Address Lookup), or with explicit paths:
|
||||||
|
```rust
|
||||||
|
// From just EndpointId — needs Address Lookup
|
||||||
|
let addr = EndpointAddr::new(endpoint_id);
|
||||||
|
|
||||||
|
// With relay URL
|
||||||
|
let addr = EndpointAddr::new(endpoint_id).with_relay_url(relay_url);
|
||||||
|
|
||||||
|
// With both
|
||||||
|
let addr = EndpointAddr::from_parts(endpoint_id, [
|
||||||
|
TransportAddr::Relay(relay_url),
|
||||||
|
TransportAddr::Ip(socket_addr),
|
||||||
|
]);
|
||||||
|
```
|
||||||
|
|
||||||
|
### `TransportAddr`
|
||||||
|
```rust
|
||||||
|
pub enum TransportAddr {
|
||||||
|
Relay(RelayUrl),
|
||||||
|
Ip(SocketAddr),
|
||||||
|
Custom(CustomAddr),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `CustomAddr`
|
||||||
|
Opaque custom transport address (for `unstable-custom-transports` feature):
|
||||||
|
```rust
|
||||||
|
pub struct CustomAddr {
|
||||||
|
id: u32,
|
||||||
|
addr: Vec<u8>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RelayUrl`
|
||||||
|
Arc-wrapped `Url` identifying a relay server. Cheaply clonable. Encourages fully-qualified DNS names (trailing dot).
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let url: RelayUrl = "https://use1-1.relay.n0.iroh-canary.iroh.link.".parse()?;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Endpoint Trait (`iroh`)
|
||||||
|
|
||||||
|
### `Endpoint`
|
||||||
|
The central type — created via `Builder`, used for all connection operations:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Endpoint {
|
||||||
|
// Construction
|
||||||
|
pub fn builder(preset: impl Preset) -> Builder;
|
||||||
|
pub async fn bind(preset: impl Preset) -> Result<Self, BindError>;
|
||||||
|
|
||||||
|
// Connection
|
||||||
|
pub async fn connect(&self, addr: impl Into<EndpointAddr>, alpn: &[u8]) -> Result<Connection, ConnectError>;
|
||||||
|
pub async fn connect_with_opts(&self, addr: impl Into<EndpointAddr>, alpn: &[u8], opts: ConnectOptions) -> Result<Connecting, ConnectWithOptsError>;
|
||||||
|
pub fn accept(&self) -> Accept<'_>;
|
||||||
|
|
||||||
|
// Identity
|
||||||
|
pub fn id(&self) -> EndpointId;
|
||||||
|
pub fn secret_key(&self) -> &SecretKey;
|
||||||
|
pub fn addr(&self) -> EndpointAddr;
|
||||||
|
pub fn watch_addr(&self) -> impl Watcher<Value = EndpointAddr>;
|
||||||
|
|
||||||
|
// Lifecycle
|
||||||
|
pub async fn close(&self);
|
||||||
|
pub fn is_closed(&self) -> bool;
|
||||||
|
pub fn closed(&self) -> EndpointClosed;
|
||||||
|
pub async fn online(&self); // Wait for relay connection
|
||||||
|
|
||||||
|
// Configuration changes
|
||||||
|
pub fn set_alpns(&self, alpns: Vec<Vec<u8>>);
|
||||||
|
pub async fn insert_relay(&self, relay: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
|
||||||
|
pub async fn remove_relay(&self, relay: &RelayUrl) -> Option<Arc<RelayConfig>>;
|
||||||
|
pub async fn add_external_addr(&self, addr: SocketAddr);
|
||||||
|
pub async fn remove_external_addr(&self, addr: &SocketAddr) -> bool;
|
||||||
|
pub fn set_user_data_for_address_lookup(&self, user_data: Option<UserData>);
|
||||||
|
pub async fn network_change(&self);
|
||||||
|
|
||||||
|
// Observers
|
||||||
|
pub fn home_relay_status(&self) -> impl Watcher<Value = Vec<RelayStatus>>;
|
||||||
|
pub fn net_report(&self) -> impl Watcher<Value = Option<NetReport>>;
|
||||||
|
pub fn remote_info(&self, id: EndpointId) -> Option<RemoteInfo>;
|
||||||
|
pub fn metrics(&self) -> &EndpointMetrics;
|
||||||
|
pub fn bound_sockets(&self) -> Vec<SocketAddr>;
|
||||||
|
pub fn dns_resolver(&self) -> Result<&DnsResolver, EndpointError>;
|
||||||
|
pub fn tls_config(&self) -> &rustls::ClientConfig;
|
||||||
|
pub fn address_lookup(&self) -> Result<&AddressLookupServices, EndpointError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Builder`
|
||||||
|
Fluent builder for `Endpoint`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let ep = Endpoint::builder(presets::N0)
|
||||||
|
.secret_key(secret_key) // Identity
|
||||||
|
.alpns(vec![b"my-alpn".to_vec()]) // Accepted protocols
|
||||||
|
.relay_mode(RelayMode::Default) // Relay configuration
|
||||||
|
.address_lookup(PkarrPublisher::n0_dns()) // Address discovery
|
||||||
|
.address_lookup(DnsAddressLookup::n0_dns()) // DNS resolution
|
||||||
|
.addr_filter(AddrFilter::relay_only()) // Filter published addresses
|
||||||
|
.user_data_for_address_lookup(user_data) // Custom discovery data
|
||||||
|
.transport_config(QuicTransportConfig::default()) // QUIC tuning
|
||||||
|
.dns_resolver(dns_resolver) // Custom DNS resolver
|
||||||
|
.proxy_url(proxy_url) // HTTP proxy
|
||||||
|
.ca_roots_config(CaRootsConfig::default()) // TLS CA roots
|
||||||
|
.keylog(true) // SSLKEYLOGFILE debug
|
||||||
|
.max_tls_tickets(256) // 0-RTT ticket cache
|
||||||
|
.hooks(my_hook) // Connection hooks
|
||||||
|
.portmapper_config(PortmapperConfig::Enabled) // UPnP/NAT-PMP
|
||||||
|
.external_addr(addr) // Advertised external addr
|
||||||
|
.bind_addr("0.0.0.0:0")? // Bind specific socket
|
||||||
|
.bind() // Build & bind
|
||||||
|
.await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RelayMode`
|
||||||
|
```rust
|
||||||
|
pub enum RelayMode {
|
||||||
|
Disabled, // No relay
|
||||||
|
Default, // n0 production relays
|
||||||
|
Staging, // n0 staging relays
|
||||||
|
Custom(RelayMap), // Custom relay configuration
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Protocol Handler (`iroh::protocol`)
|
||||||
|
|
||||||
|
### `ProtocolHandler`
|
||||||
|
Trait for handling incoming connections by ALPN:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait ProtocolHandler: Send + Sync + Debug + 'static {
|
||||||
|
// Optional: intercept at Accepting stage (supports 0-RTT)
|
||||||
|
fn on_accepting(&self, accepting: Accepting) -> impl Future<Output = Result<Connection, AcceptError>> + Send;
|
||||||
|
|
||||||
|
// Required: handle the established connection
|
||||||
|
fn accept(&self, connection: Connection) -> impl Future<Output = Result<(), AcceptError>> + Send;
|
||||||
|
|
||||||
|
// Optional: called on graceful shutdown
|
||||||
|
fn shutdown(&self) -> impl Future<Output = ()> + Send;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Router`
|
||||||
|
Spawns an accept loop that dispatches incoming connections to registered handlers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let router = Router::builder(endpoint)
|
||||||
|
.accept(b"/my-alpn", Arc::new(MyHandler))
|
||||||
|
.incoming_filter(|incoming| {
|
||||||
|
if !incoming.remote_addr_validated() {
|
||||||
|
IncomingFilterOutcome::Retry
|
||||||
|
} else {
|
||||||
|
IncomingFilterOutcome::Accept
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.spawn();
|
||||||
|
|
||||||
|
// Later...
|
||||||
|
router.shutdown().await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
### `IncomingFilterOutcome`
|
||||||
|
```rust
|
||||||
|
pub enum IncomingFilterOutcome {
|
||||||
|
Accept, // Allow the connection
|
||||||
|
Retry, // Send QUIC retry (address validation)
|
||||||
|
Reject, // Refuse with CONNECTION_REFUSED
|
||||||
|
Ignore, // Drop silently (remote times out)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `AccessLimit`
|
||||||
|
Wrapper that limits connections to allowed `EndpointId`s:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let handler = AccessLimit::new(MyHandler, |endpoint_id| allowed_set.contains(&endpoint_id));
|
||||||
|
```
|
||||||
|
|
||||||
|
### `EndpointHooks`
|
||||||
|
Intercept connection establishment at two points:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait EndpointHooks: Debug + Send + Sync {
|
||||||
|
// Before outgoing connection starts
|
||||||
|
fn before_connect<'a>(&'a self, remote_addr: &'a EndpointAddr, alpn: &'a [u8])
|
||||||
|
-> BoxFuture<'a, BeforeConnectOutcome>;
|
||||||
|
|
||||||
|
// After TLS handshake completes (on both sides)
|
||||||
|
fn after_handshake<'a>(&'a self, info: &'a ConnectionInfo)
|
||||||
|
-> BoxFuture<'a, AfterHandshakeOutcome>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Connection Types (`iroh::endpoint::connection`)
|
||||||
|
|
||||||
|
### `Connecting`
|
||||||
|
The state between initiating a connection and completing the handshake:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Connecting {
|
||||||
|
pub async fn await?(self) -> Result<Connection, ConnectingError>;
|
||||||
|
pub fn into_0rtt(self) -> Result<(OutgoingZeroRttConnection, Connection), Connecting>;
|
||||||
|
pub fn alpn(&self) -> Result<Vec<u8>, ConnectingError>;
|
||||||
|
pub fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Connection`
|
||||||
|
Wraps a `noq::Connection` with iroh-specific metadata:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Connection {
|
||||||
|
// Stream operations
|
||||||
|
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), OpenBi>;
|
||||||
|
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), AcceptBi>;
|
||||||
|
pub async fn open_uni(&self) -> Result<SendStream, OpenUni>;
|
||||||
|
pub async fn accept_uni(&self) -> Result<RecvStream, AcceptUni>;
|
||||||
|
|
||||||
|
// Datagrams
|
||||||
|
pub fn send_datagram(&self, data: SendDatagram) -> Result<(), SendDatagramError>;
|
||||||
|
pub async fn read_datagram(&self) -> Result<Bytes, ReadDatagram>;
|
||||||
|
|
||||||
|
// Connection lifecycle
|
||||||
|
pub fn close(&self, error_code: VarInt, reason: &[u8]);
|
||||||
|
pub async fn closed(&self) -> ConnectionError;
|
||||||
|
|
||||||
|
// Identity
|
||||||
|
pub fn remote_id(&self) -> EndpointId;
|
||||||
|
pub fn alpn(&self) -> Vec<u8>;
|
||||||
|
|
||||||
|
// Path observation
|
||||||
|
pub fn paths(&self) -> PathWatcher;
|
||||||
|
|
||||||
|
// Keying material export
|
||||||
|
pub fn export_keying_material(&self, output: &mut [u8], label: &[u8], context: Option<&[u8]>) -> Result<(), ExportKeyingMaterialError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Incoming`
|
||||||
|
Pre-accept incoming connection:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Incoming {
|
||||||
|
pub fn accept(self) -> Result<Accepting, ConnectionError>;
|
||||||
|
pub fn accept_with(self, server_config: Arc<ServerConfig>) -> Result<Accepting, ConnectionError>;
|
||||||
|
pub fn refuse(self);
|
||||||
|
pub fn retry(self) -> Result<(), RetryError>;
|
||||||
|
pub fn ignore(self);
|
||||||
|
pub fn remote_addr(&self) -> IncomingAddr;
|
||||||
|
pub fn local_ip(&self) -> Option<IpAddr>;
|
||||||
|
pub fn remote_addr_validated(&self) -> bool;
|
||||||
|
pub fn decrypt(&self) -> Option<DecryptedInitial>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `IncomingAddr`
|
||||||
|
```rust
|
||||||
|
pub enum IncomingAddr {
|
||||||
|
Ip(SocketAddr),
|
||||||
|
Relay { url: RelayUrl, endpoint_id: EndpointId },
|
||||||
|
Custom(CustomAddr),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## `RelayMap` and `RelayConfig` (`iroh-relay`)
|
||||||
|
|
||||||
|
### `RelayMap`
|
||||||
|
Thread-safe map of relay servers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let map = RelayMap::from_iter([
|
||||||
|
"https://relay1.example.org".parse()?,
|
||||||
|
"https://relay2.example.org".parse()?,
|
||||||
|
]);
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RelayConfig`
|
||||||
|
```rust
|
||||||
|
pub struct RelayConfig {
|
||||||
|
pub url: RelayUrl,
|
||||||
|
pub quic: Option<RelayQuicConfig>, // QAD support
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct RelayQuicConfig {
|
||||||
|
pub port: u16, // Default: 3478
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## `EndpointData` and `EndpointInfo` (`iroh-dns`)
|
||||||
|
|
||||||
|
### `EndpointData`
|
||||||
|
The data published about an endpoint:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EndpointData {
|
||||||
|
addrs: Vec<TransportAddr>,
|
||||||
|
user_data: Option<UserData>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `EndpointInfo`
|
||||||
|
Combines `EndpointId` with `EndpointData`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EndpointInfo {
|
||||||
|
pub endpoint_id: EndpointId,
|
||||||
|
pub data: EndpointData,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `UserData`
|
||||||
|
Application-defined string data published alongside addressing info:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct UserData(String); // Max 256 bytes
|
||||||
|
```
|
||||||
|
|
||||||
|
### `AddrFilter`
|
||||||
|
Controls which addresses are published to address lookup services:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let filter = AddrFilter::relay_only(); // Only relay URLs
|
||||||
|
let filter = AddrFilter::unfiltered(); // All addresses
|
||||||
|
let filter = AddrFilter::custom(|addrs| { /* custom logic */ });
|
||||||
|
```
|
||||||
401
docs/research/references/iroh/iroh/03-networking-protocols.md
Normal file
401
docs/research/references/iroh/iroh/03-networking-protocols.md
Normal file
@@ -0,0 +1,401 @@
|
|||||||
|
# Iroh: Networking & Protocol Details
|
||||||
|
|
||||||
|
## Connection Establishment
|
||||||
|
|
||||||
|
### Overview
|
||||||
|
The connection process follows this sequence:
|
||||||
|
|
||||||
|
```
|
||||||
|
Caller Callee
|
||||||
|
| |
|
||||||
|
|--- connect(EndpointAddr, alpn) -------->| (via relay first)
|
||||||
|
| |
|
||||||
|
|<------ TLS Handshake (Raw Public Key) ->|
|
||||||
|
| |
|
||||||
|
|<====== QUIC Connection Established ====|
|
||||||
|
| |
|
||||||
|
| (iroh attempts direct path migration) |
|
||||||
|
| |
|
||||||
|
|--- open_bi() / open_uni() ------------->|
|
||||||
|
|<--- accept_bi() / accept_uni() ----------|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step-by-Step
|
||||||
|
|
||||||
|
1. **Resolve addressing** — `resolve_remote(EndpointAddr)` starts a `RemoteStateActor` for the peer. If no direct addresses or relay URL are provided, Address Lookup services are queried.
|
||||||
|
|
||||||
|
2. **Map addresses** — `EndpointId` is mapped to a synthetic IPv6 address for the QUIC layer (`EndpointIdMappedAddr`). Relay and custom transport addresses are similarly mapped.
|
||||||
|
|
||||||
|
3. **TLS connection** — Uses RFC 7250 Raw Public Keys. The server name is encoded as `<z32-encoded-pubkey>.iroh.invalid`. Both sides authenticate by `EndpointId`.
|
||||||
|
|
||||||
|
4. **ALPN negotiation** — The Application-Layer Protocol Negotiation determines which protocol handler receives the connection.
|
||||||
|
|
||||||
|
5. **Path migration** — Once a QUIC connection is established (initially via relay), iroh continuously searches for better paths. Direct IP paths are preferred when available.
|
||||||
|
|
||||||
|
## Transport Layer Architecture
|
||||||
|
|
||||||
|
### The `Socket` — Core Connectivity Engine
|
||||||
|
|
||||||
|
The `Socket` struct is the heart of iroh's networking. It manages:
|
||||||
|
- Multiple transport paths (IPv4, IPv6, relay, custom)
|
||||||
|
- Address discovery and NAT traversal
|
||||||
|
- Path migration between relay and direct connections
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┐
|
||||||
|
│ Endpoint │ (Public API)
|
||||||
|
│ (Arc<EndpointInner>) │
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
┌──────▼───────┐
|
||||||
|
│ Socket │ (Connectivity engine)
|
||||||
|
│ (Arc<Socket>) │
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
┌────────────┼────────────┐
|
||||||
|
│ │ │
|
||||||
|
┌─────▼─────┐ ┌───▼────┐ ┌──────▼──────┐
|
||||||
|
│IpTransport│ │Relay │ │CustomTransport│
|
||||||
|
│(IPv4/v6) │ │Transport│ │(unstable) │
|
||||||
|
└─────┬─────┘ └───┬────┘ └──────┬──────┘
|
||||||
|
│ │ │
|
||||||
|
┌─────▼─────┐ ┌───▼────┐ │
|
||||||
|
│ UdpSocket │ │WebSocket│ │
|
||||||
|
│ (netwatch)│ │ Actor │ │
|
||||||
|
└────────────┘ └────────┘ │
|
||||||
|
```
|
||||||
|
|
||||||
|
### Transport Configuration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum TransportConfig {
|
||||||
|
Ip {
|
||||||
|
config: IpConfig, // IPv4 or IPv6 socket config
|
||||||
|
is_user_defined: bool,
|
||||||
|
},
|
||||||
|
Relay {
|
||||||
|
relay_map: RelayMap, // Which relay servers to use
|
||||||
|
is_user_defined: bool,
|
||||||
|
},
|
||||||
|
#[cfg(feature = "unstable-custom-transports")]
|
||||||
|
Custom(Arc<dyn CustomTransport>),
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum IpConfig {
|
||||||
|
V4 { ip_net: Ipv4Net, port: u16, is_required: bool, is_default: bool },
|
||||||
|
V6 { ip_net: Ipv6Net, scope_id: u32, port: u16, is_required: bool, is_default: bool },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Address Mapping
|
||||||
|
|
||||||
|
Iroh maps all transport addresses to IPv6 for the QUIC layer:
|
||||||
|
|
||||||
|
- **IPv4/IPv6 addresses** → used directly as QUIC path addresses
|
||||||
|
- **Relay addresses** → mapped to synthetic IPv6 addresses in a dedicated range
|
||||||
|
- **Custom addresses** → mapped to synthetic IPv6 addresses in another range
|
||||||
|
|
||||||
|
The `MappedAddrs` struct maintains these mappings:
|
||||||
|
```rust
|
||||||
|
pub(crate) struct MappedAddrs {
|
||||||
|
pub(super) endpoint_addrs: AddrMap<EndpointId, EndpointIdMappedAddr>,
|
||||||
|
pub(super) relay_addrs: AddrMap<(RelayUrl, EndpointId), RelayMappedAddr>,
|
||||||
|
pub(super) custom_addrs: AddrMap<CustomAddr, CustomMappedAddr>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Transport Bias
|
||||||
|
|
||||||
|
Path selection uses a configurable bias system:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let endpoint = Endpoint::builder(presets::N0)
|
||||||
|
.transport_bias(AddrKind::Custom(42), TransportBias::primary())
|
||||||
|
.bind()
|
||||||
|
.await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Default biases:
|
||||||
|
- IPv4 and IPv6 are **primary** (IPv6 gets small RTT advantage)
|
||||||
|
- Relay is **backup** (only used when no primary transport available)
|
||||||
|
|
||||||
|
## Relay Protocol
|
||||||
|
|
||||||
|
### Architecture
|
||||||
|
|
||||||
|
The relay system is based on a revised version of Tailscale's DERP (Designated Encrypted Relay for Packets) protocol.
|
||||||
|
|
||||||
|
```
|
||||||
|
Client A Relay Server Client B
|
||||||
|
│ │ │
|
||||||
|
│─── HTTP CONNECT ──>| │
|
||||||
|
│<── 200 OK ─────────│ │
|
||||||
|
│ │<─── HTTP CONNECT ────│
|
||||||
|
│ │──── 200 OK ────────>│
|
||||||
|
│ │ │
|
||||||
|
│─── Encrypted QUIC ─>│─── Encrypted QUIC ─>│
|
||||||
|
│<── Encrypted QUIC ──│<── Encrypted QUIC ──│
|
||||||
|
```
|
||||||
|
|
||||||
|
### Relay Actor
|
||||||
|
|
||||||
|
The `RelayActor` manages the WebSocket connection to the relay:
|
||||||
|
- Connects to relay via HTTPS, upgrades to custom protocol
|
||||||
|
- Sends/receives encrypted datagrams on behalf of the local endpoint
|
||||||
|
- Manages reconnection on network changes or relay restarts
|
||||||
|
- Reports connection status via `HomeRelayWatch`
|
||||||
|
|
||||||
|
### Relay Data Flow
|
||||||
|
1. Outgoing packet → `RelayTransport::send()` → `RelayActor` → WebSocket → Relay server → WebSocket → remote `RelayActor` → remote `RelayTransport::recv()` → QUIC
|
||||||
|
2. The relay only sees encrypted QUIC packets — it cannot decode application data
|
||||||
|
|
||||||
|
### Home Relay Selection
|
||||||
|
|
||||||
|
The `net_report` module continuously probes relay servers and maintains latency statistics. The "home relay" is selected based on:
|
||||||
|
- Lowest recent latency (with hysteresis to avoid flapping)
|
||||||
|
- At most a 2/3 improvement threshold to switch from current relay
|
||||||
|
|
||||||
|
## Hole-Punching & NAT Traversal
|
||||||
|
|
||||||
|
### QUIC Address Discovery (QAD)
|
||||||
|
|
||||||
|
Iroh uses QUIC Address Discovery (based on [draft-ietf-quic-address-discovery](https://datatracker.ietf.org/doc/draft-ietf-quic-address-discovery/)) to discover external IP addresses. The relay servers expose QAD endpoints.
|
||||||
|
|
||||||
|
The `net_report` module:
|
||||||
|
1. Establishes QUIC connections to relay servers
|
||||||
|
2. Uses `observed_external_addr()` to learn external addresses
|
||||||
|
3. Reports NAT type, mapping behavior, and preferred relay
|
||||||
|
|
||||||
|
### NAT Traversal Strategy
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────┐
|
||||||
|
│ NAT Traversal │
|
||||||
|
│ │
|
||||||
|
│ 1. Direct connection attempt │
|
||||||
|
│ (simultaneous open) │
|
||||||
|
│ │
|
||||||
|
│ 2. QAD-discovered addresses │
|
||||||
|
│ (relay reports observed IP)│
|
||||||
|
│ │
|
||||||
|
│ 3. Port mapping (UPnP/PCP/NAT-PMP)│
|
||||||
|
│ (if supported by gateway) │
|
||||||
|
│ │
|
||||||
|
│ 4. Relay fallback │
|
||||||
|
│ (always available) │
|
||||||
|
└──────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Port Mapper
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum PortmapperConfig {
|
||||||
|
Enabled {}, // Default: tries UPnP, PCP, NAT-PMP
|
||||||
|
Disabled, // No port mapping
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When enabled, the port mapper:
|
||||||
|
- Discovers gateway devices
|
||||||
|
- Requests port mappings
|
||||||
|
- Provides external addresses to the endpoint
|
||||||
|
- Updates when mappings change
|
||||||
|
|
||||||
|
### Net Report
|
||||||
|
|
||||||
|
`NetReport` discovers network conditions:
|
||||||
|
- IPv4/IPv6 connectivity
|
||||||
|
- NAT mapping behavior (varies by destination or not)
|
||||||
|
- Captive portal detection
|
||||||
|
- Preferred relay selection
|
||||||
|
- External IP addresses (via QAD)
|
||||||
|
|
||||||
|
Key timeouts:
|
||||||
|
- `NET_REPORT_TIMEOUT` = 10 seconds
|
||||||
|
- `FULL_REPORT_INTERVAL` = 5 minutes
|
||||||
|
- `HEARTBEAT_INTERVAL` = 5 seconds (keepalive)
|
||||||
|
- `PATH_MAX_IDLE_TIMEOUT` = 15 seconds (direct)
|
||||||
|
- `RELAY_PATH_MAX_IDLE_TIMEOUT` = 30 seconds (relay)
|
||||||
|
|
||||||
|
## Address Lookup System
|
||||||
|
|
||||||
|
### Trait Definition
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait AddressLookup: Debug + Send + Sync + 'static {
|
||||||
|
fn publish(&self, data: &EndpointData);
|
||||||
|
fn resolve(&self, endpoint_id: EndpointId) -> Option<BoxStream<Result<Item, Error>>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `AddressLookupServices`
|
||||||
|
A composite that runs multiple lookup services concurrently:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let services = AddressLookupServices::default();
|
||||||
|
services.set_addr_filter(AddrFilter::relay_only());
|
||||||
|
services.add(publisher);
|
||||||
|
services.add(resolver);
|
||||||
|
```
|
||||||
|
|
||||||
|
Resolution merges results from all services. Individual service errors don't block other services.
|
||||||
|
|
||||||
|
### Built-in Implementations
|
||||||
|
|
||||||
|
#### `PkarrPublisher`
|
||||||
|
Publishes endpoint info to a pkarr relay via HTTP PUT:
|
||||||
|
```rust
|
||||||
|
let publisher = PkarrPublisher::builder(pkarr_url)
|
||||||
|
.addr_filter(AddrFilter::relay_only()) // Default: relay-only
|
||||||
|
.build(secret_key, tls_config);
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `PkarrResolver` (browser/WASM)
|
||||||
|
Resolves endpoint info from a pkarr relay via HTTP GET.
|
||||||
|
|
||||||
|
#### `DnsAddressLookup` (non-browser)
|
||||||
|
Resolves endpoint info via DNS TXT records:
|
||||||
|
```rust
|
||||||
|
// Default n0 DNS
|
||||||
|
let lookup = DnsAddressLookup::n0_dns();
|
||||||
|
|
||||||
|
// Custom DNS origin
|
||||||
|
let lookup = DnsAddressLookup::new(dns_resolver, origin);
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `MemoryLookup`
|
||||||
|
In-memory address lookup for testing:
|
||||||
|
```rust
|
||||||
|
let lookup = MemoryLookup::new();
|
||||||
|
lookup.add_endpoint(endpoint_id, endpoint_data);
|
||||||
|
```
|
||||||
|
|
||||||
|
### DNS Record Format
|
||||||
|
```
|
||||||
|
_iroh.<z32-encoded-endpoint-id>.<origin-domain> TXT
|
||||||
|
```
|
||||||
|
Attributes:
|
||||||
|
- `relay=<url>` — Home relay URL
|
||||||
|
- `addr=<addr> <addr>` — Space-separated socket addresses
|
||||||
|
- `user_data=<base64-encoded-data>` — Application-specific data
|
||||||
|
|
||||||
|
## TLS Configuration
|
||||||
|
|
||||||
|
### `TlsConfig`
|
||||||
|
Manages TLS state shared across sessions:
|
||||||
|
```rust
|
||||||
|
struct TlsConfig {
|
||||||
|
secret_key: SecretKey,
|
||||||
|
cert_resolver: Arc<ResolveRawPublicKeyCert>,
|
||||||
|
server_verifier: Arc<ServerCertificateVerifier>,
|
||||||
|
client_verifier: Arc<ClientCertificateVerifier>,
|
||||||
|
session_store: Arc<dyn ClientSessionStore>,
|
||||||
|
crypto_provider: Arc<CryptoProvider>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Raw Public Key Certificate
|
||||||
|
Uses RFC 7250 — no X.509 certificates. The `ResolveRawPublicKeyCert` resolver creates TLS certificates on-the-fly from the Ed25519 public key.
|
||||||
|
|
||||||
|
### Verification Flow
|
||||||
|
- **Client verifies server**: The `ServerCertificateVerifier` checks that the server's `EndpointId` matches the expected `EndpointId` encoded in the TLS server name.
|
||||||
|
- **Server verifies client**: The `ClientCertificateVerifier` ensures the client presents a valid raw public key.
|
||||||
|
|
||||||
|
### Crypto Providers
|
||||||
|
Two built-in options via feature flags:
|
||||||
|
- `tls-ring` — uses `ring` crypto (default)
|
||||||
|
- `tls-aws-lc-rs` — uses AWS LC-RS crypto
|
||||||
|
|
||||||
|
Custom providers can be set via `Builder::crypto_provider()`.
|
||||||
|
|
||||||
|
## Multipath & Path Migration
|
||||||
|
|
||||||
|
Iroh supports QUIC multipath connections. Multiple paths can be active simultaneously:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Watch path changes
|
||||||
|
let paths = connection.paths();
|
||||||
|
while let Some(infos) = paths.stream().next().await {
|
||||||
|
for info in infos.iter() {
|
||||||
|
if info.is_ip() { /* direct path */ }
|
||||||
|
if info.is_relay() { /* relay path */ }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Maximum multipath paths per connection: 12 (`MAX_MULTIPATH_PATHS`).
|
||||||
|
|
||||||
|
### Path Types
|
||||||
|
```rust
|
||||||
|
pub struct PathInfo {
|
||||||
|
pub addr: TransportAddr,
|
||||||
|
pub usage: TransportAddrUsage,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub enum TransportAddrUsage {
|
||||||
|
DefaultRoute,
|
||||||
|
SubnetRoute,
|
||||||
|
Backup,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Connection Hooks
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
struct MyHook;
|
||||||
|
|
||||||
|
impl EndpointHooks for MyHook {
|
||||||
|
fn before_connect<'a>(
|
||||||
|
&'a self,
|
||||||
|
remote_addr: &'a EndpointAddr,
|
||||||
|
alpn: &'a [u8],
|
||||||
|
) -> BoxFuture<'a, BeforeConnectOutcome> {
|
||||||
|
Box::pin(async move {
|
||||||
|
if is_allowed(remote_addr.id()) {
|
||||||
|
BeforeConnectOutcome::Accept
|
||||||
|
} else {
|
||||||
|
BeforeConnectOutcome::Reject
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn after_handshake<'a>(
|
||||||
|
&'a self,
|
||||||
|
info: &'a ConnectionInfo,
|
||||||
|
) -> BoxFuture<'a, AfterHandshakeOutcome> {
|
||||||
|
Box::pin(async move {
|
||||||
|
AfterHandshakeOutcome::Accept
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Custom Transports (Unstable)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait CustomTransport: Send + Sync + Debug + 'static {
|
||||||
|
// Create an endpoint for this transport
|
||||||
|
fn create_endpoint(&self, config: CustomEndpointConfig) -> Result<Arc<dyn CustomEndpoint>, CustomTransportError>;
|
||||||
|
}
|
||||||
|
|
||||||
|
pub trait CustomEndpoint: Send + Sync + Debug + 'static {
|
||||||
|
fn send(&self, item: CustomSendItem) -> Result<(), CustomTransportError>;
|
||||||
|
fn recv(&self) -> Result<CustomRecvItem, CustomTransportError>;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Register:
|
||||||
|
let ep = Endpoint::builder(presets::N0)
|
||||||
|
.add_custom_transport(Arc::new(MyTransport))
|
||||||
|
.bind()
|
||||||
|
.await?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Transport IDs (from `TRANSPORTS.md`):
|
||||||
|
|
||||||
|
| ID | Transport | Address format |
|
||||||
|
|----|-----------|---------------|
|
||||||
|
| `0x00-0x1F` | Reserved | - |
|
||||||
|
| `0x20` | Test | Ed25519 public key (32 bytes) |
|
||||||
|
| `0x544F52` | Tor | Ed25519 public key (32 bytes) |
|
||||||
|
| `0x424C45` | BLE | Bluetooth MAC address (6 bytes) |
|
||||||
294
docs/research/references/iroh/iroh/04-sub-crates.md
Normal file
294
docs/research/references/iroh/iroh/04-sub-crates.md
Normal file
@@ -0,0 +1,294 @@
|
|||||||
|
# Iroh: Sub-Crates
|
||||||
|
|
||||||
|
## `iroh-base`
|
||||||
|
|
||||||
|
**Purpose**: Fundamental types shared across all iroh crates.
|
||||||
|
**Features**: `key` (default), `relay` (default)
|
||||||
|
|
||||||
|
### Key Types
|
||||||
|
|
||||||
|
| Type | Description |
|
||||||
|
|------|-------------|
|
||||||
|
| `SecretKey` | Ed25519 signing key (32 bytes). Generated randomly or from bytes. |
|
||||||
|
| `PublicKey` | Ed25519 public key (32 bytes). Verifies signatures. |
|
||||||
|
| `EndpointId` | Type alias for `PublicKey` — used as network identity. |
|
||||||
|
| `Signature` | Ed25519 signature (64 bytes). |
|
||||||
|
| `RelayUrl` | Arc-wrapped `Url` identifying a relay server. |
|
||||||
|
| `EndpointAddr` | Combines `EndpointId` + `BTreeSet<TransportAddr>`. Primary addressing type. |
|
||||||
|
| `TransportAddr` | Enum: `Relay(RelayUrl)`, `Ip(SocketAddr)`, `Custom(CustomAddr)`. |
|
||||||
|
| `CustomAddr` | Opaque address for custom transports (id + bytes). |
|
||||||
|
| `KeyParsingError` | Error type for key parsing. |
|
||||||
|
| `RelayUrlParseError` | Error type for URL parsing. |
|
||||||
|
|
||||||
|
### `EndpointAddr` Methods
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl EndpointAddr {
|
||||||
|
pub fn new(id: PublicKey) -> Self;
|
||||||
|
pub fn from_parts(id: PublicKey, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||||
|
pub fn with_relay_url(self, relay_url: RelayUrl) -> Self;
|
||||||
|
pub fn with_ip_addr(self, addr: SocketAddr) -> Self;
|
||||||
|
pub fn with_addrs(self, addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||||
|
pub fn is_empty(&self) -> bool;
|
||||||
|
pub fn ip_addrs(&self) -> impl Iterator<Item = &SocketAddr>;
|
||||||
|
pub fn relay_urls(&self) -> impl Iterator<Item = &RelayUrl>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Serialization
|
||||||
|
- `PublicKey`/`EndpointId`: Human-readable → base32 z-base-32; Binary → 32 raw bytes
|
||||||
|
- `EndpointAddr`: Serialized as `{id, addrs}` with `TransportAddr` as tagged enum
|
||||||
|
- `RelayUrl`: Serialized as URL string
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## `iroh-dns`
|
||||||
|
|
||||||
|
**Purpose**: DNS resolver and endpoint info serialization for address discovery.
|
||||||
|
**Key Features**: pkarr signed packet creation/verification, DNS TXT record parsing, configurable DNS resolver.
|
||||||
|
|
||||||
|
### Modules
|
||||||
|
|
||||||
|
| Module | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `dns` | `DnsResolver` — configurable async DNS resolver with IPv4/IPv6 staggered lookup |
|
||||||
|
| `endpoint_info` | `EndpointInfo`, `EndpointData`, `AddrFilter`, `UserData` — serialization/deserialization |
|
||||||
|
| `pkarr` | Pkarr signed packet creation and verification |
|
||||||
|
| `attrs` | Low-level TXT record attribute parsing |
|
||||||
|
|
||||||
|
### `DnsResolver`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl DnsResolver {
|
||||||
|
pub fn new() -> Self;
|
||||||
|
pub fn with_nameserver(addr: SocketAddr) -> Self;
|
||||||
|
pub fn with_nameservers(addrs: Vec<SocketAddr>) -> Self;
|
||||||
|
|
||||||
|
// Lookup methods
|
||||||
|
pub async fn lookup_ipv4(&self, host: String) -> Result<...>;
|
||||||
|
pub async fn lookup_ipv6(&self, host: String) -> Result<...>;
|
||||||
|
pub async fn lookup_ipv4_ipv6_staggered(&self, host: &str, timeout: Duration, delays: &[u64]) -> Result<...>;
|
||||||
|
pub async fn lookup_txt(&self, host: String) -> Result<...>;
|
||||||
|
pub async fn lookup_endpoint_by_id(&self, id: &EndpointId, origin: &str) -> Result<EndpointInfo>;
|
||||||
|
|
||||||
|
// Cache management
|
||||||
|
pub fn clear_cache(&self);
|
||||||
|
pub fn reset_resolver(&self);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `EndpointInfo` & `EndpointData`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct EndpointInfo {
|
||||||
|
pub endpoint_id: EndpointId,
|
||||||
|
pub data: EndpointData,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct EndpointData {
|
||||||
|
addrs: Vec<TransportAddr>,
|
||||||
|
user_data: Option<UserData>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl EndpointData {
|
||||||
|
pub fn new(addrs: Vec<TransportAddr>) -> Self;
|
||||||
|
pub fn from_iter(addrs: impl IntoIterator<Item = TransportAddr>) -> Self;
|
||||||
|
pub fn with_user_data(mut self, user_data: UserData) -> Self;
|
||||||
|
pub fn addrs(&self) -> impl Iterator<Item = &TransportAddr>;
|
||||||
|
pub fn user_data(&self) -> Option<&UserData>;
|
||||||
|
pub fn apply_filter(&self, filter: &AddrFilter) -> Cow<'_, EndpointData>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `AddrFilter`
|
||||||
|
|
||||||
|
Controls which addresses are published in address lookup:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum AddrFilter {
|
||||||
|
RelayOnly, // Only relay URLs
|
||||||
|
Unfiltered, // All addresses
|
||||||
|
Custom(fn(&[TransportAddr]) -> Vec<TransportAddr>),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pkarr Integration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Creating signed packets
|
||||||
|
let info = EndpointInfo::new(secret_key.public())
|
||||||
|
.with_relay_url(relay_url);
|
||||||
|
let packet = info.to_pkarr_signed_packet(&secret_key, 30)?; // 30 second TTL
|
||||||
|
|
||||||
|
// Verifying and extracting
|
||||||
|
let info = EndpointInfo::from_pkarr_signed_packet(&packet)?;
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## `iroh-relay`
|
||||||
|
|
||||||
|
**Purpose**: Relay server and client implementation. Provides DERP-like relay protocol, QAD support, and relay server binary.
|
||||||
|
|
||||||
|
### Key Exports
|
||||||
|
|
||||||
|
| Type | Description |
|
||||||
|
|------|-------------|
|
||||||
|
| `RelayMap` | Thread-safe map of `RelayUrl → RelayConfig` |
|
||||||
|
| `RelayConfig` | Configuration for a single relay server |
|
||||||
|
| `RelayQuicConfig` | QUIC address discovery configuration |
|
||||||
|
| `KeyCache` | Cache for relay server public keys |
|
||||||
|
| `PingTracker` | Ping/pong tracking for relay connections |
|
||||||
|
| `MAX_PACKET_SIZE` | Maximum relay packet size (64KB - overhead) |
|
||||||
|
|
||||||
|
### Modules
|
||||||
|
|
||||||
|
| Module | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `client` | HTTP client for relay server connections |
|
||||||
|
| `http` | HTTP-related relay functionality |
|
||||||
|
| `protos` | Protocol definitions (handshake, relay, streams) |
|
||||||
|
| `quic` | QUIC client for QAD probing |
|
||||||
|
| `server` | Full relay server implementation (`feature = "server"`) |
|
||||||
|
| `tls` | TLS configuration utilities |
|
||||||
|
|
||||||
|
### `RelayConfig`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RelayConfig {
|
||||||
|
pub url: RelayUrl,
|
||||||
|
pub quic: Option<RelayQuicConfig>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RelayConfig {
|
||||||
|
pub fn new(url: RelayUrl, quic: Option<RelayQuicConfig>) -> Self;
|
||||||
|
pub fn from(url: RelayUrl) -> Self; // No QAD
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RelayMap`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl RelayMap {
|
||||||
|
pub fn empty() -> Self;
|
||||||
|
pub fn from(relay: RelayConfig) -> Self;
|
||||||
|
pub fn from_iter(iter: impl IntoIterator<Item = impl Into<RelayConfig>>) -> Self;
|
||||||
|
pub fn try_from_iter(iter: impl IntoIterator<Item = &str>) -> Result<Self, RelayUrlParseError>;
|
||||||
|
pub fn insert(&self, url: RelayUrl, config: Arc<RelayConfig>) -> Option<Arc<RelayConfig>>;
|
||||||
|
pub fn remove(&self, url: &RelayUrl) -> Option<Arc<RelayConfig>>;
|
||||||
|
pub fn len(&self) -> usize;
|
||||||
|
pub fn is_empty(&self) -> bool;
|
||||||
|
pub fn urls<T: FromIterator<RelayUrl>>(&self) -> T;
|
||||||
|
pub fn relays<T: FromIterator<Arc<RelayConfig>>>(&self) -> T;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Relay Protocol (DERP-like)
|
||||||
|
|
||||||
|
The relay protocol is based on Tailscale's DERP protocol, adapted for iroh:
|
||||||
|
|
||||||
|
1. Client connects via HTTPS, upgrades to custom protocol
|
||||||
|
2. Authentication via raw public key (Ed25519)
|
||||||
|
3. Encrypted datagram forwarding by `EndpointId`
|
||||||
|
4. QAD probes via QUIC for address discovery
|
||||||
|
5. Ping/pong keepalive mechanism
|
||||||
|
|
||||||
|
### TLS Utilities
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub use iroh_relay::tls::{CaRootsConfig, default_provider};
|
||||||
|
|
||||||
|
// Skip certificate verification (testing only)
|
||||||
|
let config = CaRootsConfig::insecure_skip_verify();
|
||||||
|
|
||||||
|
// Use system trust roots
|
||||||
|
let config = CaRootsConfig::platform_verifier();
|
||||||
|
|
||||||
|
// Use specific roots
|
||||||
|
let config = CaRootsConfig::from_pem(pem_bytes);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## `iroh-dns-server`
|
||||||
|
|
||||||
|
**Purpose**: DNS server that resolves iroh `EndpointId`s to addressing information. Powers `dns.iroh.link`.
|
||||||
|
|
||||||
|
### Key Features
|
||||||
|
- Serves DNS TXT records for `_iroh.<z32-endpoint-id>.<origin>` queries
|
||||||
|
- Integrates with pkarr for signed record verification
|
||||||
|
- Supports production (`dns.iroh.link`) and staging (`staging-dns.iroh.link`) origins
|
||||||
|
- Includes benchmarking support
|
||||||
|
|
||||||
|
### Configuration Files
|
||||||
|
- `config.dev.toml` — Development configuration
|
||||||
|
- `config.prod.toml` — Production configuration
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Internal Modules in `iroh` Crate
|
||||||
|
|
||||||
|
### `socket` Module
|
||||||
|
The connectivity layer — manages the `Socket` struct that orchestrates:
|
||||||
|
- Multiple transport paths
|
||||||
|
- Network change detection
|
||||||
|
- Address discovery and publication
|
||||||
|
- Remote state actors (per-peer state machines)
|
||||||
|
|
||||||
|
**Key sub-modules**:
|
||||||
|
|
||||||
|
| Sub-module | Description |
|
||||||
|
|-----------|-------------|
|
||||||
|
| `transports/` | Transport implementations (IP, relay, custom) |
|
||||||
|
| `transports/ip.rs` | IPv4/IPv6 UDP transport |
|
||||||
|
| `transports/relay.rs` | Relay WebSocket transport |
|
||||||
|
| `transports/relay/actor.rs` | Relay connection management actor |
|
||||||
|
| `transports/custom.rs` | Unstable custom transport API |
|
||||||
|
| `remote_map.rs` | Per-peer `RemoteStateActor` management |
|
||||||
|
| `remote_map/remote_state.rs` | State machine for connecting to a peer |
|
||||||
|
| `mapped_addrs.rs` | Address mapping for QUIC layer |
|
||||||
|
| `concurrent_read_map.rs` | Lock-free concurrent map for remote actors |
|
||||||
|
| `metrics.rs` | Socket-level metrics |
|
||||||
|
|
||||||
|
### `net_report` Module
|
||||||
|
Network condition reporter:
|
||||||
|
- Discovers external IP addresses (QAD)
|
||||||
|
- Measures relay latencies
|
||||||
|
- Detects NAT types
|
||||||
|
- Detects captive portals
|
||||||
|
- Selects preferred relay
|
||||||
|
|
||||||
|
### `portmapper` Module
|
||||||
|
UPnP/PCP/NAT-PMP port mapping:
|
||||||
|
- Gateway discovery
|
||||||
|
- Port mapping procurement
|
||||||
|
- External address monitoring
|
||||||
|
|
||||||
|
### `address_lookup` Module
|
||||||
|
Pluggable address discovery:
|
||||||
|
|
||||||
|
| Sub-module | Description |
|
||||||
|
|-----------|-------------|
|
||||||
|
| `dns.rs` | `DnsAddressLookup` — resolves via DNS TXT records |
|
||||||
|
| `pkarr.rs` | `PkarrPublisher` — publishes via HTTP PUT to pkarr relay; `PkarrResolver` — resolves from pkarr relay |
|
||||||
|
| `memory.rs` | `MemoryLookup` — in-memory lookup for testing |
|
||||||
|
|
||||||
|
### `runtime` Module
|
||||||
|
Tokio-based async runtime wrapper for `noq`:
|
||||||
|
- Task spawning with cancellation support
|
||||||
|
- Timer management
|
||||||
|
- Graceful and abrupt shutdown
|
||||||
|
- WASM browser support (delegates to `wasm-bindgen-futures`)
|
||||||
|
|
||||||
|
### `defaults` Module
|
||||||
|
Default configuration values:
|
||||||
|
- Production relay servers (4 regions)
|
||||||
|
- Staging relay servers (2 regions)
|
||||||
|
- Timeout constants
|
||||||
|
- Environment variable for forcing staging (`IROH_FORCE_STAGING_RELAYS`)
|
||||||
|
|
||||||
|
### `metrics` Module
|
||||||
|
`EndpointMetrics` collection:
|
||||||
|
- Socket metrics (datagrams sent/received, data by transport type)
|
||||||
|
- Net report metrics (reports generated, full vs incremental)
|
||||||
|
- Port mapper metrics
|
||||||
261
docs/research/references/iroh/iroh/05-data-flow-internals.md
Normal file
261
docs/research/references/iroh/iroh/05-data-flow-internals.md
Normal file
@@ -0,0 +1,261 @@
|
|||||||
|
# Iroh: Data Flow & Internal Architecture
|
||||||
|
|
||||||
|
## Data Flow: Connecting to a Remote Endpoint
|
||||||
|
|
||||||
|
```
|
||||||
|
Endpoint::connect(endpoint_addr, alpn)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
resolve_remote(endpoint_addr)
|
||||||
|
│
|
||||||
|
├─ If addr has direct IPs or relay URL → use those
|
||||||
|
│
|
||||||
|
└─ If addr is just EndpointId → query AddressLookupServices
|
||||||
|
│
|
||||||
|
├─ PkarrPublisher/PkarrResolver (HTTP)
|
||||||
|
├─ DnsAddressLookup (DNS TXT)
|
||||||
|
├─ MemoryLookup (in-memory)
|
||||||
|
└─ ...custom implementations
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Map EndpointId → MappedAddr for QUIC layer
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
noq::Endpoint::connect(client_config, dest_addr, server_name)
|
||||||
|
│
|
||||||
|
├─ TLS handshake with Raw Public Key authentication
|
||||||
|
│ server_name = "<z32-encoded-endpoint-id>.iroh.invalid"
|
||||||
|
│
|
||||||
|
└─ QUIC connection established
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Connecting → Connection
|
||||||
|
│
|
||||||
|
├─ Connection stays on relay path initially
|
||||||
|
│
|
||||||
|
└─ RemoteStateActor discovers direct paths
|
||||||
|
│
|
||||||
|
├─ QAD-discovered addresses
|
||||||
|
├─ Addresses from Address Lookup
|
||||||
|
├─ Port mapper external addresses
|
||||||
|
│
|
||||||
|
└─ Path migration: relay → direct (if possible)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow: Accepting Connections
|
||||||
|
|
||||||
|
```
|
||||||
|
Endpoint::accept() → Accept<'_>
|
||||||
|
│
|
||||||
|
▼ (incoming QUIC packet arrives on any transport)
|
||||||
|
│
|
||||||
|
noq::Endpoint::accept()
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Incoming
|
||||||
|
│
|
||||||
|
├─ incoming.remote_addr() → IncomingAddr (Ip/Relay/Custom)
|
||||||
|
├─ incoming.remote_addr_validated() → bool
|
||||||
|
├─ incoming.accept() → Accepting
|
||||||
|
├─ incoming.refuse() → reject
|
||||||
|
├─ incoming.retry() → QUIC retry (address validation)
|
||||||
|
└─ incoming.ignore() → drop silently
|
||||||
|
│
|
||||||
|
Accepting
|
||||||
|
│
|
||||||
|
├─ accepting.alpn().await → alpn bytes
|
||||||
|
├─ accepting.into_0rtt() → (OutgoingZeroRtt, Connection) [optional]
|
||||||
|
└─ accepting.await → Connection
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow: Router Accept Loop
|
||||||
|
|
||||||
|
```
|
||||||
|
Router::spawn()
|
||||||
|
│
|
||||||
|
├─ endpoint.set_alpns(registered_alpns)
|
||||||
|
│
|
||||||
|
└─ Loop:
|
||||||
|
│
|
||||||
|
├─ endpoint.accept().await → Incoming
|
||||||
|
│ │
|
||||||
|
│ ├─ Apply incoming_filter (optional)
|
||||||
|
│ │ ├─ Accept → continue
|
||||||
|
│ │ ├─ Retry → incoming.retry()
|
||||||
|
│ │ ├─ Reject → incoming.refuse()
|
||||||
|
│ │ └─ Ignore → incoming.ignore()
|
||||||
|
│ │
|
||||||
|
│ ├─ incoming.accept() → Accepting
|
||||||
|
│ ├─ accepting.alpn().await → determine ALPN
|
||||||
|
│ │
|
||||||
|
│ └─ protocols.get(alpn) → handler
|
||||||
|
│ │
|
||||||
|
│ ├─ handler.on_accepting(accepting).await
|
||||||
|
│ └─ handler.accept(connection).await
|
||||||
|
│
|
||||||
|
└─ On shutdown:
|
||||||
|
├─ protocols.shutdown().await
|
||||||
|
├─ handler_cancel_token.cancel()
|
||||||
|
└─ endpoint.close().await
|
||||||
|
```
|
||||||
|
|
||||||
|
## Actor Model: Per-Remote State
|
||||||
|
|
||||||
|
Each remote peer gets a `RemoteStateActor` that manages the connection state:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌───────────────────────────────────────────────┐
|
||||||
|
│ RemoteStateActor │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────┐ ┌─────────────────┐ │
|
||||||
|
│ │ Address │ │ Connection │ │
|
||||||
|
│ │ Lookup │ │ Tracker │ │
|
||||||
|
│ │ Resolution │ │ │ │
|
||||||
|
│ └──────┬──────┘ └────────┬────────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ ▼ ▼ │
|
||||||
|
│ ┌──────────────────────────────────┐ │
|
||||||
|
│ │ Path Selection │ │
|
||||||
|
│ │ ┌────────┐ ┌────────┐ │ │
|
||||||
|
│ │ │ IPv4 │ │ IPv6 │ │ │
|
||||||
|
│ │ │primary │ │primary │ │ │
|
||||||
|
│ │ └────────┘ └────────┘ │ │
|
||||||
|
│ │ ┌────────┐ ┌────────┐ │ │
|
||||||
|
│ │ │ Relay │ │Custom │ │ │
|
||||||
|
│ │ │backup │ │primary │ │ │
|
||||||
|
│ │ └────────┘ └────────┘ │ │
|
||||||
|
│ └──────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────────────────────────┐ │
|
||||||
|
│ │ Mapped Addresses │ │
|
||||||
|
│ │ EndpointId → MappedIPv6Addr │ │
|
||||||
|
│ │ (RelayUrl, EndpointId) → Addr │ │
|
||||||
|
│ │ CustomAddr → MappedIPv6Addr │ │
|
||||||
|
│ └──────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ Messages: │
|
||||||
|
│ ├─ ResolveRemote(EndpointAddr, reply) │
|
||||||
|
│ ├─ AddConnection(EndpointId, WeakConn, reply)│
|
||||||
|
│ └─ RemoteInfo(reply) │
|
||||||
|
└───────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow: Socket Actor
|
||||||
|
|
||||||
|
The `Actor` in `Socket` runs as a background task handling network changes:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌────────────────────────────────────────────────────────────┐
|
||||||
|
│ Socket Actor │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────────┐ ┌─────────────────┐ │
|
||||||
|
│ │ Network Monitor │ │ Direct Addr │ │
|
||||||
|
│ │ (netwatch) │ │ Update State │ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ │ Detects: │ │ Manages: │ │
|
||||||
|
│ │ - Interface up/down│ │ - NetReport runs │ │
|
||||||
|
│ │ - Address changes │ │ - Port mapper │ │
|
||||||
|
│ │ - Route changes │ │ - Direct addrs │ │
|
||||||
|
│ └────────┬─────────┘ └────────┬──────────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ ▼ ▼ │
|
||||||
|
│ ┌──────────────────────────────────────────────┐ │
|
||||||
|
│ │ Triggers │ │
|
||||||
|
│ │ - NetworkChange (major/minor) │ │
|
||||||
|
│ │ - PeriodicReStun (every 30s-5min) │ │
|
||||||
|
│ │ - PortmapUpdated │ │
|
||||||
|
│ │ - RelayMapChange │ │
|
||||||
|
│ │ - DirectAddrRefresh │ │
|
||||||
|
│ │ - ResolveRemote (from connect) │ │
|
||||||
|
│ │ - AddConnection (from new QUIC conn) │ │
|
||||||
|
│ └──────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ On address change: │
|
||||||
|
│ ┌──────────────────────────────────────────────┐ │
|
||||||
|
│ │ 1. Run net_report to discover external addrs │ │
|
||||||
|
│ │ 2. Update direct_addrs watchable │ │
|
||||||
|
│ │ 3. Publish new addresses to AddressLookup │ │
|
||||||
|
│ │ 4. Notify noq of network changes │ │
|
||||||
|
│ └──────────────────────────────────────────────┘ │
|
||||||
|
└────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Shutdown Sequence
|
||||||
|
|
||||||
|
```
|
||||||
|
Endpoint::close()
|
||||||
|
│
|
||||||
|
├─ Cancel at_close_start token
|
||||||
|
│ (stops net_reports, address lookups)
|
||||||
|
│
|
||||||
|
├─ Clear address_lookup services
|
||||||
|
│
|
||||||
|
├─ noq_endpoint.close(0, b"")
|
||||||
|
│ (refuses new connections, starts close for existing)
|
||||||
|
│
|
||||||
|
├─ noq_endpoint.wait_idle().await
|
||||||
|
│ (waits for close frames to be acknowledged)
|
||||||
|
│
|
||||||
|
├─ Cancel at_endpoint_closed token
|
||||||
|
│
|
||||||
|
├─ Wait for actor task (100ms timeout, then abort)
|
||||||
|
│
|
||||||
|
└─ runtime.shutdown().await
|
||||||
|
(waits for all spawned tasks)
|
||||||
|
```
|
||||||
|
|
||||||
|
## WASM/Browser Differences
|
||||||
|
|
||||||
|
When compiled to `wasm32-unknown-unknown`:
|
||||||
|
|
||||||
|
| Feature | Native | WASM/Browser |
|
||||||
|
|---------|--------|-------------|
|
||||||
|
| IP transports | Yes (IPv4 + IPv6) | No (no socket access) |
|
||||||
|
| DNS resolution | `DnsAddressLookup` (system DNS) | `PkarrResolver` (HTTP) |
|
||||||
|
| Network monitoring | `netwatch` (interface changes) | Not available |
|
||||||
|
| Port mapping | UPnP/PCP/NAT-PMP | Not available |
|
||||||
|
| Net report | Full (QAD, HTTPS probes) | Limited |
|
||||||
|
| Runtime | Tokio | `wasm-bindgen-futures` |
|
||||||
|
| Timer | Tokio timer | `web::Timer` wrapping `sleep_until` |
|
||||||
|
|
||||||
|
## Thread Safety & Concurrency
|
||||||
|
|
||||||
|
- `Endpoint` is `Clone` (wraps `Arc<EndpointInner>`)
|
||||||
|
- `Socket` is `Arc<Socket>` — shared across all connections
|
||||||
|
- `RemoteMap` uses `ConcurrentReadMap` — lock-free reads for hot path
|
||||||
|
- `AddressLookupServices` uses `RwLock` — infrequent writes, frequent reads
|
||||||
|
- `DirectAddrs` uses `Watchable` — publishes changes to watchers
|
||||||
|
- `HomeRelayWatch` uses `n0_watcher::Direct` — efficient change notification
|
||||||
|
|
||||||
|
## Error Handling Patterns
|
||||||
|
|
||||||
|
Iroh uses the `n0_error::stack_error` macro for rich error chains:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[stack_error(derive, add_meta, from_sources)]
|
||||||
|
pub enum ConnectError {
|
||||||
|
#[error(transparent)]
|
||||||
|
Connect { source: ConnectWithOptsError },
|
||||||
|
#[error(transparent)]
|
||||||
|
Connecting { source: ConnectingError },
|
||||||
|
#[error(transparent)]
|
||||||
|
Connection { source: ConnectionError },
|
||||||
|
}
|
||||||
|
|
||||||
|
// Usage:
|
||||||
|
// ConnectError::Connect { source: ConnectWithOptsError::SelfConnect }
|
||||||
|
// ConnectError::Connecting { source: ConnectingError::AuthenticationError { .. } }
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Constants & Timeouts
|
||||||
|
|
||||||
|
| Constant | Value | Purpose |
|
||||||
|
|----------|-------|---------|
|
||||||
|
| `HEARTBEAT_INTERVAL` | 5s | Keepalive PING interval |
|
||||||
|
| `PATH_MAX_IDLE_TIMEOUT` | 15s | Max idle before closing direct path |
|
||||||
|
| `RELAY_PATH_MAX_IDLE_TIMEOUT` | 30s | Max idle before closing relay path |
|
||||||
|
| `MAX_MULTIPATH_PATHS` | 12 | Max concurrent paths per connection |
|
||||||
|
| `DEFAULT_MAX_TLS_TICKETS` | 256 (8×32) | TLS session ticket cache size |
|
||||||
|
| `NET_REPORT_TIMEOUT` | 10s | Max time for net report |
|
||||||
|
| `FULL_REPORT_INTERVAL` | 5min | Time between full net reports |
|
||||||
|
| `DEFAULT_RELAY_QUIC_PORT` | 3478 | QAD port on relay servers |
|
||||||
@@ -0,0 +1,108 @@
|
|||||||
|
# irpc: Overview and Architecture
|
||||||
|
|
||||||
|
## What is irpc?
|
||||||
|
|
||||||
|
`irpc` is a **streaming RPC system** built for [iroh](https://docs.rs/iroh) and [noq](https://docs.rs/noq) (QUIC-based transports). It provides a framework for defining RPC protocols in Rust that work identically whether the communication is **in-process** (via tokio channels) or **cross-process/cross-network** (via QUIC streams).
|
||||||
|
|
||||||
|
**Key design goals:**
|
||||||
|
|
||||||
|
1. **Zero-overhead local use** — When used in-process, irpc should be as lightweight as raw tokio channels, replacing the common pattern of a giant `enum` over an `mpsc` channel with typed backchannels.
|
||||||
|
2. **Transparent local/remote abstraction** — The same protocol definition and client API works for both in-process and remote communication.
|
||||||
|
3. **Streaming-first** — Full support for unary RPC, server streaming, client streaming, and bidirectional streaming interaction patterns.
|
||||||
|
4. **QUIC-native** — Does not abstract over stream types; directly uses noq/iroh QUIC streams, enabling per-request stream tuning (priorities, etc.).
|
||||||
|
|
||||||
|
**Non-goals:**
|
||||||
|
|
||||||
|
- Cross-language interop (Rust-to-Rust only)
|
||||||
|
- Versioning (users must handle this themselves)
|
||||||
|
- Making remote calls look like local async function calls
|
||||||
|
- Runtime agnosticism (tokio only)
|
||||||
|
|
||||||
|
## Crate Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
irpc/
|
||||||
|
├── src/lib.rs # Core library: traits, channels, Client, RPC module
|
||||||
|
├── src/util.rs # Varint utilities, noq endpoint setup helpers
|
||||||
|
├── src/tests.rs # Channel filter/map tests
|
||||||
|
├── irpc-derive/ # Procedural macro crate (rpc_requests)
|
||||||
|
├── irpc-iroh/ # Iroh transport integration
|
||||||
|
├── examples/ # Working examples (storage, compute, derive, local)
|
||||||
|
└── tests/ # Integration tests (channels, derive)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Features
|
||||||
|
|
||||||
|
| Feature | Default | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `rpc` | ✅ | Enables remote RPC (noq transport, postcard serialization) |
|
||||||
|
| `derive` | ✅ | Enables the `#[rpc_requests]` macro |
|
||||||
|
| `spans` | ✅ | Preserves tracing spans across message passing |
|
||||||
|
| `stream` | ✅ | Enables `into_stream()` on mpsc receivers |
|
||||||
|
| `noq_endpoint_setup` | ✅ | Utilities to create noq endpoints (testing, localhost) |
|
||||||
|
| `varint-util` | ❌ | Varint read/write utilities without full RPC |
|
||||||
|
|
||||||
|
## High-Level Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ Application │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌───────────┐ ┌───────────┐ │
|
||||||
|
│ │ Client │─────│ Protocol │─────│ Actor/ │ │
|
||||||
|
│ │<S> │ │ Enum (S) │ │ Handler │ │
|
||||||
|
│ └────┬─────┘ └───────────┘ └─────┬─────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ ┌────▼─────────────────────────────────────▼─────┐ │
|
||||||
|
│ │ WithChannels<I, S> │ │
|
||||||
|
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌─────┐ │ │
|
||||||
|
│ │ │ inner │ │ tx │ │ rx │ │span │ │ │
|
||||||
|
│ │ │ (I) │ │(Sender)│ │(Recv) │ │ │ │ │
|
||||||
|
│ │ └────────┘ └────────┘ └────────┘ └─────┘ │ │
|
||||||
|
│ └────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────────────┐ ┌─────────────────────────┐ │
|
||||||
|
│ │ Local Path │ │ Remote Path (rpc feat) │ │
|
||||||
|
│ │ tokio::mpsc │ │ noq QUIC streams │ │
|
||||||
|
│ │ tokio::oneshot │ │ postcard serialization │ │
|
||||||
|
│ └────────────────────┘ └─────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Core Flow
|
||||||
|
|
||||||
|
1. **Define a protocol** — An enum where each variant represents an RPC method, annotated with `#[rpc(tx=..., rx=...)]`.
|
||||||
|
2. **The `rpc_requests` macro** generates:
|
||||||
|
- `Channels<S>` impl for each request type
|
||||||
|
- A message enum wrapping each request in `WithChannels<I, S>`
|
||||||
|
- `Service` and `RemoteService` trait implementations
|
||||||
|
- `From` conversions between request types, protocol enum, and message enum
|
||||||
|
3. **Client sends messages** — `Client<S>` either sends over a local `mpsc` channel or serializes and sends over a QUIC stream.
|
||||||
|
4. **Actor/handler processes messages** — Matches on the message enum, extracts `WithChannels { inner, tx, rx, .. }`, and uses `tx`/`rx` to communicate back.
|
||||||
|
|
||||||
|
## Dependency Graph
|
||||||
|
|
||||||
|
```
|
||||||
|
irpc (core)
|
||||||
|
├── serde (always)
|
||||||
|
├── tokio (sync, macros)
|
||||||
|
├── tokio-util
|
||||||
|
├── n0-error
|
||||||
|
├── n0-future
|
||||||
|
├── postcard (rpc feature)
|
||||||
|
├── noq (rpc feature)
|
||||||
|
├── smallvec (rpc feature)
|
||||||
|
├── tracing (spans feature)
|
||||||
|
└── irpc-derive (derive feature)
|
||||||
|
|
||||||
|
irpc-iroh
|
||||||
|
├── irpc
|
||||||
|
├── iroh
|
||||||
|
├── iroh-base
|
||||||
|
├── postcard
|
||||||
|
└── n0-error, n0-future, tokio, tracing, serde
|
||||||
|
```
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
Dual-licensed: Apache-2.0 OR MIT
|
||||||
239
docs/research/references/iroh/irpc/02-types-and-traits.md
Normal file
239
docs/research/references/iroh/irpc/02-types-and-traits.md
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
# irpc: Key Types and Traits
|
||||||
|
|
||||||
|
## Core Traits
|
||||||
|
|
||||||
|
### `RpcMessage`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RpcMessage: Debug + Serialize + DeserializeOwned + Send + Sync + Unpin + 'static {}
|
||||||
|
```
|
||||||
|
|
||||||
|
A blanket trait implemented for all types that satisfy the bounds. Every message sent through irpc (both local and remote) must implement this. The `Serialize + DeserializeOwned` requirement exists even without the `rpc` feature because the same protocol definition should work in both modes.
|
||||||
|
|
||||||
|
### `Service`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Service: Serialize + DeserializeOwned + Send + Sync + Debug + 'static {
|
||||||
|
type Message: Send + Unpin + 'static;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Implemented on the **protocol enum** (e.g., `StorageProtocol`). The `Message` associated type is the **message enum** — an enum with identical variant names but whose single field is `WithChannels<InnerType, Self>`.
|
||||||
|
|
||||||
|
The `Service` trait acts as a **scope** for channel type definitions, allowing the same inner request type to be used with multiple services.
|
||||||
|
|
||||||
|
### `Channels<S>`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Channels<S: Service>: Send + 'static {
|
||||||
|
type Tx: Sender;
|
||||||
|
type Rx: Receiver;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Implemented on each **request type** (e.g., `Get`, `Set`). Specifies what kind of channels accompany that request when sent through service `S`. The `Tx` type is the response channel (server → client); the `Rx` type is the update channel (client → server).
|
||||||
|
|
||||||
|
### `Sender` and `Receiver`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Sender: Debug + Sealed {}
|
||||||
|
pub trait Receiver: Debug + Sealed {}
|
||||||
|
```
|
||||||
|
|
||||||
|
Sealed marker traits. Only the types in `irpc::channel` implement these: `oneshot::Sender`, `oneshot::Receiver`, `mpsc::Sender`, `mpsc::Receiver`, `NoSender`, `NoReceiver`.
|
||||||
|
|
||||||
|
### `RemoteService` (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RemoteService: Service + Sized {
|
||||||
|
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message;
|
||||||
|
|
||||||
|
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
|
||||||
|
// Default: convert deserialized protocol enum + streams → Message, send to local sender
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Implemented on the protocol enum. Maps a deserialized protocol variant + a pair of QUIC streams into a `WithChannels` message, which is then forwarded to the local actor.
|
||||||
|
|
||||||
|
### `RemoteConnection` (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait RemoteConnection: Send + Sync + Debug + 'static {
|
||||||
|
fn clone_boxed(&self) -> Box<dyn RemoteConnection>;
|
||||||
|
fn open_bi(&self) -> BoxFuture<Result<(noq::SendStream, noq::RecvStream), RequestError>>;
|
||||||
|
fn zero_rtt_accepted(&self) -> BoxFuture<bool>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Abstraction over how to open a bidirectional QUIC stream. Implemented for:
|
||||||
|
- `noq::Connection` — direct noq connection
|
||||||
|
- `NoqLazyRemoteConnection` — lazy connection that caches the underlying QUIC connection
|
||||||
|
- `IrohRemoteConnection` — iroh connection (in `irpc-iroh`)
|
||||||
|
- `IrohLazyRemoteConnection` — lazy iroh connection (in `irpc-iroh`)
|
||||||
|
- `IrohZrttRemoteConnection` — 0-RTT iroh connection (in `irpc-iroh`)
|
||||||
|
|
||||||
|
## Key Structs
|
||||||
|
|
||||||
|
### `WithChannels<I, S>`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct WithChannels<I: Channels<S>, S: Service> {
|
||||||
|
pub inner: I,
|
||||||
|
pub tx: <I as Channels<S>>::Tx,
|
||||||
|
pub rx: <I as Channels<S>>::Rx,
|
||||||
|
#[cfg(feature = "spans")]
|
||||||
|
pub span: tracing::Span,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The central message wrapper. Wraps a request type `I` with its typed channels for service `S`. Implements `Deref` to `I` for convenient field access.
|
||||||
|
|
||||||
|
**Construction** via tuple conversions:
|
||||||
|
- `(inner, tx, rx)` → full channels
|
||||||
|
- `(inner, tx)` → when `Rx = NoReceiver` (most common for RPC/server-streaming)
|
||||||
|
- `(inner,)` → when `Tx = NoSender, Rx = NoReceiver` (notify)
|
||||||
|
|
||||||
|
### `Client<S>`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug)]
|
||||||
|
pub struct Client<S: Service>(ClientInner<S::Message>, PhantomData<S>);
|
||||||
|
```
|
||||||
|
|
||||||
|
The primary client type. Generic over a service `S`. Can be either local or remote.
|
||||||
|
|
||||||
|
**Construction:**
|
||||||
|
- `Client::local(mpsc_sender)` — from a tokio mpsc sender
|
||||||
|
- `Client::noq(endpoint, addr)` — from a noq endpoint + address (rpc feature)
|
||||||
|
- `Client::boxed(remote_connection)` — from any `RemoteConnection` impl
|
||||||
|
|
||||||
|
**Key methods** (all handle both local and remote transparently):
|
||||||
|
|
||||||
|
| Method | Pattern | Tx Type | Rx Type |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `rpc()` | Unary RPC | `oneshot::Sender<Res>` | `NoReceiver` |
|
||||||
|
| `server_streaming()` | Server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
|
||||||
|
| `client_streaming()` | Client streaming | `oneshot::Sender<Res>` | `mpsc::Receiver<Update>` |
|
||||||
|
| `bidi_streaming()` | Bidirectional | `mpsc::Sender<Res>` | `mpsc::Receiver<Update>` |
|
||||||
|
| `notify()` | Fire-and-forget | `NoSender` | `NoReceiver` |
|
||||||
|
| `rpc_0rtt()` | 0-RTT unary | `oneshot::Sender<Res>` | `NoReceiver` |
|
||||||
|
| `server_streaming_0rtt()` | 0-RTT server streaming | `mpsc::Sender<Res>` | `NoReceiver` |
|
||||||
|
| `notify_0rtt()` | 0-RTT fire-and-forget | `NoSender` | `NoReceiver` |
|
||||||
|
|
||||||
|
Each method creates the appropriate channel pair, wraps the message into `WithChannels`, and sends it.
|
||||||
|
|
||||||
|
### `LocalSender<S>`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[repr(transparent)]
|
||||||
|
pub struct LocalSender<S: Service>(crate::channel::mpsc::Sender<S::Message>);
|
||||||
|
```
|
||||||
|
|
||||||
|
A thin wrapper around `mpsc::Sender<S::Message>` for sending messages to a local actor. Provides:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl<S: Service> LocalSender<S> {
|
||||||
|
pub fn send<T>(&self, value: impl Into<WithChannels<T, S>>) -> impl Future<Output = Result<(), SendError>>
|
||||||
|
where
|
||||||
|
T: Channels<S>,
|
||||||
|
S::Message: From<WithChannels<T, S>>;
|
||||||
|
|
||||||
|
pub fn send_raw(&self, value: S::Message) -> impl Future<Output = Result<(), SendError>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Request<L, R>`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Request<L, R> {
|
||||||
|
Local(L),
|
||||||
|
Remote(R),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A generic enum distinguishing local vs remote requests. `Client::request()` returns `Request<LocalSender<S>, RemoteSender<S>>`.
|
||||||
|
|
||||||
|
### `RemoteSender<S>` (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
|
||||||
|
```
|
||||||
|
|
||||||
|
Holds a QUIC stream pair after opening a bidirectional stream. The `write()` method serializes the protocol message with postcard + varint length prefix and sends it over the send stream.
|
||||||
|
|
||||||
|
### `Handler<R>` (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub type Handler<R> = Arc<
|
||||||
|
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
|
||||||
|
+ Send + Sync + 'static,
|
||||||
|
>;
|
||||||
|
```
|
||||||
|
|
||||||
|
A shared handler function that processes incoming remote requests. Typically created via `Protocol::remote_handler(local_sender)`.
|
||||||
|
|
||||||
|
## Error Types
|
||||||
|
|
||||||
|
### `RequestError`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum RequestError {
|
||||||
|
Connect { source: noq::ConnectError }, // Connection establishment failed
|
||||||
|
Connection { source: noq::ConnectionError }, // Stream open failed
|
||||||
|
Other { source: AnyError }, // Generic error for non-noq transports
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `SendError` (in `channel` module)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum SendError {
|
||||||
|
ReceiverClosed, // Local: receiver dropped
|
||||||
|
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||||
|
Io { source: io::Error }, // Remote: network/serialization error
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `RecvError` (oneshot and mpsc variants)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// oneshot::RecvError
|
||||||
|
pub enum RecvError {
|
||||||
|
SenderClosed, // Local: sender dropped
|
||||||
|
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||||
|
Io { source: io::Error }, // Remote: network/deserialization error
|
||||||
|
}
|
||||||
|
|
||||||
|
// mpsc::RecvError
|
||||||
|
pub enum RecvError {
|
||||||
|
MaxMessageSizeExceeded, // Remote: message > 16 MiB
|
||||||
|
Io { source: io::Error }, // Remote: network/deserialization error
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: `mpsc::RecvError` does **not** have `SenderClosed` — mpsc receivers return `Ok(None)` when the sender is dropped.
|
||||||
|
|
||||||
|
### `WriteError` (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum WriteError {
|
||||||
|
Noq { source: noq::WriteError }, // QUIC stream write error
|
||||||
|
MaxMessageSizeExceeded, // Message > 16 MiB
|
||||||
|
Io { source: io::Error }, // Serialization error
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### `Error` (top-level umbrella)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum Error {
|
||||||
|
Request { source: RequestError },
|
||||||
|
Send { source: SendError },
|
||||||
|
MpscRecv { source: mpsc::RecvError },
|
||||||
|
OneshotRecv { source: oneshot::RecvError },
|
||||||
|
Write { source: rpc::WriteError }, // rpc feature only
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All error types implement `From<Error>` for `io::Error`, allowing integration with `?` in `io::Result` contexts.
|
||||||
168
docs/research/references/iroh/irpc/03-channel-system.md
Normal file
168
docs/research/references/iroh/irpc/03-channel-system.md
Normal file
@@ -0,0 +1,168 @@
|
|||||||
|
# irpc: Channel System
|
||||||
|
|
||||||
|
The channel system is the heart of irpc. It provides channel types that abstract over local (tokio) and remote (QUIC stream) communication, with the same API surface regardless of transport.
|
||||||
|
|
||||||
|
## Channel Kinds
|
||||||
|
|
||||||
|
irpc provides three kinds of channels, each with local and remote variants:
|
||||||
|
|
||||||
|
### Oneshot Channels (`channel::oneshot`)
|
||||||
|
|
||||||
|
Single-value, single-use channels for RPC responses.
|
||||||
|
|
||||||
|
| Type | Local Backend | Remote Backend |
|
||||||
|
|---|---|---|
|
||||||
|
| `oneshot::Sender<T>` | `tokio::sync::oneshot::Sender` | `BoxedSender<T>` (FnOnce over QUIC write) |
|
||||||
|
| `oneshot::Receiver<T>` | `FusedOneshotReceiver<T>` | `BoxedReceiver<T>` (boxed future over QUIC read) |
|
||||||
|
|
||||||
|
**Creation:** `oneshot::channel::<T>()` returns `(Sender<T>, Receiver<T>)`
|
||||||
|
|
||||||
|
**Sender behavior:**
|
||||||
|
- Local: `send(value)` is synchronous-ish, fails only if receiver dropped
|
||||||
|
- Remote: `send(value)` is async — serializes with postcard, length-prefixes with varint, writes to QUIC stream
|
||||||
|
|
||||||
|
**Receiver behavior:**
|
||||||
|
- Implements `Future<Output = Result<T, RecvError>>`
|
||||||
|
- Local: resolves to the value or `SenderClosed` error
|
||||||
|
- Remote: reads varint length prefix, reads that many bytes, deserializes with postcard
|
||||||
|
|
||||||
|
**Filtering/Mapping** (on `Sender<T>` where `T: Send + Sync + 'static`):
|
||||||
|
```rust
|
||||||
|
sender.with_filter(|v| v > 0) // Drop messages failing predicate
|
||||||
|
sender.with_map(|v: U| v.into()) // Transform before sending
|
||||||
|
sender.with_filter_map(|v| ...) // Combined filter + map
|
||||||
|
```
|
||||||
|
|
||||||
|
### MPSC Channels (`channel::mpsc`)
|
||||||
|
|
||||||
|
Multi-producer, single-consumer streaming channels for server-streaming, client-streaming, and bidirectional patterns.
|
||||||
|
|
||||||
|
| Type | Local Backend | Remote Backend |
|
||||||
|
|---|---|---|
|
||||||
|
| `mpsc::Sender<T>` | `tokio::sync::mpsc::Sender` | `Arc<DynSender<T>>` (NoqSender) |
|
||||||
|
| `mpsc::Receiver<T>` | `tokio::sync::mpsc::Receiver` | `Box<dyn DynReceiver<T>>` (NoqReceiver) |
|
||||||
|
|
||||||
|
**Creation:** `mpsc::channel::<T>(buffer)` returns `(Sender<T>, Receiver<T>)`
|
||||||
|
|
||||||
|
**Sender behavior:**
|
||||||
|
- `send(value).await` — sends, yielding if full (remote: serializes + writes to stream)
|
||||||
|
- `try_send(value).await` — non-blocking attempt; returns `Ok(false)` if would block
|
||||||
|
- `closed().await` — waits until all receivers are dropped
|
||||||
|
- `is_rpc()` — returns `true` for remote senders
|
||||||
|
|
||||||
|
**Receiver behavior:**
|
||||||
|
- `recv().await` → `Result<Option<T>, RecvError>` — `None` means sender closed/cleanly finished
|
||||||
|
- `filter(pred)`, `map(fn)`, `filter_map(fn)` — chainable transformations
|
||||||
|
- `into_stream()` (with `stream` feature) — converts to `Stream<Item = Result<T, RecvError>>`
|
||||||
|
|
||||||
|
**Cloning:** `mpsc::Sender<T>` implements `Clone`. Local senders clone the underlying tokio sender; remote senders clone the `Arc`.
|
||||||
|
|
||||||
|
### None Channels (`channel::none`)
|
||||||
|
|
||||||
|
Placeholder channels for when no communication is needed.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct NoSender; // Implements Sender, does nothing
|
||||||
|
pub struct NoReceiver; // Implements Receiver, does nothing
|
||||||
|
```
|
||||||
|
|
||||||
|
Used as defaults when `#[rpc(tx=...)]` or `#[rpc(rx=...)]` are omitted.
|
||||||
|
|
||||||
|
## Remote Channel Internals
|
||||||
|
|
||||||
|
### NoqSender<T>
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct NoqSender<T>(tokio::sync::Mutex<NoqSenderState<T>>);
|
||||||
|
|
||||||
|
enum NoqSenderState<T> {
|
||||||
|
Open(NoqSenderInner<T>),
|
||||||
|
Closed,
|
||||||
|
}
|
||||||
|
|
||||||
|
struct NoqSenderInner<T> {
|
||||||
|
send: noq::SendStream,
|
||||||
|
buffer: SmallVec<[u8; 128]>, // Stack-allocated buffer for small messages
|
||||||
|
_marker: PhantomData<T>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Key behaviors:
|
||||||
|
- **Mutex-protected state**: The inner state is `Mutex`-protected because `DynSender::send()` takes `&self`. When a send fails, the state transitions to `Closed` and all subsequent sends return `BrokenPipe`.
|
||||||
|
- **Buffer reuse**: Uses `SmallVec<[u8; 128]>` to avoid heap allocation for messages that serialize to ≤128 bytes.
|
||||||
|
- **Serialization**: Each message is postcard-serialized with a varint length prefix. If serialization exceeds `MAX_MESSAGE_SIZE` (16 MiB), the stream is reset with error code `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` (1).
|
||||||
|
- **Serialization errors**: If postcard serialization fails, the stream is reset with `ERROR_CODE_INVALID_POSTCARD` (2).
|
||||||
|
|
||||||
|
### NoqReceiver<T>
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct NoqReceiver<T> {
|
||||||
|
recv: noq::RecvStream,
|
||||||
|
_marker: PhantomData<T>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Reads a varint length prefix, allocates a buffer of that size, reads the data, and deserializes with postcard. If the length exceeds `MAX_MESSAGE_SIZE`, stops the stream with the appropriate error code.
|
||||||
|
|
||||||
|
### Oneshot Remote Sender
|
||||||
|
|
||||||
|
For `oneshot::Sender<T>` over QUIC, the sender is a `BoxedSender<T>` — a `Box<dyn FnOnce(T) -> BoxFuture<Result<(), SendError>>>`. This captures the `noq::SendStream` and on invocation:
|
||||||
|
1. Computes `postcard::experimental::serialized_size(&value)`
|
||||||
|
2. Checks against `MAX_MESSAGE_SIZE`
|
||||||
|
3. Writes length-prefixed postcard data to the stream
|
||||||
|
|
||||||
|
### Oneshot Remote Receiver
|
||||||
|
|
||||||
|
For `oneshot::Receiver<T>` over QUIC, the receiver is constructed from a `noq::RecvStream`:
|
||||||
|
1. Reads a varint length prefix
|
||||||
|
2. Reads that many bytes
|
||||||
|
3. Deserializes with postcard
|
||||||
|
4. Returns the value
|
||||||
|
|
||||||
|
## Channel Conversion Table
|
||||||
|
|
||||||
|
When a QUIC stream pair `(SendStream, RecvStream)` is received for a request:
|
||||||
|
|
||||||
|
| Channel Kind | `Tx` (SendStream →) | `Rx` (RecvStream →) |
|
||||||
|
|---|---|---|
|
||||||
|
| `oneshot::Sender<T>` | Serialize + write, then finish | Read length-prefixed data |
|
||||||
|
| `mpsc::Sender<T>` | Repeatedly serialize + write | N/A |
|
||||||
|
| `oneshot::Receiver<T>` | N/A | Read single length-prefixed value |
|
||||||
|
| `mpsc::Receiver<T>` | N/A | Repeatedly read length-prefixed values |
|
||||||
|
| `NoSender` | Drop the stream | N/A |
|
||||||
|
| `NoReceiver` | N/A | Drop the stream |
|
||||||
|
|
||||||
|
The `From<noq::RecvStream>` and `From<noq::SendStream>` impls handle these conversions automatically based on the target type.
|
||||||
|
|
||||||
|
## DynSender and DynReceiver Traits
|
||||||
|
|
||||||
|
The `mpsc` module exposes traits for dynamic dispatch:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait DynSender<T>: Debug + Send + Sync + 'static {
|
||||||
|
fn send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<(), SendError>> + Send + '_>>;
|
||||||
|
fn try_send(&self, value: T) -> Pin<Box<dyn Future<Output = Result<bool, SendError>> + Send + '_>>;
|
||||||
|
fn closed(&self) -> Pin<Box<dyn Future<Output = ()> + Send + Sync + '_>>;
|
||||||
|
fn is_rpc(&self) -> bool;
|
||||||
|
}
|
||||||
|
|
||||||
|
pub trait DynReceiver<T>: Debug + Send + Sync + 'static {
|
||||||
|
fn recv(&mut self) -> Pin<Box<dyn Future<Output = Result<Option<T>, RecvError>> + Send + Sync + '_>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
These enable boxing of remote senders/receivers while keeping the local variants unboxed for zero overhead.
|
||||||
|
|
||||||
|
## FusedOneshotReceiver
|
||||||
|
|
||||||
|
A thin wrapper around `tokio::sync::oneshot::Receiver` that prevents panics when polling an already-completed receiver. It tracks completion state and returns `Poll::Pending` indefinitely after resolution, matching the `FusedFuture` pattern.
|
||||||
|
|
||||||
|
## Cancellation Safety
|
||||||
|
|
||||||
|
For remote `mpsc::Sender`:
|
||||||
|
- If a `send()` future is dropped before completion, the underlying QUIC stream is closed.
|
||||||
|
- All clones of the sender will receive `SendError::Io(BrokenPipe)` on subsequent send attempts.
|
||||||
|
- This is documented behavior: **always poll send futures to completion if you want to reuse the sender**.
|
||||||
|
|
||||||
|
For remote `oneshot::Sender`:
|
||||||
|
- Since it's `FnOnce`, dropping the future before sending simply means the value is never sent. The receiver will get `SenderClosed`.
|
||||||
@@ -0,0 +1,272 @@
|
|||||||
|
# irpc: Protocol and Message Flow
|
||||||
|
|
||||||
|
## Wire Protocol
|
||||||
|
|
||||||
|
When the `rpc` feature is enabled, irpc uses the following wire format over QUIC streams:
|
||||||
|
|
||||||
|
### Message Framing
|
||||||
|
|
||||||
|
Every message on the wire is **length-prefixed using postcard varints** (LEB128 encoding):
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┬──────────────────────┐
|
||||||
|
│ varint length │ postcard-serialized │
|
||||||
|
│ (1-10 bytes) │ message data │
|
||||||
|
└─────────────────┴──────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Length prefix**: LEB128 varint encoding of `u64` length. Each byte uses 7 bits for the value and the MSB as a continuation bit. Maximum 10 bytes for a full `u64`.
|
||||||
|
- **Payload**: Postcard-encoded (compact, no-schema serde format) Rust message.
|
||||||
|
|
||||||
|
### Maximum Message Size
|
||||||
|
|
||||||
|
`MAX_MESSAGE_SIZE = 16 MiB (16 * 1024 * 1024)`
|
||||||
|
|
||||||
|
Messages exceeding this limit are rejected:
|
||||||
|
- **Send side**: The sender checks `postcard::experimental::serialized_size()` before sending. If exceeded, the stream is reset with error code `1` (`ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED`).
|
||||||
|
- **Receive side**: After reading the varint length, if it exceeds `MAX_MESSAGE_SIZE`, the stream is stopped with error code `1`.
|
||||||
|
|
||||||
|
### Error Codes
|
||||||
|
|
||||||
|
| Code | Constant | Meaning |
|
||||||
|
|---|---|---|
|
||||||
|
| `1` | `ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED` | Message larger than 16 MiB |
|
||||||
|
| `2` | `ERROR_CODE_INVALID_POSTCARD` | Postcard serialization failed |
|
||||||
|
|
||||||
|
These are used as QUIC stream reset/stop error codes.
|
||||||
|
|
||||||
|
### Connection Closure
|
||||||
|
|
||||||
|
Error code `0` on the QUIC connection means "clean close" — the remote side intentionally shut down. This is distinguished from actual errors.
|
||||||
|
|
||||||
|
## Message Flow: Local Path
|
||||||
|
|
||||||
|
```
|
||||||
|
Client Actor
|
||||||
|
│ │
|
||||||
|
│ Client::rpc(Get { key: "x" }) │
|
||||||
|
│ │
|
||||||
|
│ 1. Create oneshot channel pair │
|
||||||
|
│ (tx, rx) = oneshot::channel() │
|
||||||
|
│ │
|
||||||
|
│ 2. Wrap into WithChannels │
|
||||||
|
│ WithChannels { │
|
||||||
|
│ inner: Get { key: "x" }, │
|
||||||
|
│ tx: oneshot::Sender<Res>, │
|
||||||
|
│ rx: NoReceiver, │
|
||||||
|
│ span: current_span, │
|
||||||
|
│ } │
|
||||||
|
│ │
|
||||||
|
│ 3. Convert to Message enum │
|
||||||
|
│ StorageMessage::Get(wc) │
|
||||||
|
│ │
|
||||||
|
│ 4. Send over mpsc channel ────────►│
|
||||||
|
│ │
|
||||||
|
│ 5. Await on oneshot receiver │
|
||||||
|
│ rx.await ◄─────────────────────│
|
||||||
|
│ tx.send(res)│
|
||||||
|
│ │
|
||||||
|
│ Result: res │
|
||||||
|
```
|
||||||
|
|
||||||
|
For bidirectional streaming:
|
||||||
|
```
|
||||||
|
Client Actor
|
||||||
|
│ │
|
||||||
|
│ Client::bidi_streaming(Sum, 4, 4) │
|
||||||
|
│ │
|
||||||
|
│ 1. Create channel pairs │
|
||||||
|
│ (update_tx, update_rx) │
|
||||||
|
│ (res_tx, res_rx) │
|
||||||
|
│ │
|
||||||
|
│ 2. WithChannels { │
|
||||||
|
│ inner: Sum, │
|
||||||
|
│ tx: mpsc::Sender<i64>, │
|
||||||
|
│ rx: mpsc::Receiver<i64>, │
|
||||||
|
│ } │
|
||||||
|
│ │
|
||||||
|
│ 3. Send message ──────────────────►│
|
||||||
|
│ │
|
||||||
|
│ 4. Use update_tx.send(val) ───────►│
|
||||||
|
│ Use res_rx.recv() ◄─────────│
|
||||||
|
│ res_tx.send(val)
|
||||||
|
│ │
|
||||||
|
```
|
||||||
|
|
||||||
|
## Message Flow: Remote Path
|
||||||
|
|
||||||
|
```
|
||||||
|
Client Server
|
||||||
|
│ │
|
||||||
|
│ Client::rpc(Get { key: "x" }) │
|
||||||
|
│ │
|
||||||
|
│ 1. open_bi() → (SendStream, RecvStream)
|
||||||
|
│ │
|
||||||
|
│ 2. Serialize StorageProtocol::Get(Get { key: "x" })
|
||||||
|
│ with postcard + varint prefix │
|
||||||
|
│ │
|
||||||
|
│ 3. Write to SendStream ───────────►│
|
||||||
|
│ │
|
||||||
|
│ │ 4. Accept bi stream
|
||||||
|
│ │ 5. Read varint + deserialize
|
||||||
|
│ │ 6. RemoteService::with_remote_channels()
|
||||||
|
│ │ → WithChannels { inner, tx, rx }
|
||||||
|
│ │ 7. Forward to local actor
|
||||||
|
│ │
|
||||||
|
│ │ Actor processes, sends response
|
||||||
|
│ │ on the SendStream (which is the
|
||||||
|
│ │ oneshot::Sender<T> backed by QUIC)
|
||||||
|
│ │
|
||||||
|
│ 8. Read from RecvStream ◄──────────│
|
||||||
|
│ 9. Deserialize response │
|
||||||
|
│ │
|
||||||
|
│ Result: res │
|
||||||
|
```
|
||||||
|
|
||||||
|
For bidirectional streaming over remote:
|
||||||
|
```
|
||||||
|
Client Server
|
||||||
|
│ │
|
||||||
|
│ Client::bidi_streaming(Sum, 4, 4) │
|
||||||
|
│ │
|
||||||
|
│ open_bi() → (SendStream, RecvStream)
|
||||||
|
│ │
|
||||||
|
│ SendStream → mpsc::Sender<Update> │ RecvStream → mpsc::Receiver<Update>
|
||||||
|
│ RecvStream → oneshot::Receiver<Res>│ SendStream → oneshot::Sender<Res>
|
||||||
|
│ (or mpsc::Receiver<Res> for │
|
||||||
|
│ server-streaming with mpsc tx) │
|
||||||
|
│ │
|
||||||
|
│ The initial message is sent on │
|
||||||
|
│ SendStream with varint prefix. │
|
||||||
|
│ │
|
||||||
|
│ Subsequent updates are sent on │
|
||||||
|
│ the same SendStream as varint- │
|
||||||
|
│ prefixed postcard messages. │
|
||||||
|
│ │
|
||||||
|
│ The response stream is read from │
|
||||||
|
│ the RecvStream as varint-prefixed │
|
||||||
|
│ postcard messages. │
|
||||||
|
```
|
||||||
|
|
||||||
|
## Stream Direction Convention
|
||||||
|
|
||||||
|
In irpc's QUIC stream model:
|
||||||
|
- **Client opens** a bidirectional stream (`open_bi()`)
|
||||||
|
- **SendStream** (client → server): carries the initial request message, plus any client-streaming updates
|
||||||
|
- **RecvStream** (server → client): carries the response(s) from the server
|
||||||
|
|
||||||
|
The `RemoteService::with_remote_channels()` method decides how to map streams to channels:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// For a simple RPC (tx=oneshot, rx=none):
|
||||||
|
fn with_remote_channels(self, rx: RecvStream, tx: SendStream) -> Self::Message {
|
||||||
|
// rx stream is unused (NoReceiver), tx carries response
|
||||||
|
WithChannels::from((msg, tx.into(), rx.into()))
|
||||||
|
// tx → oneshot::Sender<Res> (or mpsc::Sender<Res>)
|
||||||
|
// rx → NoReceiver
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Wait — looking at the actual implementation more carefully:
|
||||||
|
|
||||||
|
The `RemoteService::with_remote_channels` method takes `(self, rx: RecvStream, tx: SendStream)` where:
|
||||||
|
- `rx` = the `RecvStream` from the bidirectional stream (client reads from this)
|
||||||
|
- `tx` = the `SendStream` from the bidirectional stream (client writes to this)
|
||||||
|
|
||||||
|
But for the **server side**, the `RecvStream` is what the server reads from (client updates), and `SendStream` is what the server writes to (server responses).
|
||||||
|
|
||||||
|
In the `with_remote_channels` generated code:
|
||||||
|
```rust
|
||||||
|
// For rpc(tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>):
|
||||||
|
WithChannels::from((msg, tx.into(), rx.into()))
|
||||||
|
// tx (SendStream) → oneshot::Sender<Res> — server writes response
|
||||||
|
// rx (RecvStream) → mpsc::Receiver<Update> — server reads client updates
|
||||||
|
```
|
||||||
|
|
||||||
|
So the naming in `with_remote_channels` is from the **server's perspective**:
|
||||||
|
- `rx` parameter = RecvStream = what server receives (client → server updates)
|
||||||
|
- `tx` parameter = SendStream = what server sends (server → client responses)
|
||||||
|
|
||||||
|
## Connection Management
|
||||||
|
|
||||||
|
### NoqLazyRemoteConnection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
|
||||||
|
|
||||||
|
struct NoqLazyRemoteConnectionInner {
|
||||||
|
endpoint: noq::Endpoint,
|
||||||
|
addr: SocketAddr,
|
||||||
|
connection: Mutex<Option<noq::Connection>>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Lazily establishes connection on first use
|
||||||
|
- Caches the `noq::Connection` inside a `Mutex<Option<...>>`
|
||||||
|
- On `open_bi()`: if cached connection exists, tries to reuse it; if it fails, clears cache and reconnects once
|
||||||
|
- Thread-safe via `Arc` + `Mutex`
|
||||||
|
|
||||||
|
### IrohLazyRemoteConnection (irpc-iroh)
|
||||||
|
|
||||||
|
Same pattern but for iroh endpoints, with an additional `alpn` field for protocol identification.
|
||||||
|
|
||||||
|
### 0-RTT Support
|
||||||
|
|
||||||
|
irpc supports QUIC 0-RTT for reduced latency on reconnections:
|
||||||
|
|
||||||
|
- `Client::rpc_0rtt()` — sends request immediately with 0-RTT data; if the server rejects 0-RTT, re-sends
|
||||||
|
- `Client::server_streaming_0rtt()` — same for server-streaming
|
||||||
|
- `Client::notify_0rtt()` — same for fire-and-forget
|
||||||
|
|
||||||
|
The 0-RTT flow:
|
||||||
|
1. Client serializes the message into a buffer (`prepare_write()`)
|
||||||
|
2. Sends the buffer over a 0-RTT connection
|
||||||
|
3. Awaits `zero_rtt_accepted()` to check if 0-RTT was accepted
|
||||||
|
4. If not accepted, opens a new connection and re-sends the same buffer
|
||||||
|
|
||||||
|
`RemoteConnection::zero_rtt_accepted()` returns `true` for regular connections and for lazy connections. For `IrohZrttRemoteConnection`, it checks the actual 0-RTT status via `handshake_completed()`.
|
||||||
|
|
||||||
|
## Server-Side: Accepting Connections
|
||||||
|
|
||||||
|
### Using noq (direct QUIC)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
irpc::rpc::listen(endpoint, handler)
|
||||||
|
```
|
||||||
|
|
||||||
|
This function:
|
||||||
|
1. Loops on `endpoint.accept()` to accept incoming connections
|
||||||
|
2. For each connection, spawns a task running `handle_connection()`
|
||||||
|
3. `handle_connection()` loops on `read_request_raw()` to read requests from bidirectional streams
|
||||||
|
4. Each request is deserialized and passed to the `Handler`
|
||||||
|
|
||||||
|
### Using iroh
|
||||||
|
|
||||||
|
```rust
|
||||||
|
IrohProtocol::with_sender(local_sender)
|
||||||
|
```
|
||||||
|
|
||||||
|
This creates a `ProtocolHandler` that can be registered with `iroh::protocol::Router`. When a connection arrives, it calls `handle_connection()` from irpc-iroh, which handles the protocol handshake and reads requests.
|
||||||
|
|
||||||
|
For 0-RTT support:
|
||||||
|
```rust
|
||||||
|
Iroh0RttProtocol::with_sender(local_sender)
|
||||||
|
```
|
||||||
|
|
||||||
|
This implements `ProtocolHandler::on_accepting()` to handle 0-RTT connections.
|
||||||
|
|
||||||
|
### Handler Function
|
||||||
|
|
||||||
|
```rust
|
||||||
|
type Handler<R> = Arc<
|
||||||
|
dyn Fn(R, noq::RecvStream, noq::SendStream) -> BoxFuture<Result<(), SendError>>
|
||||||
|
+ Send + Sync + 'static,
|
||||||
|
>;
|
||||||
|
```
|
||||||
|
|
||||||
|
The handler receives:
|
||||||
|
1. The deserialized protocol message (`R`)
|
||||||
|
2. The `RecvStream` (for client → server updates)
|
||||||
|
3. The `SendStream` (for server → client responses)
|
||||||
|
|
||||||
|
Typically created via `Protocol::remote_handler(local_sender)`, which converts streams to typed channels and forwards the `WithChannels` message to a local actor.
|
||||||
278
docs/research/references/iroh/irpc/05-rpc-requests-macro.md
Normal file
278
docs/research/references/iroh/irpc/05-rpc-requests-macro.md
Normal file
@@ -0,0 +1,278 @@
|
|||||||
|
# irpc: The rpc_requests Macro
|
||||||
|
|
||||||
|
The `#[rpc_requests]` attribute macro is the primary way to define an irpc protocol. It generates the boilerplate for channel typing, message wrapping, and service trait implementations.
|
||||||
|
|
||||||
|
## Basic Usage
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use irpc::{channel::{mpsc, oneshot}, rpc_requests, Client, WithChannels};
|
||||||
|
use serde::{Deserialize, Serialize};
|
||||||
|
|
||||||
|
#[rpc_requests(message = ComputeMessage)]
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
enum ComputeProtocol {
|
||||||
|
/// Unary RPC: one request, one response
|
||||||
|
#[rpc(tx=oneshot::Sender<i64>)]
|
||||||
|
#[wrap(Multiply)]
|
||||||
|
Multiply(i64, i64),
|
||||||
|
|
||||||
|
/// Bidirectional streaming
|
||||||
|
#[rpc(tx=mpsc::Sender<i64>, rx=mpsc::Receiver<i64>)]
|
||||||
|
#[wrap(Sum)]
|
||||||
|
Sum,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This single macro invocation generates:
|
||||||
|
|
||||||
|
1. **Wrapper structs** (from `#[wrap]`): `Multiply` and `Sum` struct types
|
||||||
|
2. **`Channels<ComputeProtocol>` impls**: For each variant's inner type, specifying `Tx` and `Rx`
|
||||||
|
3. **`Service` impl**: `impl Service for ComputeProtocol { type Message = ComputeMessage; }`
|
||||||
|
4. **`RemoteService` impl** (rpc feature): Maps protocol variants + QUIC streams to messages
|
||||||
|
5. **`ComputeMessage` enum**: Wraps each request in `WithChannels`
|
||||||
|
6. **`From` conversions**: Between inner types, `ComputeProtocol`, and `ComputeMessage`
|
||||||
|
|
||||||
|
## Macro Arguments
|
||||||
|
|
||||||
|
### Top-level (on the enum)
|
||||||
|
|
||||||
|
| Argument | Required | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| `message = Name` | Recommended | Name of the generated message enum. Also generates `Service` and `RemoteService` impls. |
|
||||||
|
| `alias = "Suffix"` | Optional | Generates type aliases like `MultiplyMsg = WithChannels<Multiply, ComputeProtocol>` |
|
||||||
|
| `rpc_feature = "feat"` | Optional | Feature-gates the `RemoteService` impl with `#[cfg(feature = "feat")]` |
|
||||||
|
| `no_rpc` | Optional | Skips generating `RemoteService` impl entirely |
|
||||||
|
| `no_spans` | Optional | Skips span-related code (for use without the `spans` feature) |
|
||||||
|
|
||||||
|
### Per-variant
|
||||||
|
|
||||||
|
#### `#[rpc(tx=Type, rx=Type)]`
|
||||||
|
|
||||||
|
Specifies channel types for each request:
|
||||||
|
- `tx` — response channel type (server → client). Defaults to `NoSender`.
|
||||||
|
- `rx` — update channel type (client → server). Defaults to `NoReceiver`.
|
||||||
|
|
||||||
|
Valid types:
|
||||||
|
- `oneshot::Sender<T>` — single response
|
||||||
|
- `mpsc::Sender<T>` — streaming response
|
||||||
|
- `oneshot::Receiver<T>` — not valid as tx (use for rx pattern)
|
||||||
|
- `mpsc::Receiver<T>` — streaming updates (client → server)
|
||||||
|
- `NoSender` / `NoReceiver` — no channel in that direction
|
||||||
|
|
||||||
|
#### `#[wrap(TypeName, derive(Traits))]`
|
||||||
|
|
||||||
|
Generates a struct from the variant's fields:
|
||||||
|
- `TypeName` — name of the generated struct
|
||||||
|
- Optional visibility prefix (e.g., `pub(crate) TypeName`)
|
||||||
|
- `derive(...)` — additional derive macros beyond the default `Serialize, Deserialize, Debug`
|
||||||
|
|
||||||
|
If `#[wrap]` is not used, each variant must have exactly one unnamed field (a named type).
|
||||||
|
|
||||||
|
## Generated Code Walkthrough
|
||||||
|
|
||||||
|
Given this input:
|
||||||
|
```rust
|
||||||
|
#[rpc_requests(message = StoreMessage)]
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
enum StoreProtocol {
|
||||||
|
#[rpc(tx=oneshot::Sender<String>)]
|
||||||
|
#[wrap(GetRequest, derive(Clone))]
|
||||||
|
Get(String),
|
||||||
|
|
||||||
|
#[rpc(tx=oneshot::Sender<()>)]
|
||||||
|
#[wrap(SetRequest)]
|
||||||
|
Set { key: String, value: String },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The macro generates:
|
||||||
|
|
||||||
|
### 1. Wrapper Structs
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Serialize, Deserialize, Clone)]
|
||||||
|
pub GetRequest(pub String);
|
||||||
|
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
pub SetRequest { pub key: String, pub value: String }
|
||||||
|
```
|
||||||
|
|
||||||
|
The variants are rewritten to use these:
|
||||||
|
```rust
|
||||||
|
enum StoreProtocol {
|
||||||
|
Get(GetRequest),
|
||||||
|
Set(SetRequest),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Channels Implementations
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Channels<StoreProtocol> for GetRequest {
|
||||||
|
type Tx = oneshot::Sender<String>;
|
||||||
|
type Rx = NoReceiver;
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Channels<StoreProtocol> for SetRequest {
|
||||||
|
type Tx = oneshot::Sender<()>;
|
||||||
|
type Rx = NoReceiver;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Message Enum
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[doc = "Message enum for [`StoreProtocol`]"]
|
||||||
|
#[allow(missing_docs)]
|
||||||
|
#[derive(Debug)]
|
||||||
|
pub enum StoreMessage {
|
||||||
|
Get(WithChannels<GetRequest, StoreProtocol>),
|
||||||
|
Set(WithChannels<SetRequest, StoreProtocol>),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Service Implementation
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl Service for StoreProtocol {
|
||||||
|
type Message = StoreMessage;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. RemoteService Implementation (rpc feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl RemoteService for StoreProtocol {
|
||||||
|
fn with_remote_channels(
|
||||||
|
self,
|
||||||
|
rx: noq::RecvStream,
|
||||||
|
tx: noq::SendStream,
|
||||||
|
) -> Self::Message {
|
||||||
|
match self {
|
||||||
|
StoreProtocol::Get(msg) => {
|
||||||
|
StoreMessage::from(WithChannels::from((msg, tx, rx)))
|
||||||
|
}
|
||||||
|
StoreProtocol::Set(msg) => {
|
||||||
|
StoreMessage::from(WithChannels::from((msg, tx, rx)))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. From Conversions
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Inner type → Protocol enum
|
||||||
|
impl From<GetRequest> for StoreProtocol { ... }
|
||||||
|
impl From<SetRequest> for StoreProtocol { ... }
|
||||||
|
|
||||||
|
// WithChannels → Message enum
|
||||||
|
impl From<WithChannels<GetRequest, StoreProtocol>> for StoreMessage { ... }
|
||||||
|
impl From<WithChannels<SetRequest, StoreProtocol>> for StoreMessage { ... }
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7. parent_span Method (spans feature)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl StoreMessage {
|
||||||
|
pub fn parent_span(&self) -> tracing::Span {
|
||||||
|
let span = match self {
|
||||||
|
StoreMessage::Get(inner) => inner.parent_span_opt(),
|
||||||
|
StoreMessage::Set(inner) => inner.parent_span_opt(),
|
||||||
|
};
|
||||||
|
span.cloned().unwrap_or_else(|| tracing::Span::current())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Interaction Pattern Mapping
|
||||||
|
|
||||||
|
The `#[rpc]` attribute maps directly to gRPC-like patterns:
|
||||||
|
|
||||||
|
| Pattern | `tx` type | `rx` type | Example |
|
||||||
|
|---|---|---|---|
|
||||||
|
| **Unary RPC** | `oneshot::Sender<R>` | `NoReceiver` | Get by key, return value |
|
||||||
|
| **Server streaming** | `mpsc::Sender<R>` | `NoReceiver` | List all items |
|
||||||
|
| **Client streaming** | `oneshot::Sender<R>` | `mpsc::Receiver<U>` | Upload items, get count |
|
||||||
|
| **Bidirectional** | `mpsc::Sender<R>` | `mpsc::Receiver<U>` | Chat, live updates |
|
||||||
|
| **Notify (fire & forget)** | `NoSender` | `NoReceiver` | Log event |
|
||||||
|
|
||||||
|
## Client Methods Generated by Patterns
|
||||||
|
|
||||||
|
The `Client<S>` methods correspond to channel types:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Unary RPC: tx=oneshot::Sender<Res>, rx=NoReceiver
|
||||||
|
client.rpc(Get { key: "x" }).await // → Result<Res>
|
||||||
|
|
||||||
|
// Server streaming: tx=mpsc::Sender<Res>, rx=NoReceiver
|
||||||
|
client.server_streaming(List, 16).await // → Result<mpsc::Receiver<Res>>
|
||||||
|
|
||||||
|
// Client streaming: tx=oneshot::Sender<Res>, rx=mpsc::Receiver<Update>
|
||||||
|
client.client_streaming(SetMany, 4).await // → Result<(mpsc::Sender<Update>, oneshot::Receiver<Res>)>
|
||||||
|
|
||||||
|
// Bidirectional: tx=mpsc::Sender<Res>, rx=mpsc::Receiver<Update>
|
||||||
|
client.bidi_streaming(Sum, 4, 4).await // → Result<(mpsc::Sender<Update>, mpsc::Receiver<Res>)>
|
||||||
|
|
||||||
|
// Notify: tx=NoSender, rx=NoReceiver
|
||||||
|
client.notify(Log { msg: "hi" }).await // → Result<()>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Manual Protocol Definition (Without Macro)
|
||||||
|
|
||||||
|
You can define protocols manually instead of using the macro:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use irpc::{channel::{mpsc, none::NoReceiver, oneshot}, Channels, Service, WithChannels};
|
||||||
|
use serde::{Deserialize, Serialize};
|
||||||
|
|
||||||
|
// 1. Define request types
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
struct Get { key: String }
|
||||||
|
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
struct Set { key: String, value: String }
|
||||||
|
|
||||||
|
// 2. Implement Channels for each type
|
||||||
|
impl Channels<StorageProtocol> for Get {
|
||||||
|
type Tx = oneshot::Sender<Option<String>>;
|
||||||
|
type Rx = NoReceiver;
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Channels<StorageProtocol> for Set {
|
||||||
|
type Tx = oneshot::Sender<()>;
|
||||||
|
type Rx = NoReceiver;
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Define protocol enum
|
||||||
|
#[derive(derive_more::From, Serialize, Deserialize, Debug)]
|
||||||
|
enum StorageProtocol {
|
||||||
|
Get(Get),
|
||||||
|
Set(Set),
|
||||||
|
}
|
||||||
|
|
||||||
|
// 4. Define message enum
|
||||||
|
#[derive(derive_more::From)]
|
||||||
|
enum StorageMessage {
|
||||||
|
Get(WithChannels<Get, StorageProtocol>),
|
||||||
|
Set(WithChannels<Set, StorageProtocol>),
|
||||||
|
}
|
||||||
|
|
||||||
|
// 5. Implement Service
|
||||||
|
impl Service for StorageProtocol {
|
||||||
|
type Message = StorageMessage;
|
||||||
|
}
|
||||||
|
|
||||||
|
// 6. Implement RemoteService (rpc feature)
|
||||||
|
impl RemoteService for StorageProtocol {
|
||||||
|
fn with_remote_channels(self, rx: noq::RecvStream, tx: noq::SendStream) -> Self::Message {
|
||||||
|
match self {
|
||||||
|
StorageProtocol::Get(msg) => WithChannels::from((msg, tx, rx)).into(),
|
||||||
|
StorageProtocol::Set(msg) => WithChannels::from((msg, tx, rx)).into(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This manual approach gives full control but requires more boilerplate. The macro generates all of this automatically.
|
||||||
@@ -0,0 +1,274 @@
|
|||||||
|
# irpc: RPC Module and Remote Transport
|
||||||
|
|
||||||
|
The `rpc` module (enabled by the `rpc` feature) contains all cross-process RPC functionality: QUIC stream handling, connection management, serialization, and server-side request processing.
|
||||||
|
|
||||||
|
## Module Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub mod rpc {
|
||||||
|
pub const MAX_MESSAGE_SIZE: u64 = 1024 * 1024 * 16;
|
||||||
|
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
|
||||||
|
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
|
||||||
|
|
||||||
|
pub enum WriteError { Noq, MaxMessageSizeExceeded, Io }
|
||||||
|
pub trait RemoteConnection: Send + Sync + Debug + 'static { ... }
|
||||||
|
pub struct RemoteSender<S>(SendStream, RecvStream, PhantomData<S>);
|
||||||
|
pub type Handler<R> = Arc<dyn Fn(R, RecvStream, SendStream) -> BoxFuture<Result<(), SendError>> + Send + Sync>;
|
||||||
|
pub trait RemoteService: Service + Sized { ... }
|
||||||
|
pub async fn listen<R>(endpoint, handler);
|
||||||
|
pub async fn handle_connection<R>(connection, handler) -> io::Result<()>;
|
||||||
|
pub async fn read_request<S: RemoteService>(connection) -> io::Result<Option<S::Message>>;
|
||||||
|
pub async fn read_request_raw<R>(connection) -> io::Result<Option<(R, RecvStream, SendStream)>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## RemoteConnection Implementations
|
||||||
|
|
||||||
|
### NoqLazyRemoteConnection
|
||||||
|
|
||||||
|
The default remote connection for noq (QUIC-by-socket-address):
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct NoqLazyRemoteConnection(Arc<NoqLazyRemoteConnectionInner>);
|
||||||
|
|
||||||
|
struct NoqLazyRemoteConnectionInner {
|
||||||
|
endpoint: noq::Endpoint,
|
||||||
|
addr: SocketAddr,
|
||||||
|
connection: Mutex<Option<noq::Connection>>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Behavior:**
|
||||||
|
- `open_bi()`:
|
||||||
|
1. Locks the `Mutex<Option<Connection>>`
|
||||||
|
2. If a cached connection exists, tries `conn.open_bi()`
|
||||||
|
3. If that fails, clears the cache and establishes a new connection
|
||||||
|
4. If no cached connection, establishes a new one
|
||||||
|
5. Returns `(SendStream, RecvStream)` pair
|
||||||
|
- `zero_rtt_accepted()`: Always returns `true` (noq doesn't have 0-RTT concept in this context)
|
||||||
|
- `clone_boxed()`: Clones the `Arc`, sharing the same connection cache
|
||||||
|
|
||||||
|
### Direct noq::Connection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl RemoteConnection for noq::Connection {
|
||||||
|
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||||
|
// Directly opens a bidirectional stream on the connection
|
||||||
|
}
|
||||||
|
fn zero_rtt_accepted(&self) -> BoxFuture<bool> { Box::pin(async { true }) }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## RemoteSender
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct RemoteSender<S>(noq::SendStream, noq::RecvStream, PhantomData<S>);
|
||||||
|
```
|
||||||
|
|
||||||
|
Created by `Client::request()` when the client is remote. Holds both sides of a QUIC bidirectional stream.
|
||||||
|
|
||||||
|
### Key Methods
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl<S: Service> RemoteSender<S> {
|
||||||
|
pub fn new(send: SendStream, recv: RecvStream) -> Self;
|
||||||
|
|
||||||
|
pub async fn write(self, msg: impl Into<S>) -> Result<(SendStream, RecvStream), WriteError> {
|
||||||
|
let buf = prepare_write(msg)?;
|
||||||
|
self.write_raw(&buf).await
|
||||||
|
}
|
||||||
|
|
||||||
|
// Internal: writes pre-serialized buffer
|
||||||
|
pub(crate) async fn write_raw(self, buf: &[u8]) -> Result<(SendStream, RecvStream), WriteError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `write()` method:
|
||||||
|
1. Converts `msg` into the protocol enum `S` via `Into`
|
||||||
|
2. Checks serialized size against `MAX_MESSAGE_SIZE`
|
||||||
|
3. Length-prefixes with varint + postcard serialization
|
||||||
|
4. Writes to the `SendStream`
|
||||||
|
5. Returns the stream pair (now usable for response channels)
|
||||||
|
|
||||||
|
The `write_raw()` method is used for 0-RTT where the message is pre-serialized to allow re-sending without re-serialization.
|
||||||
|
|
||||||
|
### prepare_write
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn prepare_write<S: Service>(msg: impl Into<S>) -> Result<SmallVec<[u8; 128]>, WriteError> {
|
||||||
|
let msg = msg.into();
|
||||||
|
if postcard::experimental::serialized_size(&msg)? as u64 > MAX_MESSAGE_SIZE {
|
||||||
|
return Err(WriteError::MaxMessageSizeExceeded);
|
||||||
|
}
|
||||||
|
let mut buf = SmallVec::<[u8; 128]>::new();
|
||||||
|
buf.write_length_prefixed(&msg)?;
|
||||||
|
Ok(buf)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Uses `SmallVec<[u8; 128]>` to avoid heap allocation for small messages.
|
||||||
|
|
||||||
|
## Stream-to-Channel Conversions
|
||||||
|
|
||||||
|
When a QUIC stream pair is received on the server side, it needs to be converted into typed channels. The `From` implementations handle this:
|
||||||
|
|
||||||
|
### SendStream → Channel Tx
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// NoSender: drop the stream
|
||||||
|
impl From<SendStream> for NoSender { ... }
|
||||||
|
|
||||||
|
// Oneshot: serialize and send single value, then done
|
||||||
|
impl<T: RpcMessage> From<SendStream> for oneshot::Sender<T> { ... }
|
||||||
|
|
||||||
|
// MPSC: repeatedly serialize and send values
|
||||||
|
impl<T: RpcMessage> From<SendStream> for mpsc::Sender<T> { ... }
|
||||||
|
```
|
||||||
|
|
||||||
|
### RecvStream → Channel Rx
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// NoReceiver: drop the stream
|
||||||
|
impl From<RecvStream> for NoReceiver { ... }
|
||||||
|
|
||||||
|
// Oneshot: read single length-prefixed value
|
||||||
|
impl<T: DeserializeOwned> From<RecvStream> for oneshot::Receiver<T> { ... }
|
||||||
|
|
||||||
|
// MPSC: repeatedly read length-prefixed values
|
||||||
|
impl<T: RpcMessage> From<RecvStream> for mpsc::Receiver<T> { ... }
|
||||||
|
```
|
||||||
|
|
||||||
|
## Server-Side Request Processing
|
||||||
|
|
||||||
|
### read_request_raw
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
|
||||||
|
connection: &noq::Connection,
|
||||||
|
) -> io::Result<Option<(R, RecvStream, SendStream)>>
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Calls `connection.accept_bi()` to accept an incoming bidirectional stream
|
||||||
|
2. If `ApplicationClosed(0)`, returns `Ok(None)` (clean shutdown)
|
||||||
|
3. Reads a varint length prefix from the `RecvStream`
|
||||||
|
4. Checks against `MAX_MESSAGE_SIZE`
|
||||||
|
5. Reads `length` bytes from the stream
|
||||||
|
6. Deserializes with `postcard::from_bytes::<R>()`
|
||||||
|
7. Returns `(deserialized_message, RecvStream, SendStream)`
|
||||||
|
|
||||||
|
### read_request (typed)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn read_request<S: RemoteService>(
|
||||||
|
connection: &noq::Connection,
|
||||||
|
) -> io::Result<Option<S::Message>>
|
||||||
|
```
|
||||||
|
|
||||||
|
Calls `read_request_raw()` and then applies `S::with_remote_channels()` to convert the raw protocol message + stream pair into a `WithChannels`-wrapped `Message`.
|
||||||
|
|
||||||
|
### handle_connection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn handle_connection<R: DeserializeOwned + 'static>(
|
||||||
|
connection: noq::Connection,
|
||||||
|
handler: Handler<R>,
|
||||||
|
) -> io::Result<()>
|
||||||
|
```
|
||||||
|
|
||||||
|
Loops:
|
||||||
|
1. Calls `read_request_raw()` to get the next request
|
||||||
|
2. If `None`, returns `Ok(())` (connection closed)
|
||||||
|
3. Invokes `handler(msg, rx, tx)` to process the request
|
||||||
|
4. Continues until the connection closes or an error occurs
|
||||||
|
|
||||||
|
Each connection is handled in a separate task (spawned by `listen()`).
|
||||||
|
|
||||||
|
### listen
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn listen<R: DeserializeOwned + 'static>(
|
||||||
|
endpoint: noq::Endpoint,
|
||||||
|
handler: Handler<R>,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
The top-level server loop:
|
||||||
|
1. Accepts incoming connections from the `noq::Endpoint`
|
||||||
|
2. Spawns a task for each connection
|
||||||
|
3. Each task calls `handle_connection()`
|
||||||
|
4. Uses a `JoinSet` to manage and clean up completed tasks
|
||||||
|
|
||||||
|
## The Handler and Local Forwarding
|
||||||
|
|
||||||
|
The typical handler is created by `Protocol::remote_handler(local_sender)`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn remote_handler(local_sender: LocalSender<Self>) -> Handler<Self> {
|
||||||
|
Arc::new(move |msg, rx, tx| {
|
||||||
|
let msg = Self::with_remote_channels(msg, rx, tx);
|
||||||
|
Box::pin(local_sender.send_raw(msg))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This converts the raw (deserialized protocol message, RecvStream, SendStream) tuple into a typed `WithChannels` message and forwards it to the local actor via the mpsc channel. The local actor can then use the typed channels without knowing whether they're local or remote.
|
||||||
|
|
||||||
|
## Full Request Lifecycle (Remote)
|
||||||
|
|
||||||
|
```
|
||||||
|
CLIENT SERVER
|
||||||
|
│ │
|
||||||
|
│ 1. Client::request() │
|
||||||
|
│ → open_bi() on connection │
|
||||||
|
│ │
|
||||||
|
│ 2. RemoteSender::write(protocol_msg) │
|
||||||
|
│ → serialize + send on SendStream ────►│
|
||||||
|
│ │ 3. accept_bi()
|
||||||
|
│ │ 4. read_request_raw()
|
||||||
|
│ │ → read varint + data
|
||||||
|
│ │ → deserialize protocol_msg
|
||||||
|
│ │
|
||||||
|
│ │ 5. RemoteService::with_remote_channels()
|
||||||
|
│ │ → creates WithChannels
|
||||||
|
│ │ → SendStream → tx channel
|
||||||
|
│ │ → RecvStream → rx channel
|
||||||
|
│ │
|
||||||
|
│ │ 6. handler(msg, rx, tx)
|
||||||
|
│ │ → local_sender.send_raw(message)
|
||||||
|
│ │ → message goes to actor
|
||||||
|
│ │
|
||||||
|
│ │ 7. Actor processes:
|
||||||
|
│ │ match message {
|
||||||
|
│ │ Msg::Get(wc) => {
|
||||||
|
│ │ let res = db.get(wc.inner.key);
|
||||||
|
│ │ wc.tx.send(res).await;
|
||||||
|
│ │ // tx.send() writes to SendStream
|
||||||
|
│ │ }
|
||||||
|
│ │ }
|
||||||
|
│ │
|
||||||
|
│ 8. RecvStream reads response ◄───────────│
|
||||||
|
│ 9. Deserialize response │
|
||||||
|
│ 10. Return to caller │
|
||||||
|
```
|
||||||
|
|
||||||
|
## 0-RTT Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
CLIENT SERVER
|
||||||
|
│ │
|
||||||
|
│ 1. Serialize message into buffer │
|
||||||
|
│ (prepare_write) │
|
||||||
|
│ │
|
||||||
|
│ 2. Open 0-RTT connection │
|
||||||
|
│ → write buffer immediately ─────────►│
|
||||||
|
│ │
|
||||||
|
│ 3. Check zero_rtt_accepted() │
|
||||||
|
│ → If true: done, read response │
|
||||||
|
│ → If false: │
|
||||||
|
│ 4. Open new (full) connection │
|
||||||
|
│ 5. Re-send same buffer ────────────►│
|
||||||
|
│ │
|
||||||
|
│ 6. Read response ◄──────────────────────│
|
||||||
|
```
|
||||||
|
|
||||||
|
The key insight: the message buffer is pre-serialized so it can be re-sent without re-serialization if 0-RTT is rejected.
|
||||||
271
docs/research/references/iroh/irpc/07-irpc-iroh.md
Normal file
271
docs/research/references/iroh/irpc/07-irpc-iroh.md
Normal file
@@ -0,0 +1,271 @@
|
|||||||
|
# irpc: irpc-iroh — Iroh Transport Integration
|
||||||
|
|
||||||
|
The `irpc-iroh` crate provides transport integration for iroh, enabling irpc to work with iroh's QUIC connections that use endpoint IDs (rather than socket addresses) for routing.
|
||||||
|
|
||||||
|
## Crate Overview
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[package]
|
||||||
|
name = "irpc-iroh"
|
||||||
|
version = "0.13.0"
|
||||||
|
description = "Iroh transport for irpc"
|
||||||
|
```
|
||||||
|
|
||||||
|
Dependencies: `iroh`, `irpc`, `tokio`, `tracing`, `serde`, `postcard`, `n0-error`, `n0-future`
|
||||||
|
|
||||||
|
## Key Types
|
||||||
|
|
||||||
|
### IrohRemoteConnection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct IrohRemoteConnection(Connection);
|
||||||
|
```
|
||||||
|
|
||||||
|
Wraps an existing iroh `Connection`. Simplest way to use irpc with iroh — create a connection externally and wrap it.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl RemoteConnection for IrohRemoteConnection {
|
||||||
|
fn clone_boxed(&self) -> Box<dyn RemoteConnection> { ... }
|
||||||
|
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||||
|
// Delegates to connection.open_bi()
|
||||||
|
}
|
||||||
|
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
|
||||||
|
// Always true — fully authenticated connection
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note:** This stops working when the underlying connection is closed. For automatic reconnection, use `IrohLazyRemoteConnection`.
|
||||||
|
|
||||||
|
### IrohZrttRemoteConnection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct IrohZrttRemoteConnection(OutgoingZeroRttConnection);
|
||||||
|
```
|
||||||
|
|
||||||
|
Wraps an iroh 0-RTT (Zero Round Trip Time) connection. This enables sending data before the full handshake completes for reduced latency on reconnections.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl RemoteConnection for IrohZrttRemoteConnection {
|
||||||
|
fn open_bi(&self) -> BoxFuture<Result<(SendStream, RecvStream), RequestError>> {
|
||||||
|
// Delegates to the 0-RTT connection's open_bi()
|
||||||
|
}
|
||||||
|
fn zero_rtt_accepted(&self) -> BoxFuture<bool> {
|
||||||
|
// Actually checks handshake_completed() to determine
|
||||||
|
// if 0-RTT data was accepted
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `zero_rtt_accepted()` method:
|
||||||
|
- Returns `true` if `ZeroRttStatus::Accepted`
|
||||||
|
- Returns `false` if `ZeroRttStatus::Rejected` or on error
|
||||||
|
- This allows the `Client` to decide whether to re-send data
|
||||||
|
|
||||||
|
### IrohLazyRemoteConnection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct IrohLazyRemoteConnection(Arc<IrohRemoteConnectionInner>);
|
||||||
|
|
||||||
|
struct IrohRemoteConnectionInner {
|
||||||
|
endpoint: iroh::Endpoint,
|
||||||
|
addr: iroh::EndpointAddr,
|
||||||
|
connection: tokio::sync::Mutex<Option<Connection>>,
|
||||||
|
alpn: Vec<u8>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The lazy connection caches the underlying iroh `Connection` and reconnects automatically:
|
||||||
|
|
||||||
|
1. On first `open_bi()`, establishes a connection via `endpoint.connect(addr, alpn)`
|
||||||
|
2. Caches the connection in a `Mutex<Option<Connection>>`
|
||||||
|
3. On subsequent `open_bi()`, tries to reuse the cached connection
|
||||||
|
4. If the cached connection fails, clears the cache and reconnects once
|
||||||
|
|
||||||
|
The `alpn` field is required because iroh connections need an ALPN protocol identifier.
|
||||||
|
|
||||||
|
### `client()` Function
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn client<S: irpc::Service>(
|
||||||
|
endpoint: iroh::Endpoint,
|
||||||
|
addr: impl Into<iroh::EndpointAddr>,
|
||||||
|
alpn: impl AsRef<[u8]>,
|
||||||
|
) -> irpc::Client<S>
|
||||||
|
```
|
||||||
|
|
||||||
|
Convenience function to create a `Client<S>` using iroh. Creates an `IrohLazyRemoteConnection` and wraps it with `Client::boxed()`.
|
||||||
|
|
||||||
|
## Server-Side: IrohProtocol
|
||||||
|
|
||||||
|
### IrohProtocol
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct IrohProtocol<R> {
|
||||||
|
handler: Handler<R>,
|
||||||
|
request_id: AtomicU64,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Implements `iroh::protocol::ProtocolHandler`, allowing it to be registered with iroh's `Router`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for IrohProtocol<R> {
|
||||||
|
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> {
|
||||||
|
// Handle the connection using irpc's handle_connection
|
||||||
|
let handler = self.handler.clone();
|
||||||
|
let fut = handle_connection(&connection, handler).map_err(AcceptError::from_err);
|
||||||
|
fut.instrument(span).await
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Usage:**
|
||||||
|
```rust
|
||||||
|
let protocol = IrohProtocol::with_sender(local_sender);
|
||||||
|
// or
|
||||||
|
let protocol = IrohProtocol::new(handler);
|
||||||
|
|
||||||
|
let router = Router::builder(endpoint)
|
||||||
|
.accept(ALPN, protocol)
|
||||||
|
.spawn();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Iroh0RttProtocol
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Iroh0RttProtocol<R> { ... }
|
||||||
|
```
|
||||||
|
|
||||||
|
Supports 0-RTT connections by implementing `ProtocolHandler::on_accepting()`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl<R: DeserializeOwned + Send + 'static> ProtocolHandler for Iroh0RttProtocol<R> {
|
||||||
|
async fn on_accepting(&self, accepting: Accepting) -> Result<Connection, AcceptError> {
|
||||||
|
let zrtt_conn = accepting.into_0rtt();
|
||||||
|
// Handle 0-RTT data immediately
|
||||||
|
handle_connection(&zrtt_conn, handler).await?;
|
||||||
|
// Wait for handshake completion
|
||||||
|
let conn = zrtt_conn.handshake_completed().await?;
|
||||||
|
Ok(conn)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn accept(&self, _connection: Connection) -> Result<(), AcceptError> {
|
||||||
|
// Noop — handled in on_accepting
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Warning:** 0-RTT data is replayable. Only use for idempotent operations. See <https://www.iroh.computer/blog/0rtt-api>.
|
||||||
|
|
||||||
|
### IncomingRemoteConnection Trait
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait IncomingRemoteConnection {
|
||||||
|
fn accept_bi(&self) -> impl Future<Output = Result<(SendStream, RecvStream), ConnectionError>> + Send;
|
||||||
|
fn close(&self, error_code: VarInt, reason: &[u8]);
|
||||||
|
fn remote_id(&self) -> Result<EndpointId, RemoteEndpointIdError>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Abstraction over `Connection` and `IncomingZeroRttConnection`, enabling `handle_connection` and `read_request` to work with both regular and 0-RTT connections.
|
||||||
|
|
||||||
|
Implemented for:
|
||||||
|
- `Connection` — regular iroh connection
|
||||||
|
- `IncomingZeroRttConnection` — 0-RTT connection
|
||||||
|
|
||||||
|
## handle_connection (iroh variant)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn handle_connection<R: DeserializeOwned + 'static>(
|
||||||
|
connection: &impl IncomingRemoteConnection,
|
||||||
|
handler: Handler<R>,
|
||||||
|
) -> io::Result<()>
|
||||||
|
```
|
||||||
|
|
||||||
|
Similar to the noq version but works with iroh's `IncomingRemoteConnection` trait. Records the remote endpoint ID in the tracing span.
|
||||||
|
|
||||||
|
## read_request and read_request_raw (iroh variants)
|
||||||
|
|
||||||
|
Same logic as the noq versions but using `IncomingRemoteConnection` instead of `noq::Connection`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn read_request<S: RemoteService>(
|
||||||
|
connection: &impl IncomingRemoteConnection,
|
||||||
|
) -> io::Result<Option<S::Message>>
|
||||||
|
|
||||||
|
pub async fn read_request_raw<R: DeserializeOwned + 'static>(
|
||||||
|
connection: &impl IncomingRemoteConnection,
|
||||||
|
) -> io::Result<Option<(R, RecvStream, SendStream)>>
|
||||||
|
```
|
||||||
|
|
||||||
|
## listen (iroh variant)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn listen<R: DeserializeOwned + 'static>(endpoint: iroh::Endpoint, handler: Handler<R>)
|
||||||
|
```
|
||||||
|
|
||||||
|
Accepts connections from an iroh `Endpoint` and handles them with the provided handler. Uses `n0_future::task::JoinSet` for task management.
|
||||||
|
|
||||||
|
## Example Usage
|
||||||
|
|
||||||
|
### Server
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use irpc::{rpc_requests, channel::oneshot, Client, WithChannels};
|
||||||
|
use irpc_iroh::IrohProtocol;
|
||||||
|
use iroh::{endpoint::presets, protocol::Router, Endpoint};
|
||||||
|
|
||||||
|
#[rpc_requests(message = FooMessage)]
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
enum FooProtocol {
|
||||||
|
#[rpc(tx=oneshot::Sender<String>)]
|
||||||
|
Get(String),
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn server() -> Result<()> {
|
||||||
|
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||||
|
tokio::task::spawn(actor(rx));
|
||||||
|
let client = Client::<FooProtocol>::local(tx);
|
||||||
|
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let protocol = IrohProtocol::with_sender(client.as_local().unwrap());
|
||||||
|
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||||
|
// ... keep running
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Client
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn connect(endpoint_id: EndpointId) -> Result<Client<FooProtocol>> {
|
||||||
|
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||||
|
let client = irpc_iroh::client(endpoint, endpoint_id, ALPN);
|
||||||
|
Ok(client)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Or with direct connection:
|
||||||
|
async fn connect_direct(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<FooProtocol>> {
|
||||||
|
let conn = endpoint.connect(addr, ALPN).await?;
|
||||||
|
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 0-RTT Client
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn connect_0rtt(endpoint: Endpoint, addr: EndpointAddr) -> Result<Client<EchoProtocol>> {
|
||||||
|
let connecting = endpoint.connect_with_opts(addr, ALPN, Default::default()).await?;
|
||||||
|
match connecting.into_0rtt() {
|
||||||
|
Ok(conn) => Ok(Client::boxed(IrohZrttRemoteConnection::new(conn))),
|
||||||
|
Err(connecting) => {
|
||||||
|
let conn = connecting.await?;
|
||||||
|
Ok(Client::boxed(IrohRemoteConnection::new(conn)))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
@@ -0,0 +1,134 @@
|
|||||||
|
# irpc: Serialization and Utility Modules
|
||||||
|
|
||||||
|
## Varint Utilities
|
||||||
|
|
||||||
|
The `varint-util` module (available with `rpc` or `varint-util` feature) provides LEB128 varint encoding/decoding compatible with postcard's format.
|
||||||
|
|
||||||
|
### Async Reading
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub async fn read_varint_u64<R: AsyncRead + Unpin>(reader: &mut R) -> io::Result<Option<u64>>
|
||||||
|
```
|
||||||
|
|
||||||
|
Reads a LEB128-encoded `u64` from an async reader. Returns `Ok(None)` on `UnexpectedEof` at the first byte position (clean stream end).
|
||||||
|
|
||||||
|
**Format:** Each byte uses 7 bits for the value, MSB as continuation bit. Values stored little-endian (least significant group first).
|
||||||
|
|
||||||
|
### Sync Writing
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn write_varint_u64_sync<W: io::Write>(writer: &mut W, value: u64) -> io::Result<usize>
|
||||||
|
```
|
||||||
|
|
||||||
|
Writes a `u64` as LEB128 to a synchronous writer.
|
||||||
|
|
||||||
|
### Length-Prefixed Encoding
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Sync:
|
||||||
|
pub fn write_length_prefixed<T: Serialize>(write: impl io::Write, value: T) -> io::Result<()>
|
||||||
|
pub trait WriteVarintExt: io::Write {
|
||||||
|
fn write_varint_u64(&mut self, value: u64) -> io::Result<usize>;
|
||||||
|
fn write_length_prefixed<T: Serialize>(&mut self, value: T) -> io::Result<()>;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Async:
|
||||||
|
pub trait AsyncReadVarintExt: AsyncRead + Unpin {
|
||||||
|
fn read_varint_u64(&mut self) -> impl Future<Output = io::Result<Option<u64>>>;
|
||||||
|
fn read_length_prefixed<T: DeserializeOwned>(&mut self, max_size: usize) -> impl Future<Output = io::Result<T>>;
|
||||||
|
}
|
||||||
|
|
||||||
|
pub trait AsyncWriteVarintExt: AsyncWrite + Unpin {
|
||||||
|
fn write_varint_u64(&mut self, value: u64) -> impl Future<Output = io::Result<usize>>;
|
||||||
|
fn write_length_prefixed<T: Serialize>(&mut self, value: V) -> impl Future<Output = io::Result<usize>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The length-prefix format is:
|
||||||
|
```
|
||||||
|
[varint-encoded-length][postcard-serialized-data]
|
||||||
|
```
|
||||||
|
|
||||||
|
Used internally by irpc for framing all messages on QUIC streams. The `max_size` parameter in `read_length_prefixed` prevents memory exhaustion from malicious length values.
|
||||||
|
|
||||||
|
## noq Endpoint Setup
|
||||||
|
|
||||||
|
The `noq_endpoint_setup` feature provides helpers for creating noq endpoints with TLS configuration:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn configure_client(server_certs: &[&[u8]]) -> Result<ClientConfig>
|
||||||
|
pub fn configure_server() -> Result<(ServerConfig, Vec<u8>)>
|
||||||
|
pub fn configure_client_insecure() -> Result<ClientConfig>
|
||||||
|
|
||||||
|
// Non-WASM only:
|
||||||
|
pub fn make_client_endpoint(bind_addr: SocketAddr, server_certs: &[&[u8]]) -> Result<Endpoint>
|
||||||
|
pub fn make_insecure_client_endpoint(bind_addr: SocketAddr) -> Result<Endpoint>
|
||||||
|
pub fn make_server_endpoint(bind_addr: SocketAddr) -> Result<(Endpoint, Vec<u8>)>
|
||||||
|
```
|
||||||
|
|
||||||
|
- `configure_server()`: Creates a self-signed certificate with rcgen and configures the server with TLS 1.3. Returns the DER-encoded certificate for clients to trust.
|
||||||
|
- `configure_client()`: Configures a client to trust specific DER certificates.
|
||||||
|
- `configure_client_insecure()`: Skips certificate verification (for testing only).
|
||||||
|
- Server endpoints set `max_concurrent_uni_streams(0)` to disable unidirectional streams (only bidirectional streams are used).
|
||||||
|
- Keep-alive interval is set to 1 second on client configs.
|
||||||
|
|
||||||
|
## FusedOneshotReceiver
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct FusedOneshotReceiver<T>(pub tokio::sync::oneshot::Receiver<T>);
|
||||||
|
```
|
||||||
|
|
||||||
|
A wrapper that prevents panics when polling an already-completed oneshot receiver. After the inner receiver resolves, subsequent polls return `Poll::Pending` indefinitely instead of panicking.
|
||||||
|
|
||||||
|
This is important because irpc's `oneshot::Receiver` can be wrapped in `Receiver::Boxed` (a `BoxFuture`), and the inner future might be polled multiple times in certain select patterns.
|
||||||
|
|
||||||
|
## now_or_never
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) fn now_or_never<F: Future>(future: F) -> Option<F::Output>
|
||||||
|
```
|
||||||
|
|
||||||
|
Attempts to complete a future immediately without blocking. If the future would block, returns `None`. Used internally by `NoqSenderInner::try_send()` to attempt an immediate write to the QUIC stream without yielding.
|
||||||
|
|
||||||
|
Implementation uses a no-op waker to poll the future once.
|
||||||
|
|
||||||
|
## Spans Feature
|
||||||
|
|
||||||
|
When the `spans` feature is enabled (default), `WithChannels` includes a `span: tracing::Span` field:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct WithChannels<I: Channels<S>, S: Service> {
|
||||||
|
pub inner: I,
|
||||||
|
pub tx: <I as Channels<S>>::Tx,
|
||||||
|
pub rx: <I as Channels<S>>::Rx,
|
||||||
|
#[cfg(feature = "spans")]
|
||||||
|
pub span: tracing::Span,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The span is captured from `tracing::Span::current()` at the time of `WithChannels` construction (via `From` implementations). This preserves tracing context across async message-passing boundaries.
|
||||||
|
|
||||||
|
The `rpc_requests` macro generates a `parent_span()` method on the message enum when `no_spans` is not set:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
impl ComputeMessage {
|
||||||
|
pub fn parent_span(&self) -> tracing::Span {
|
||||||
|
let span = match self {
|
||||||
|
ComputeMessage::Multiply(inner) => inner.parent_span_opt(),
|
||||||
|
ComputeMessage::Sum(inner) => inner.parent_span_opt(),
|
||||||
|
};
|
||||||
|
span.cloned().unwrap_or_else(|| tracing::Span::current())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows server-side handlers to enter the client's tracing span:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn handle(msg: ComputeMessage) {
|
||||||
|
let _entered = msg.parent_span().enter();
|
||||||
|
// ... processing happens in the client's tracing context
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When `no_spans` is set in the macro, no span-related code is generated, making it compatible with builds that don't have the `spans` feature enabled.
|
||||||
@@ -0,0 +1,249 @@
|
|||||||
|
# irpc: Design Patterns and Usage Examples
|
||||||
|
|
||||||
|
## Pattern 1: Actor Model (Most Common)
|
||||||
|
|
||||||
|
The primary usage pattern is an actor that receives messages and processes them sequentially:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct StorageActor {
|
||||||
|
recv: tokio::sync::mpsc::Receiver<StorageMessage>,
|
||||||
|
state: BTreeMap<String, String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl StorageActor {
|
||||||
|
pub fn spawn() -> StorageApi {
|
||||||
|
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||||
|
let actor = Self { recv: rx, state: BTreeMap::new() };
|
||||||
|
tokio::task::spawn(actor.run());
|
||||||
|
StorageApi { inner: Client::local(tx) }
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn run(mut self) {
|
||||||
|
while let Some(msg) = self.recv.recv().await {
|
||||||
|
self.handle(msg).await;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn handle(&mut self, msg: StorageMessage) {
|
||||||
|
match msg {
|
||||||
|
StorageMessage::Get(wc) => {
|
||||||
|
let WithChannels { inner, tx, .. } = wc;
|
||||||
|
tx.send(self.state.get(&inner.key).cloned()).await.ok();
|
||||||
|
}
|
||||||
|
StorageMessage::Set(wc) => {
|
||||||
|
let WithChannels { inner, tx, .. } = wc;
|
||||||
|
self.state.insert(inner.key, inner.value);
|
||||||
|
tx.send(()).await.ok();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key points:**
|
||||||
|
- The actor owns state and processes messages sequentially
|
||||||
|
- `Client::local(tx)` wraps the sender side of the mpsc channel
|
||||||
|
- `WithChannels` destructuring gives access to `inner` (the request data), `tx` (response channel), and `rx` (update channel)
|
||||||
|
- The `..` pattern ignores `rx` when it's `NoReceiver` and `span` (with `spans` feature)
|
||||||
|
|
||||||
|
## Pattern 2: Concurrent Task Per Request
|
||||||
|
|
||||||
|
For long-running or independent requests, spawn a task per message:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn run(mut self) {
|
||||||
|
while let Ok(Some(msg)) = self.recv.recv().await {
|
||||||
|
tokio::task::spawn(async move {
|
||||||
|
if let Err(cause) = Self::handle(msg).await {
|
||||||
|
eprintln!("Error: {cause}");
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is useful for CPU-intensive or I/O-bound requests that shouldn't block other requests.
|
||||||
|
|
||||||
|
## Pattern 3: Local-Only Usage
|
||||||
|
|
||||||
|
irpc can be used without any RPC feature for pure in-process communication:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Cargo.toml: default-features = false, features = ["derive"]
|
||||||
|
#[rpc_requests(message = StorageMessage, no_rpc, no_spans)]
|
||||||
|
#[derive(Serialize, Deserialize, Debug)]
|
||||||
|
enum StorageProtocol {
|
||||||
|
#[rpc(tx=oneshot::Sender<Option<String>>)]
|
||||||
|
Get(Get),
|
||||||
|
#[rpc(tx=oneshot::Sender<()>)]
|
||||||
|
Set(Set),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `no_rpc` flag prevents `RemoteService` from being generated, and `no_spans` removes the tracing dependency. This leaves only the local channel mechanism, with minimal dependencies (serde, tokio, tokio-util).
|
||||||
|
|
||||||
|
## Pattern 4: API Type Wrapping Client
|
||||||
|
|
||||||
|
The recommended pattern is to wrap `Client<S>` in a higher-level API type:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct StorageApi {
|
||||||
|
inner: Client<StorageProtocol>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl StorageApi {
|
||||||
|
// Local
|
||||||
|
pub fn spawn() -> Self {
|
||||||
|
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||||
|
tokio::task::spawn(StorageActor::new(rx).run());
|
||||||
|
Self { inner: Client::local(tx) }
|
||||||
|
}
|
||||||
|
|
||||||
|
// Remote (noq)
|
||||||
|
pub fn connect(endpoint: noq::Endpoint, addr: SocketAddr) -> Self {
|
||||||
|
Self { inner: Client::noq(endpoint, addr) }
|
||||||
|
}
|
||||||
|
|
||||||
|
// Remote (iroh)
|
||||||
|
pub fn connect_iroh(endpoint: iroh::Endpoint, addr: EndpointAddr) -> Self {
|
||||||
|
Self { inner: irpc_iroh::client(endpoint, addr, ALPN) }
|
||||||
|
}
|
||||||
|
|
||||||
|
// Type-safe methods that work for both local and remote
|
||||||
|
pub async fn get(&self, key: String) -> irpc::Result<Option<String>> {
|
||||||
|
self.inner.rpc(Get { key }).await
|
||||||
|
}
|
||||||
|
|
||||||
|
pub async fn set(&self, key: String, value: String) -> irpc::Result<()> {
|
||||||
|
self.inner.rpc(Set { key, value }).await
|
||||||
|
}
|
||||||
|
|
||||||
|
pub async fn list(&self) -> irpc::Result<mpsc::Receiver<String>> {
|
||||||
|
self.inner.server_streaming(List, 16).await
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This encapsulates the protocol details and provides a clean, type-safe API. The same `StorageApi` works identically whether connected locally or remotely.
|
||||||
|
|
||||||
|
## Pattern 5: Server Setup
|
||||||
|
|
||||||
|
### With noq
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn serve(api: &StorageApi, endpoint: noq::Endpoint) -> Result<JoinHandle<()>> {
|
||||||
|
let local = api.inner.as_local().context("cannot listen on remote service")?;
|
||||||
|
let handler = StorageProtocol::remote_handler(local);
|
||||||
|
Ok(tokio::task::spawn(irpc::rpc::listen(endpoint, handler)))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### With iroh
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn serve(api: &StorageApi, endpoint: iroh::Endpoint) -> Result<Router> {
|
||||||
|
let local = api.inner.as_local().context("cannot listen on remote service")?;
|
||||||
|
let protocol = IrohProtocol::with_sender(local);
|
||||||
|
Ok(Router::builder(endpoint).accept(ALPN, protocol).spawn())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Pattern 6: Low-Level Request Handling
|
||||||
|
|
||||||
|
For more control than the `Client` methods provide, use `request()` directly:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn custom_request(&self, msg: Get) -> anyhow::Result<oneshot::Receiver<Option<String>>> {
|
||||||
|
match self.inner.request().await? {
|
||||||
|
Request::Local(request) => {
|
||||||
|
let (tx, rx) = oneshot::channel();
|
||||||
|
request.send((msg, tx)).await?;
|
||||||
|
Ok(rx)
|
||||||
|
}
|
||||||
|
Request::Remote(request) => {
|
||||||
|
let (_tx, rx) = request.write(msg).await?;
|
||||||
|
Ok(rx.into())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows custom channel creation logic, e.g., different buffer sizes for local vs remote.
|
||||||
|
|
||||||
|
## Pattern 7: Channel Filtering and Mapping
|
||||||
|
|
||||||
|
irpc channels support filtering and mapping, which work for both local and remote channels:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Server-side: filter responses to only include values > 10
|
||||||
|
let filtered_tx = wc.tx.with_filter(|v: &i64| *v > 10);
|
||||||
|
|
||||||
|
// Server-side: transform responses
|
||||||
|
let mapped_tx = wc.tx.with_map(|v: i64| v * 2);
|
||||||
|
|
||||||
|
// Client-side: filter received updates
|
||||||
|
let filtered_rx = rx.filter(|update: &Update| update.is_relevant());
|
||||||
|
```
|
||||||
|
|
||||||
|
For remote channels, these create boxed wrappers. For local channels, they also create boxed wrappers. The overhead is negligible for remote (network latency dominates) but present for local.
|
||||||
|
|
||||||
|
## Pattern 8: Using the `wrap` Attribute
|
||||||
|
|
||||||
|
The `#[wrap]` attribute generates named structs from variant fields:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[rpc_requests(message = StoreMessage)]
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
enum StoreProtocol {
|
||||||
|
#[rpc(tx=oneshot::Sender<Option<String>>)]
|
||||||
|
#[wrap(GetRequest, derive(Clone))]
|
||||||
|
Get(String), // Generates: pub struct GetRequest(pub String);
|
||||||
|
|
||||||
|
#[rpc(tx=oneshot::Sender<()>)]
|
||||||
|
#[wrap(SetRequest)]
|
||||||
|
Set { key: String, value: String }, // Generates: pub struct SetRequest { pub key: String, pub value: String }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Benefits:
|
||||||
|
- Named request types can be imported and constructed by name
|
||||||
|
- Additional derives (e.g., `Clone`) can be added
|
||||||
|
- Custom visibility can be specified: `#[wrap(pub(crate) GetRequest)]`
|
||||||
|
- The generated struct inherits the enum's visibility by default
|
||||||
|
|
||||||
|
## Pattern 9: 0-RTT Connections
|
||||||
|
|
||||||
|
For reduced latency on reconnections with iroh:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Client side
|
||||||
|
let result = client.rpc_0rtt(Get { key: "x".into() }).await?;
|
||||||
|
|
||||||
|
// Server side (iroh)
|
||||||
|
let protocol = Iroh0RttProtocol::with_sender(local_sender);
|
||||||
|
let router = Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||||
|
```
|
||||||
|
|
||||||
|
**Important:** Only use 0-RTT for idempotent operations, as the data may be replayed by an attacker.
|
||||||
|
|
||||||
|
## Pattern 10: Shared State in Actor
|
||||||
|
|
||||||
|
For actors that need shared state accessible from multiple handlers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Actor {
|
||||||
|
recv: tokio::sync::mpsc::Receiver<Message>,
|
||||||
|
state: Arc<Mutex<SharedState>>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use the actor pattern with internal mutation:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct Actor {
|
||||||
|
recv: tokio::sync::mpsc::Receiver<Message>,
|
||||||
|
db: HashMap<String, String>, // owned state
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Since the actor processes messages sequentially, no internal synchronization is needed.
|
||||||
230
docs/research/references/iroh/irpc/10-quick-reference.md
Normal file
230
docs/research/references/iroh/irpc/10-quick-reference.md
Normal file
@@ -0,0 +1,230 @@
|
|||||||
|
# irpc: Quick Reference
|
||||||
|
|
||||||
|
## Crate Info
|
||||||
|
|
||||||
|
- **Name:** `irpc`
|
||||||
|
- **Version:** 0.13.0
|
||||||
|
- **License:** Apache-2.0 OR MIT
|
||||||
|
- **Repository:** https://github.com/n0-computer/irpc
|
||||||
|
- **MSRV:** 1.89
|
||||||
|
|
||||||
|
## Feature Flags
|
||||||
|
|
||||||
|
| Feature | Default | Dependencies Added |
|
||||||
|
|---|---|---|
|
||||||
|
| `rpc` | ✅ | noq, postcard, smallvec, tracing, tokio/io-util |
|
||||||
|
| `derive` | ✅ | irpc-derive |
|
||||||
|
| `spans` | ✅ | tracing |
|
||||||
|
| `stream` | ✅ | futures-util |
|
||||||
|
| `noq_endpoint_setup` | ✅ | rustls, rcgen, futures-buffered |
|
||||||
|
| `varint-util` | ❌ | postcard, smallvec, tokio/io-util |
|
||||||
|
|
||||||
|
## Type Quick Reference
|
||||||
|
|
||||||
|
### Core Types
|
||||||
|
|
||||||
|
```
|
||||||
|
Service trait — implemented on protocol enum, defines Message type
|
||||||
|
Channels<S> trait — implemented on request types, defines Tx/Rx types
|
||||||
|
RpcMessage trait — blanket impl for Debug+Serialize+DeserializeOwned+Send+Sync+Unpin+'static
|
||||||
|
Sender trait — sealed marker for sender types
|
||||||
|
Receiver trait — sealed marker for receiver types
|
||||||
|
WithChannels<I,S> struct — wraps request I with tx/rx/span for service S
|
||||||
|
Client<S> struct — client to service S (local or remote)
|
||||||
|
LocalSender<S> struct — local sender wrapping mpsc::Sender<S::Message>
|
||||||
|
Request<L,R> enum — Local(L) or Remote(R) request
|
||||||
|
RemoteSender<S> struct — holds QUIC stream pair for sending initial message
|
||||||
|
```
|
||||||
|
|
||||||
|
### Channel Types
|
||||||
|
|
||||||
|
```
|
||||||
|
oneshot::Sender<T> — Tokio or Boxed; single value; async send
|
||||||
|
oneshot::Receiver<T> — Tokio or Boxed; single value; Future impl
|
||||||
|
mpsc::Sender<T> — Tokio or Arc<DynSender>; stream; async send/try_send
|
||||||
|
mpsc::Receiver<T> — Tokio or Box<DynReceiver>; stream; async recv
|
||||||
|
NoSender — No-op sender
|
||||||
|
NoReceiver — No-op receiver
|
||||||
|
```
|
||||||
|
|
||||||
|
### Remote Types (rpc feature)
|
||||||
|
|
||||||
|
```
|
||||||
|
RemoteConnection trait — open_bi(), zero_rtt_accepted(), clone_boxed()
|
||||||
|
NoqLazyRemoteConnection — lazy noq connection with cache
|
||||||
|
Handler<R> type — Arc<dyn Fn(R, RecvStream, SendStream) -> ...>
|
||||||
|
```
|
||||||
|
|
||||||
|
### irpc-iroh Types
|
||||||
|
|
||||||
|
```
|
||||||
|
IrohRemoteConnection — wraps iroh::Connection
|
||||||
|
IrohZrttRemoteConnection — wraps iroh::OutgoingZeroRttConnection
|
||||||
|
IrohLazyRemoteConnection — lazy iroh connection with cache
|
||||||
|
IrohProtocol<R> — ProtocolHandler for iroh Router
|
||||||
|
Iroh0RttProtocol<R> — ProtocolHandler with 0-RTT support
|
||||||
|
IncomingRemoteConnection trait — abstraction over Connection and ZeroRttConnection
|
||||||
|
```
|
||||||
|
|
||||||
|
## Interaction Patterns Cheatsheet
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
// Protocol Definition
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
|
||||||
|
#[rpc_requests(message = MyMessage)]
|
||||||
|
#[derive(Debug, Serialize, Deserialize)]
|
||||||
|
enum MyProtocol {
|
||||||
|
// Unary RPC
|
||||||
|
#[rpc(tx=oneshot::Sender<Response>)]
|
||||||
|
#[wrap(GetReq)]
|
||||||
|
Get(String),
|
||||||
|
|
||||||
|
// Server streaming
|
||||||
|
#[rpc(tx=mpsc::Sender<Item>)]
|
||||||
|
#[wrap(ListReq)]
|
||||||
|
List(ListParams),
|
||||||
|
|
||||||
|
// Client streaming
|
||||||
|
#[rpc(tx=oneshot::Sender<Count>, rx=mpsc::Receiver<Item>)]
|
||||||
|
#[wrap(UploadReq)]
|
||||||
|
Upload,
|
||||||
|
|
||||||
|
// Bidirectional streaming
|
||||||
|
#[rpc(tx=mpsc::Sender<Result>, rx=mpsc::Receiver<Update>)]
|
||||||
|
#[wrap(ProcessReq)]
|
||||||
|
Process(ProcessConfig),
|
||||||
|
|
||||||
|
// Fire and forget
|
||||||
|
#[rpc]
|
||||||
|
#[wrap(LogReq)]
|
||||||
|
Log(String),
|
||||||
|
}
|
||||||
|
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
// Client Usage
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
|
||||||
|
// Local
|
||||||
|
let (tx, rx) = tokio::sync::mpsc::channel(16);
|
||||||
|
tokio::task::spawn(actor(rx));
|
||||||
|
let client: Client<MyProtocol> = Client::local(tx);
|
||||||
|
|
||||||
|
// Remote (noq)
|
||||||
|
let client: Client<MyProtocol> = Client::noq(endpoint, addr);
|
||||||
|
|
||||||
|
// Remote (iroh)
|
||||||
|
let client: Client<MyProtocol> = irpc_iroh::client(endpoint, addr, alpn);
|
||||||
|
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
// Making Requests
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
|
||||||
|
// Unary
|
||||||
|
let result: Response = client.rpc(GetReq("key".into())).await?;
|
||||||
|
|
||||||
|
// Server streaming
|
||||||
|
let mut rx: mpsc::Receiver<Item> = client.server_streaming(ListReq(params), 16).await?;
|
||||||
|
while let Some(item) = rx.recv().await? { ... }
|
||||||
|
|
||||||
|
// Client streaming
|
||||||
|
let (update_tx, response_rx): (mpsc::Sender<Item>, oneshot::Receiver<Count>) =
|
||||||
|
client.client_streaming(Upload, 4).await?;
|
||||||
|
update_tx.send(item).await?;
|
||||||
|
let count = response_rx.await?;
|
||||||
|
|
||||||
|
// Bidirectional
|
||||||
|
let (update_tx, mut result_rx): (mpsc::Sender<Update>, mpsc::Receiver<Result>) =
|
||||||
|
client.bidi_streaming(ProcessReq(config), 4, 16).await?;
|
||||||
|
update_tx.send(update).await?;
|
||||||
|
while let Some(result) = result_rx.recv().await? { ... }
|
||||||
|
|
||||||
|
// Fire and forget
|
||||||
|
client.notify(LogReq("message".into())).await?;
|
||||||
|
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
// Server Setup
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
|
||||||
|
// noq
|
||||||
|
let handler = MyProtocol::remote_handler(local_sender);
|
||||||
|
irpc::rpc::listen(endpoint, handler).await;
|
||||||
|
|
||||||
|
// iroh
|
||||||
|
let protocol = IrohProtocol::with_sender(local_sender);
|
||||||
|
Router::builder(endpoint).accept(ALPN, protocol).spawn();
|
||||||
|
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
// Actor Message Handling
|
||||||
|
// ═══════════════════════════════════════════
|
||||||
|
|
||||||
|
async fn handle(&mut self, msg: MyMessage) {
|
||||||
|
match msg {
|
||||||
|
MyMessage::Get(wc) => {
|
||||||
|
let WithChannels { inner, tx, .. } = wc;
|
||||||
|
let result = self.db.get(&inner.0).cloned();
|
||||||
|
tx.send(result).await.ok();
|
||||||
|
}
|
||||||
|
MyMessage::List(wc) => {
|
||||||
|
let WithChannels { tx, .. } = wc;
|
||||||
|
for item in &self.items {
|
||||||
|
if tx.send(item.clone()).await.is_err() { break; }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
MyMessage::Upload(wc) => {
|
||||||
|
let WithChannels { tx, mut rx, .. } = wc;
|
||||||
|
let mut count = 0;
|
||||||
|
while let Ok(Some(item)) = rx.recv().await {
|
||||||
|
self.process(item);
|
||||||
|
count += 1;
|
||||||
|
}
|
||||||
|
tx.send(count).await.ok();
|
||||||
|
}
|
||||||
|
MyMessage::Process(wc) => {
|
||||||
|
let WithChannels { tx, mut rx, inner, .. } = wc;
|
||||||
|
tokio::task::spawn(async move {
|
||||||
|
while let Ok(Some(update)) = rx.recv().await {
|
||||||
|
if let Some(result) = process(update, &inner) {
|
||||||
|
if tx.send(result).await.is_err() { break; }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
MyMessage::Log(wc) => {
|
||||||
|
let WithChannels { inner, .. } = wc;
|
||||||
|
println!("{}", inner.0);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Handling Quick Reference
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Client-side errors
|
||||||
|
use irpc::{Error, RequestError, Result};
|
||||||
|
|
||||||
|
// Request errors (connection/stream open failures)
|
||||||
|
match client.rpc(GetReq("key".into())).await {
|
||||||
|
Ok(result) => { ... }
|
||||||
|
Err(Error::Request { source }) => { ... } // Connection failed
|
||||||
|
Err(Error::OneshotRecv { source }) => { ... } // Response channel error
|
||||||
|
}
|
||||||
|
|
||||||
|
// Channel errors
|
||||||
|
use irpc::channel::{SendError, mpsc::RecvError, oneshot::RecvError};
|
||||||
|
|
||||||
|
// SendError: ReceiverClosed | MaxMessageSizeExceeded | Io
|
||||||
|
// RecvError (oneshot): SenderClosed | MaxMessageSizeExceeded | Io
|
||||||
|
// RecvError (mpsc): MaxMessageSizeExceeded | Io
|
||||||
|
```
|
||||||
|
|
||||||
|
## Constants
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub const MAX_MESSAGE_SIZE: u64 = 16 * 1024 * 1024; // 16 MiB
|
||||||
|
pub const ERROR_CODE_MAX_MESSAGE_SIZE_EXCEEDED: u32 = 1;
|
||||||
|
pub const ERROR_CODE_INVALID_POSTCARD: u32 = 2;
|
||||||
|
// Connection close code 0 = clean shutdown
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user