docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs
This commit is contained in:
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
249
docs/research/references/iroh/iroh-blobs/03-transfer-protocol.md
Normal file
@@ -0,0 +1,249 @@
|
||||
# iroh-blobs: Transfer Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
The transfer protocol is a **request-response** protocol operating over QUIC streams (via iroh). The ALPN is `b"/iroh-bytes/4"`.
|
||||
|
||||
The requester opens a bidirectional QUIC stream, sends a request, and the provider responds with BLAKE3-verified streaming data on the same stream.
|
||||
|
||||
**Key properties**:
|
||||
- Data integrity is verified in-stream — every 16 KiB chunk group can be independently verified against the BLAKE3 hash tree
|
||||
- No upper limit on blob or collection size — streaming design avoids buffering entire transfers
|
||||
- Zero round-trip overhead for multiple small blobs (via HashSeq/GetManyRequest)
|
||||
- Range requests supported at chunk granularity
|
||||
|
||||
## Request Types
|
||||
|
||||
```rust
|
||||
pub enum Request {
|
||||
Get(GetRequest),
|
||||
Observe(ObserveRequest),
|
||||
Slot2, Slot3, Slot4, Slot5, Slot6, Slot7, // Reserved
|
||||
Push(PushRequest),
|
||||
GetMany(GetManyRequest),
|
||||
}
|
||||
```
|
||||
|
||||
Wire format: 1-byte discriminator (postcard-encoded `RequestType` enum), followed by postcard-serialized request body.
|
||||
|
||||
### GetRequest
|
||||
|
||||
```rust
|
||||
pub struct GetRequest {
|
||||
pub hash: Hash, // BLAKE3 hash of the root blob
|
||||
pub ranges: ChunkRangesSeq, // What ranges to request
|
||||
}
|
||||
```
|
||||
|
||||
The most common request type. The `ranges` field uses `ChunkRangesSeq` to express which parts of the root blob and its children to request.
|
||||
|
||||
**Common patterns**:
|
||||
|
||||
```rust
|
||||
// Request an entire single blob
|
||||
let req = GetRequest::blob(hash);
|
||||
// -> ChunkRangesSeq with a single element: all chunks of the root
|
||||
|
||||
// Request a HashSeq (root + all children)
|
||||
let req = GetRequest::all(hash);
|
||||
// -> ChunkRangesSeq::all() - infinite sequence of "all chunks"
|
||||
|
||||
// Request parts of a single blob
|
||||
let req = GetRequest::builder()
|
||||
.root(ChunkRanges::bytes(0..1000))
|
||||
.build(hash);
|
||||
|
||||
// Request a HashSeq with specific child ranges
|
||||
let req = GetRequest::builder()
|
||||
.root(ChunkRanges::all()) // full root (the hash seq)
|
||||
.child(1, ChunkRanges::bytes(0..100)) // partial child 1
|
||||
.next(ChunkRanges::all()) // full remaining children
|
||||
.build_open(hash); // build_open = last range repeats forever
|
||||
```
|
||||
|
||||
### GetManyRequest
|
||||
|
||||
```rust
|
||||
pub struct GetManyRequest {
|
||||
pub hashes: Vec<Hash>, // Sorted, deduplicated list of hashes
|
||||
pub ranges: ChunkRangesSeq, // Ranges for each hash (no root entry)
|
||||
}
|
||||
```
|
||||
|
||||
Like a `GetRequest` for a HashSeq, but the hashes are provided by the requester instead of looked up from the provider. This avoids the provider needing to have a pre-existing HashSeq blob.
|
||||
|
||||
```rust
|
||||
let req = GetManyRequest::builder()
|
||||
.hash(hash1, ChunkRanges::all())
|
||||
.hash(hash2, ChunkRanges::all())
|
||||
.build();
|
||||
// Deduplicates and sorts hashes automatically
|
||||
```
|
||||
|
||||
### PushRequest
|
||||
|
||||
```rust
|
||||
pub struct PushRequest(GetRequest); // Wraps a GetRequest
|
||||
```
|
||||
|
||||
The inverse of a GetRequest — the requester pushes data to the provider. The request describes what will be sent, followed by the actual data stream. Providers may reject push requests (disabled by default via `EventMask`).
|
||||
|
||||
### ObserveRequest
|
||||
|
||||
```rust
|
||||
pub struct ObserveRequest {
|
||||
pub hash: Hash,
|
||||
pub ranges: RangeSpec, // Which ranges to observe
|
||||
}
|
||||
```
|
||||
|
||||
Subscribes to availability changes for a blob's bitfield. The provider sends `ObserveItem` updates as chunks become available.
|
||||
|
||||
## Response Format
|
||||
|
||||
### For Get/GetMany/Push
|
||||
|
||||
The response is BLAKE3-verified streaming data (bao-tree format). For each blob in the request:
|
||||
|
||||
1. **8-byte size header** (little-endian u64) — the total size of the blob
|
||||
2. **BLAKE3 verified stream** — encoded data for the requested ranges, using bao-tree's mixed encoding:
|
||||
- `BaoContentItem::Parent(node, (left_hash, right_hash))` — internal hash tree nodes (64 bytes each)
|
||||
- `BaoContentItem::Leaf(Leaf { offset, data })` — actual data chunks
|
||||
|
||||
The data is sent in order: ascending chunks for each blob, blobs in HashSeq order.
|
||||
|
||||
**Verification**: The requester validates each chunk group against the expected BLAKE3 hash tree. Invalid data is detected within at most 16 KiB of reception. Missing data (provider doesn't have a chunk) causes the provider to close the stream at the point where data becomes unavailable.
|
||||
|
||||
### For Observe
|
||||
|
||||
The provider sends length-prefixed `ObserveItem` messages:
|
||||
|
||||
```rust
|
||||
pub struct ObserveItem {
|
||||
pub size: u64, // Blob size
|
||||
pub ranges: ChunkRanges, // Available chunks
|
||||
}
|
||||
```
|
||||
|
||||
Updates are sent as deltas — only the new chunks that have become available since the last update.
|
||||
|
||||
## Error Handling
|
||||
|
||||
Error codes for stream/connection closure:
|
||||
|
||||
| Code | Name | Meaning |
|
||||
|------|------|---------|
|
||||
| 0 | StreamDropped | RecvStream was dropped |
|
||||
| 1 | ProviderTerminating | Provider is shutting down |
|
||||
| 2 | RequestReceived | Only one request per stream allowed |
|
||||
| 1 (application) | ERR_PERMISSION | Permission denied |
|
||||
| 2 (application) | ERR_LIMIT | Rate limited |
|
||||
| 3 (application) | ERR_INTERNAL | Internal error |
|
||||
|
||||
## Client-Side FSM (Get)
|
||||
|
||||
The `get::fsm` module implements the get request as a **finite state machine** for maximum control:
|
||||
|
||||
```
|
||||
AtInitial
|
||||
│ (open QUIC stream)
|
||||
▼
|
||||
AtConnected
|
||||
│ (send request, drop writer)
|
||||
▼
|
||||
ConnectedNext ─┬─ StartRoot(hash, ranges) // offset 0 = root blob
|
||||
├─ StartChild(offset, ranges) // offset > 0 = child blob
|
||||
└─ Closing // empty request
|
||||
│
|
||||
AtStartRoot / AtStartChild
|
||||
│ (determine hash for child)
|
||||
▼
|
||||
AtBlobHeader
|
||||
│ (read 8-byte size)
|
||||
▼
|
||||
AtBlobContent
|
||||
│ (stream BLAKE3-verified items)
|
||||
├─ More(content_item) → AtBlobContent // loop
|
||||
└─ Done → AtEndBlob
|
||||
│
|
||||
AtEndBlob
|
||||
│ (iterate to next blob in sequence)
|
||||
├─ MoreChildren(AtStartChild)
|
||||
└─ Closing
|
||||
│ (drain remaining bytes)
|
||||
▼
|
||||
Stats (transfer statistics)
|
||||
```
|
||||
|
||||
Each state transition is explicit. The FSM gives the consumer full control:
|
||||
- `AtBlobContent::next()` returns `BlobContentNext::More((content, item))` or `BlobContentNext::Done(end)`
|
||||
- `AtBlobHeader::next()` reads the size header and creates a `ResponseDecoder`
|
||||
- `AtStartChild::next(hash)` requires the caller to supply the hash (from the HashSeq)
|
||||
|
||||
### Stats Tracking
|
||||
|
||||
```rust
|
||||
pub struct Stats {
|
||||
pub payload_bytes_read: u64, // Actual data bytes
|
||||
pub other_bytes_read: u64, // Hash pairs, headers
|
||||
pub payload_bytes_written: u64, // For push
|
||||
pub other_bytes_written: u64, // For push
|
||||
pub elapsed: Duration,
|
||||
}
|
||||
```
|
||||
|
||||
## Provider-Side Handling
|
||||
|
||||
```rust
|
||||
pub async fn handle_connection(connection: Connection, store: Store, events: EventSender);
|
||||
```
|
||||
|
||||
The provider accepts QUIC streams on a connection. For each stream:
|
||||
1. Read the request type byte
|
||||
2. Deserialize the request
|
||||
3. Dispatch to `handle_get`, `handle_get_many`, `handle_observe`, or `handle_push`
|
||||
4. For `handle_get`: iterate over the `ChunkRangesSeq`, streaming each blob via `store.export_bao(hash, ranges)`
|
||||
5. For HashSeq requests: load the root blob, parse it as `HashSeq`, then stream each requested child
|
||||
|
||||
### Event System
|
||||
|
||||
The provider can emit events for monitoring and access control:
|
||||
|
||||
```rust
|
||||
pub struct EventMask {
|
||||
pub connected: ConnectMode, // None, Notify, Intercept
|
||||
pub get: RequestMode, // None, Notify, Intercept, NotifyLog, InterceptLog, Disabled
|
||||
pub get_many: RequestMode,
|
||||
pub push: RequestMode, // Disabled by default!
|
||||
pub observe: ObserveMode,
|
||||
pub throttle: ThrottleMode, // None, Intercept
|
||||
}
|
||||
```
|
||||
|
||||
- **None**: No events, requests processed normally
|
||||
- **Notify**: Events sent but cannot block requests
|
||||
- **Intercept**: Events sent as RPC requests; handler can reject with `AbortReason`
|
||||
- **Disabled**: All requests of this type rejected
|
||||
|
||||
Progress events: `TransferStarted`, `TransferProgress`, `TransferCompleted`, `TransferAborted`.
|
||||
|
||||
## Collection Format
|
||||
|
||||
```rust
|
||||
pub struct Collection {
|
||||
blobs: Vec<(String, Hash)>, // Named references to child blobs
|
||||
}
|
||||
```
|
||||
|
||||
Wire format (as a HashSeq blob):
|
||||
1. First child blob: `CollectionMeta` serialized with postcard
|
||||
2. Remaining children: the actual data blobs
|
||||
|
||||
```rust
|
||||
pub struct CollectionMeta {
|
||||
header: [u8; 13], // Must be b"CollectionV0."
|
||||
names: Vec<String>, // Names for each child blob
|
||||
}
|
||||
```
|
||||
|
||||
The header `b"CollectionV0."` is a magic number for format identification. The meta blob's hash becomes the first entry in the HashSeq, followed by the hashes of each data blob. Names correspond 1:1 with data blobs (excluding the meta entry).
|
||||
Reference in New Issue
Block a user