docs(research): add iroh suite deep-dive references for iroh, irpc, iroh-blobs, iroh-gossip, iroh-live, and iroh-docs
This commit is contained in:
@@ -0,0 +1,312 @@
|
||||
# iroh-blobs: Data Flow and Complete Example
|
||||
|
||||
## Complete Data Flow: Provider Side
|
||||
|
||||
```
|
||||
QUIC Connection Arrives
|
||||
│
|
||||
▼
|
||||
handle_connection(conn, store, events)
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ Accept QUIC BIDI │
|
||||
│ streams in loop │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
handle_stream(pair, store)
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ Read Request type │
|
||||
│ byte + deserialize │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌─────────────┬───────┼───────┬──────────────┐
|
||||
│ │ │ │ │
|
||||
handle_get handle_get handle handle (reserved)
|
||||
_many _observe _push
|
||||
│ │ │ │
|
||||
▼ ▼ ▼ ▼
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ For each (offset, ranges) in request.ranges: │
|
||||
│ │
|
||||
│ if offset == 0: │
|
||||
│ send_blob(store, 0, hash, ranges, writer) │
|
||||
│ else: │
|
||||
│ lookup hash in HashSeq[offset-1] │
|
||||
│ send_blob(store, offset, child_hash, ranges, writer) │
|
||||
│ │
|
||||
│ send_blob: │
|
||||
│ store.export_bao(hash, ranges) │
|
||||
│ .write_with_progress(writer, ctx, &hash, idx) │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Complete Data Flow: Requester Side (Get FSM)
|
||||
|
||||
```
|
||||
Create GetRequest
|
||||
│
|
||||
▼
|
||||
fsm::start(connection, request, counters)
|
||||
│
|
||||
▼
|
||||
AtInitial.next()
|
||||
│ (open_bi, send request)
|
||||
▼
|
||||
AtConnected.next()
|
||||
│
|
||||
┌───────────┼───────────┐
|
||||
│ │ │
|
||||
StartRoot StartChild Closing
|
||||
(offset=0) (offset>0) (empty)
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
AtBlobHeader AtBlobHeader AtClosing
|
||||
.next() .next(hash) .next()
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
(size, AtBlobContent) Stats
|
||||
│
|
||||
┌────────┴────────┐
|
||||
│ │
|
||||
More(item) Done
|
||||
(loop back to (AtEndBlob)
|
||||
AtBlobContent) │
|
||||
┌─────┼─────┐
|
||||
│ │
|
||||
MoreChildren Closing
|
||||
(AtStartChild) (AtClosing)
|
||||
│ │
|
||||
└───────────┘
|
||||
```
|
||||
|
||||
### Blob Content Items
|
||||
|
||||
During `AtBlobContent`, items arrive as `BaoContentItem`:
|
||||
|
||||
```rust
|
||||
pub enum BaoContentItem {
|
||||
Parent(ParentNode), // (node, (left_hash, right_hash)) — 64 bytes
|
||||
Leaf(Leaf), // { offset: u64, data: Bytes } — actual data
|
||||
}
|
||||
```
|
||||
|
||||
- **Parent nodes** contain BLAKE3 hash pairs for tree verification. They're overhead (~64 bytes per internal node).
|
||||
- **Leaf nodes** contain actual data chunks. Each leaf's data is at most `IROH_BLOCK_SIZE` bytes (16 KiB).
|
||||
|
||||
Verification is automatic: the `ResponseDecoder` from `bao-tree` validates each chunk against the expected hash tree rooted at the request hash.
|
||||
|
||||
## Blob Verification and BaoTree Encoding
|
||||
|
||||
### How BLAKE3 Verified Streaming Works
|
||||
|
||||
1. **The hash is the root** of a binary Merkle tree
|
||||
2. **Internal nodes** store `(left_child_hash, right_child_hash)` — 64 bytes each
|
||||
3. **Leaf nodes** store the actual data chunks (up to 1024 bytes each in standard BLAKE3, or 16 KiB in iroh's block size)
|
||||
4. **Chunk groups** (16 chunks = 16 KiB) are the minimum verification unit in iroh-blobs
|
||||
|
||||
For a request with specific ranges:
|
||||
- The provider traverses the tree, yielding only nodes needed to verify the requested ranges
|
||||
- The requester can verify each chunk group independently after receiving its parent hash pair
|
||||
- Maximum undetected corruption: 16 KiB (one chunk group)
|
||||
|
||||
### Outboard Storage
|
||||
|
||||
The **outboard** is the BLAKE3 hash tree stored separately from the data. For the provider:
|
||||
- Small blobs (≤16 KiB): outboard is empty (not needed, single chunk group)
|
||||
- Large blobs: outboard stored as `PreOrderMemOutboard` (in-memory) or as a file (filesystem store)
|
||||
|
||||
For the requester, the outboard is built incrementally as data arrives.
|
||||
|
||||
## Import and Export Flows
|
||||
|
||||
### Import Bytes (Local Data)
|
||||
|
||||
```
|
||||
add_bytes(data) / add_slice(data)
|
||||
│
|
||||
▼
|
||||
ImportBytesRequest { data, format, scope }
|
||||
│
|
||||
▼
|
||||
Actor::import_bytes()
|
||||
│ 1. Send AddProgressItem::Size(len)
|
||||
│ 2. Send AddProgressItem::CopyDone
|
||||
│ 3. Compute outboard: PreOrderMemOutboard::create(&data, IROH_BLOCK_SIZE)
|
||||
│ 4. Return ImportEntry { data, outboard, scope, format, tx }
|
||||
│
|
||||
▼
|
||||
Actor::finish_import()
|
||||
│ 1. Get hash from outboard.root()
|
||||
│ 2. Get or create BaoFileHandle for hash
|
||||
│ 3. Transition BaoFileStorage::Partial → Complete
|
||||
│ 4. Create TempTag for the hash_and_format
|
||||
│ 5. Send AddProgressItem::Done(temp_tag)
|
||||
```
|
||||
|
||||
### Import BAO Stream (Remote Data)
|
||||
|
||||
```
|
||||
import_bao_bytes(hash, ranges, data) / import_bao_reader(hash, ranges, reader)
|
||||
│
|
||||
▼
|
||||
ImportBaoRequest { hash, size }
|
||||
│
|
||||
▼
|
||||
Actor::import_bao()
|
||||
│ 1. Set size on partial entry
|
||||
│ 2. Create BaoTree for the size
|
||||
│ 3. For each BaoContentItem from stream:
|
||||
│ - Parent: write hash pair to outboard
|
||||
│ - Leaf: write data to storage, update bitfield
|
||||
│ - If bitfield becomes complete: transition Partial → Complete
|
||||
│ 4. Send result
|
||||
```
|
||||
|
||||
### Export BAO
|
||||
|
||||
```
|
||||
export_bao(hash, ranges) → ExportBao
|
||||
│
|
||||
▼
|
||||
Actor::export_bao()
|
||||
│ 1. Look up BaoFileHandle for hash
|
||||
│ 2. If not found: send EncodeError::NotFound and return
|
||||
│ 3. Create BaoTreeSender from data + outboard readers
|
||||
│ 4. Call traverse_ranges_validated(data, outboard, &ranges, tx)
|
||||
│ → streams validated BAO items to the sender
|
||||
```
|
||||
|
||||
### Export Path (To Filesystem)
|
||||
|
||||
```
|
||||
export(hash, target_path) → ExportPath
|
||||
│
|
||||
▼
|
||||
Actor::export_path()
|
||||
│ 1. Look up BaoFileHandle for hash
|
||||
│ 2. Create parent directories if needed
|
||||
│ 3. Create file at target_path
|
||||
│ 4. Send ExportProgressItem::Size(total_size)
|
||||
│ 5. Read data from store in 64 KiB chunks
|
||||
│ 6. Write to file, yielding ExportProgressItem::CopyProgress(offset)
|
||||
│ 7. Send ExportProgressItem::Done
|
||||
```
|
||||
|
||||
## Observe Protocol Detail
|
||||
|
||||
```
|
||||
Requester Provider
|
||||
│ │
|
||||
│ ObserveRequest {hash, ranges} │
|
||||
│─────────────────────────────────►│
|
||||
│ │
|
||||
│ ObserveItem {size, ranges} │ (initial state)
|
||||
│◄─────────────────────────────────│
|
||||
│ │
|
||||
│ ... (time passes, more data │
|
||||
│ becomes available) │
|
||||
│ │
|
||||
│ ObserveItem {size, ranges} │ (delta update)
|
||||
│◄─────────────────────────────────│
|
||||
│ │
|
||||
│ ... (continue until │
|
||||
│ requester stops │
|
||||
│ or connection closes) │
|
||||
│ │
|
||||
│ STOP_STREAM │
|
||||
│─────────────────────────────────►│
|
||||
```
|
||||
|
||||
The observe protocol uses `Bitfield::diff()` to send only the new chunks since the last update, minimizing bandwidth.
|
||||
|
||||
## Full Working Example
|
||||
|
||||
```rust
|
||||
use iroh::{protocol::Router, Endpoint, endpoint::presets};
|
||||
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ticket::BlobTicket, BlobFormat};
|
||||
|
||||
// === Provider Side ===
|
||||
async fn provider() -> anyhow::Result<()> {
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let store = MemStore::new();
|
||||
|
||||
// Add some data
|
||||
let tag = store.add_slice(b"Hello, iroh-blobs!").await?;
|
||||
|
||||
let _ = endpoint.online().await;
|
||||
let addr = endpoint.addr();
|
||||
|
||||
// Create ticket for sharing
|
||||
let ticket = BlobTicket::new(addr, tag.hash, BlobFormat::Raw);
|
||||
println!("Ticket: {ticket}");
|
||||
|
||||
// Start serving
|
||||
let blobs = BlobsProtocol::new(&store, None);
|
||||
let router = Router::builder(endpoint)
|
||||
.accept(iroh_blobs::ALPN, blobs)
|
||||
.spawn();
|
||||
|
||||
tokio::signal::ctrl_c().await?;
|
||||
router.shutdown().await?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
// === Requester Side ===
|
||||
async fn requester(ticket: BlobTicket) -> anyhow::Result<()> {
|
||||
let (addr, hash, format) = ticket.into_parts();
|
||||
|
||||
let endpoint = Endpoint::bind(presets::N0).await?;
|
||||
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
|
||||
|
||||
// Build request based on format
|
||||
let request = match format {
|
||||
BlobFormat::Raw => iroh_blobs::protocol::GetRequest::blob(hash),
|
||||
BlobFormat::HashSeq => iroh_blobs::protocol::GetRequest::all(hash),
|
||||
};
|
||||
|
||||
// Use the get FSM
|
||||
let start = iroh_blobs::get::fsm::start(conn, request, Default::default());
|
||||
let connected = start.next().await?;
|
||||
let connected = connected.next().await?;
|
||||
|
||||
match connected {
|
||||
iroh_blobs::get::fsm::ConnectedNext::StartRoot(at_root) => {
|
||||
let (at_content, size) = at_root.next().next().await?;
|
||||
let (at_end, data) = at_content.concatenate_into_vec().await?;
|
||||
println!("Got {} bytes: {:?}", size, data);
|
||||
// ...
|
||||
}
|
||||
iroh_blobs::get::fsm::ConnectedNext::StartChild(at_child) => {
|
||||
// Need to know the child hash
|
||||
}
|
||||
iroh_blobs::get::fsm::ConnectedNext::Closing(at_closing) => {
|
||||
println!("Empty response");
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
## Simplified Fetch (Using Store + Remote)
|
||||
|
||||
```rust
|
||||
// The simplest way to download data
|
||||
let store = MemStore::new();
|
||||
let remote = store.remote();
|
||||
|
||||
// Fetch with automatic local availability checking
|
||||
let result = remote.fetch(connection, hash, format, &store).await?;
|
||||
// Result includes Stats with transfer metrics
|
||||
```
|
||||
|
||||
## Key Error Types
|
||||
|
||||
| Error Type | Location | Purpose |
|
||||
|------------|----------|---------|
|
||||
| `GetError` | `get::error` | Errors during get FSM |
|
||||
| `ExportBaoError` | `api` | Errors during BAO export |
|
||||
| `RequestError` | `api` | Store command errors |
|
||||
| `DecodeError` | `get::fsm` | BAO stream decode errors |
|
||||
| `ProgressError` | `provider::events` | Provider event errors |
|
||||
Reference in New Issue
Block a user