12 KiB
12 KiB
iroh-blobs: Data Flow and Complete Example
Complete Data Flow: Provider Side
QUIC Connection Arrives
│
▼
handle_connection(conn, store, events)
│
┌──────────┴──────────┐
│ Accept QUIC BIDI │
│ streams in loop │
└──────────┬──────────┘
│
handle_stream(pair, store)
│
┌──────────┴──────────┐
│ Read Request type │
│ byte + deserialize │
└──────────┬──────────┘
│
┌─────────────┬───────┼───────┬──────────────┐
│ │ │ │ │
handle_get handle_get handle handle (reserved)
_many _observe _push
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────┐
│ For each (offset, ranges) in request.ranges: │
│ │
│ if offset == 0: │
│ send_blob(store, 0, hash, ranges, writer) │
│ else: │
│ lookup hash in HashSeq[offset-1] │
│ send_blob(store, offset, child_hash, ranges, writer) │
│ │
│ send_blob: │
│ store.export_bao(hash, ranges) │
│ .write_with_progress(writer, ctx, &hash, idx) │
└─────────────────────────────────────────────────┘
Complete Data Flow: Requester Side (Get FSM)
Create GetRequest
│
▼
fsm::start(connection, request, counters)
│
▼
AtInitial.next()
│ (open_bi, send request)
▼
AtConnected.next()
│
┌───────────┼───────────┐
│ │ │
StartRoot StartChild Closing
(offset=0) (offset>0) (empty)
│ │ │
▼ ▼ ▼
AtBlobHeader AtBlobHeader AtClosing
.next() .next(hash) .next()
│ │ │
▼ ▼ ▼
(size, AtBlobContent) Stats
│
┌────────┴────────┐
│ │
More(item) Done
(loop back to (AtEndBlob)
AtBlobContent) │
┌─────┼─────┐
│ │
MoreChildren Closing
(AtStartChild) (AtClosing)
│ │
└───────────┘
Blob Content Items
During AtBlobContent, items arrive as BaoContentItem:
pub enum BaoContentItem {
Parent(ParentNode), // (node, (left_hash, right_hash)) — 64 bytes
Leaf(Leaf), // { offset: u64, data: Bytes } — actual data
}
- Parent nodes contain BLAKE3 hash pairs for tree verification. They're overhead (~64 bytes per internal node).
- Leaf nodes contain actual data chunks. Each leaf's data is at most
IROH_BLOCK_SIZEbytes (16 KiB).
Verification is automatic: the ResponseDecoder from bao-tree validates each chunk against the expected hash tree rooted at the request hash.
Blob Verification and BaoTree Encoding
How BLAKE3 Verified Streaming Works
- The hash is the root of a binary Merkle tree
- Internal nodes store
(left_child_hash, right_child_hash)— 64 bytes each - Leaf nodes store the actual data chunks (up to 1024 bytes each in standard BLAKE3, or 16 KiB in iroh's block size)
- Chunk groups (16 chunks = 16 KiB) are the minimum verification unit in iroh-blobs
For a request with specific ranges:
- The provider traverses the tree, yielding only nodes needed to verify the requested ranges
- The requester can verify each chunk group independently after receiving its parent hash pair
- Maximum undetected corruption: 16 KiB (one chunk group)
Outboard Storage
The outboard is the BLAKE3 hash tree stored separately from the data. For the provider:
- Small blobs (≤16 KiB): outboard is empty (not needed, single chunk group)
- Large blobs: outboard stored as
PreOrderMemOutboard(in-memory) or as a file (filesystem store)
For the requester, the outboard is built incrementally as data arrives.
Import and Export Flows
Import Bytes (Local Data)
add_bytes(data) / add_slice(data)
│
▼
ImportBytesRequest { data, format, scope }
│
▼
Actor::import_bytes()
│ 1. Send AddProgressItem::Size(len)
│ 2. Send AddProgressItem::CopyDone
│ 3. Compute outboard: PreOrderMemOutboard::create(&data, IROH_BLOCK_SIZE)
│ 4. Return ImportEntry { data, outboard, scope, format, tx }
│
▼
Actor::finish_import()
│ 1. Get hash from outboard.root()
│ 2. Get or create BaoFileHandle for hash
│ 3. Transition BaoFileStorage::Partial → Complete
│ 4. Create TempTag for the hash_and_format
│ 5. Send AddProgressItem::Done(temp_tag)
Import BAO Stream (Remote Data)
import_bao_bytes(hash, ranges, data) / import_bao_reader(hash, ranges, reader)
│
▼
ImportBaoRequest { hash, size }
│
▼
Actor::import_bao()
│ 1. Set size on partial entry
│ 2. Create BaoTree for the size
│ 3. For each BaoContentItem from stream:
│ - Parent: write hash pair to outboard
│ - Leaf: write data to storage, update bitfield
│ - If bitfield becomes complete: transition Partial → Complete
│ 4. Send result
Export BAO
export_bao(hash, ranges) → ExportBao
│
▼
Actor::export_bao()
│ 1. Look up BaoFileHandle for hash
│ 2. If not found: send EncodeError::NotFound and return
│ 3. Create BaoTreeSender from data + outboard readers
│ 4. Call traverse_ranges_validated(data, outboard, &ranges, tx)
│ → streams validated BAO items to the sender
Export Path (To Filesystem)
export(hash, target_path) → ExportPath
│
▼
Actor::export_path()
│ 1. Look up BaoFileHandle for hash
│ 2. Create parent directories if needed
│ 3. Create file at target_path
│ 4. Send ExportProgressItem::Size(total_size)
│ 5. Read data from store in 64 KiB chunks
│ 6. Write to file, yielding ExportProgressItem::CopyProgress(offset)
│ 7. Send ExportProgressItem::Done
Observe Protocol Detail
Requester Provider
│ │
│ ObserveRequest {hash, ranges} │
│─────────────────────────────────►│
│ │
│ ObserveItem {size, ranges} │ (initial state)
│◄─────────────────────────────────│
│ │
│ ... (time passes, more data │
│ becomes available) │
│ │
│ ObserveItem {size, ranges} │ (delta update)
│◄─────────────────────────────────│
│ │
│ ... (continue until │
│ requester stops │
│ or connection closes) │
│ │
│ STOP_STREAM │
│─────────────────────────────────►│
The observe protocol uses Bitfield::diff() to send only the new chunks since the last update, minimizing bandwidth.
Full Working Example
use iroh::{protocol::Router, Endpoint, endpoint::presets};
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ticket::BlobTicket, BlobFormat};
// === Provider Side ===
async fn provider() -> anyhow::Result<()> {
let endpoint = Endpoint::bind(presets::N0).await?;
let store = MemStore::new();
// Add some data
let tag = store.add_slice(b"Hello, iroh-blobs!").await?;
let _ = endpoint.online().await;
let addr = endpoint.addr();
// Create ticket for sharing
let ticket = BlobTicket::new(addr, tag.hash, BlobFormat::Raw);
println!("Ticket: {ticket}");
// Start serving
let blobs = BlobsProtocol::new(&store, None);
let router = Router::builder(endpoint)
.accept(iroh_blobs::ALPN, blobs)
.spawn();
tokio::signal::ctrl_c().await?;
router.shutdown().await?;
Ok(())
}
// === Requester Side ===
async fn requester(ticket: BlobTicket) -> anyhow::Result<()> {
let (addr, hash, format) = ticket.into_parts();
let endpoint = Endpoint::bind(presets::N0).await?;
let conn = endpoint.connect(addr, iroh_blobs::ALPN).await?;
// Build request based on format
let request = match format {
BlobFormat::Raw => iroh_blobs::protocol::GetRequest::blob(hash),
BlobFormat::HashSeq => iroh_blobs::protocol::GetRequest::all(hash),
};
// Use the get FSM
let start = iroh_blobs::get::fsm::start(conn, request, Default::default());
let connected = start.next().await?;
let connected = connected.next().await?;
match connected {
iroh_blobs::get::fsm::ConnectedNext::StartRoot(at_root) => {
let (at_content, size) = at_root.next().next().await?;
let (at_end, data) = at_content.concatenate_into_vec().await?;
println!("Got {} bytes: {:?}", size, data);
// ...
}
iroh_blobs::get::fsm::ConnectedNext::StartChild(at_child) => {
// Need to know the child hash
}
iroh_blobs::get::fsm::ConnectedNext::Closing(at_closing) => {
println!("Empty response");
}
}
Ok(())
}
Simplified Fetch (Using Store + Remote)
// The simplest way to download data
let store = MemStore::new();
let remote = store.remote();
// Fetch with automatic local availability checking
let result = remote.fetch(connection, hash, format, &store).await?;
// Result includes Stats with transfer metrics
Key Error Types
| Error Type | Location | Purpose |
|---|---|---|
GetError |
get::error |
Errors during get FSM |
ExportBaoError |
api |
Errors during BAO export |
RequestError |
api |
Store command errors |
DecodeError |
get::fsm |
BAO stream decode errors |
ProgressError |
provider::events |
Provider event errors |