9.7 KiB
iroh-gossip: PlumTree Broadcast Protocol
Overview
The PlumTree (Epidemic Broadcast Trees) protocol provides efficient message broadcasting across all peers in a topic's swarm. It builds on top of HyParView's membership layer, using the active view as its peer set.
It is implemented in src/proto/plumtree.rs.
Core Concept: Eager vs Lazy Push
Each peer maintains two subsets of its HyParView active view:
| Set | Description | Behavior |
|---|---|---|
| Eager push peers | Peers to whom full messages are sent immediately | Messages are pushed eagerly (full content) |
| Lazy push peers | Peers to whom only message IDs (hashes) are sent | IHave announcements are sent, requesting content only if needed |
When a peer broadcasts a message:
- The full message is pushed to all eager peers.
- The message ID (a blake3 hash) is pushed to all lazy peers (after a short delay for batching).
This creates an optimized broadcast tree: eager peers form a spanning tree for low-latency delivery, while lazy peers provide redundancy through timeout-based recovery.
Configuration (plumtree::Config)
pub struct Config {
pub graft_timeout_1: Duration, // Default: 80ms
pub graft_timeout_2: Duration, // Default: 40ms
pub dispatch_timeout: Duration, // Default: 5ms
pub optimization_threshold: Round, // Default: Round(7)
pub message_cache_retention: Duration, // Default: 30s
pub message_id_retention: Duration, // Default: 90s
pub cache_evict_interval: Duration, // Default: 1s
}
Timeout Semantics
graft_timeout_1: After receiving anIHave, wait this long for the full message from an eager peer. If it doesn't arrive, send aGraftto theIHavesender.graft_timeout_2: After sending aGraft, wait this shorter timeout for the reply. If no reply, try the nextIHavesender.dispatch_timeout: Delay before batching and sendingIHavemessages. This allows multiple announcements to be aggregated into a single message.optimization_threshold: Number of hops difference required to trigger tree optimization (see below).
Cache Settings
message_cache_retention: How long to keep full message payloads in cache. This enables replying toGraftrequests from peers who missed the eager push.message_id_retention: How long to remember that we've already seen a message ID. This prevents re-delivering duplicate messages.cache_evict_interval: How often to check and evict expired entries.
State Structure
pub struct State<PI> {
me: PI, // Our peer identity
config: Config, // Protocol configuration
pub eager_push_peers: BTreeSet<PI>, // Full message delivery peers
pub lazy_push_peers: BTreeSet<PI>, // Message-ID-only delivery peers
lazy_push_queue: BTreeMap<PI, Vec<IHave>>, // Pending IHave announcements (batched)
missing_messages: HashMap<MessageId, VecDeque<(PI, Round)>>, // IHave senders awaiting delivery
received_messages: TimeBoundCache<MessageId, ()>, // Seen message IDs
cache: TimeBoundCache<MessageId, Gossip>, // Full message payloads
graft_timer_scheduled: HashSet<MessageId>, // Active graft timers
dispatch_timer_scheduled: bool, // Whether IHave dispatch is pending
init: bool, // Whether first event was processed
stats: Stats, // Message counters
max_message_size: usize, // Maximum allowed message size
}
Message Types (plumtree::Message)
| Message | Direction | Purpose |
|---|---|---|
Gossip(Gossip) |
Eager push | Full message content, broadcast to eager peers |
Prune |
Bidirectional | Sent when moving a peer from eager to lazy set |
Graft(Graft) |
Lazy → Eager upgrade | Request to become an eager peer; may include a message ID to request re-delivery |
IHave(Vec<IHave>) |
Lazy push | Announcement: "I have these messages" (batched, sent after dispatch_timeout) |
Gossip Message Structure
pub struct Gossip {
id: MessageId, // blake3 hash of content
content: Bytes, // The actual message payload
scope: DeliveryScope, // Swarm(round) or Neighbors
}
The DeliveryScope tracks how many hops the message has traveled:
pub enum DeliveryScope {
Swarm(Round), // Delivered via the swarm; Round = hop count from origin
Neighbors, // Delivered only to direct neighbors (not forwarded further)
}
Each time a Gossip message is forwarded, its Round is incremented via next_round(). Neighbors-scope messages are not forwarded at all.
IHave Structure
pub struct IHave {
id: MessageId, // The blake3 hash of the message content
round: Round, // The hop count at which the sender received this message
}
Graft Structure
pub struct Graft {
id: Option<MessageId>, // If set, also reply with full message content
round: Round, // The round from the IHave that triggered this graft
}
Message ID
pub struct MessageId([u8; 32]); // blake3 hash of message content
impl MessageId {
pub fn from_content(message: &[u8]) -> Self {
Self::from(blake3::hash(message))
}
}
Messages are validated: when receiving a Gossip, the receiver checks that MessageId::from_content(&content) == id. Spoofed messages (where the hash doesn't match the content) are silently discarded.
Broadcast Flow
Sending a Message
1. Compute MessageId = blake3(content)
2. Create Gossip { id, content, scope: Swarm(Round(0)) or Neighbors }
3. If Swarm scope:
a. Add to received_messages and cache
b. Queue IHave for lazy peers (dispatched after dispatch_timeout)
4. Eager-push Gossip to all eager peers (except self and sender)
Receiving a Gossip Message
1. Validate: message.id == blake3(message.content) → discard if invalid
2. If already received (in received_messages):
→ Send Prune to sender (move sender to lazy set)
→ Return (don't re-broadcast)
3. If Swarm scope:
a. Add to received_messages
b. Increment round (next_round)
c. Add to cache (for Graft replies)
d. Eager-push to all eager peers (except sender)
e. Lazy-push IHave to all lazy peers (except sender)
f. Check if any prior IHave senders had a shorter path → optimize tree
4. Emit Received event to application
Receiving an IHave
For each IHave entry:
If message ID not in received_messages:
Add (sender, round) to missing_messages[message_id]
If no graft timer scheduled for this message:
Schedule SendGraft timer (graft_timeout_1)
Graft Timer Expiry (Two-Phase)
Phase 1 (graft_timeout_1):
If message already received → no-op (cancel)
Otherwise:
Pop first (peer, round) from missing_messages[message_id]
Move peer to eager set
Send Graft { id: Some(message_id), round } to that peer
Schedule another SendGraft timer (graft_timeout_2) for fallback
Phase 2 (graft_timeout_2):
If message already received → no-op
Otherwise:
Pop next (peer, round) from missing_messages[message_id]
Move that peer to eager set
Send Graft { id: Some(message_id), round }
Schedule another SendGraft timer (graft_timeout_2)
(continues until the message is received or senders are exhausted)
Receiving a Graft
1. Move sender to eager set
2. If Graft contains a message ID:
Look up message in cache
If found: send Gossip(message) to the requesting peer
Receiving a Prune
Move sender from eager set to lazy set
Tree Optimization
The PlumTree self-optimizes based on latency. When a Gossip message is received, if we previously received an IHave for the same message from a different peer, we check whether the IHave path was significantly shorter:
if (ihave_round < gossip_round) && (gossip_round - ihave_round) >= optimization_threshold:
Graft the IHave sender (move to eager)
Prune the Gossip sender (move to lazy)
This means if a peer consistently has a shorter path to the message origin, they are promoted to eager, and the longer-path peer is demoted. The optimization_threshold (default: 7 hops) prevents thrashing from minor latency differences.
Neighbor Events
PlumTree receives neighbor events from HyParView:
NeighborUp(peer): Add peer to eager set (all new neighbors start as eager)NeighborDown(peer): Remove from both eager and lazy sets; clean up anyIHaveentries from this peer inmissing_messages
Neighbor-Only Broadcast
The Scope::Neighbors broadcast scope sends a message only to directly connected peers (the active view), without any forwarding:
pub enum Scope {
Swarm, // Broadcast to all peers in the swarm
Neighbors, // Broadcast only to immediate neighbors
}
Neighbor-scoped messages are useful for localized communication and are not cached or re-broadcast.
Cache Management
The PlumTree maintains two time-bounded caches:
-
cache(TimeBoundCache<MessageId, Gossip>): Stores full message payloads formessage_cache_retention(default 30s). This enables replying toGraftrequests for recently-broadcast messages. -
received_messages(TimeBoundCache<MessageId, ()>): Tracks which messages have been seen formessage_id_retention(default 90s). This prevents duplicate delivery.
Both caches are periodically evicted (every cache_evict_interval, default 1s) via the EvictCache timer.