7.5 KiB
iroh-gossip: HyParView Membership Protocol
Overview
The HyParView protocol provides swarm membership management — it maintains which peers are currently part of the swarm for a given topic and ensures the overlay network remains connected even as nodes join, leave, or fail.
It is implemented in src/proto/hyparview.rs.
Core Concept: Two Views
Each peer maintains two sets of peers:
| View | Description | Default Size | Connection? |
|---|---|---|---|
| Active View | Peers we maintain active bidirectional connections to | 5 | Yes — TCP/QUIC connection is kept open |
| Passive View | An address book of peers we know about but are not connected to | 30 | No — just contact information |
Key invariants:
- Active connections are always bidirectional: If peer A has peer B in its active view, peer B also has peer A in its active view.
- The passive view serves as a failover pool: When an active peer disconnects, a random peer from the passive view is promoted to fill the slot.
Configuration (hyparview::Config)
pub struct Config {
pub active_view_capacity: usize, // Default: 5
pub passive_view_capacity: usize, // Default: 30
pub active_random_walk_length: Ttl, // Default: Ttl(6)
pub passive_random_walk_length: Ttl, // Default: Ttl(3)
pub shuffle_random_walk_length: Ttl, // Default: Ttl(6)
pub shuffle_active_view_count: usize, // Default: 3
pub shuffle_passive_view_count: usize, // Default: 4
pub shuffle_interval: Duration, // Default: 60s
pub neighbor_request_timeout: Duration, // Default: 500ms
}
These defaults come directly from the HyParView paper (p9), except for shuffle_interval and neighbor_request_timeout which are "wild guesses" in the code.
State Structure
pub struct State<PI, RG = ThreadRng> {
me: PI, // Our peer identity
me_data: Option<PeerData>, // Opaque data we share with peers
pub active_view: IndexSet<PI>, // Connected peers
pub passive_view: IndexSet<PI>, // Known but disconnected peers
config: Config,
shuffle_scheduled: bool, // Whether shuffle timer is active
rng: RG, // Random number generator
stats: Stats,
pending_neighbor_requests: HashSet<PI>, // Peers we've sent Neighbor to but no reply yet
peer_data: HashMap<PI, PeerData>, // Opaque data received from other peers
alive_disconnect_peers: HashSet<PI>, // Peers disconnecting but to keep in passive view
}
Messages (hyparview::Message)
| Message | Direction | Purpose |
|---|---|---|
Join(Option<PeerData>) |
New node → Contact | Sent to a known peer to join the swarm |
ForwardJoin(ForwardJoin) |
Propagated | Forwarded to active view to introduce a new member |
Neighbor(Neighbor) |
Bidirectional | Request to add sender to active view (with priority) |
Disconnect(Disconnect) |
Bidirectional | Notification that a peer is leaving or being demoted |
Shuffle(Shuffle) |
Initiated periodically | Sent to random peer to exchange passive view contacts |
ShuffleReply(ShuffleReply) |
Reply to Shuffle | Returns a random subset of our views to the origin |
Message Details
pub struct ForwardJoin<PI> {
peer: PeerInfo<PI>, // The new peer's identity + optional data
ttl: Ttl, // Time-to-live, decremented per hop
}
pub struct Shuffle<PI> {
origin: PI, // Who initiated the shuffle
nodes: Vec<PeerInfo<PI>>, // Random subset of our views
ttl: Ttl, // Time-to-live for the random walk
}
pub struct Neighbor {
priority: Priority, // High (cannot be denied) or Low (can be denied)
data: Option<PeerData>,
}
pub struct Disconnect {
alive: bool, // If true, peer is still alive (just demoting)
_respond: bool, // Obsolete, kept for wire compat
}
Join Procedure (Step by Step)
-
A new node sends
Join(me_data)to a known contact peer. -
The contact peer adds the new node to its active view (even evicting a random peer if necessary).
-
The contact peer forwards
ForwardJointo all other peers in its active view withTTL = active_random_walk_length. -
Each peer receiving
ForwardJoin:- If
TTL == 0or active view has ≤1 peer: sendsNeighbor(High)to the new node (which adds it to active view). - If
TTL == passive_random_walk_length: adds the new node to passive view. - Decrements TTL and forwards to a random active peer (different from sender).
- If
-
The
Neighbormessage establishes the bidirectional active connection. APriority::Highneighbor request must be accepted (potentially evicting a random active peer). APriority::Lowrequest is only accepted if there is room.
Shuffle Mechanism
Periodically (every shuffle_interval), each node:
- Picks a random active peer.
- Sends
Shufflecontaining a random subset of active + passive views plus the origin's info, with a TTL. - The shuffle message does a random walk (each hop decrements TTL).
- When TTL reaches 0 or the active view is ≤1, the peer accepts the shuffle and replies with
ShuffleReplycontaining its own random peers. - The origin receives
ShuffleReplyand adds new peers to its passive view.
This ensures the passive view remains fresh and provides good connectivity even in dynamic networks.
Failure Recovery
When a peer in the active view disconnects (detected via PeerDisconnected):
- The peer is removed from the active view.
- A
NeighborDownevent is emitted. - A random peer from the passive view is selected and sent a
Neighbor(Low)request. - If that peer doesn't respond within
neighbor_request_timeout, it's removed from the passive view and another peer is tried. - This continues until a connection is established or the passive view is exhausted.
If a Disconnect(alive=true) message is received:
- The peer is moved to the passive view (not just dropped), because it's still alive.
- The
alive_disconnect_peersset tracks which disconnected peers should be retained in passive view when their connection eventually closes.
PeerData
PeerData is an opaque Bytes type that peers exchange when joining. In the net module, it is used to serialize and transmit addressing information (AddrInfo):
struct AddrInfo {
relay_url: Option<RelayUrl>,
direct_addresses: BTreeSet<SocketAddr>,
}
This allows the gossip protocol itself to help propagate connectivity information, enabling the GossipAddressLookup service to feed addresses back into iroh's endpoint discovery system.
Events (hyparview::Event)
| Event | Meaning |
|---|---|
NeighborUp(PI) |
A peer was added to our active view |
NeighborDown(PI) |
A peer was removed from our active view |
These events are forwarded up to the PlumTree layer and to the application.
Timers
| Timer | Purpose |
|---|---|
DoShuffle |
Periodically trigger a shuffle operation |
PendingNeighborRequest(PI) |
Timeout for a pending neighbor request |
IO Trait Pattern
The HyParView state machine is generic over an IO trait:
pub trait IO<PI: Clone> {
fn push(&mut self, event: impl Into<OutEvent<PI>>);
}
This allows the protocol to emit output events without knowing about the networking layer. The upper layers supply a VecDeque<OutEvent> or similar container.