Files
pubsub/docs/architecture/iroh-transport.md
glm-5.1 8c025c3433 Set up project structure, source files, and architecture docs
- Copy core source from alkhub_ts/packages/core/pubsub/ with import path fixups
  (typed_event_target.ts → types.ts, .ts → .js extensions)
- Make PubSubPublishArgsByKey exported (was private type, needed by barrel)
- Add package.json with sub-path exports and optional peer deps (ioredis)
- Add tsup.config.ts with multi-entry + splitting for tree-shaking
- Add tsconfig.json, vitest.config.ts, .gitignore
- Add AGENTS.md with project conventions and adapter checklist
- Add architecture docs following taskgraph/alkhub pattern:
  docs/architecture/README.md, api-surface.md, event-targets.md,
  iroh-transport.md, build-distribution.md
- Add ADRs: 001-graphql-yoga-fork, 002-tree-shake-pattern
- Copy migration research doc to docs/research/migration.md
- Dual-license MIT OR Apache-2.0 (matching taskgraph)
2026-04-30 10:20:41 +00:00

140 lines
6.3 KiB
Markdown

---
status: draft
last_updated: 2026-04-30
---
# Iroh Transport
P2P QUIC event target using iroh. More complex than the other transports due to NAT traversal, crypto identity, and byte-stream framing.
**Import**: `@alkdev/pubsub/event-target-iroh` (not yet implemented)
**Peer dep**: `@rayhanadev/iroh` (optional, NAPI-RS native addon)
## Why Iroh
WebSocket requires the hub to have a publicly reachable address. Iroh solves:
1. **Hub behind NAT** — spokes dial by `NodeId` through relay servers, no public IP needed
2. **Spoke push** — hub can initiate connections to spokes by `NodeId` (impossible with WS without polling)
3. **P2P spoke-to-spoke** — direct spoke-to-spoke communication without routing through hub
4. **Cryptographic identity** — Ed25519 `NodeId` doubles as spoke authentication
## Iroh Binding
Using `@rayhanadev/iroh` (v0.1.1) as the NAPI-RS binding. Community binding, one author, no tests. It has everything needed for hub-spoke 1:1 bidirectional streams:
| Method | Purpose |
|--------|---------|
| `Endpoint.create()` / `createWithOptions({ alpns })` | Create QUIC endpoint |
| `Endpoint.connect(nodeId, alpn)` | Connect to a peer by public key |
| `Endpoint.accept()` | Accept incoming connection |
| `Endpoint.nodeId()` | Get our public key identity |
| `Connection.openBi()` | Open bidirectional stream (spoke side) |
| `Connection.acceptBi()` | Accept bidirectional stream (hub side) |
| `SendStream.writeAll(data)` | Send data on stream |
| `RecvStream.readExact(len)` | Read exact bytes from stream |
| `Connection.remoteNodeId()` | Get peer's public key |
| `Connection.sendDatagram()` / `readDatagram()` | Unreliable datagrams |
Not exposed (not critical): `Endpoint.watch_addr()`, `Connection.close_reason()`, `Connection.stats()`.
## Protocol
Single bidirectional QUIC stream per connection. Length-prefixed JSON messages.
### Framing
QUIC streams are byte streams (no message boundaries). We use 4-byte big-endian length prefix:
```
[4 bytes: length N][N bytes: JSON payload]
```
`RecvStream.readExact(4)` reads the length, then `readExact(N)` reads the payload. This is trivial with iroh's `readExact()` API.
### Message Format
Same `type` + `detail` shape as all other transports:
```json
{ "type": "call.requested", "detail": { ... } }
```
Maps directly to `new CustomEvent(type, { detail })`.
## Two-Sided Design
Unlike Redis and WebSocket, Iroh has distinct hub and spoke connection patterns:
### Spoke Side
```ts
const conn = await endpoint.connect(hubNodeId, "alkhub/1");
const eventTarget = await createSpokeIrohEventTarget(conn);
```
Spoke opens the bidirectional stream with `openBi()`. The event target wraps the `SendStream` and `RecvStream`.
### Hub Side
```ts
const conn = await endpoint.accept();
const eventTarget = await createHubIrohEventTarget(conn);
```
Hub accepts the connection, then accepts the stream with `acceptBi()`. Same `TypedEventTarget` interface on both sides.
### Why Two Factories?
The connection initiator (spoke) calls `openBi()`. The listener (hub) calls `acceptBi()`. Both get `SendStream` + `RecvStream` — the framing and event handling are identical. The split is about connection establishment, not event handling. Could be unified as `createIrohEventTarget(sendStream, recvStream)` with separate helpers for connection, but the two-factory pattern makes the hub/spoke asymmetry explicit.
## Identity
`Connection.remoteNodeId()` returns the peer's Ed25519 public key. This is cryptographic identity — no separate API key exchange needed for authentication. The hub can verify that a connection comes from an expected spoke by checking its `NodeId`.
This is strictly better than WebSocket's token-in-URL or first-message approach. It's also harder to revoke — disabling a spoke requires a denylist of `NodeId`s rather than rotating a token.
## Connection Startup
On connection, both sides exchange the operations they expose (same `hub.register` pattern as WebSocket). The `NodeId` serves as identity — no separate API key exchange.
## Reconnection
Same pattern as WebSocket — detect connection failure, reconnect, re-register. QUIC handles multipath better than TCP but the application still needs reconnection logic.
Detection: `RecvStream.readExact()` throws on connection close. The event target should propagate this as an error event or let the caller handle it.
## Browser Limitations
Iroh in browsers is relay-only (no UDP hole punching from browser sandbox). This means:
- Browser spokes always route through relay servers
- WebSocketEventTarget is the right browser transport today (native, no extra deps)
- IrohEventTarget for browsers would use the WASM build over relay — future option
## Multi-Node (Future)
For 1:N fan-out, `iroh-gossip` is the right tool. No TS binding exists yet. Options:
1. Write a minimal Rust NAPI crate wrapping `iroh-gossip::Gossip.subscribe() + broadcast()`
2. Contribute gossip to `@rayhanadev/iroh`
3. Use hub as a relay point (hub receives once, fans out to each spoke's `IrohEventTarget` individually)
For now, 1:1 connections are sufficient. The hub fans out to multiple spokes by dispatching to each spoke's `IrohEventTarget` individually — same pattern as WebSocketEventTarget on the hub side.
## Comparison with WebSocketEventTarget
| Aspect | WebSocket | Iroh |
|--------|-----------|------|
| Connection | `new WebSocket(url)` | `endpoint.connect(nodeId, alpn)` |
| Accept | Hono WS upgrade | `endpoint.accept()` |
| Identity | API key/token | Ed25519 NodeId (cryptographic, mutual) |
| NAT traversal | Requires reverse proxy / tunnel | Built-in (relay + hole punching) |
| Framing | WS frames (built-in) | QUIC stream (length-prefix needed) |
| Hub behind NAT | Not possible without tunneling | Yes |
| Browser | Yes (native) | Limited (WASM build, relay-only) |
| Native addon | No | Yes (NAPI-RS) |
## Open Questions
1. **Binding stability**`@rayhanadev/iroh` has one author and no tests. If it breaks, we may need to fork or write our own NAPI wrapper. Mitigation: the API surface we use is small (10 methods) and the binding is thin.
2. **NAPI under Deno** — NAPI-RS `.node` binaries need testing under Deno 2.x. Since we're building with tsup for npm, the runtime is Node.js.
3. **Datagram support**`sendDatagram`/`readDatagram` could be used for fire-and-forget events (no response expected). Not needed for hub-spoke but could be useful for broadcast. Deferred.