Files

glm-5.1 13b0991fb8 Resolve all architecture open questions, add 13 ADRs, update specs

Resolved all 11 open questions based on project guidance:

Transport:
- OQ-01/OQ-07: ACME/Let's Encrypt with domain + IP paths (ADR-008)
- OQ-02: Default to n0 relay, --iroh-relay override (ADR-009)
- OQ-05: Transport chaining supported natively (ADR-010)

Client:
- OQ-06: Programmatic-first API, no ~/.ssh/config (ADR-011)

Server:
- OQ-04: Ed25519 + OpenSSH cert-authority, no password auth (ADR-012)
- OQ-08: fail2ban-friendly logging + built-in rate limiting (ADR-013)

TUN:
- OQ-03/OQ-09: Deferred entirely, recommend tun2proxy (ADR-014)
- tun-shim.md marked deprecated

NAPI:
- OQ-10: Expose both connect() and serve() (ADR-016)
- OQ-11: Use napi-rs for FFI bridge (ADR-015)

Additional ADRs created during review:
- ADR-006: No logging of tunnel destinations (was phantom reference)
- ADR-017: Stealth mode protocol multiplexing
- ADR-018: Control channel for pubsub over SSH

Fixed: ADR-002 status → Superseded, ADR-007 title typo,
WRAUTH_SERVER typo, ADR-005 stale wraith-tun refs,
undefined ACL feature removed from server.md,
--proxy semantic difference documented.

2026-06-01 17:31:28 +00:00

2.5 KiB

Raw Blame History

ADR-013: Fail2ban-Friendly Server Logging

Status

Accepted

Context

The server needs to handle abuse on public-facing deployments. Our production infrastructure uses fail2ban on Linux (documented in /workspace/system/dev1/fail2ban.md) with nftables and systemd journal backend. fail2ban needs structured, parseable logs to identify abusive IP addresses.

However, fail2ban is Linux-specific. On other platforms (macOS, Windows, BSD), users need a different approach to reject abusive connections. The server should provide enough logging for fail2ban on Linux and enough built-in protection for other platforms.

Decision

The server logs connection and authentication events at INFO level with structured fields, and provides a configurable connection rate limiter as a built-in defense.

Logging (for fail2ban integration on Linux):

Log auth attempts: level=INFO, msg="auth attempt", remote_addr=<ip>, user=<user>, key_fingerprint=<sha256>, result=<accept|reject>
Log new connections: level=INFO, msg="connection opened", remote_addr=<ip>, transport=<tcp|tls|iroh>
Log disconnections: level=INFO, msg="connection closed", remote_addr=<ip>, duration=<secs>
Do NOT log: channel open targets, DNS resolutions, bytes transferred

This matches what fail2ban needs: source IP + failure indicator. Our existing fail2ban setup filters on similar fields for SSH and nginx.

Built-in rate limiting (for all platforms):

--max-connections-per-ip <n> (default: 0 = unlimited) — reject new connections from an IP that already has N active connections
--max-auth-attempts <n> (default: 10) — disconnect after N failed auth attempts from one connection
Rate limiting happens at the SSH layer, before channels are opened

This ensures that even without fail2ban, the server rejects obviously abusive connections.

Consequences

Positive: fail2ban can parse wraith logs the same way it parses SSH and nginx logs on our production systems.
Positive: Built-in rate limiting provides protection on platforms without fail2ban.
Positive: No privacy-sensitive data in logs (no tunnel destinations).
Negative: Slightly more code in the server for connection tracking per IP.
Negative: Users with custom fail2ban filters need to write regex for wraith's log format (documented examples provided).

References

server.md
OQ-08 — resolved by this ADR
Production fail2ban setup: /workspace/system/dev1/fail2ban.md

2.5 KiB Raw Blame History