Files
wraith/docs/architecture/decisions/013-fail2ban-friendly-logging.md
glm-5.1 13b0991fb8 Resolve all architecture open questions, add 13 ADRs, update specs
Resolved all 11 open questions based on project guidance:

Transport:
- OQ-01/OQ-07: ACME/Let's Encrypt with domain + IP paths (ADR-008)
- OQ-02: Default to n0 relay, --iroh-relay override (ADR-009)
- OQ-05: Transport chaining supported natively (ADR-010)

Client:
- OQ-06: Programmatic-first API, no ~/.ssh/config (ADR-011)

Server:
- OQ-04: Ed25519 + OpenSSH cert-authority, no password auth (ADR-012)
- OQ-08: fail2ban-friendly logging + built-in rate limiting (ADR-013)

TUN:
- OQ-03/OQ-09: Deferred entirely, recommend tun2proxy (ADR-014)
- tun-shim.md marked deprecated

NAPI:
- OQ-10: Expose both connect() and serve() (ADR-016)
- OQ-11: Use napi-rs for FFI bridge (ADR-015)

Additional ADRs created during review:
- ADR-006: No logging of tunnel destinations (was phantom reference)
- ADR-017: Stealth mode protocol multiplexing
- ADR-018: Control channel for pubsub over SSH

Fixed: ADR-002 status → Superseded, ADR-007 title typo,
WRAUTH_SERVER typo, ADR-005 stale wraith-tun refs,
undefined ACL feature removed from server.md,
--proxy semantic difference documented.
2026-06-01 17:31:28 +00:00

2.5 KiB

ADR-013: Fail2ban-Friendly Server Logging

Status

Accepted

Context

The server needs to handle abuse on public-facing deployments. Our production infrastructure uses fail2ban on Linux (documented in /workspace/system/dev1/fail2ban.md) with nftables and systemd journal backend. fail2ban needs structured, parseable logs to identify abusive IP addresses.

However, fail2ban is Linux-specific. On other platforms (macOS, Windows, BSD), users need a different approach to reject abusive connections. The server should provide enough logging for fail2ban on Linux and enough built-in protection for other platforms.

Decision

The server logs connection and authentication events at INFO level with structured fields, and provides a configurable connection rate limiter as a built-in defense.

Logging (for fail2ban integration on Linux):

  • Log auth attempts: level=INFO, msg="auth attempt", remote_addr=<ip>, user=<user>, key_fingerprint=<sha256>, result=<accept|reject>
  • Log new connections: level=INFO, msg="connection opened", remote_addr=<ip>, transport=<tcp|tls|iroh>
  • Log disconnections: level=INFO, msg="connection closed", remote_addr=<ip>, duration=<secs>
  • Do NOT log: channel open targets, DNS resolutions, bytes transferred

This matches what fail2ban needs: source IP + failure indicator. Our existing fail2ban setup filters on similar fields for SSH and nginx.

Built-in rate limiting (for all platforms):

  • --max-connections-per-ip <n> (default: 0 = unlimited) — reject new connections from an IP that already has N active connections
  • --max-auth-attempts <n> (default: 10) — disconnect after N failed auth attempts from one connection
  • Rate limiting happens at the SSH layer, before channels are opened

This ensures that even without fail2ban, the server rejects obviously abusive connections.

Consequences

  • Positive: fail2ban can parse wraith logs the same way it parses SSH and nginx logs on our production systems.
  • Positive: Built-in rate limiting provides protection on platforms without fail2ban.
  • Positive: No privacy-sensitive data in logs (no tunnel destinations).
  • Negative: Slightly more code in the server for connection tracking per IP.
  • Negative: Users with custom fail2ban filters need to write regex for wraith's log format (documented examples provided).

References

  • server.md
  • OQ-08 — resolved by this ADR
  • Production fail2ban setup: /workspace/system/dev1/fail2ban.md