Files

glm-5.1 d3633b7839 docs: complete Phase 0 architecture — spec updates, review fixes, and link portability

Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to
reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy,
OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034)
for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05
in open-questions.md, deprecated hub terminology, undefined AuthService and noq
terms. Replace inline OQ text with cross-references per format rules. Add
ConfigServiceImpl definition to configuration.md. Port absolute workspace paths
to project-relative links by copying referenced docs (feasibility, certbot,
fail2ban, event_source_types) into docs/research/.

2026-06-07 11:27:52 +00:00

2.5 KiB

Raw Blame History

ADR-013: Fail2ban-Friendly Server Logging

Status

Accepted

Context

The server needs to handle abuse on public-facing deployments. Our production infrastructure uses fail2ban on Linux (documented in fail2ban.md) with nftables and systemd journal backend. fail2ban needs structured, parseable logs to identify abusive IP addresses.

However, fail2ban is Linux-specific. On other platforms (macOS, Windows, BSD), users need a different approach to reject abusive connections. The server should provide enough logging for fail2ban on Linux and enough built-in protection for other platforms.

Decision

The server logs connection and authentication events at INFO level with structured fields, and provides a configurable connection rate limiter as a built-in defense.

Logging (for fail2ban integration on Linux):

Log auth attempts: level=INFO, msg="auth attempt", remote_addr=<ip>, user=<user>, key_fingerprint=<sha256>, result=<accept|reject>
Log new connections: level=INFO, msg="connection opened", remote_addr=<ip>, transport=<tcp|tls|iroh>
Log disconnections: level=INFO, msg="connection closed", remote_addr=<ip>, duration=<secs>
Do NOT log: channel open targets, DNS resolutions, bytes transferred

This matches what fail2ban needs: source IP + failure indicator. Our existing fail2ban setup filters on similar fields for SSH and nginx.

Built-in rate limiting (for all platforms):

--max-connections-per-ip <n> (default: 0 = unlimited) — reject new connections from an IP that already has N active connections
--max-auth-attempts <n> (default: 10) — disconnect after N failed auth attempts from one connection
Rate limiting happens at the SSH layer, before channels are opened

This ensures that even without fail2ban, the server rejects obviously abusive connections.

Consequences

Positive: fail2ban can parse alknet logs the same way it parses SSH and nginx logs on our production systems.
Positive: Built-in rate limiting provides protection on platforms without fail2ban.
Positive: No privacy-sensitive data in logs (no tunnel destinations).
Negative: Slightly more code in the server for connection tracking per IP.
Negative: Users with custom fail2ban filters need to write regex for alknet's log format (documented examples provided).

References

server.md
OQ-08 — resolved by this ADR
Production fail2ban setup: fail2ban.md

2.5 KiB Raw Blame History