Expand architecture: multi-site Phase 1, multi-domain TLS, fix review issues

Promote multi-site support from Phase 2 to Phase 1 (ADR-010): the proxy
must support git.alk.dev and alk.dev from initial release. Add multi-domain
TLS configuration (ADR-011): acme_domains array replaces acme_domain string,
single SAN certificate via rustls-acme.

Key changes:
- ADR-010: Multi-site in Phase 1 — avoids config format migration later
- ADR-011: Multi-domain TLS — single SAN cert, acme_domains Vec<String>
- ADR-002: Updated rationale for multi-site (one upstream per domain)
- overview.md: Phase 1 now includes multi-site, alk.dev pass-through,
  dual licensing (MIT OR Apache-2.0), real IP removed
- config.md: acme_domain → acme_domains, TOML example shows both sites,
  validation adds unique host check, real IP replaced with 203.0.113.10
- tls.md: Multi-domain SNI section moved from Future to current, manual
  mode uses ResolvesServerCert for SNI mapping, TOML header fixed
- proxy.md: Updated for multi-site, removed single-domain language
- operations.md: RFC 5737 documentation IPs, clarified rate limit eviction
  semantics (distinct scan interval vs eviction age)
- open-questions.md: OQ-05 resolved (single bind_addr sufficient), new
  OQ-07 (per-site TLS overrides)

Review fixes:
- acme_domains (plural) consistently used across all docs and diagram
- ADR-011 clearly scopes acme_domain as previous design
- Inline decision rationale extracted: tls.md hot-reload → ADR-004 ref,
  config.md static/dynamic → ADR-008 ref
- TOML section headers consistent (server.tls)
This commit is contained in:
2026-06-11 08:50:03 +00:00
parent 8ee6284b62
commit 7efc142406
10 changed files with 356 additions and 108 deletions

View File

@@ -42,9 +42,10 @@ Requests` and logs the event with structured fields.
### State Eviction
The per-IP token bucket state grows over time as new IPs are seen. A
background task runs at a configurable interval (default: 60 seconds) and
removes entries that haven't been accessed within the cleanup interval. This
prevents unbounded memory growth.
background task runs every 60 seconds (configurable) and removes entries
whose last access timestamp is older than a configurable eviction age
(default: 300 seconds / 5 minutes). This prevents unbounded memory growth
while preserving recent entries that may still receive traffic.
### Fail2ban Integration
@@ -55,7 +56,7 @@ format decision.
The log format uses `key=value` pairs with a `RATE_LIMIT` prefix:
```
RATE_LIMIT client_ip=X.X.X.X host=Y.Z path=/W status=429
RATE_LIMIT client_ip=203.0.113.50 host=Y.Z path=/W status=429
```
A corresponding fail2ban filter and jail configuration are provided as part
@@ -71,15 +72,15 @@ log entries:
1. **Access logs**: Every proxied request is logged at `info` level with
structured fields.
```
REQUEST client_ip=1.2.3.4 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45
```
```
REQUEST client_ip=203.0.113.50 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45
```
2. **Event logs**: Rate limits, TLS errors, upstream failures, config reloads,
etc.
```
RATE_LIMIT client_ip=1.2.3.4 host=git.alk.dev path=/login status=429
RATE_LIMIT client_ip=203.0.113.50 host=git.alk.dev path=/login status=429
UPSTREAM_ERROR host=git.alk.dev upstream=127.0.0.1:3000 error="connection refused"
CONFIG_RELOAD status=success sites=1
```