Resolve all architecture review findings (7 critical, 14 warnings, 6 suggestions)
Critical findings resolved: - C1: Site routing is global (per-listener TOML, global runtime lookup) - C2: X-Forwarded-For replaces (not appends) — edge proxy model (ADR-021) - C3: Hop-by-hop header handling rules specified (proxy.md) - C4: ACME failure behavior defined (tls.md) - C5: Startup sequence with fail-fast semantics (operations.md) - C6: Per-listener Router instances with shared global state (overview.md) - C7: Rate limiter adopts new params on next request, no state clear (operations.md) Warnings resolved: - W1: Admin socket wire protocol specified - W2: Host header port stripped, hostnames only in config - W3: HTTP redirect URL construction with port handling - W4: /health on HTTPS matches regardless of Host header - W5: Static config changes logged as warning during reload - W6: Reload operations serialized via Mutex - W7: http_port validation rules added (9 new rules total) - W8: upstream format validation (host:port required, no scheme) - W9: TLS error handling table (SNI, version, cipher failures) - W10: IPv6 rate limited per /64 prefix - W11: Graceful shutdown sequence specified (6 steps) - W12: Error response bodies: minimal plain text, no version disclosure - W13: upstream_scheme HTTPS uses system CA store - W14: allow_wildcard_bind is OR between config and CLI - W15: ADR-010 Phase 2 list updated (timeouts moved to Phase 1) - W17: LoggingConfig static/restart note added Suggestions applied: - S2: ConnectInfo propagation note - S3: Case-insensitive host matching (RFC 7230) - S5: Response streaming behavior (chunk-by-chunk) - S6: Token bucket nodelay semantics - S7: File watching explicitly out of scope - S8: All paths forwarded without filtering - S9: shutdown_timeout_secs referenced in shutdown description - S11: Consolidated defaults table in config.md
This commit is contained in:
@@ -36,6 +36,12 @@ Rate limits are global per-IP in Phase 1 (not per-site). A request from IP
|
||||
address X counts against the same bucket regardless of which site it targets.
|
||||
Per-site rate limits may be added in Phase 2.
|
||||
|
||||
The token bucket uses **nodelay** semantics matching nginx's `limit_req burst
|
||||
nodelay`: when the bucket is empty, the request is immediately rejected with
|
||||
429 — requests are not queued. Tokens are added at a rate of
|
||||
`requests_per_second` (1 token every 1000ms / requests_per_second), and the
|
||||
bucket capacity is the `burst` value.
|
||||
|
||||
When a request exceeds the rate limit, the middleware returns `429 Too Many
|
||||
Requests` and logs the event with structured fields.
|
||||
|
||||
@@ -47,6 +53,37 @@ whose last access timestamp is older than a configurable eviction age
|
||||
(default: 300 seconds / 5 minutes). This prevents unbounded memory growth
|
||||
while preserving recent entries that may still receive traffic.
|
||||
|
||||
### Config Reload Behavior
|
||||
|
||||
When rate limit parameters change (e.g., from 10 req/s burst 20 to 20 req/s
|
||||
burst 40), the behavior is:
|
||||
|
||||
1. New `DynamicConfig` is swapped in via ArcSwap.
|
||||
2. On the next request from an existing IP, the rate limiter reads the current
|
||||
`DynamicConfig` for rate/burst parameters.
|
||||
3. The token bucket refills using the new rate, and its capacity is set to the
|
||||
new burst maximum.
|
||||
4. If the current token count exceeds the new burst maximum, it is capped to
|
||||
the new burst maximum.
|
||||
|
||||
The HashMap is **not** cleared — this avoids creating a rate-limiting gap.
|
||||
Existing buckets adopt new parameters on their next request. The eviction task
|
||||
continues removing stale entries independently.
|
||||
|
||||
### IPv6 Rate Limiting
|
||||
|
||||
IPv6 addresses have a vastly larger address space than IPv4. Rate limiting per
|
||||
individual IPv6 address (`/128`) is ineffective against attackers who can
|
||||
generate many addresses within a `/64` prefix.
|
||||
|
||||
- **IPv4**: Rate limited per individual address (`/32`).
|
||||
- **IPv6**: Rate limited per `/64` prefix. All addresses in the same `/64` share
|
||||
the same token bucket. This matches RFC 4941 privacy extension boundaries and
|
||||
common anti-abuse practice.
|
||||
|
||||
The rate limiter normalizes IPv6 addresses to their `/64` prefix before
|
||||
bucket lookup.
|
||||
|
||||
### Fail2ban Integration
|
||||
|
||||
Rate limit events are logged in a structured format that a custom fail2ban
|
||||
@@ -225,13 +262,46 @@ process does not exit on SIGHUP.
|
||||
|
||||
The admin Unix domain socket provides programmatic config reload with feedback.
|
||||
This is useful for CI/CD pipelines and automation tools. See ADR-014 for the
|
||||
command protocol.
|
||||
rationale.
|
||||
|
||||
### Timeout
|
||||
**Protocol:**
|
||||
|
||||
In-flight requests have a configurable shutdown timeout (default: 30 seconds).
|
||||
After the timeout, remaining connections are forcefully closed and the process
|
||||
exits.
|
||||
- **Connection lifecycle**: One command per connection. Client connects, sends
|
||||
one newline-terminated command, receives one newline-terminated JSON
|
||||
response, then the server closes the connection.
|
||||
- **Message framing**: Newline-delimited (`\n`). Responses end with `\n`.
|
||||
- **Commands**:
|
||||
- `reload` — Re-read config file, validate, and swap DynamicConfig. Returns
|
||||
`{"status": "ok"}` or `{"status": "error", "message": "..."}`.
|
||||
- `status` — Return basic process info. Returns
|
||||
`{"status": "ok", "uptime_secs": 1234, "sites": 2}`.
|
||||
- **Error responses**: Unrecognized commands return
|
||||
`{"status": "error", "message": "unknown command: <cmd>"}`. Invalid or empty
|
||||
input returns `{"status": "error", "message": "invalid input"}`.
|
||||
- **Concurrency**: Multiple clients can connect simultaneously, but reload
|
||||
operations are serialized (see Config Reload section in config.md).
|
||||
- **Socket cleanup**: The proxy removes any existing socket file at startup
|
||||
before binding. If the file exists and another process is listening, a warning
|
||||
is logged and the admin socket is disabled (but the proxy continues starting).
|
||||
|
||||
### Shutdown Sequence
|
||||
|
||||
On SIGTERM or SIGINT, the proxy performs a graceful shutdown:
|
||||
|
||||
1. **Stop accepting new connections** — Close all TCP listening sockets. No new
|
||||
connections are accepted.
|
||||
2. **Close idle keep-alive connections** — Send `Connection: close` on any idle
|
||||
connections in the keep-alive pool.
|
||||
3. **Wait for in-flight requests** — Up to `shutdown_timeout_secs` (default: 30)
|
||||
for active requests to complete.
|
||||
4. **Force-close remaining connections** — After the timeout, any remaining
|
||||
connections are forcefully closed via TCP RST.
|
||||
5. **Cancel background tasks** — ACME renewal tasks, rate limiter eviction task,
|
||||
and admin socket listener are all cancelled.
|
||||
6. **Exit with code 0**.
|
||||
|
||||
The `shutdown_timeout_secs` is configurable in StaticConfig (default: 30
|
||||
seconds). See config.md for details.
|
||||
|
||||
## Deployment
|
||||
|
||||
@@ -443,6 +513,51 @@ continues to be routed directly to the Gitea container via Docker port
|
||||
publishing (e.g., `203.0.113.10:22:2222`), matching the current deployment
|
||||
pattern.
|
||||
|
||||
## Startup Sequence
|
||||
|
||||
The proxy starts components in a specific order to ensure fail-fast behavior
|
||||
and correct dependency initialization:
|
||||
|
||||
1. **Parse and validate config** — Read the TOML config file, deserialize into
|
||||
`StaticConfig` and `DynamicConfig`, and validate all rules. If validation
|
||||
fails, exit with non-zero code and log errors. No ports are bound.
|
||||
|
||||
2. **Initialize DynamicConfig** — Load sites, rate limits, and body limits into
|
||||
`ArcSwap<DynamicConfig>`.
|
||||
|
||||
3. **Initialize shared state** — Create the rate limiter
|
||||
`HashMap<IpAddr, TokenBucket>`, the shared `hyper::Client`, and the
|
||||
`tracing-subscriber` with file and stdout layers.
|
||||
|
||||
4. **Bind health check port** (if enabled) — Bind `127.0.0.1:{health_check_port}`.
|
||||
Fail-fast if bind fails.
|
||||
|
||||
5. **Bind admin socket** (if enabled) — Remove any stale socket file first, then
|
||||
bind the Unix domain socket. If the socket file exists and another process is
|
||||
listening, log a warning and fail the admin socket (but continue starting —
|
||||
the admin socket is non-critical).
|
||||
|
||||
6. **Bind all listener ports** — For each listener: bind HTTP port (if enabled)
|
||||
and HTTPS port. If any bind fails, fail-fast and exit. All ports are bound
|
||||
before proceeding.
|
||||
|
||||
7. **Load TLS configuration** — For each listener: load manual certificates or
|
||||
initialize ACME state machine. If manual certificate loading fails, fail-fast
|
||||
and exit. For ACME: if no cached certificate exists and ACME provisioning
|
||||
fails, fail-fast and exit.
|
||||
|
||||
8. **Start TCP listeners** — Begin accepting connections on all bound ports.
|
||||
|
||||
9. **Start background tasks** — ACME renewal tasks (per listener in ACME mode),
|
||||
rate limiter eviction task, signal handler task, admin socket handler task.
|
||||
|
||||
10. **Signal readiness** — Send `sd_notify("READY=1")` to systemd (if running
|
||||
under systemd).
|
||||
|
||||
**Failure semantics**: **Fail-fast**. If any step fails, the process exits with
|
||||
a non-zero code. The proxy does not partially start. All ports are bound before
|
||||
any connections are accepted.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
All design decisions are documented as ADRs in [decisions/](decisions/).
|
||||
|
||||
Reference in New Issue
Block a user