Update architecture specs to reflect live deployment findings and fix two bugs

Architecture updates based on gaps discovered during live deployment testing:

- ADR-023: HTTP/2 client-facing support via ALPN-based protocol detection.
  The spec previously said HTTP/2 was out of scope, but the deployment
  revealed that modern browsers negotiate HTTP/2 via ALPN. The proxy now
  correctly detects the negotiated ALPN protocol and uses the appropriate
  HTTP server builder (http2::Builder for h2, auto::Builder for http/1.1).
  Upstream connections remain HTTP/1.1. Host resolution now falls back to
  URI host for HTTP/2 :authority pseudo-headers.

- ADR-024: ANSI-disabled logging. All tracing-subscriber layers now use
  with_ansi(false) to prevent ANSI escape codes in log output, which broke
  fail2ban regex matching in Docker deployments. Also documents the fail2ban
  regex anchor fix (^RATE_LIMIT → RATE_LIMIT).

Bug fixes found by architecture review:

- Fix missing ALPN protocols in manual TLS mode. build_manual_server_config
  and build_multi_domain_server_config did not set alpn_protocols, meaning
  manual TLS mode could not support HTTP/2. Added h2 and http/1.1 ALPN
  entries to both functions (acme-tls/1 only in ACME mode).

- Fix missing with_ansi(false) in JSON log format. The init_json function
  with file output did not disable ANSI on stdout or file layers, which would
  break fail2ban in production JSON logging mode.

Other spec updates:

- All document statuses updated from draft to reviewed
- proxy.md: documented Server header removal, upstream HTTPS client,
  two-phase timeout enforcement, HTTP/2 host resolution, connect timeout
- tls.md: documented ALPN configuration differing by mode (ACME vs manual)
- overview.md: added HTTP/2 client-facing support to scope, updated crate
  deps (hyper-rustls, rustls-native-certs, hyper-util), clarified out-of-scope
- config.md: fixed http_port type (u16→u32) to match implementation, added
  ANSI-disabled note for LoggingConfig
- operations.md: documented ANSI-disabled logging, fail2ban regex anchor
- open-questions.md: updated OQ-09 resolution (connect timeout fully
  implemented), OQ-10 (C2 bug is fixed)
This commit is contained in:
2026-06-12 11:28:31 +00:00
parent c2eefddb4f
commit 0d54eba41e
11 changed files with 313 additions and 66 deletions

View File

@@ -1,5 +1,5 @@
---
status: draft
status: reviewed
last_updated: 2026-06-12
---
@@ -114,15 +114,14 @@ last_updated: 2026-06-12
- **Origin**: Implementation review finding W4, ADR-015, ADR-017
- **Status**: resolved
- **Priority**: medium
- **Resolution**: This is an implementation gap, not an architectural unknown.
The architecture already specifies a 5-second default connect timeout
separate from the request timeout (ADR-015, ADR-017), and `SiteConfig`
already includes `upstream_connect_timeout_secs`. The implementation must
wire this field to hyper's `connect_timeout` parameter. If hyper's API
doesn't expose a separate connect timeout, a two-phase `tokio::time::timeout`
approach should be used for Phase 2. For Phase 1, the connect timeout field
exists in config but is not enforced — this is a documented known gap. No ADR
needed; the decision was already made in ADR-015.
- **Resolution**: Implemented using a two-phase `tokio::time::timeout` approach.
The inner timeout uses the per-site `upstream_connect_timeout_secs` (default
5s) for the connect + first-byte phase, and the outer timeout uses
`upstream_request_timeout_secs` (default 60s) for the full request/response
cycle. Additionally, `HttpConnector::set_connect_timeout()` enforces the
TCP-level connect timeout on both HTTP and HTTPS clients. The implementation
is in `handler.rs` and `create_http_client()`/`create_https_client()`.
No new ADR needed; the decision was already made in ADR-015.
- **Cross-references**: ADR-015, ADR-017
### ~~OQ-10: Should ACME contact email be a required config field?~~
@@ -134,9 +133,10 @@ last_updated: 2026-06-12
specifies `acme_contact` as a required field in ACME mode (config.md
validation rule 19). The field is defined in the `ListenerConfig` table and
shown in TOML examples. Let's Encrypt requires a contact email for production
certificate requests. The implementation bug (C2: `contact: vec![]`) must be
fixed to use the configured `acme_contact` value. No new ADR needed — the
decision is already documented in config.md and tls.md.
certificate requests. The implementation bug (C2: `contact: vec![]`) has been
fixed `acme_contact` is now correctly wired from config to the ACME state
machine. No new ADR needed — the decision is already documented in config.md
and tls.md.
- **Cross-references**: ADR-004
### ~~OQ-11: How should `X-Forwarded-Proto` be derived per-listener?~~