Resolve 5 open questions, add 7 ADRs for previously undocumented decisions

Resolve open questions:
- OQ-01: Restrict cipher suites to match nginx scope (4 ECDHE-AES-GCM
  suites for TLS 1.2 + all TLS 1.3 suites) — ADR-012
- OQ-03: Health check on separate local port (default 9900, localhost
  only) — ADR-013
- OQ-04: Add Unix domain socket admin API for config reload alongside
  SIGHUP, with structured success/failure responses — ADR-014
- OQ-06: Per-site upstream timeouts with defaults (5s connect, 60s
  request), overridable in SiteConfig — ADR-015

Document previously undocumented decisions flagged by architecture review:
- ADR-016: Explicit bind address requirement (reject 0.0.0.0)
- ADR-017: Upstream connection defaults (HTTP/1.1, no redirects, pooling)
- ADR-018: 100 MB body size limit (matches nginx, Gitea compatibility)

OQ-07 (per-site TLS overrides) remains open for future consideration.

Spec updates:
- config.md: add health_check_port, admin_socket_path, per-site timeout
  fields, update TOML example and validation rules
- proxy.md: reference ADR-015/017/018 for timeouts, connection defaults,
  and body limit decisions
- tls.md: replace OQ-01 cipher suite section with ADR-012 decision
- operations.md: add local health check port section, admin socket reload
- overview.md: update Phase 1 scope with new features, add ADR references
- open-questions.md: resolve OQ-01/03/04/06, keep OQ-07 open
This commit is contained in:
2026-06-11 09:07:36 +00:00
parent 7efc142406
commit 9a2352e61c
14 changed files with 613 additions and 89 deletions

View File

@@ -109,12 +109,15 @@ The proxy handler constructs a new request to the upstream:
5. Stream the response back to the client
The hyper Client is created once at startup and shared via axum's `State`. It
must be configured with:
must be configured with (see ADR-017 for rationale):
- Connection pooling (hyper default behavior)
- Connect timeout: 5 seconds
- Request timeout: 60 seconds
- HTTP/1.1 only for upstream connections (HTTP/2 proxying is out of scope)
- No redirect following (proxies should not follow redirects)
Per-site timeout overrides are available via `upstream_connect_timeout_secs`
and `upstream_request_timeout_secs` in `SiteConfig` (see ADR-015). When not
specified, defaults of 5s connect and 60s request are used.
### 4. Error Handling
| Upstream Condition | Response | Notes |
@@ -147,10 +150,11 @@ between the proxy and backend services on loopback is unnecessary.
## Body Size Limit
axum's `DefaultBodyLimit` layer sets the maximum request body size. For
compatibility with Gitea's push operations (large pack files), this defaults
to 100 MB. In Phase 1, the body limit is a global setting; Phase 2 may add
per-site body limits.
axum's `DefaultBodyLimit` layer sets the maximum request body size. The default
of 100 MB (104,857,600 bytes) matches our current nginx configuration and
accommodates Gitea's push operations with large pack files (see ADR-018). In
Phase 1, the body limit is a global setting; Phase 2 may add per-site body
limits.
## Design Decisions
@@ -161,11 +165,14 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
| [002](decisions/002-custom-proxy-handler.md) | Custom proxy handler | One upstream per domain — simpler than a general proxy library |
| [007](decisions/007-custom-log-format.md) | Custom structured log format | key=value pairs with RATE_LIMIT prefix for fail2ban |
| [010](decisions/010-multi-site-phase1.md) | Multi-site in Phase 1 | Multiple domains from initial release |
| [015](decisions/015-per-site-timeouts.md) | Per-site upstream timeouts with defaults | 5s connect / 60s request defaults, per-site overrides |
| [017](decisions/017-upstream-connection-defaults.md) | Upstream connection defaults | HTTP/1.1, no redirects, connection pooling |
| [018](decisions/018-body-size-limit.md) | Request body size limit | 100 MB default matching nginx, Gitea push compatibility |
## Open Questions
Open questions are tracked in [open-questions.md](open-questions.md). Key
questions affecting this document:
- **OQ-06**: Should upstream timeouts be configurable per-site? (open — Phase 1
uses global defaults of 5s connect, 60s request)
- ~~**OQ-06**: Should upstream timeouts be configurable per-site?~~ (resolved —
ADR-015: per-site timeout overrides with defaults)