- Replace handle.abort() for HTTPS server tasks with timeout-based join,
allowing in-flight requests to drain before forceful shutdown
- Add shutdown_rx to start_admin_socket with tokio::select! for clean
accept loop exit and Unix socket file cleanup on shutdown
- Add shutdown_rx to start_eviction_task with tokio::select! for
cancellable eviction loop
- Add shutdown channel to spawn_acme_state for cancellable ACME state
machine via tokio::select!
- Pass Arc<GracefulShutdown> through setup_tls to ACME state machine
- Move GracefulShutdown creation before admin socket and TLS setup
- Update integration test for new start_eviction_task signature
The main code changes were already committed (3f2550f), but test config
TOML strings in cli.rs, admin/socket.rs, shutdown.rs, and
integration_test.rs still needed the new acme_contact field to pass
validation rule 19.
Add log_request! calls for every proxied request (success, 4xx/5xx from
upstream, 502/504 errors) and log_upstream_error! calls for upstream
connection failures and timeouts. Duration is tracked from request entry
to response using std::time::Instant.
- Remove health_handler and /health early return from proxy_handler
- Remove /health route from proxy_router (now just fallback)
- Remove is_https field from ProxyState struct
- Remove is_https parameter from inject_proxy_headers, hardcode https
- Add comment explaining why X-Forwarded-Proto is always https
- Remove health_path_returns_200 and health_with_unknown_host tests
- Update all inject_proxy_headers test calls to remove is_https param
- Remove inject_proxy_headers_sets_x_forwarded_proto_http test
Extract strip_port_from_host into shared utils module and update normalize_host to properly strip brackets from IPv6 addresses like [::1]:443 -> ::1 instead of incorrectly using split(':').next().
Change ConfigReloadHandle.static_config from StaticConfig to ArcSwap<StaticConfig>
so that after each reload, the stored static config is updated with the new value.
This prevents repeated stale warnings about the same static config fields on
every reload.
The init_creates_log_directory_and_file test called init() which sets a
global tracing subscriber. When tests run in parallel, other tests may
have already set the subscriber, causing init() to return an error and
the test to fail. Now the test tolerates the 'already set' error while
still asserting the log file is created.
Change http_port type from u16 to u32 to allow out-of-range values to be
caught by validation. Add HttpPortInvalid error variant and validation check
for http_port > 65535. Add test for http_port=65536 producing HttpPortInvalid.
http_port=0 (disabled) remains valid per existing test.
Resolve OQ-08 through OQ-12 after reviewing implementation findings:
- OQ-08: Remove /health route from the main HTTPS listener entirely.
Health checking belongs on port 9900 and admin socket only, not on
the public-facing proxy. This eliminates upstream collision problems
and special-case routing logic. (ADR-022)
- OQ-09: Not an architectural unknown — ADR-015 already decided on a
separate connect timeout. The implementation gap is a known issue.
- OQ-10: Not an open question — acme_contact is already specified as
required in config.md. The empty contact list is bug C2.
- OQ-11: Hardcoded is_https=true is correct for a TLS-terminating
proxy. HTTP listener redirects, doesn't proxy. Just needs a comment.
- OQ-12: Access logging is already specified as mandatory/always-on in
operations.md. Missing log_request! calls are bug W13.
Updated docs: proxy.md, operations.md, overview.md, config.md,
open-questions.md, README.md, ADR-013. Created ADR-022.
Analyzed 29 findings from the implementation review (002-implementation-review.md)
and identified 8 architecture-level concerns requiring spec changes:
Architecture gaps addressed:
- C2: Added acme_contact field to config.md, tls.md, and operations.md.
Let's Encrypt requires a contact email for production; the spec was missing
this required field.
- C4: Added StaticConfig drift tracking requirement to config.md reload
section. ConfigReloadHandle must update its stored StaticConfig after each
successful reload to prevent stale warnings.
- W1: Updated shutdown sequence in operations.md to specify that server tasks
should be joined (not aborted) during the drain window.
- W5: Added health check path collision note to proxy.md.
- W13: Clarified that access logging is always-on in operations.md.
- W14: Updated X-Forwarded-Proto description in proxy.md to clarify that it
is always 'https' since the HTTP listener redirects rather than proxies.
New open questions added:
- OQ-08: Should /health use a less common path to avoid upstream collision?
- OQ-09: How should upstream_connect_timeout_secs be enforced?
- OQ-10: Should ACME contact email be a required config field?
- OQ-11: How should X-Forwarded-Proto be derived per-listener?
- OQ-12: Should request access logging be mandatory or optional?
The remaining 21 findings are implementation-level bugs, code quality issues,
or Phase 2 improvements that don't require architecture spec changes.
W13: No request access logging - log_request! macro defined but never called
W14: is_https hardcoded to true on ProxyState - X-Forwarded-Proto always https
S9: Rate limiting silently bypassed when no client IP found
S10: Integration test TOML has [[listeners.listeners.sites]] typo
S11: No Server response header added by proxy (upstream's is stripped)
- Replace determine_if_https() with ProxyState.is_https field so X-Forwarded-Proto
reflects the listener's protocol instead of guessing from the Host header
- Return ProxyError::BadGateway with host/upstream context for non-connect upstream
errors instead of bare StatusCode::BAD_GATEWAY
- Implement InFlightCounter with RAII guard for tracking in-flight connections
- Add drain_in_flight() to wait for connections to complete on shutdown, with
configurable timeout before forcing exit
- Mark review/core-components and review/integration-readiness as complete
- Add server module that orchestrates the full startup sequence:
parse config, init dynamic config, init shared state, bind health
check, bind admin socket, bind all listener ports, load TLS config,
start TCP listeners, start background tasks, signal readiness
- For each ListenerConfig: bind TCP listener, construct appropriate
ServerConfig (manual or ACME via TlsMode), create TlsAcceptor
- ConnectInfo<SocketAddr> populated from TcpStream::peer_addr() BEFORE
TLS wrapping via ConnectInfoService wrapper that inserts ConnectInfo
into request extensions for each connection
- Per-listener axum::Router instances sharing Arc<ProxyState> via State
- Fail-fast: if any bind or TLS load fails, exit with non-zero code
- All ports bound before any connections accepted
- /health endpoint available on HTTPS listener(s) as fallback
(proxy_router already includes /health route)
- sd_notify(READY=1) sent after all listeners started
- Use hyper_util for TLS connection serving with TowerToHyperService
and ConnectInfoService to bridge ConnectInfo from pre-TLS peer_addr
- Add sd-notify dependency for systemd readiness notification
- Add ProxyError enum with IntoResponse for error handling (400, 404, 502, 504)
- Implement proxy header injection: X-Real-IP, X-Forwarded-For (replaced, not appended), X-Forwarded-Proto
- Implement hop-by-hop header removal for both request and response headers
- Implement request forwarding via shared hyper::Client with HTTP and HTTPS support
- Add ProxyState with http_client and https_client instances shared via axum State
- Add per-site timeout overrides using tokio::time::timeout
- Add HTTPS upstream support with system native TLS root certificates
- No Server or Via headers added to responses
- Host header preserved as-is
- Add unit tests for header injection, hop-by-hop removal, and URI building
- Add integration tests for proxy forwarding, hop-by-hop removal, and 502 on unreachable upstream