Commit Graph

124 Commits

Author SHA1 Message Date
603d722ad0 feat(rate-limiter): add ConnectInfo-based tests for rate limiter (ADR-025) 2026-06-12 14:24:17 +00:00
db982e9c4d Mark fix/inflight-counter-increment, fix/consolidate-config-types, fix/rate-limiter-ip-source as completed 2026-06-12 14:02:02 +00:00
e6d22bdcb8 Merge remote-tracking branch 'origin/fix/fix/rate-limiter-ip-source' 2026-06-12 14:01:16 +00:00
ad9b9b9b78 fix(rate_limit): use ConnectInfo as sole IP source, reject without it
The rate limiter previously extracted client IP from the X-Forwarded-For
header first, falling back to ConnectInfo. This allowed attackers to bypass
rate limits by sending spoofed X-Forwarded-For headers. Per ADR-025, the
rate limiter now uses ConnectInfo<SocketAddr> exclusively and rejects
requests with 429 when ConnectInfo is absent.
2026-06-12 14:00:31 +00:00
77ea1160de Merge remote-tracking branch 'origin/fix/fix/consolidate-config-types' 2026-06-12 14:00:10 +00:00
1ba1d2a4de Consolidate config types: remove RawConfig, use FullConfig in load_config
Delete the duplicate RawConfig struct and collect_sites helper from cli.rs.
Rewrite load_config to use FullConfig::parse + into_static_and_dynamic,
eliminating the redundant manual construction path.
2026-06-12 13:58:36 +00:00
05fea1a8e2 Fix InFlightCounter: increment in new(), use new() constructor, drain interval 100ms 2026-06-12 13:58:04 +00:00
54f1725173 Decompose security review #003 findings into 17 fix tasks and 1 review task
Address 4 critical, 8 warning, and 5 suggestion findings from the
security and bug review by creating atomic, dependency-ordered tasks:

Critical fixes (C1-C4): rate limiter IP source (ADR-025), InFlightCounter
increment + drain interval, connector timeout ceiling (ADR-026), JSON format
without log file.

Validation tightening (W1, W2): upstream host validation, ACME contact email
validation.

Robustness (W3, W4, W5, W12): upstream URI error handling (502 not silent
drop), admin socket resource limits (ADR-027), TlsMode wildcard mismatch,
http_port u32→u16.

Code quality (W6, W10, W11, S1, S3, W8/W9): config type consolidation,
TokenBucket field visibility, reload_mutex #[cfg(test)], dead code removal,
root cert count logging, misleading test names.

Test coverage (S10): rate limiter ConnectInfo tests (depends on C1 fix).

Review: post-security-fix-review checkpoint covering all critical fixes
and sensitive config consolidation path.
2026-06-12 13:42:37 +00:00
80d1fd0fb3 Update architecture docs to address security review #003 findings
Add three ADRs (025-027) and update five spec documents to close gaps
identified in the security and bug review:

- ADR-025: Rate limiter IP source must be ConnectInfo only (C1 fix)
- ADR-026: Connector timeout ceiling of 30s for per-site timeouts (C3 fix)
- ADR-027: Admin socket resource limits — 5s timeout, 4096 byte line limit (W4 fix)

Spec changes:
- proxy.md: add rate limiter IP source section, URI error handling
  constraint, connector ceiling description, renumber sections
- operations.md: add ConnectInfo-only IP source, in-flight counter
  architectural requirement (C2), JSON format guarantee (C4), admin
  socket resource limits, 100ms drain polling interval
- config.md: fix http_port type u32→u16 (W12), tighten upstream host
  validation (W1), tighten ACME contact validation (W2), add
  X-Forwarded-Proto cross-reference, clarify alknet ADR-030 reference
- overview.md: fix ambiguous C1 reference, add ADR/OQ cross-references
- open-questions.md: update OQ-09 resolution, add OQ-13 (acme_contact
  Vec) and OQ-14 (eviction configurability)
- README.md: add ADR-025/026/027 and OQ-13/14, update doc statuses to draft

Also fix reviewer findings: alknet ADR-030 scope clarification, RFC 2616
reference updated to RFC 7230.
2026-06-12 13:17:39 +00:00
4f537c80d2 Add security and bug review #003 (4 critical, 12 warnings, 10 suggestions) 2026-06-12 13:03:20 +00:00
c8ab794ef3 Add LICENSE, README, AGENTS.md, and deployment setup guide
Dual MIT/Apache-2.0 license, public-facing README with quick start
and config reference, step-by-step deploy/README.md for Docker and
systemd setups, and AGENTS.md for LLM-assisted development.
2026-06-12 11:42:08 +00:00
0d54eba41e Update architecture specs to reflect live deployment findings and fix two bugs
Architecture updates based on gaps discovered during live deployment testing:

- ADR-023: HTTP/2 client-facing support via ALPN-based protocol detection.
  The spec previously said HTTP/2 was out of scope, but the deployment
  revealed that modern browsers negotiate HTTP/2 via ALPN. The proxy now
  correctly detects the negotiated ALPN protocol and uses the appropriate
  HTTP server builder (http2::Builder for h2, auto::Builder for http/1.1).
  Upstream connections remain HTTP/1.1. Host resolution now falls back to
  URI host for HTTP/2 :authority pseudo-headers.

- ADR-024: ANSI-disabled logging. All tracing-subscriber layers now use
  with_ansi(false) to prevent ANSI escape codes in log output, which broke
  fail2ban regex matching in Docker deployments. Also documents the fail2ban
  regex anchor fix (^RATE_LIMIT → RATE_LIMIT).

Bug fixes found by architecture review:

- Fix missing ALPN protocols in manual TLS mode. build_manual_server_config
  and build_multi_domain_server_config did not set alpn_protocols, meaning
  manual TLS mode could not support HTTP/2. Added h2 and http/1.1 ALPN
  entries to both functions (acme-tls/1 only in ACME mode).

- Fix missing with_ansi(false) in JSON log format. The init_json function
  with file output did not disable ANSI on stdout or file layers, which would
  break fail2ban in production JSON logging mode.

Other spec updates:

- All document statuses updated from draft to reviewed
- proxy.md: documented Server header removal, upstream HTTPS client,
  two-phase timeout enforcement, HTTP/2 host resolution, connect timeout
- tls.md: documented ALPN configuration differing by mode (ACME vs manual)
- overview.md: added HTTP/2 client-facing support to scope, updated crate
  deps (hyper-rustls, rustls-native-certs, hyper-util), clarified out-of-scope
- config.md: fixed http_port type (u16→u32) to match implementation, added
  ANSI-disabled note for LoggingConfig
- operations.md: documented ANSI-disabled logging, fail2ban regex anchor
- open-questions.md: updated OQ-09 resolution (connect timeout fully
  implemented), OQ-10 (C2 bug is fixed)
2026-06-12 11:28:31 +00:00
c2eefddb4f Disable ANSI colors in logs and fix fail2ban regex
- Add with_ansi(false) to all tracing_subscriber fmt layers so log
  output (both stdout and file) is plain text without escape codes.
  This is critical for Docker deployments and fail2ban log parsing.

- Remove ^ anchor from fail2ban failregex since log lines have a
  timestamp/level prefix before RATE_LIMIT.
2026-06-12 10:15:50 +00:00
9ebb8ee7a8 Fix HTTP/2 support: use ALPN-based protocol detection and fallback to URI host
Two changes to properly support HTTP/2 clients:

1. server.rs: Detect ALPN protocol after TLS handshake and use
   hyper::server::conn::http2::Builder for H2 connections instead
   of the auto::Builder which failed to detect HTTP/2 over TLS.
   The auto::Builder's ReadVersion mechanism doesn't work reliably
   with tokio-rustls TlsStreams. For H1 connections, continue using
   auto::Builder with upgrade support.

2. handler.rs: Fallback to URI host when Host header is missing.
   In HTTP/2, the host is conveyed via :authority pseudo-header which
   hyper represents as the URI host, not a Host header.
2026-06-12 06:14:46 +00:00
da28ea749d Mark fix/clean-dead-code as completed 2026-06-12 05:13:10 +00:00
cfba7491ae Merge branch 'fix/fix/clean-dead-code' 2026-06-12 05:12:47 +00:00
cbcd746c9f Remove dead_code annotations and add #[non_exhaustive] to public enums
All #[allow(dead_code)] annotations on now-used items have been removed
(acceptor.rs, acme.rs, config.rs, static_config.rs). #[non_exhaustive]
added to TlsMode, ProxyError, AdminSocketError, and ValidationError
with wildcard match arms in main.rs for the non-exhaustive enums.
2026-06-12 05:12:32 +00:00
8f3c56e6bc Mark fix/add-code-comments as completed 2026-06-12 05:05:36 +00:00
9b3fe23499 Add clarifying comments for correct-but-non-obvious behaviors (C3, W8, W10, W11, S9) 2026-06-12 05:05:10 +00:00
516efb0403 Mark fix/connect-timeout as completed 2026-06-12 05:02:41 +00:00
0c769e682e Wire upstream_connect_timeout_secs to enforce separate connect timeout
Implement two-phase timeout in proxy_handler:
- Inner timeout uses per-site upstream_connect_timeout_secs (default 5s)
  for the connect + first-byte phase
- Outer timeout uses upstream_request_timeout_secs (default 60s) for the
  full request/response cycle
- Set connect_timeout on HttpConnector for both HTTP and HTTPS clients
  (default 5s) to enforce TCP-level connect timeouts
- Use wrap_connector for HTTPS client to apply connect_timeout on the
  underlying HttpConnector
- Add Ok(Err(_)) handler for connect timeout returning 504 Gateway Timeout
2026-06-12 05:01:54 +00:00
1da01a2336 Mark fix/graceful-shutdown as completed 2026-06-12 05:00:28 +00:00
6cb0f8e6fe Merge branch 'fix/fix/graceful-shutdown' into fix/acme-contact-and-challenge 2026-06-12 04:59:32 +00:00
280fe782a1 Implement graceful shutdown for listeners, admin socket, eviction task, and ACME
- Replace handle.abort() for HTTPS server tasks with timeout-based join,
  allowing in-flight requests to drain before forceful shutdown
- Add shutdown_rx to start_admin_socket with tokio::select! for clean
  accept loop exit and Unix socket file cleanup on shutdown
- Add shutdown_rx to start_eviction_task with tokio::select! for
  cancellable eviction loop
- Add shutdown channel to spawn_acme_state for cancellable ACME state
  machine via tokio::select!
- Pass Arc<GracefulShutdown> through setup_tls to ACME state machine
- Move GracefulShutdown creation before admin socket and TLS setup
- Update integration test for new start_eviction_task signature
2026-06-12 04:59:18 +00:00
9bdc2b72af Add acme_contact to test config TOML strings
The main code changes were already committed (3f2550f), but test config
TOML strings in cli.rs, admin/socket.rs, shutdown.rs, and
integration_test.rs still needed the new acme_contact field to pass
validation rule 19.
2026-06-12 04:48:25 +00:00
abc8a44134 Mark fix/request-timeout-scope as completed 2026-06-12 04:47:15 +00:00
3f20c9d01f Add request timeout scope comment (fix/request-timeout-scope) 2026-06-12 04:47:06 +00:00
f02670d5ef Mark Batch 2 tasks as completed (remove-health, access-logging, acme-contact) 2026-06-12 04:46:34 +00:00
5529cf2add Merge branch 'fix/fix/access-logging'
# Conflicts:
#	src/proxy/handler.rs
2026-06-12 04:46:26 +00:00
4cdc3aa0b8 Merge branch 'fix/fix/remove-health-and-hardcode-https' 2026-06-12 04:44:54 +00:00
3f2550fa20 Fix ACME contact email wiring and remove unused challenge config 2026-06-12 04:44:41 +00:00
23c8b74058 Wire up access logging in proxy handler
Add log_request! calls for every proxied request (success, 4xx/5xx from
upstream, 502/504 errors) and log_upstream_error! calls for upstream
connection failures and timeouts. Duration is tracked from request entry
to response using std::time::Instant.
2026-06-12 04:43:59 +00:00
a826106673 Remove /health route from main listener and hardcode X-Forwarded-Proto to https
- Remove health_handler and /health early return from proxy_handler
- Remove /health route from proxy_router (now just fallback)
- Remove is_https field from ProxyState struct
- Remove is_https parameter from inject_proxy_headers, hardcode https
- Add comment explaining why X-Forwarded-Proto is always https
- Remove health_path_returns_200 and health_with_unknown_host tests
- Update all inject_proxy_headers test calls to remove is_https param
- Remove inject_proxy_headers_sets_x_forwarded_proto_http test
2026-06-12 04:43:59 +00:00
19efbd42ee Mark fix/normalize-host-ipv6 as completed 2026-06-12 04:41:52 +00:00
f59a86a8cf Merge branch 'fix/fix/normalize-host-ipv6' 2026-06-12 04:41:20 +00:00
42c721e954 fix: normalize_host handles IPv6 bracket notation
Extract strip_port_from_host into shared utils module and update normalize_host to properly strip brackets from IPv6 addresses like [::1]:443 -> ::1 instead of incorrectly using split(':').next().
2026-06-12 04:40:43 +00:00
53d601522e Mark fix/config-reload-static-drift as completed 2026-06-12 04:36:34 +00:00
a78e3bf374 Fix ConfigReloadHandle static config drift causing stale diff warnings
Change ConfigReloadHandle.static_config from StaticConfig to ArcSwap<StaticConfig>
so that after each reload, the stored static config is updated with the new value.
This prevents repeated stale warnings about the same static config fields on
every reload.
2026-06-12 04:35:20 +00:00
d7f811ffb5 Mark fix/logging-test-global-subscriber as completed 2026-06-12 04:29:48 +00:00
634ceb365a Merge branch 'fix/fix/logging-test-global-subscriber' 2026-06-12 04:29:40 +00:00
667495cf43 fix(logging): handle global subscriber conflict in test
The init_creates_log_directory_and_file test called init() which sets a
global tracing subscriber. When tests run in parallel, other tests may
have already set the subscriber, causing init() to return an error and
the test to fail. Now the test tolerates the 'already set' error while
still asserting the log file is created.
2026-06-12 04:29:28 +00:00
c50d2e8d1b Mark fix/http-port-validation as completed 2026-06-12 04:29:02 +00:00
d24148dae9 Add http_port range validation (0 or 1-65535)
Change http_port type from u16 to u32 to allow out-of-range values to be
caught by validation. Add HttpPortInvalid error variant and validation check
for http_port > 65535. Add test for http_port=65536 producing HttpPortInvalid.
http_port=0 (disabled) remains valid per existing test.
2026-06-12 04:28:35 +00:00
53ef5b32c3 Mark fix/fragile-error-detection as completed 2026-06-12 04:25:49 +00:00
8f9e3b639d Merge branch 'fix/fix/fragile-error-detection' 2026-06-12 04:25:35 +00:00
067f8a9012 fix: use typed hyper::Error::is_incomplete_message() instead of fragile string matching 2026-06-12 04:25:11 +00:00
4db4ecbeb9 Mark fix/integration-test-toml as completed 2026-06-12 04:23:27 +00:00
c4872cb88c fix: correct TOML nesting from [[listeners.listeners.sites]] to [[listeners.sites]] 2026-06-12 04:22:46 +00:00
426333eeda Mark fix/token-bucket-nanosecond as completed 2026-06-12 04:22:35 +00:00
a701c82c90 fix: use nanosecond precision in token bucket refill calculation 2026-06-12 04:21:53 +00:00