Decompose security review #003 findings into 17 fix tasks and 1 review task
Address 4 critical, 8 warning, and 5 suggestion findings from the security and bug review by creating atomic, dependency-ordered tasks: Critical fixes (C1-C4): rate limiter IP source (ADR-025), InFlightCounter increment + drain interval, connector timeout ceiling (ADR-026), JSON format without log file. Validation tightening (W1, W2): upstream host validation, ACME contact email validation. Robustness (W3, W4, W5, W12): upstream URI error handling (502 not silent drop), admin socket resource limits (ADR-027), TlsMode wildcard mismatch, http_port u32→u16. Code quality (W6, W10, W11, S1, S3, W8/W9): config type consolidation, TokenBucket field visibility, reload_mutex #[cfg(test)], dead code removal, root cert count logging, misleading test names. Test coverage (S10): rate limiter ConnectInfo tests (depends on C1 fix). Review: post-security-fix-review checkpoint covering all critical fixes and sensitive config consolidation path.
This commit is contained in:
68
tasks/fix/acme-contact-validation.md
Normal file
68
tasks/fix/acme-contact-validation.md
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
id: fix/acme-contact-validation
|
||||
name: Validate ACME contact email has non-empty address with @ sign (W2)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [W2]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
The ACME contact validation only checks that `contact.starts_with("mailto:")`
|
||||
but doesn't verify there's an actual email address after the prefix. Values
|
||||
like `mailto:` (empty email) pass validation but will fail at the Let's Encrypt
|
||||
API at certificate provisioning time — after the proxy has already started.
|
||||
|
||||
The spec (config.md validation rule 19) now requires: "a valid `mailto:` URI
|
||||
with a non-empty email address containing an `@` sign."
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/config/validation.rs`** — ACME contact validation (lines 148-153):
|
||||
- After checking `starts_with("mailto:")`, validate the email part:
|
||||
```rust
|
||||
let contact = &listener.tls.acme_contact;
|
||||
if contact.is_empty() || !contact.starts_with("mailto:") {
|
||||
errors.push(ValidationError::AcmeContactInvalid {
|
||||
bind_addr: listener.bind_addr.clone(),
|
||||
});
|
||||
} else {
|
||||
let email = &contact[7..]; // after "mailto:"
|
||||
if email.is_empty() || !email.contains('@') {
|
||||
errors.push(ValidationError::AcmeContactInvalid {
|
||||
bind_addr: listener.bind_addr.clone(),
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
- Add tests for:
|
||||
- Valid: `mailto:admin@example.com`
|
||||
- Invalid: `mailto:` (empty), `mailto:user` (no @), empty string, non-mailto
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `mailto:` (empty email) is rejected by validation
|
||||
- [ ] `mailto:user` (no @ sign) is rejected by validation
|
||||
- [ ] `mailto:admin@example.com` still passes validation
|
||||
- [ ] Non-mailto values still rejected
|
||||
- [ ] New unit tests for the tightened validation
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/config.md — validation rule 19
|
||||
- docs/reviews/003-security-and-bug-review.md — W2 finding
|
||||
- src/config/validation.rs — ACME contact validation, existing tests
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
52
tasks/fix/admin-socket-reload-mutex-visibility.md
Normal file
52
tasks/fix/admin-socket-reload-mutex-visibility.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
id: fix/admin-socket-reload-mutex-visibility
|
||||
name: Gate AdminSocket::reload_mutex with #[cfg(test)] (W11)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [W11]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`AdminSocket::reload_mutex()` is a public method that exists solely for the
|
||||
`test_reload_serialized_with_mutex` test. It exposes an internal synchronization
|
||||
primitive, and the test acquires the mutex before sending a reload command —
|
||||
coupling the test to implementation details.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/admin/socket.rs`**:
|
||||
- Gate `reload_mutex()` with `#[cfg(test)]`:
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
pub fn reload_mutex(&self) -> Arc<Mutex<()>> {
|
||||
self.reload_mutex.clone()
|
||||
}
|
||||
```
|
||||
- The existing test `test_reload_serialized_with_mutex` already uses this
|
||||
method, so it will continue to work.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `reload_mutex()` is only available in test builds (`#[cfg(test)]`)
|
||||
- [ ] The `test_reload_serialized_with_mutex` test still compiles and passes
|
||||
- [ ] `cargo clippy` passes with no warnings in non-test build
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — W11 finding
|
||||
- src/admin/socket.rs — `AdminSocket::reload_mutex()`, test
|
||||
|
||||
## Notes
|
||||
|
||||
> The review suggests alternatively removing the method entirely and testing
|
||||
> serialization through observable behavior. For Phase 1, gating with
|
||||
> `#[cfg(test)]` is the simpler fix that preserves the existing test.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
100
tasks/fix/admin-socket-resource-limits.md
Normal file
100
tasks/fix/admin-socket-resource-limits.md
Normal file
@@ -0,0 +1,100 @@
|
||||
---
|
||||
id: fix/admin-socket-resource-limits
|
||||
name: Add read timeout and line length limit to admin socket (ADR-027)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [W4, S4]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
The admin socket's `handle_connection` reads one newline-terminated line with
|
||||
`reader.read_line(&mut line)` but sets no timeout and no length limit. This
|
||||
allows:
|
||||
1. A client to connect and send no data, holding a connection indefinitely
|
||||
2. A client to send unbounded data without a newline, causing OOM
|
||||
|
||||
ADR-027 specifies: 5-second read timeout, 4096 byte line length limit.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/admin/socket.rs`** — `handle_connection` function (lines 166-210):
|
||||
- Wrap the `BufReader` with `tokio::io::take` to limit read size to 4096 bytes:
|
||||
```rust
|
||||
let (reader, mut writer) = stream.into_split();
|
||||
let mut reader = BufReader::new(tokio::io::take(reader, 4096));
|
||||
let mut line = String::new();
|
||||
```
|
||||
- Wrap the `read_line` call in a `tokio::time::timeout`:
|
||||
```rust
|
||||
use std::time::Duration;
|
||||
let read_result = tokio::time::timeout(
|
||||
Duration::from_secs(5),
|
||||
reader.read_line(&mut line),
|
||||
).await;
|
||||
```
|
||||
- Handle timeout and line-too-long cases:
|
||||
```rust
|
||||
match read_result {
|
||||
Ok(Ok(0)) | Ok(Err(_)) => {
|
||||
// existing "invalid input" handling
|
||||
}
|
||||
Err(_) => {
|
||||
// timeout
|
||||
tracing::debug!("admin socket connection timed out");
|
||||
let _ = writer.write_all(
|
||||
format!("{}\n", serde_json::to_string(&ErrorResponse {
|
||||
status: "error",
|
||||
message: "read timeout".to_string(),
|
||||
}).unwrap()).as_bytes()
|
||||
).await;
|
||||
return;
|
||||
}
|
||||
Ok(Ok(n)) => {
|
||||
// Check if line was truncated (no newline found within limit)
|
||||
if !line.ends_with('\n') && n > 0 {
|
||||
tracing::warn!("admin socket command exceeded 4096 byte limit");
|
||||
let _ = writer.write_all(
|
||||
format!("{}\n", serde_json::to_string(&ErrorResponse {
|
||||
status: "error",
|
||||
message: "command too long".to_string(),
|
||||
}).unwrap()).as_bytes()
|
||||
).await;
|
||||
return;
|
||||
}
|
||||
// ... existing command handling
|
||||
}
|
||||
}
|
||||
```
|
||||
- Update existing tests and add new tests for timeout and line length limit.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Read timeout of 5 seconds applied to admin socket connections
|
||||
- [ ] Line length limit of 4096 bytes applied (via `tokio::io::take`)
|
||||
- [ ] Timeout logged at `debug` level per ADR-027
|
||||
- [ ] Line-too-long logged at `warn` level per ADR-027
|
||||
- [ ] Both conditions return appropriate error JSON to the client
|
||||
- [ ] Legitimate commands (`reload`, `status`) still work
|
||||
- [ ] New tests for timeout and line length limit behavior
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/decisions/027-admin-socket-resource-limits.md — ADR-027
|
||||
- docs/architecture/operations.md — admin socket resource limits
|
||||
- docs/reviews/003-security-and-bug-review.md — W4, S4 findings
|
||||
- src/admin/socket.rs — `handle_connection`
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
58
tasks/fix/connector-timeout-ceiling.md
Normal file
58
tasks/fix/connector-timeout-ceiling.md
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
id: fix/connector-timeout-ceiling
|
||||
name: Raise connector timeout ceiling to 30s per ADR-026
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: low
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [C3]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
The HTTP connector's `set_connect_timeout` is hardcoded to 5 seconds. Per-site
|
||||
connect timeout values > 5s are silently capped because the connector's internal
|
||||
timeout fires before the `tokio::time::timeout` wrapper.
|
||||
|
||||
ADR-026 establishes a 30-second ceiling on the connector. The per-site
|
||||
`tokio::time::timeout` enforces the actual per-site connect timeout. The
|
||||
connector ceiling is a safety backstop, not the primary enforcement mechanism.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/proxy/handler.rs`**:
|
||||
- Change `DEFAULT_CONNECT_TIMEOUT_SECS` from 5 to 30:
|
||||
```rust
|
||||
const CONNECT_TIMEOUT_CEILING_SECS: u64 = 30;
|
||||
```
|
||||
- Rename the constant to make its role clear (ceiling, not the default connect
|
||||
timeout for sites).
|
||||
- Update both `create_http_client` and `create_https_client` to use the renamed
|
||||
constant.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Connector `set_connect_timeout` is set to 30 seconds
|
||||
- [ ] Constant is named to reflect its ceiling role (not "default")
|
||||
- [ ] Per-site connect timeout (default 5s) via `tokio::time::timeout` is unchanged
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/decisions/026-connector-timeout-ceiling.md — ADR-026
|
||||
- docs/architecture/proxy.md — Upstream connection section
|
||||
- docs/reviews/003-security-and-bug-review.md — C3 finding
|
||||
- src/proxy/handler.rs — `create_http_client`, `create_https_client`
|
||||
|
||||
## Notes
|
||||
|
||||
> The previous `fix/connect-timeout` task wired the two-phase timeout approach.
|
||||
> This task completes that work by raising the connector ceiling so per-site
|
||||
> timeouts > 5s actually work.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
95
tasks/fix/consolidate-config-types.md
Normal file
95
tasks/fix/consolidate-config-types.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
id: fix/consolidate-config-types
|
||||
name: Delete RawConfig and use FullConfig in load_config (W6, S5)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: medium
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [W6, S5]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`RawConfig` (in `src/cli.rs`) and `FullConfig` (in `src/config/mod.rs`) have
|
||||
identical fields and identical serde attributes. They exist because the initial
|
||||
load path manually constructs `StaticConfig` + `SerializableDynamicConfig`,
|
||||
while the reload path uses `FullConfig::into_static_and_dynamic()`. Any new
|
||||
config field must be added in two places.
|
||||
|
||||
The fix is to delete `RawConfig` and use `FullConfig` in `load_config`. The
|
||||
`collect_sites` helper can also be removed since `into_static_and_dynamic`
|
||||
already collects sites from all listeners.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/cli.rs`**:
|
||||
- Delete the `RawConfig` struct (lines 49-65)
|
||||
- Rewrite `load_config` to use `FullConfig`:
|
||||
```rust
|
||||
pub fn load_config(cli: &Cli) -> Result<LoadedConfig> {
|
||||
let config_path = Path::new(&cli.config);
|
||||
let config_content = std::fs::read_to_string(config_path)
|
||||
.with_context(|| format!("failed to read config file: {}", cli.config))?;
|
||||
|
||||
let full_config = crate::config::FullConfig::parse(&config_content)
|
||||
.with_context(|| format!("failed to parse config file: {}", cli.config))?;
|
||||
|
||||
let (static_config, dynamic_config) = full_config.into_static_and_dynamic();
|
||||
|
||||
let allow_wildcard_bind = static_config.allow_wildcard_bind || cli.allow_wildcard_bind;
|
||||
|
||||
validate(&static_config, &dynamic_config, cli.allow_wildcard_bind).map_err(|errors| {
|
||||
anyhow::anyhow!(
|
||||
"config validation failed:\n{}",
|
||||
errors.iter()
|
||||
.map(|e| format!(" - {}", e))
|
||||
.collect::<Vec<_>>()
|
||||
.join("\n")
|
||||
)
|
||||
})?;
|
||||
|
||||
Ok(LoadedConfig {
|
||||
static_config,
|
||||
dynamic_config,
|
||||
allow_wildcard_bind,
|
||||
})
|
||||
}
|
||||
```
|
||||
- Delete the `collect_sites` helper function (lines 112-118)
|
||||
- Remove the now-unused imports: `SerializableDynamicConfig`, `BodyConfig`,
|
||||
`RateLimitConfig` (if they're only used via `RawConfig`)
|
||||
|
||||
**`src/config/mod.rs`**:
|
||||
- No changes needed — `FullConfig` already has `into_static_and_dynamic()`
|
||||
- Verify `FullConfig::parse` and `into_static_and_dynamic` produce identical
|
||||
results to the old `RawConfig` path
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `RawConfig` struct deleted from `src/cli.rs`
|
||||
- [ ] `collect_sites` function deleted from `src/cli.rs`
|
||||
- [ ] `load_config` uses `FullConfig::parse` + `into_static_and_dynamic`
|
||||
- [ ] Startup config loading produces same results as before
|
||||
- [ ] Config validation still runs on both startup and reload
|
||||
- [ ] All existing `cli.rs` tests pass
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/config.md — config reload, FullConfig
|
||||
- docs/reviews/003-security-and-bug-review.md — W6, S5 findings
|
||||
- src/cli.rs — RawConfig, load_config, collect_sites
|
||||
- src/config/mod.rs — FullConfig, into_static_and_dynamic
|
||||
|
||||
## Notes
|
||||
|
||||
> This changes the startup config parsing path. While the behavior should be
|
||||
> identical (same fields, same serde attributes), this is a sensitive area. A
|
||||
> review task follows the code quality generation.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
77
tasks/fix/http-port-type-u16.md
Normal file
77
tasks/fix/http-port-type-u16.md
Normal file
@@ -0,0 +1,77 @@
|
||||
---
|
||||
id: fix/http-port-type-u16
|
||||
name: Change http_port type from u32 to u16 per spec (W12)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [W12]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`http_port` is declared as `u32` in `ListenerConfig` but `https_port` is `u16`.
|
||||
Both represent TCP port numbers (valid range 1–65535). The type inconsistency
|
||||
means comparisons require casting (`listener.http_port == listener.https_port as
|
||||
u32`) and `http_port` could theoretically hold values > 65535 caught by
|
||||
validation rather than the type system.
|
||||
|
||||
The spec (config.md) now declares `http_port` as `u16`.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/config/static_config.rs`**:
|
||||
- Change `http_port` field type from `u32` to `u16` in `ListenerConfig`
|
||||
- Update `default_http_port()` return type to `u16`
|
||||
|
||||
**`src/config/validation.rs`**:
|
||||
- Change `DuplicateHttpBind` error type: `http_port: u32` → `http_port: u16`
|
||||
- Change `HttpsAndHttpPortSame` error type: `http_port: u32` → `http_port: u16`
|
||||
- Change `HttpPortInvalid` error type: `http_port: u32` → `http_port: u16`
|
||||
- Remove `as u32` casts — both `http_port` and `https_port` are now `u16`
|
||||
- Remove the `http_port > 65535` check (impossible with `u16`, but keep `http_port > 0`
|
||||
for the "disabled" check)
|
||||
- Update comparison: `listener.http_port == listener.https_port` (no cast needed)
|
||||
- Update health check port comparison: remove `as u32` cast
|
||||
|
||||
**`src/main.rs`**:
|
||||
- Update any `http_port` references that assume `u32`
|
||||
|
||||
**`src/cli.rs`**:
|
||||
- Update `RawConfig.http_port` type from `u32` to `u16` (if `RawConfig` still
|
||||
exists after `fix/consolidate-config-types`; if not, this file is unaffected)
|
||||
|
||||
**`src/config/test_fixtures.rs`**:
|
||||
- Update any test fixture `http_port` values from `u32` to `u16`
|
||||
|
||||
**`tests/integration_test.rs`**:
|
||||
- Update any hardcoded `http_port` values
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `http_port` is `u16` in `ListenerConfig`
|
||||
- [ ] All `as u32` casts on `http_port` removed
|
||||
- [ ] `http_port > 65535` validation check removed (impossible with u16)
|
||||
- [ ] `http_port == https_port` comparison works without casting
|
||||
- [ ] All validation tests pass
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/config.md — `http_port` type declaration
|
||||
- docs/reviews/003-security-and-bug-review.md — W12 finding
|
||||
- src/config/static_config.rs — `ListenerConfig` struct
|
||||
- src/config/validation.rs — validation rules, error types
|
||||
|
||||
## Notes
|
||||
|
||||
> If `fix/consolidate-config-types` runs first and removes `RawConfig`, the
|
||||
> `src/cli.rs` changes in this task are reduced. The two tasks are independent
|
||||
> in terms of the type change itself.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
74
tasks/fix/inflight-counter-increment.md
Normal file
74
tasks/fix/inflight-counter-increment.md
Normal file
@@ -0,0 +1,74 @@
|
||||
---
|
||||
id: fix/inflight-counter-increment
|
||||
name: Fix InFlightCounter to increment before spawning task (C2 + drain interval)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: medium
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [C2]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`InFlightCounter::increment()` is never called anywhere in the codebase. The
|
||||
`InFlightGuard` only decrements on drop. Since `count` stays at 0, the first
|
||||
guard drop does `fetch_sub(1)` on an `AtomicUsize` with value 0, which wraps to
|
||||
`usize::MAX`. `is_zero()` checks `count == 0`, which never becomes true again.
|
||||
The drain logic in `drain_in_flight` always times out.
|
||||
|
||||
The spec (operations.md shutdown sequence) states: "each request **must**
|
||||
increment the counter when it begins and decrement when it completes (via guard
|
||||
drop). The increment must happen before the request task is spawned."
|
||||
|
||||
Additionally, the spec states the drain polls every 100ms, but the current
|
||||
implementation uses 50ms. Align with the spec.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/server.rs`**:
|
||||
- Fold the increment into `InFlightGuard::new()` so it's impossible to forget:
|
||||
```rust
|
||||
impl InFlightGuard {
|
||||
fn new(counter: Arc<InFlightCounter>) -> Self {
|
||||
counter.increment();
|
||||
Self(counter)
|
||||
}
|
||||
}
|
||||
```
|
||||
- Update `serve_https_listener` to use `InFlightGuard::new(in_flight.clone())`
|
||||
instead of `InFlightGuard(in_flight.clone())`.
|
||||
- Make `InFlightGuard`'s tuple struct private (if it isn't already) so callers
|
||||
must use `new()`.
|
||||
|
||||
**`src/server.rs` — `drain_in_flight`**:
|
||||
- Change polling interval from 50ms to 100ms per the spec (operations.md):
|
||||
```rust
|
||||
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `InFlightGuard::new()` calls `counter.increment()` before returning
|
||||
- [ ] `InFlightGuard` is constructed via `new()` only, not the tuple struct
|
||||
- [ ] `serve_https_listener` uses `InFlightGuard::new(in_flight.clone())`
|
||||
- [ ] `drain_in_flight` polls every 100ms (not 50ms)
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/operations.md — shutdown sequence, in-flight counter
|
||||
- docs/reviews/003-security-and-bug-review.md — C2 finding
|
||||
- src/server.rs — InFlightCounter, InFlightGuard, drain_in_flight
|
||||
- src/main.rs — drain_in_flight caller
|
||||
|
||||
## Notes
|
||||
|
||||
> The previous `fix/graceful-shutdown` task addressed the abort-vs-join logic
|
||||
> but did not fix the increment bug. This task completes that work.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
58
tasks/fix/json-format-without-logfile.md
Normal file
58
tasks/fix/json-format-without-logfile.md
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
id: fix/json-format-without-logfile
|
||||
name: Fix JSON format not applied when no log file is configured (C4)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [C4]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
When `format = "json"` is configured but no `log_file_path` is set, the `None`
|
||||
branch of `init_json` creates a layer **without** calling `.json()`. The output
|
||||
is plain text, not JSON. The `format = "json"` config value is silently ignored.
|
||||
|
||||
The spec (operations.md) states: "Both output destinations must respect the
|
||||
`format` config value: when `format = "json"`, both file and stdout output must
|
||||
use JSON formatting."
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/logging/mod.rs`** — `init_json` function, `None` branch (lines 54-58):
|
||||
- Add `.json()` to the stdout-only layer:
|
||||
```rust
|
||||
None => {
|
||||
let layer = tracing_subscriber::fmt::layer()
|
||||
.json()
|
||||
.with_ansi(false)
|
||||
.with_filter(env_filter);
|
||||
tracing_subscriber::registry().with(layer).try_init()?;
|
||||
}
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `.json()` is called in the `None` branch of `init_json`
|
||||
- [ ] `format = "json"` with no `log_file_path` produces JSON output on stdout
|
||||
- [ ] `format = "json"` with `log_file_path` still produces JSON on both outputs
|
||||
- [ ] `format = "text"` paths are unchanged
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/operations.md — logging output, format guarantee
|
||||
- docs/reviews/003-security-and-bug-review.md — C4 finding
|
||||
- src/logging/mod.rs — `init_json` function
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
66
tasks/fix/log-root-cert-count.md
Normal file
66
tasks/fix/log-root-cert-count.md
Normal file
@@ -0,0 +1,66 @@
|
||||
---
|
||||
id: fix/log-root-cert-count
|
||||
name: Log system root certificate count at startup (S3)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [S3]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`root_certs()` loads native certificates silently — only logs errors. If the
|
||||
system has zero root certificates (misconfigured CA bundle), all HTTPS upstream
|
||||
connections will fail with opaque TLS errors and no diagnostic message.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/proxy/handler.rs`** — `root_certs()` function (lines 246-258):
|
||||
- Add an info-level log with cert count and warn if zero:
|
||||
```rust
|
||||
fn root_certs() -> rustls::RootCertStore {
|
||||
let mut roots = rustls::RootCertStore::empty();
|
||||
let result = rustls_native_certs::load_native_certs();
|
||||
for cert in result.certs {
|
||||
roots.add(cert).ok();
|
||||
}
|
||||
let cert_count = roots.len();
|
||||
let error_count = result.errors.len();
|
||||
if cert_count == 0 {
|
||||
warn!(certs_loaded = cert_count, errors = error_count,
|
||||
"no system root certificates loaded — HTTPS upstream connections will fail");
|
||||
} else {
|
||||
info!(certs_loaded = cert_count, errors = error_count,
|
||||
"loaded system root certificates");
|
||||
}
|
||||
for err in &result.errors {
|
||||
warn!(error = %err, "failed to load native certificate");
|
||||
}
|
||||
roots
|
||||
}
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Info-level log with cert count when certs > 0
|
||||
- [ ] Warn-level log when cert count is 0
|
||||
- [ ] Error count included in log output
|
||||
- [ ] Individual cert load errors still logged at warn level
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — S3 finding
|
||||
- src/proxy/handler.rs — `root_certs()` function
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
67
tasks/fix/rate-limiter-connectinfo-tests.md
Normal file
67
tasks/fix/rate-limiter-connectinfo-tests.md
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
id: fix/rate-limiter-connectinfo-tests
|
||||
name: Update rate limiter tests to use ConnectInfo instead of X-Forwarded-For (S10)
|
||||
status: pending
|
||||
depends_on: [fix/rate-limiter-ip-source]
|
||||
scope: narrow
|
||||
risk: medium
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [S10]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
After `fix/rate-limiter-ip-source` removes X-Forwarded-For parsing from the rate
|
||||
limiter, existing integration tests that pass client IPs via `X-Forwarded-For`
|
||||
headers will no longer work correctly. The rate limiter now reads exclusively
|
||||
from `ConnectInfo<SocketAddr>`, so tests must provide `ConnectInfo` in request
|
||||
extensions.
|
||||
|
||||
This task also addresses review finding S10: "No test verifies the ConnectInfo
|
||||
extraction path, which is the primary path after C1 is fixed."
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`tests/integration_test.rs`**:
|
||||
- Find all rate limit integration tests (lines 90-163 and any others) that set
|
||||
`X-Forwarded-For` headers for rate limiting purposes
|
||||
- Update those tests to set `ConnectInfo<SocketAddr>` in request extensions
|
||||
instead of (or in addition to) `X-Forwarded-For`
|
||||
- Verify that `X-Forwarded-For` headers are now **ignored** by the rate limiter
|
||||
— add a test that sends requests with different `X-Forwarded-For` values from
|
||||
the same `ConnectInfo` IP and confirms they all count against the same bucket
|
||||
- Add a test that verifies requests **without** `ConnectInfo` are rejected with
|
||||
429 (per ADR-025)
|
||||
|
||||
**`src/rate_limit/mod.rs`** (test module):
|
||||
- Update any unit tests that rely on `X-Forwarded-For` extraction
|
||||
- Add a test that verifies `ConnectInfo`-based rate limiting works without any
|
||||
`X-Forwarded-For` header
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] All rate limit integration tests use `ConnectInfo` for client IP
|
||||
- [ ] Tests verify `X-Forwarded-For` headers are ignored by rate limiter
|
||||
- [ ] New test: requests without `ConnectInfo` are rejected with 429
|
||||
- [ ] New test: different `X-Forwarded-For` values from same `ConnectInfo` IP
|
||||
count against the same bucket (proving XFF is ignored)
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/decisions/025-rate-limiter-ip-source.md — ADR-025
|
||||
- docs/reviews/003-security-and-bug-review.md — S10 finding
|
||||
- tests/integration_test.rs — rate limit tests
|
||||
- src/rate_limit/mod.rs — rate limiter middleware and tests
|
||||
|
||||
## Notes
|
||||
|
||||
> This task depends on `fix/rate-limiter-ip-source` because the code changes
|
||||
> must be in place before the tests can be updated. Attempting this before the
|
||||
> C1 fix would require writing tests for the old (broken) behavior.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
78
tasks/fix/rate-limiter-ip-source.md
Normal file
78
tasks/fix/rate-limiter-ip-source.md
Normal file
@@ -0,0 +1,78 @@
|
||||
---
|
||||
id: fix/rate-limiter-ip-source
|
||||
name: Fix rate limiter to use ConnectInfo only (ADR-025)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: high
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [C1]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
The rate limiter currently extracts the client IP by checking the `X-Forwarded-For`
|
||||
header **first**, then falling back to `ConnectInfo`. This is a security
|
||||
vulnerability: since the rate limiter runs as middleware before the proxy
|
||||
handler injects trusted headers, the `X-Forwarded-For` value is whatever the
|
||||
client sent — completely untrusted. This enables two attack vectors:
|
||||
|
||||
1. **Rate limit bypass**: Attacker sends each request with a different random
|
||||
`X-Forwarded-For` value, evading per-IP token buckets entirely.
|
||||
2. **DoS via IP spoofing**: Attacker sends requests with `X-Forwarded-For:
|
||||
<victim IP>`, depleting the victim's bucket.
|
||||
|
||||
ADR-025 establishes that the rate limiter must use `ConnectInfo<SocketAddr>` as
|
||||
the **sole** source of client IP. If `ConnectInfo` is absent, the request must
|
||||
be **rejected** (not fall back to an untrusted header).
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/rate_limit/mod.rs`**:
|
||||
- Replace the current IP extraction logic in `rate_limit_middleware` (lines 66-76)
|
||||
with ConnectInfo-first, no fallback:
|
||||
```rust
|
||||
let client_ip = req
|
||||
.extensions()
|
||||
.get::<axum::extract::ConnectInfo<std::net::SocketAddr>>()
|
||||
.map(|ci| ci.ip());
|
||||
```
|
||||
- When `ConnectInfo` is absent, return 429 (reject the request) rather than
|
||||
passing through without rate limiting. Per ADR-025: "If ConnectInfo is absent,
|
||||
the request must be rejected rather than falling back to an untrusted header."
|
||||
```rust
|
||||
let Some(ip) = client_ip else {
|
||||
return (StatusCode::TOO_MANY_REQUESTS, "Too Many Requests").into_response();
|
||||
};
|
||||
```
|
||||
- Remove all `X-Forwarded-For` header parsing from the rate limiter middleware.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Rate limiter extracts client IP from `ConnectInfo<SocketAddr>` only
|
||||
- [ ] No `X-Forwarded-For` header parsing in the rate limiter middleware
|
||||
- [ ] Requests without `ConnectInfo` are rejected with 429 (not passed through)
|
||||
- [ ] `cargo test` passes (note: integration tests that pass `X-Forwarded-For`
|
||||
for rate limiting will need to be updated in the separate test task
|
||||
`fix/rate-limiter-connectinfo-tests`)
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/decisions/025-rate-limiter-ip-source.md — ADR-025
|
||||
- docs/architecture/operations.md — Rate limiting design, IP source section
|
||||
- docs/architecture/proxy.md — Rate limiter IP source section
|
||||
- docs/reviews/003-security-and-bug-review.md — C1 finding
|
||||
- src/rate_limit/mod.rs — current implementation (lines 61-104)
|
||||
|
||||
## Notes
|
||||
|
||||
> This is the highest-priority security fix. After this change, the integration
|
||||
> tests in `tests/integration_test.rs` that rely on `X-Forwarded-For` for rate
|
||||
> limiting will need to be updated. That work is tracked separately in
|
||||
> `fix/rate-limiter-connectinfo-tests`.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
96
tasks/fix/remove-dead-code-remnants.md
Normal file
96
tasks/fix/remove-dead-code-remnants.md
Normal file
@@ -0,0 +1,96 @@
|
||||
---
|
||||
id: fix/remove-dead-code-remnants
|
||||
name: Remove dead code items identified in security review #003 (S1)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: trivial
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [S1]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
Security review #003 identifies the following dead code items that survived the
|
||||
previous `fix/clean-dead-code` task. These items are defined but never called in
|
||||
production code:
|
||||
|
||||
| Item | File | Note |
|
||||
|------|------|------|
|
||||
| `log_rate_limit!` macro | `src/logging/format.rs:72-83` | Rate limiter uses `warn!` directly |
|
||||
| `log_config_reload!` macro | `src/logging/format.rs:97-106` | Reload uses `info!`/`warn!` directly |
|
||||
| `format_event_fields()` | `src/logging/format.rs:50-54` | Never called |
|
||||
| `ProxyError::PayloadTooLarge` | `src/proxy/error.rs:12` | Body limit returns tuple directly |
|
||||
| `ProxyError::NotFound` | `src/proxy/error.rs:20` | Code uses `UnknownHost` instead |
|
||||
| `ProxyError::BadRequest` | `src/proxy/error.rs:22` | Code uses `MissingHost` instead |
|
||||
| `ProxyError::UpstreamTls` | `src/proxy/error.rs:28` | Never constructed |
|
||||
| `build_multi_domain_server_config()` | `src/tls/config.rs:78-102` | Never called |
|
||||
| `SniCertResolver` | `src/tls/config.rs:104-126` | Only used by above |
|
||||
| `AcmeTlsConfig::directory_url()` | `src/tls/acme.rs:53-59` | Only used in tests |
|
||||
|
||||
### Changes Required
|
||||
|
||||
For each item, either wire it up (if it should be used) or remove it:
|
||||
|
||||
**Logging macros** (`src/logging/format.rs`):
|
||||
- `log_rate_limit!`: The rate limiter middleware now uses `warn!` directly.
|
||||
Remove the macro and its test invocation in `log_macros_compile`.
|
||||
- `log_config_reload!`: The admin socket reload uses `info!`/`warn!` directly.
|
||||
Remove the macro and its test invocation.
|
||||
- `format_event_fields()`: Never called externally. Remove it.
|
||||
|
||||
**ProxyError variants** (`src/proxy/error.rs`):
|
||||
- `PayloadTooLarge`: Not used by body limit middleware (returns tuple directly).
|
||||
Keep for now with a comment explaining it's reserved for future use, OR remove
|
||||
if no plan to use it. The `#[non_exhaustive]` attribute means removing a variant
|
||||
is not a breaking change for external consumers.
|
||||
- `NotFound`: Superseded by `UnknownHost`. Remove and update `status_code()` and
|
||||
`body()` match arms.
|
||||
- `BadRequest`: Superseded by `MissingHost`. Remove and update match arms.
|
||||
- `UpstreamTls`: Never constructed. Remove and update match arms.
|
||||
- Update unit tests that reference removed variants.
|
||||
|
||||
**TLS dead code** (`src/tls/config.rs`, `src/tls/acme.rs`):
|
||||
- `build_multi_domain_server_config()`: Remove function.
|
||||
- `SniCertResolver`: Remove struct and impl (only used by above function).
|
||||
- `AcmeTlsConfig::directory_url()`: Check if it's used in tests. If only in
|
||||
tests, gate with `#[cfg(test)]`. If truly unused, remove.
|
||||
|
||||
**Test helpers** (`tests/helpers/http_test_helper.rs`):
|
||||
- `TestUpstream::url()`, `TestUpstream::upstream_addr()`: Currently annotated
|
||||
`#[allow(dead_code)]`. If not used in any test, remove. If used, remove the
|
||||
`#[allow(dead_code)]` annotation.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `log_rate_limit!` and `log_config_reload!` macros removed
|
||||
- [ ] `format_event_fields()` removed
|
||||
- [ ] Unused `ProxyError` variants removed (`NotFound`, `BadRequest`, `UpstreamTls`)
|
||||
- [ ] `PayloadTooLarge` either removed or documented with a TODO comment
|
||||
- [ ] `build_multi_domain_server_config()` and `SniCertResolver` removed
|
||||
- [ ] `directory_url()` either `#[cfg(test)]`-gated or removed
|
||||
- [ ] Dead test helper methods removed or `#[allow(dead_code)]` removed
|
||||
- [ ] All match arms in `ProxyError` methods updated for removed variants
|
||||
- [ ] All unit tests updated for removed variants
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — S1 finding
|
||||
- src/logging/format.rs — dead macros and function
|
||||
- src/proxy/error.rs — unused ProxyError variants
|
||||
- src/tls/config.rs — dead TLS functions
|
||||
- src/tls/acme.rs — directory_url method
|
||||
- tests/helpers/http_test_helper.rs — dead_code-annotated helpers
|
||||
|
||||
## Notes
|
||||
|
||||
> The previous `fix/clean-dead-code` task added `#[non_exhaustive]` and removed
|
||||
> some dead code. This task addresses the items that survived that round or were
|
||||
> newly identified in review #003.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
57
tasks/fix/rename-misleading-test.md
Normal file
57
tasks/fix/rename-misleading-test.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
id: fix/rename-misleading-test
|
||||
name: Rename misleading health check test and fix dynamic config test (W8, W9)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [W8, W9]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
Two test quality issues in `tests/integration_test.rs`:
|
||||
|
||||
**W8**: `test_health_check_disabled_when_port_zero` is misleading — port 0 means
|
||||
"OS picks a random port" not "disabled". The test verifies that
|
||||
`start_health_check_listener(0)` works (binds to a random port), which is
|
||||
correct but the name implies it tests the disable path. The actual disable
|
||||
logic is in `main.rs:94` where `health_check_port > 0` gates the call.
|
||||
|
||||
**W9**: `test_dynamic_config_with_limit` constructs `DynamicConfig` directly
|
||||
with `routing_table: Default::default()` (empty HashMap), bypassing
|
||||
`DynamicConfig::from_sites()`. The body limit tests only read
|
||||
`body.limit_bytes` so it doesn't matter, but it's a latent trap for anyone
|
||||
copying this pattern.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`tests/integration_test.rs`**:
|
||||
- Rename `test_health_check_disabled_when_port_zero` to
|
||||
`test_health_check_binds_random_port_when_zero`
|
||||
- Update `test_dynamic_config_with_limit` to use `DynamicConfig::from_sites()`
|
||||
instead of directly constructing with an empty routing table, OR add a
|
||||
comment explaining why the routing table is intentionally empty for this test
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Test renamed to `test_health_check_binds_random_port_when_zero`
|
||||
- [ ] `test_dynamic_config_with_limit` either uses `from_sites()` or has a
|
||||
comment explaining the intentional empty routing table
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — W8, W9 findings
|
||||
- tests/integration_test.rs — test names and implementations
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
62
tasks/fix/tls-mode-wildcard-mismatch.md
Normal file
62
tasks/fix/tls-mode-wildcard-mismatch.md
Normal file
@@ -0,0 +1,62 @@
|
||||
---
|
||||
id: fix/tls-mode-wildcard-mismatch
|
||||
name: Add explicit listener/acceptor count check or remove TlsMode wildcard (W5)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [W5]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
The `match tls_mode` in `main.rs` has a wildcard `_` arm that logs a warning
|
||||
and pushes **no** acceptor. Then `bound_listeners.into_iter().zip(tls_acceptors.into_iter())`
|
||||
uses `zip`, which silently stops at the shorter iterator. If the wildcard arm
|
||||
were ever reached, some listeners would have no TLS acceptor and would be
|
||||
silently dropped.
|
||||
|
||||
`setup_tls` already rejects unknown modes with `bail!`, so the wildcard is
|
||||
unreachable in practice. But it's a latent bug for future refactors.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/main.rs`** (lines 170-194):
|
||||
- Option A (preferred): Remove the wildcard `_` arm entirely. Since `TlsMode`
|
||||
only has two variants (`Manual` and `Acme`) and `setup_tls` already validates,
|
||||
the wildcard is dead code. Removing it means the compiler will catch future
|
||||
`TlsMode` additions.
|
||||
|
||||
- Option B: Add an explicit count check after the match loop:
|
||||
```rust
|
||||
if bound_listeners.len() != tls_acceptors.len() {
|
||||
anyhow::bail!("listener/acceptor count mismatch: {} listeners, {} acceptors",
|
||||
bound_listeners.len(), tls_acceptors.len());
|
||||
}
|
||||
```
|
||||
|
||||
If removing the wildcard, this check is redundant but harmless as a
|
||||
defense-in-depth assertion.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Wildcard `_` arm removed from the `match tls_mode` block, OR
|
||||
- [ ] Explicit count mismatch check added after the acceptor construction loop
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — W5 finding
|
||||
- src/main.rs — TLS acceptor construction loop (lines 170-194)
|
||||
- src/tls/acceptor.rs — `setup_tls`, `TlsMode` enum
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
51
tasks/fix/token-bucket-field-visibility.md
Normal file
51
tasks/fix/token-bucket-field-visibility.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
id: fix/token-bucket-field-visibility
|
||||
name: Make TokenBucket fields private except last_access (W10, S6)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
impact: isolated
|
||||
level: implementation
|
||||
review_findings: [W10, S6]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
All `TokenBucket` fields are `pub` but only `last_access` is read externally (by
|
||||
`evict_stale` in `rate_limit/mod.rs`). The other fields (`tokens`, `last_refill`,
|
||||
`rate`, `max`) should be private to prevent accidental direct mutation that
|
||||
bypasses `try_consume`/`refill` logic.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/rate_limit/bucket.rs`**:
|
||||
- Make `tokens`, `last_refill`, `rate`, `max` private (remove `pub`)
|
||||
- Keep `last_access` as `pub(crate)` for `evict_stale` access
|
||||
- `TokenBucket::new()` already exists as a constructor, so no changes needed there
|
||||
- Update any unit tests that directly access private fields. The tests in
|
||||
`bucket.rs` are in the same module so they have access to private fields.
|
||||
Tests in `mod.rs` may need adjustment if they access `bucket.tokens` etc.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `tokens`, `last_refill`, `rate`, `max` fields are private
|
||||
- [ ] `last_access` is `pub(crate)`
|
||||
- [ ] `new()` constructor is the only way to create a `TokenBucket` externally
|
||||
- [ ] `evict_stale` still compiles and works (uses `last_access`)
|
||||
- [ ] All unit tests pass (in-module tests can still access private fields)
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — W10, S6 findings
|
||||
- src/rate_limit/bucket.rs — TokenBucket struct
|
||||
- src/rate_limit/mod.rs — evict_stale
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
83
tasks/fix/upstream-host-validation.md
Normal file
83
tasks/fix/upstream-host-validation.md
Normal file
@@ -0,0 +1,83 @@
|
||||
---
|
||||
id: fix/upstream-host-validation
|
||||
name: Validate host part of upstream address in config (W1)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [W1]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`is_valid_upstream` checks that the upstream has a `host:port` format with a
|
||||
valid port number, but performs **no validation on the host part** beyond
|
||||
checking it's non-empty and doesn't start with `http://` or `https://`. Values
|
||||
like `!!!bad!!!:3000` or `@#$%:8080` pass validation.
|
||||
|
||||
The spec (config.md validation rule 17) now requires: "the host part must parse
|
||||
as a valid IpAddr or pass is_valid_hostname validation." Bracket-enclosed values
|
||||
must be parsed as IPv6 addresses.
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/config/validation.rs`** — `is_valid_upstream` function (lines 309-327):
|
||||
- After validating the port, validate the host part:
|
||||
```rust
|
||||
fn is_valid_upstream(upstream: &str) -> bool {
|
||||
if let Some(idx) = upstream.rfind(':') {
|
||||
let host_part = &upstream[..idx];
|
||||
let port_str = &upstream[idx + 1..];
|
||||
if host_part.is_empty() { return false; }
|
||||
if upstream.starts_with("http://") || upstream.starts_with("https://") { return false; }
|
||||
let port: u16 = match port_str.parse() { Ok(p) => p, Err(_) => return false };
|
||||
if port == 0 { return false; }
|
||||
// Validate host part per config.md rule 17
|
||||
if host_part.starts_with('[') && host_part.ends_with(']') {
|
||||
let inner = &host_part[1..host_part.len()-1];
|
||||
inner.parse::<std::net::Ipv6Addr>().is_ok()
|
||||
} else {
|
||||
host_part.parse::<std::net::IpAddr>().is_ok() || is_valid_hostname(host_part)
|
||||
}
|
||||
} else {
|
||||
false
|
||||
}
|
||||
}
|
||||
```
|
||||
- Note: `is_valid_hostname` already exists in the same file and is used for site
|
||||
`host` validation. It rejects IP addresses, which is correct for site hosts
|
||||
but wrong for upstream hosts — upstream hosts CAN be IPs. The upstream
|
||||
validation must check `IpAddr::parse` first, then fall back to
|
||||
`is_valid_hostname` for DNS names.
|
||||
|
||||
- Add tests for:
|
||||
- Valid: `gitea:3000`, `127.0.0.1:3000`, `[::1]:3000`
|
||||
- Invalid: `!!!bad!!!:3000`, `@#$%:8080`, `:3000` (empty host)
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `is_valid_upstream` validates the host part as IP address or valid hostname
|
||||
- [ ] IPv6 bracket notation is handled (e.g., `[::1]:3000`)
|
||||
- [ ] Invalid host characters like `!!!bad!!!:3000` are rejected
|
||||
- [ ] Valid upstream formats still pass: `gitea:3000`, `127.0.0.1:3000`
|
||||
- [ ] New unit tests for valid and invalid upstream host parts
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/config.md — validation rule 17
|
||||
- docs/reviews/003-security-and-bug-review.md — W1 finding
|
||||
- src/config/validation.rs — `is_valid_upstream`, `is_valid_hostname`
|
||||
|
||||
## Notes
|
||||
|
||||
> `is_valid_hostname` currently rejects IP addresses (intentional for site
|
||||
> hosts). The upstream validation must handle IPs separately before falling
|
||||
> back to `is_valid_hostname`.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
80
tasks/fix/upstream-uri-error-handling.md
Normal file
80
tasks/fix/upstream-uri-error-handling.md
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
id: fix/upstream-uri-error-handling
|
||||
name: Return 502 on upstream URI parse failure instead of dropping query string (W3)
|
||||
status: pending
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
impact: component
|
||||
level: implementation
|
||||
review_findings: [W3]
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`build_upstream_uri` silently drops the query string on parse failure and
|
||||
`.unwrap()`s the fallback. This corrupts requests — the upstream receives the
|
||||
wrong URL with no query parameters, and neither the client nor operator is
|
||||
notified. The `.unwrap()` on the fallback could also panic.
|
||||
|
||||
The spec (proxy.md request forwarding step 1) states: "If URI construction
|
||||
fails (e.g., the resulting URI is malformed), the proxy must return 502 Bad
|
||||
Gateway and log the error at warn level. The proxy must never silently drop
|
||||
parts of the URI."
|
||||
|
||||
### Changes Required
|
||||
|
||||
**`src/proxy/handler.rs`**:
|
||||
- Change `build_upstream_uri` to return `Result<Uri, ()>`:
|
||||
```rust
|
||||
fn build_upstream_uri(scheme: &str, upstream: &str, original_uri: &Uri) -> Result<Uri, ()> {
|
||||
let path = original_uri.path();
|
||||
let query = original_uri
|
||||
.query()
|
||||
.map(|q| format!("?{}", q))
|
||||
.unwrap_or_default();
|
||||
let uri_string = format!("{}://{}{}{}", scheme, upstream, path, query);
|
||||
uri_string.parse::<Uri>().map_err(|e| {
|
||||
warn!(error = %e, uri = %uri_string, "failed to parse upstream URI");
|
||||
})
|
||||
}
|
||||
```
|
||||
- Update `proxy_handler` to handle the `Err` case:
|
||||
```rust
|
||||
let upstream_uri = match build_upstream_uri(&upstream_scheme, &upstream, req.uri()) {
|
||||
Ok(uri) => uri,
|
||||
Err(()) => {
|
||||
log_upstream_error!(&host_owned, &upstream_addr, "malformed upstream URI");
|
||||
let duration_ms = start.elapsed().as_millis() as u64;
|
||||
log_request!(&client_ip, &host_owned, &method, &path, 502u16, &upstream, duration_ms);
|
||||
return StatusCode::BAD_GATEWAY.into_response();
|
||||
}
|
||||
};
|
||||
```
|
||||
- Update the existing unit tests for `build_upstream_uri` to handle the
|
||||
`Result` return type.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `build_upstream_uri` returns `Result<Uri, ()>` instead of `Uri`
|
||||
- [ ] URI parse failure logs a warning with the malformed URI string
|
||||
- [ ] URI parse failure returns 502 Bad Gateway to the client
|
||||
- [ ] No `.unwrap()` calls in `build_upstream_uri`
|
||||
- [ ] Query strings are never silently dropped
|
||||
- [ ] Existing unit tests updated for `Result` return type
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo clippy` passes with no warnings
|
||||
|
||||
## References
|
||||
|
||||
- docs/architecture/proxy.md — request forwarding step 1, URI error handling
|
||||
- docs/reviews/003-security-and-bug-review.md — W3 finding
|
||||
- src/proxy/handler.rs — `build_upstream_uri`, `proxy_handler`
|
||||
|
||||
## Notes
|
||||
|
||||
> To be filled on completion
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
60
tasks/review/post-security-fix-review.md
Normal file
60
tasks/review/post-security-fix-review.md
Normal file
@@ -0,0 +1,60 @@
|
||||
---
|
||||
id: review/post-security-fix-review
|
||||
name: Review security fix implementations before production consideration
|
||||
status: pending
|
||||
depends_on:
|
||||
- fix/rate-limiter-ip-source
|
||||
- fix/inflight-counter-increment
|
||||
- fix/connector-timeout-ceiling
|
||||
- fix/json-format-without-logfile
|
||||
- fix/upstream-host-validation
|
||||
- fix/acme-contact-validation
|
||||
- fix/upstream-uri-error-handling
|
||||
- fix/admin-socket-resource-limits
|
||||
- fix/consolidate-config-types
|
||||
- fix/rate-limiter-connectinfo-tests
|
||||
scope: moderate
|
||||
risk: low
|
||||
impact: project
|
||||
level: review
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
Review all security and bug fix implementations from Review #003 before
|
||||
considering them production-ready. Verify that the fixes correctly implement
|
||||
the architecture decisions (ADR-025, ADR-026, ADR-027) and the updated spec
|
||||
documents.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] C1 fix: Rate limiter uses ConnectInfo only, rejects without it (ADR-025)
|
||||
- [ ] C2 fix: InFlightCounter increments before task spawn, drain polls 100ms
|
||||
- [ ] C3 fix: Connector ceiling is 30s, per-site timeouts work >5s (ADR-026)
|
||||
- [ ] C4 fix: JSON format applied in stdout-only path
|
||||
- [ ] W1 fix: Upstream host part validated (DNS name or IP, IPv6 brackets)
|
||||
- [ ] W2 fix: ACME contact email validated (non-empty, contains @)
|
||||
- [ ] W3 fix: URI parse failure returns 502, never drops query string silently
|
||||
- [ ] W4 fix: Admin socket has 5s timeout and 4096 byte line limit (ADR-027)
|
||||
- [ ] W6 fix: RawConfig eliminated, FullConfig used in both paths
|
||||
- [ ] S10 fix: Rate limit tests use ConnectInfo, verify XFF is ignored
|
||||
- [ ] All `cargo test` passes
|
||||
- [ ] All `cargo clippy` passes with no warnings
|
||||
- [ ] No regressions in integration tests
|
||||
|
||||
## References
|
||||
|
||||
- docs/reviews/003-security-and-bug-review.md — all findings
|
||||
- docs/architecture/decisions/025-rate-limiter-ip-source.md — ADR-025
|
||||
- docs/architecture/decisions/026-connector-timeout-ceiling.md — ADR-026
|
||||
- docs/architecture/decisions/027-admin-socket-resource-limits.md — ADR-027
|
||||
|
||||
## Notes
|
||||
|
||||
> This review covers the critical security fixes and the sensitive config
|
||||
> consolidation. It should be the last task before the generation 4+ code
|
||||
> quality items are considered final.
|
||||
|
||||
## Summary
|
||||
|
||||
> To be filled on completion
|
||||
Reference in New Issue
Block a user