154
docs/reviews/004-post-fix-review.md
Normal file
154
docs/reviews/004-post-fix-review.md
Normal file
@@ -0,0 +1,154 @@
|
|||||||
|
---
|
||||||
|
status: draft
|
||||||
|
last_updated: 2026-06-12
|
||||||
|
reviewed_code:
|
||||||
|
- src/main.rs
|
||||||
|
- src/server.rs
|
||||||
|
- src/cli.rs
|
||||||
|
- src/config/static_config.rs
|
||||||
|
- src/config/validation.rs
|
||||||
|
- src/proxy/handler.rs
|
||||||
|
- src/proxy/error.rs
|
||||||
|
- src/rate_limit/mod.rs
|
||||||
|
- src/rate_limit/bucket.rs
|
||||||
|
- src/admin/socket.rs
|
||||||
|
- src/logging/mod.rs
|
||||||
|
- src/logging/format.rs
|
||||||
|
- src/tls/acceptor.rs
|
||||||
|
- src/tls/config.rs
|
||||||
|
- tests/integration_test.rs
|
||||||
|
- tests/helpers/http_test_helper.rs
|
||||||
|
reviewer: code-reviewer
|
||||||
|
based_on: docs/reviews/003-security-and-bug-review.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Post-Fix Review #004
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Verify that the 4 critical, 12 warning, and 10 suggestion findings from review
|
||||||
|
#003 were correctly addressed. Flag any regressions or remaining issues.
|
||||||
|
|
||||||
|
## Critical Findings — All Fixed
|
||||||
|
|
||||||
|
### C1. Rate Limiter IP Source ✓
|
||||||
|
|
||||||
|
`src/rate_limit/mod.rs:66-73` — Now uses `ConnectInfo` exclusively. No fallback
|
||||||
|
to client-supplied `X-Forwarded-For`. Returns 429 if `ConnectInfo` is absent.
|
||||||
|
Three new tests verify: ConnectInfo-only extraction, rejection without
|
||||||
|
ConnectInfo, and XFF header ignored (same bucket for different XFF values).
|
||||||
|
|
||||||
|
### C2. InFlightCounter Increment ✓
|
||||||
|
|
||||||
|
`src/server.rs:23-28,81` — `InFlightGuard::new()` now calls
|
||||||
|
`counter.increment()` on construction. The guard is created with
|
||||||
|
`InFlightGuard::new(in_flight.clone())` at line 81. Drain interval tightened
|
||||||
|
from 50ms to 100ms.
|
||||||
|
|
||||||
|
### C3. Connect Timeout Ceiling ✓
|
||||||
|
|
||||||
|
`src/proxy/handler.rs:230,234,242` — `CONNECT_TIMEOUT_CEILING_SECS = 30`
|
||||||
|
replaces the hardcoded 5s. Per-site `tokio::time::timeout` can now enforce any
|
||||||
|
value up to 30s.
|
||||||
|
|
||||||
|
### C4. JSON Format Without Log File ✓
|
||||||
|
|
||||||
|
`src/logging/mod.rs:55-56` — `.json()` added to the `None` branch of
|
||||||
|
`init_json`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Warning Findings — All Fixed
|
||||||
|
|
||||||
|
| # | Finding | Status |
|
||||||
|
|---|---------|--------|
|
||||||
|
| W1 | Upstream host validation | ✓ `is_valid_upstream` validates host part (IPv6 bracket, IP, hostname). 9 new tests. |
|
||||||
|
| W2 | ACME contact validation | ✓ Checks non-empty email with `@` after `mailto:`. 2 new tests (`mailto:` empty, no `@`). |
|
||||||
|
| W3 | Query string silently dropped | ✓ `build_upstream_uri` returns `Result`, logs warning, handler returns 502. |
|
||||||
|
| W4 | Admin socket resource limits | ✓ 5s read timeout, 4096 byte limit via `reader.take(4096)`. 3 new tests. |
|
||||||
|
| W5 | TlsMode wildcard mismatch | ✓ Wildcard arm removed. Count mismatch check added at `main.rs:190-196`. |
|
||||||
|
| W6 | RawConfig/FullConfig duplication | ✓ `RawConfig` removed. `load_config` uses `FullConfig::parse()` + `into_static_and_dynamic()`. |
|
||||||
|
| W7 | cleanup_stale_socket side effect | Not addressed (acceptable per review — Phase 1). |
|
||||||
|
| W8 | Misleading test name | ~ Test name unchanged despite commit claim. See N1 below. |
|
||||||
|
| W9 | Empty routing_table in test helper | ✓ `test_dynamic_config_with_limit` now uses `DynamicConfig::from_sites()`. |
|
||||||
|
| W10 | TokenBucket pub fields | ✓ `tokens`, `last_refill`, `rate`, `max` now private. `last_access` is `pub(crate)`. |
|
||||||
|
| W11 | reload_mutex exposed for testing | ✓ Gated with `#[cfg(test)]`. |
|
||||||
|
| W12 | http_port u32 vs u16 | ✓ `http_port` changed to `u16`. All `as u32` casts removed. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Suggestions — Addressed
|
||||||
|
|
||||||
|
| # | Suggestion | Status |
|
||||||
|
|---|-----------|--------|
|
||||||
|
| S1 | Remove dead code | ✓ `log_rate_limit!`, `log_config_reload!`, `format_event_fields`, 4 unused `ProxyError` variants, `build_multi_domain_server_config`, `SniCertResolver`, `TestUpstream::url()`/`upstream_addr()` all removed. `KvVisitor` gated with `#[cfg(test)]`. |
|
||||||
|
| S3 | Log root cert count | ✓ `root_certs()` logs count at info, warns if zero. |
|
||||||
|
| S4 | Admin socket resource limits | ✓ (same as W4) |
|
||||||
|
| S5 | Consolidate config types | ✓ (same as W6) |
|
||||||
|
| S6 | TokenBucket fields private | ✓ (same as W10) |
|
||||||
|
| S8 | Connect timeout ceiling | ✓ (same as C3) |
|
||||||
|
| S10 | ConnectInfo rate limit tests | ✓ 3 new tests in `src/rate_limit/mod.rs`. |
|
||||||
|
| S2, S7, S9 | Eviction config, status info, acme_contact Vec | Deferred (Phase 2 / config format change). |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New Findings
|
||||||
|
|
||||||
|
### N1. Test Name `test_health_check_disabled_when_port_zero` Still Misleading
|
||||||
|
|
||||||
|
**File**: `tests/integration_test.rs:82`
|
||||||
|
|
||||||
|
**Problem**: Commit `855c0f1` claimed to rename this test but the name is
|
||||||
|
unchanged. Port 0 means "OS picks a random port" — the health check listener
|
||||||
|
still starts. The actual disable logic is `health_check_port > 0` in
|
||||||
|
`main.rs:94`. The test name implies it tests the disable path but it tests the
|
||||||
|
random-port binding path.
|
||||||
|
|
||||||
|
**Severity**: Suggestion — rename to `test_health_check_binds_random_port_when_zero`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### N2. `FullConfig` Test Struct Duplicated in `static_config.rs`
|
||||||
|
|
||||||
|
**File**: `src/config/static_config.rs:189-206`
|
||||||
|
|
||||||
|
**Problem**: The test module defines its own `FullConfig` struct with
|
||||||
|
`#[allow(dead_code)]` that mirrors the production `FullConfig` in
|
||||||
|
`config/mod.rs`. If fields change, both must be updated. This is test
|
||||||
|
infrastructure only, so low risk.
|
||||||
|
|
||||||
|
**Severity**: Suggestion — use the production `FullConfig` from `config/mod.rs`
|
||||||
|
in these tests, or accept the duplication as intentional test isolation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### N3. `is_valid_upstream` Accepts Bare IPv6 Without Brackets
|
||||||
|
|
||||||
|
**File**: `src/config/validation.rs:327`
|
||||||
|
|
||||||
|
**Problem**: `::1:3000` passes validation because `::1` parses as a valid
|
||||||
|
`IpAddr`. The config spec says IPv6 should use bracket notation
|
||||||
|
(`[::1]:3000`). In practice nobody writes bare IPv6 with a port, and the
|
||||||
|
address would still work, so this is extremely low risk.
|
||||||
|
|
||||||
|
**Severity**: Suggestion — could tighten by requiring brackets for IPv6, but
|
||||||
|
not worth the churn.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
All 4 critical and 11 of 12 warning findings are properly fixed. W7 was
|
||||||
|
explicitly deferred as acceptable for Phase 1. W8 (test rename) was claimed
|
||||||
|
fixed but the name wasn't actually changed — minor. Three new suggestion-level
|
||||||
|
notes, none blocking.
|
||||||
|
|
||||||
|
The codebase is in significantly better shape than at review #003. The rate
|
||||||
|
limiter spoofing vulnerability (C1) was the most important fix and it's done
|
||||||
|
correctly with good test coverage.
|
||||||
|
|
||||||
|
| Severity | Count |
|
||||||
|
|----------|-------|
|
||||||
|
| Critical | 0 |
|
||||||
|
| Warning | 0 |
|
||||||
|
| Suggestion | 3 (new, minor) |
|
||||||
638
docs/reviews/005-admin-socket-security-review.md
Normal file
638
docs/reviews/005-admin-socket-security-review.md
Normal file
@@ -0,0 +1,638 @@
|
|||||||
|
---
|
||||||
|
status: draft
|
||||||
|
last_updated: 2026-06-14
|
||||||
|
reviewed_code:
|
||||||
|
- src/admin/socket.rs
|
||||||
|
- src/admin/mod.rs
|
||||||
|
- src/main.rs
|
||||||
|
- src/shutdown.rs
|
||||||
|
- src/config/dynamic_config.rs
|
||||||
|
- src/config/validation.rs
|
||||||
|
- src/health.rs
|
||||||
|
- src/proxy/handler.rs
|
||||||
|
- src/proxy/headers.rs
|
||||||
|
- src/proxy/mod.rs
|
||||||
|
- src/rate_limit/mod.rs
|
||||||
|
- src/server.rs
|
||||||
|
reviewer: code-reviewer
|
||||||
|
based_on: docs/reviews/004-post-fix-review.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Admin Socket Security Review #005
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Focused security review of `src/admin/socket.rs` and related code paths,
|
||||||
|
triggered by unusual traffic patterns observed on the file in both public and
|
||||||
|
self-hosted git. The review examines the Unix domain socket admin interface for
|
||||||
|
vulnerabilities including symlink attacks, privilege escalation, information
|
||||||
|
disclosure, and DoS vectors. Broader codebase was also surveyed for related
|
||||||
|
issues.
|
||||||
|
|
||||||
|
## Severity Definitions
|
||||||
|
|
||||||
|
| Severity | Meaning |
|
||||||
|
|----------|---------|
|
||||||
|
| **Critical** | Will cause incorrect behavior or security issues in production |
|
||||||
|
| **Warning** | Could cause issues under specific conditions or represents a missed edge case |
|
||||||
|
| **Suggestion** | Code quality, style, or minor improvement opportunity |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Critical Findings
|
||||||
|
|
||||||
|
### C1. Symlink Race in `cleanup_stale_socket` Enables Arbitrary File Deletion
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:143-161`
|
||||||
|
|
||||||
|
**Problem**: `cleanup_stale_socket` checks whether a socket file exists and
|
||||||
|
whether another process is listening on it, then removes the file. Between the
|
||||||
|
`is_socket_active()` check (which connects to the socket) and the
|
||||||
|
`remove_file()` call, an attacker with local access can replace the socket file
|
||||||
|
with a symlink pointing to an arbitrary path (e.g., `/etc/passwd`, a critical
|
||||||
|
database file). The `remove_file()` call then follows the symlink and deletes
|
||||||
|
the target:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn cleanup_stale_socket(path: &str) -> Result<(), AdminSocketError> {
|
||||||
|
let socket_path = Path::new(path);
|
||||||
|
if !socket_path.exists() {
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
|
||||||
|
if is_socket_active(path).await { // check
|
||||||
|
// ...
|
||||||
|
return Err(AdminSocketError::SocketInUse(path.to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
warn!("removing stale socket file: {}", path);
|
||||||
|
tokio::fs::remove_file(path) // act — follows symlinks!
|
||||||
|
.await
|
||||||
|
.map_err(AdminSocketError::Io)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a classic TOCTOU (time-of-check/time-of-use) race. The window is small
|
||||||
|
but exploitable with local access, which is exactly the threat model for a Unix
|
||||||
|
domain socket (any local user can reach it).
|
||||||
|
|
||||||
|
Additionally, `is_socket_active` works by **actually connecting** to the socket
|
||||||
|
(`src/admin/socket.rs:163-165`). If the path is a symlink to another service's
|
||||||
|
Unix socket, this creates a real connection in that service, which is an
|
||||||
|
unintended side effect that could be logged or trigger behavior in the other
|
||||||
|
service.
|
||||||
|
|
||||||
|
**Solution**: Replace the check-then-remove sequence with a safe alternative:
|
||||||
|
|
||||||
|
1. Use `std::fs::metadata()` to verify the file is actually a socket before
|
||||||
|
removal (sockets cannot be symlinked to — `metadata` does not follow
|
||||||
|
symlinks when `std::fs::symlink_metadata` is used, and sockets are a
|
||||||
|
distinct file type).
|
||||||
|
|
||||||
|
2. Alternatively, use `std::fs::remove_file` only after verifying with
|
||||||
|
`symlink_metadata` that the file type is `FileType::is_socket()`:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
async fn cleanup_stale_socket(path: &str) -> Result<(), AdminSocketError> {
|
||||||
|
let socket_path = Path::new(path);
|
||||||
|
if !socket_path.exists() {
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
|
||||||
|
let metadata = std::fs::symlink_metadata(path)
|
||||||
|
.map_err(AdminSocketError::Io)?;
|
||||||
|
|
||||||
|
if metadata.file_type().is_symlink() {
|
||||||
|
warn!("admin socket path {} is a symlink, refusing to remove", path);
|
||||||
|
return Err(AdminSocketError::BindFailed(
|
||||||
|
"socket path is a symlink, refusing to remove".to_string()
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
if !metadata.file_type().is_socket() {
|
||||||
|
warn!("admin socket path {} is not a socket file, refusing to remove", path);
|
||||||
|
return Err(AdminSocketError::BindFailed(
|
||||||
|
"path exists but is not a socket".to_string()
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
if is_socket_active(path).await {
|
||||||
|
warn!("socket file {} exists and another process is listening; disabling admin socket", path);
|
||||||
|
return Err(AdminSocketError::SocketInUse(path.to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
warn!("removing stale socket file: {}", path);
|
||||||
|
tokio::fs::remove_file(path)
|
||||||
|
.await
|
||||||
|
.map_err(AdminSocketError::Io)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This prevents both symlink attacks (refuses to remove symlinks) and accidental
|
||||||
|
removal of non-socket files.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### C2. Admin Socket Has No Access Control — Any Local User Can Trigger Reload
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:108-133,254-305`
|
||||||
|
|
||||||
|
**Problem**: The admin socket accepts connections from any local user. There is
|
||||||
|
no authentication, no peer credential check, and no ownership/permission
|
||||||
|
restriction on the socket file itself. After `UnixListener::bind()`, the socket
|
||||||
|
inherits the process umask but no explicit restrictive permissions are set.
|
||||||
|
|
||||||
|
Any local user who can reach the socket path can:
|
||||||
|
|
||||||
|
1. **Trigger a config reload** (`reload` command) — re-reads the config file
|
||||||
|
from disk and hot-swaps the live routing table. If an attacker can write to
|
||||||
|
the config file path (e.g., via a separate misconfiguration or directory
|
||||||
|
permission issue), they can chain this: write a malicious config, then send
|
||||||
|
`reload` via the admin socket to activate it. This could redirect traffic
|
||||||
|
to an attacker-controlled upstream.
|
||||||
|
|
||||||
|
2. **Read status** (`status` command) — reveals uptime and number of configured
|
||||||
|
sites. Minor information disclosure but useful for reconnaissance.
|
||||||
|
|
||||||
|
The `reload` command is particularly dangerous because it reads the config file
|
||||||
|
from disk each time (`src/admin/socket.rs:257`). The config path is set at
|
||||||
|
startup from `StaticConfig` and cannot be changed at runtime, but the file
|
||||||
|
contents at that path can be modified by any process with write access. The
|
||||||
|
admin socket becomes a trigger mechanism for activating malicious configs.
|
||||||
|
|
||||||
|
**Solution**: Multi-layered defense:
|
||||||
|
|
||||||
|
1. Set restrictive permissions on the socket immediately after binding:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use std::os::unix::fs::PermissionsExt;
|
||||||
|
|
||||||
|
let listener = UnixListener::bind(socket_path)?;
|
||||||
|
let perms = std::fs::Permissions::from_mode(0o660); // owner + group only
|
||||||
|
std::fs::set_permissions(socket_path, perms)?;
|
||||||
|
```
|
||||||
|
|
||||||
|
Or `0o600` for owner-only access. This requires the proxy to run under a
|
||||||
|
dedicated user and for the admin tool (`socat`, etc.) to run as the same
|
||||||
|
user or group.
|
||||||
|
|
||||||
|
2. Add peer credential checking using `SO_PEERCRED` on Linux:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use std::os::unix::net::UnixStream;
|
||||||
|
|
||||||
|
fn check_peer_uid(stream: &UnixStream) -> bool {
|
||||||
|
use std::os::unix::io::AsRawFd;
|
||||||
|
let uid = nix::sys::socket::getpeereid(stream.as_raw_fd())
|
||||||
|
.map(|(_, uid, _)| uid);
|
||||||
|
match uid {
|
||||||
|
Ok(peer_uid) => peer_uid == 0 || peer_uid == get_current_uid(),
|
||||||
|
Err(_) => false,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Document that the socket path should be in a directory with restrictive
|
||||||
|
permissions (e.g., `/run/reverse-proxy/` owned by the proxy user with mode
|
||||||
|
`0700`).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### C3. Admin Socket `reload` Response Leaks Filesystem Paths and Error Details
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:257-265,268-276`
|
||||||
|
|
||||||
|
**Problem**: When the `reload` command fails, the error response includes the
|
||||||
|
full `std::io::Error` or `toml::de::Error` message, which can contain absolute
|
||||||
|
filesystem paths, file permissions, and internal config structure details:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
Err(e) => {
|
||||||
|
return serde_json::to_string(&ErrorResponse {
|
||||||
|
status: "error",
|
||||||
|
message: format!("failed to read config file: {}", e), // leaks path
|
||||||
|
})
|
||||||
|
.unwrap();
|
||||||
|
}
|
||||||
|
|
||||||
|
Err(e) => {
|
||||||
|
return serde_json::to_string(&ErrorResponse {
|
||||||
|
status: "error",
|
||||||
|
message: format!("failed to parse config file: {}", e), // leaks structure
|
||||||
|
})
|
||||||
|
.unwrap();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The same applies to the `unknown command` response at line 247, which echoes
|
||||||
|
arbitrary input back without sanitization:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
message: format!("unknown command: {}", command),
|
||||||
|
```
|
||||||
|
|
||||||
|
For an unauthenticated socket, this information disclosure helps an attacker
|
||||||
|
enumerate the system (filesystem layout, config syntax, software version via
|
||||||
|
error messages).
|
||||||
|
|
||||||
|
**Solution**: Return generic error messages to the socket client and log the
|
||||||
|
details server-side:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
Err(e) => {
|
||||||
|
tracing::error!(error = %e, "failed to read config file");
|
||||||
|
serde_json::to_string(&ErrorResponse {
|
||||||
|
status: "error",
|
||||||
|
message: "reload failed".to_string(),
|
||||||
|
}).unwrap()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
For unknown commands, avoid echoing input:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
_ => serde_json::to_string(&ErrorResponse {
|
||||||
|
status: "error",
|
||||||
|
message: "unknown command".to_string(),
|
||||||
|
}).unwrap(),
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Warning Findings
|
||||||
|
|
||||||
|
### W1. No Concurrency Limit on Admin Socket Connections
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:108-133`
|
||||||
|
|
||||||
|
**Problem**: Each accepted connection spawns a new tokio task (`tokio::spawn` at
|
||||||
|
line 114) with no concurrency limit. A local user with access to the socket can
|
||||||
|
open many simultaneous connections, spawning unlimited tasks. While the 5-second
|
||||||
|
read timeout and 4096-byte limit (added in review #004) mitigate the most
|
||||||
|
trivial DoS, a determined attacker can:
|
||||||
|
|
||||||
|
- Open many connections simultaneously, each sending data slowly (within the
|
||||||
|
5-second timeout), consuming memory and task slots.
|
||||||
|
- Send data up to 4096 bytes per connection — with enough concurrent
|
||||||
|
connections, this still consumes significant memory.
|
||||||
|
|
||||||
|
**Solution**: Add a `tokio::sync::Semaphore` with a reasonable limit (e.g., 10
|
||||||
|
concurrent connections):
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let semaphore = Arc::new(tokio::sync::Semaphore::new(10));
|
||||||
|
|
||||||
|
// In the accept loop:
|
||||||
|
let permit = match semaphore.clone().acquire_owned().await {
|
||||||
|
Ok(p) => p,
|
||||||
|
Err(_) => {
|
||||||
|
warn!("admin socket connection limit reached, dropping connection");
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
tokio::spawn(async move {
|
||||||
|
let _permit = permit;
|
||||||
|
handle_connection(stream, admin_socket).await;
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W2. Config File TOCTOU: Reload Reads File Without Atomicity
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:257`, `src/shutdown.rs:88`
|
||||||
|
|
||||||
|
**Problem**: Both `handle_reload` and `handle_sighup_reload` read the config
|
||||||
|
file with `tokio::fs::read_to_string()`, then parse it. If another process is
|
||||||
|
writing to the config file at the same time (e.g., a configuration management
|
||||||
|
tool like Ansible writing a partial file), the proxy could read a partially
|
||||||
|
written config and apply it. This is a filesystem-level TOCTOU issue.
|
||||||
|
|
||||||
|
Unlike the symlink race (C1), this is harder to exploit directly — the parse
|
||||||
|
will likely fail on a partial file, resulting in a reload error rather than a
|
||||||
|
bad config being applied. However, it could result in a brief window where the
|
||||||
|
operator sees confusing errors during config rotation.
|
||||||
|
|
||||||
|
**Solution**: Use atomic file replacement — write to a temporary file in the
|
||||||
|
same directory, then rename over the target. Document this pattern for
|
||||||
|
operators. Alternatively, compute a checksum or stat the file before and after
|
||||||
|
reading to detect mid-write changes:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let metadata_before = tokio::fs::metadata(&admin_socket.config_path).await?;
|
||||||
|
let config_content = tokio::fs::read_to_string(&admin_socket.config_path).await?;
|
||||||
|
let metadata_after = tokio::fs::metadata(&admin_socket.config_path).await?;
|
||||||
|
|
||||||
|
if metadata_before.modified()? != metadata_after.modified()? {
|
||||||
|
return serde_json::to_string(&ErrorResponse {
|
||||||
|
status: "error",
|
||||||
|
message: "config file changed during read, please retry".to_string(),
|
||||||
|
}).unwrap();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W3. Admin Socket Path Is Not Validated or Sanitized
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:81-83`, `src/config/static_config.rs:22-24`
|
||||||
|
|
||||||
|
**Problem**: The `admin_socket_path` from the config file is used directly as a
|
||||||
|
filesystem path with no validation. A malicious or misconfigured path could
|
||||||
|
point to:
|
||||||
|
|
||||||
|
- A path on a critical filesystem (e.g., `/etc/passwd`)
|
||||||
|
- An extremely long path (potential buffer issues in downstream code)
|
||||||
|
- A path with directory traversal (e.g., `../../etc/cron.d/malicious`)
|
||||||
|
|
||||||
|
While the default `/run/reverse-proxy/admin.sock` is safe, the config is user-
|
||||||
|
controlled and loaded from disk. Combined with C2 (no authentication), a local
|
||||||
|
attacker who can write to the config file and trigger a reload could redirect
|
||||||
|
the admin socket to an arbitrary path.
|
||||||
|
|
||||||
|
This is partially mitigated by the fact that `admin_socket_path` is in
|
||||||
|
`StaticConfig` (requires restart, not hot-reloadable), but the startup config
|
||||||
|
still trusts the path.
|
||||||
|
|
||||||
|
**Solution**: Add validation that the socket path:
|
||||||
|
|
||||||
|
1. Ends with `.sock` or `.socket` (or at least doesn't end with a suspicious
|
||||||
|
extension)
|
||||||
|
2. Is under a known-safe directory prefix (e.g., `/run/`, `/var/run/`, or a
|
||||||
|
configurable base directory)
|
||||||
|
3. Does not contain path traversal components (`..`)
|
||||||
|
4. Has a reasonable length limit
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn validate_admin_socket_path(path: &str) -> Result<(), ValidationError> {
|
||||||
|
if path.is_empty() { return Ok(()); } // disabled is valid
|
||||||
|
if path.len() > 255 {
|
||||||
|
return Err(ValidationError::AdminSocketPathTooLong);
|
||||||
|
}
|
||||||
|
if path.contains("..") {
|
||||||
|
return Err(ValidationError::AdminSocketPathTraversal);
|
||||||
|
}
|
||||||
|
let path = Path::new(path);
|
||||||
|
if path.is_absolute() {
|
||||||
|
Ok(())
|
||||||
|
} else {
|
||||||
|
Err(ValidationError::AdminSocketPathRelative)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W4. `is_socket_active` Side-Effect on Other Processes
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:163-165`
|
||||||
|
|
||||||
|
**Problem**: This was flagged in review #003 (W7) and accepted as Phase 1.
|
||||||
|
However, in the context of the symlink attack in C1, this function becomes more
|
||||||
|
dangerous: if an attacker replaces the socket path with a symlink to another
|
||||||
|
service's socket, `is_socket_active` will connect to that service, which could
|
||||||
|
trigger behavior in the target service (e.g., accepting a connection, logging
|
||||||
|
it, starting a session). This amplifies the C1 symlink attack beyond file
|
||||||
|
deletion.
|
||||||
|
|
||||||
|
**Solution**: See C1 — the proposed fix uses `symlink_metadata()` and checks
|
||||||
|
`is_socket()` before calling `is_socket_active`, which eliminates the symlink
|
||||||
|
attack surface. Additionally, the `is_socket_active` check should only be
|
||||||
|
reached after verifying the file is not a symlink and is actually a socket
|
||||||
|
file type.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W5. `reload` Command Does Not Validate Config Before Applying
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:268-279`, `src/config/dynamic_config.rs:134-158`
|
||||||
|
|
||||||
|
**Problem**: While `ConfigReloadHandle::reload()` calls `validate()` before
|
||||||
|
storing the new config, the validation does not check that the config file
|
||||||
|
itself hasn't been tampered with (e.g., via checksum). More critically, the
|
||||||
|
validation passes `cli_allow_wildcard_bind: false` during reload
|
||||||
|
(`validate(&new_static, &new_dynamic, false)` at line 141), but the startup
|
||||||
|
path may have passed `true` (via `--allow-wildcard-bind` CLI flag). This means
|
||||||
|
a config reload could tighten the wildcard bind restriction that was
|
||||||
|
intentionally relaxed via CLI, causing existing listeners on `0.0.0.0` to
|
||||||
|
continue running while the validation reports them as errors.
|
||||||
|
|
||||||
|
This is a corner case — the running listeners are not stopped — but it creates
|
||||||
|
a confusing state where the config is rejected on reload even though it was
|
||||||
|
accepted at startup.
|
||||||
|
|
||||||
|
**Solution**: Store the `cli_allow_wildcard_bind` flag in `ConfigReloadHandle`
|
||||||
|
or `StaticConfig` so that reload uses the same flag as startup:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ConfigReloadHandle {
|
||||||
|
config: Arc<ArcSwap<DynamicConfig>>,
|
||||||
|
static_config: ArcSwap<StaticConfig>,
|
||||||
|
reload_mutex: Mutex<()>,
|
||||||
|
cli_allow_wildcard_bind: bool, // stored from startup
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W6. Reload Error Response Includes Static Config Change Warnings That Should Not Be Exposed
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:286-293`
|
||||||
|
|
||||||
|
**Problem**: On successful reload, if static config fields have changed, the
|
||||||
|
response includes a log at warning level but the admin socket only returns
|
||||||
|
`{"status": "ok"}`. However, the `reload_mutex` inside `ConfigReloadHandle`
|
||||||
|
already ensures only one reload runs at a time, but the `changed_fields` list
|
||||||
|
is only logged, not returned to the caller. An operator sending `reload` via
|
||||||
|
`socat` has no way to know if their reload actually changed static fields that
|
||||||
|
require a restart. This is an operational concern rather than a security issue.
|
||||||
|
|
||||||
|
**Solution**: Include `changed_fields` in the reload response when non-empty:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
Ok(changed_fields) => {
|
||||||
|
if !changed_fields.is_empty() {
|
||||||
|
serde_json::to_string(&OkWithFieldsResponse {
|
||||||
|
status: "ok",
|
||||||
|
message: format!("static fields changed (restart required): {}", changed_fields.join(", ")),
|
||||||
|
}).unwrap()
|
||||||
|
} else {
|
||||||
|
serde_json::to_string(&OkResponse { status: "ok" }).unwrap()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### W7. Health Check Endpoint Exposes Service on Predictable Port
|
||||||
|
|
||||||
|
**File**: `src/health.rs:17-36`
|
||||||
|
|
||||||
|
**Problem**: The health check listener binds to `127.0.0.1` on the configured
|
||||||
|
port (default 9900). While binding to localhost is correct, the port is
|
||||||
|
predictable and not configurable to bind to a Unix socket instead. On shared
|
||||||
|
systems, any local process can connect to `127.0.0.1:9900/health` and confirm
|
||||||
|
the proxy is running. This is low risk (the response is just `200 OK` with no
|
||||||
|
body), but the endpoint is completely unauthenticated and could be used for
|
||||||
|
reconnaissance.
|
||||||
|
|
||||||
|
More importantly, there's no rate limiting on the health endpoint. A local
|
||||||
|
attacker could flood it to consume connection resources, though this is
|
||||||
|
mitigated by the fact that it's localhost-only and a simple GET.
|
||||||
|
|
||||||
|
**Solution**: Consider adding an option to bind the health check to a Unix
|
||||||
|
socket instead of a TCP port, or adding a shared-secret token header check.
|
||||||
|
For Phase 1, this is acceptable as-is since the endpoint is localhost-only and
|
||||||
|
returns minimal information.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Suggestions
|
||||||
|
|
||||||
|
### S1. Add Connection Peer Logging to Admin Socket
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:112`
|
||||||
|
|
||||||
|
**Suggestion**: Log the peer credentials (UID, PID, GID) of each admin socket
|
||||||
|
connection using `SO_PEERCRED`. This provides an audit trail for reload
|
||||||
|
operations:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use std::os::unix::io::AsRawFd;
|
||||||
|
|
||||||
|
fn log_peer_credentials(stream: &tokio::net::UnixStream) {
|
||||||
|
if let Ok((_, uid)) = nix::sys::socket::getpeereid(stream.as_raw_fd()) {
|
||||||
|
info!(peer_uid = uid, "admin socket connection accepted");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### S2. Use `AF_UNIX` Peer Credential Validation Instead of File Permissions
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs`
|
||||||
|
|
||||||
|
**Suggestion**: Rather than (or in addition to) setting socket file permissions,
|
||||||
|
use `SO_PEERCRED`/`getpeereid` to validate the connecting process's UID at
|
||||||
|
accept time. This is more robust than filesystem permissions because:
|
||||||
|
|
||||||
|
1. It works regardless of the umask
|
||||||
|
2. It allows fine-grained control (e.g., allow root and the proxy user only)
|
||||||
|
3. It cannot be bypassed by group membership changes
|
||||||
|
|
||||||
|
Requires adding `nix` as a dependency (or using `libc` directly).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### S3. Set Socket File Permissions Immediately After Bind
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:90-102`
|
||||||
|
|
||||||
|
**Suggestion**: After the successful `UnixListener::bind()`, immediately set
|
||||||
|
restrictive permissions on the socket file:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let listener = UnixListener::bind(socket_path)?;
|
||||||
|
let perms = std::fs::Permissions::from_mode(0o660);
|
||||||
|
std::fs::set_permissions(socket_path, perms)?;
|
||||||
|
info!("admin socket listening on {} (mode 660)", socket_path);
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the minimum defense-in-depth even if full peer credential checking
|
||||||
|
is not implemented in Phase 1.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### S4. Consider Adding a Simple Challenge-Response to the Admin Socket Protocol
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:167-252`
|
||||||
|
|
||||||
|
**Suggestion**: Even without full authentication, a simple shared-secret token
|
||||||
|
check would significantly raise the bar for local attackers. Add a `token`
|
||||||
|
field to `StaticConfig` (or an environment variable) and require it as a prefix
|
||||||
|
to commands:
|
||||||
|
|
||||||
|
```
|
||||||
|
<token>:reload\n
|
||||||
|
<token>:status\n
|
||||||
|
```
|
||||||
|
|
||||||
|
This prevents casual discovery of the socket path from enabling reload. The
|
||||||
|
token could be stored in an environment variable (`ADMIN_SOCKET_TOKEN`) that
|
||||||
|
the proxy reads at startup, keeping it out of the config file.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### S5. Move `is_socket_active` Check After File Type Verification
|
||||||
|
|
||||||
|
**File**: `src/admin/socket.rs:143-161`
|
||||||
|
|
||||||
|
**Suggestion**: This is the implementation detail of C1's fix. The current
|
||||||
|
order is:
|
||||||
|
|
||||||
|
1. Check if path exists
|
||||||
|
2. Check if socket is active (connect to it)
|
||||||
|
3. Remove the file
|
||||||
|
|
||||||
|
The safe order should be:
|
||||||
|
|
||||||
|
1. Check if path exists
|
||||||
|
2. Verify it's not a symlink (`symlink_metadata` + `is_symlink()`)
|
||||||
|
3. Verify it's a socket file type (`is_socket()`)
|
||||||
|
4. Check if socket is active
|
||||||
|
5. Remove the file
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### S6. Document Admin Socket Security Model
|
||||||
|
|
||||||
|
**Suggestion**: Add a section to `docs/architecture/operations.md` (or similar)
|
||||||
|
that clearly states:
|
||||||
|
|
||||||
|
1. The admin socket is unauthenticated by design for Phase 1
|
||||||
|
2. The socket path must be in a directory with restrictive permissions
|
||||||
|
3. The socket file permissions should be `0600` or `0660`
|
||||||
|
4. Only trusted users should have filesystem access to the socket path
|
||||||
|
5. Future phases may add peer credential authentication or challenge-response
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary Statistics
|
||||||
|
|
||||||
|
| Severity | Count | Status |
|
||||||
|
|----------|-------|--------|
|
||||||
|
| Critical | 3 | Must fix before production |
|
||||||
|
| Warning | 7 | Should fix — security hardening |
|
||||||
|
| Suggestion | 6 | Consider for defense-in-depth |
|
||||||
|
|
||||||
|
## Recommended Fix Priority
|
||||||
|
|
||||||
|
1. **C1 (symlink race / arbitrary file deletion)** — Exploitable by any local
|
||||||
|
user who can reach the socket directory. Fix with `symlink_metadata` + file
|
||||||
|
type check before removal.
|
||||||
|
2. **C2 (no authentication / unrestricted socket permissions)** — Any local user
|
||||||
|
can trigger config reload. Set socket permissions to 0o660 or 0o600 after
|
||||||
|
bind as a minimum; add peer credential checking for full protection.
|
||||||
|
3. **C3 (error messages leak filesystem paths)** — Information disclosure on an
|
||||||
|
unauthenticated socket. Return generic errors and log details server-side.
|
||||||
|
4. **W1 (no connection concurrency limit)** — DoS vector. Add a semaphore.
|
||||||
|
5. **W2 (config file read without atomicity)** — Partial-read risk during
|
||||||
|
config rotation. Document atomic replacement pattern for operators.
|
||||||
|
6. **W3 (socket path not validated)** — Path traversal/symlink risk in config.
|
||||||
|
Add basic validation.
|
||||||
|
7. **W4 (is_socket_active side effect)** — Amplifies C1. Fixed by C1's
|
||||||
|
`symlink_metadata` guard.
|
||||||
|
8. **W5 (reload validation uses different wildcard flag)** — Inconsistency
|
||||||
|
between startup and reload validation.
|
||||||
|
9. **Remaining W and S findings** — Fix opportunistically.
|
||||||
|
|
||||||
|
## Additional Notes
|
||||||
|
|
||||||
|
The unusual traffic on this file was likely driven by interest in C1 and C2.
|
||||||
|
The symlink race (C1) is the most directly exploitable vulnerability — it
|
||||||
|
requires only local filesystem access to the socket directory and timing to
|
||||||
|
replace the socket with a symlink. Combined with C2 (no authentication), a
|
||||||
|
local attacker who can write to the socket directory can both delete arbitrary
|
||||||
|
files (C1) and trigger config reloads with attacker-controlled content (C2 +
|
||||||
|
config file write access).
|
||||||
Reference in New Issue
Block a user