22 KiB
status, last_updated, reviewed_code, reviewer, based_on
| status | last_updated | reviewed_code | reviewer | based_on | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| draft | 2026-06-14 |
|
code-reviewer | docs/reviews/004-post-fix-review.md |
Admin Socket Security Review #005
Purpose
Focused security review of src/admin/socket.rs and related code paths,
triggered by unusual traffic patterns observed on the file in both public and
self-hosted git. The review examines the Unix domain socket admin interface for
vulnerabilities including symlink attacks, privilege escalation, information
disclosure, and DoS vectors. Broader codebase was also surveyed for related
issues.
Severity Definitions
| Severity | Meaning |
|---|---|
| Critical | Will cause incorrect behavior or security issues in production |
| Warning | Could cause issues under specific conditions or represents a missed edge case |
| Suggestion | Code quality, style, or minor improvement opportunity |
Critical Findings
C1. Symlink Race in cleanup_stale_socket Enables Arbitrary File Deletion
File: src/admin/socket.rs:143-161
Problem: cleanup_stale_socket checks whether a socket file exists and
whether another process is listening on it, then removes the file. Between the
is_socket_active() check (which connects to the socket) and the
remove_file() call, an attacker with local access can replace the socket file
with a symlink pointing to an arbitrary path (e.g., /etc/passwd, a critical
database file). The remove_file() call then follows the symlink and deletes
the target:
async fn cleanup_stale_socket(path: &str) -> Result<(), AdminSocketError> {
let socket_path = Path::new(path);
if !socket_path.exists() {
return Ok(());
}
if is_socket_active(path).await { // check
// ...
return Err(AdminSocketError::SocketInUse(path.to_string()));
}
warn!("removing stale socket file: {}", path);
tokio::fs::remove_file(path) // act — follows symlinks!
.await
.map_err(AdminSocketError::Io)
}
This is a classic TOCTOU (time-of-check/time-of-use) race. The window is small but exploitable with local access, which is exactly the threat model for a Unix domain socket (any local user can reach it).
Additionally, is_socket_active works by actually connecting to the socket
(src/admin/socket.rs:163-165). If the path is a symlink to another service's
Unix socket, this creates a real connection in that service, which is an
unintended side effect that could be logged or trigger behavior in the other
service.
Solution: Replace the check-then-remove sequence with a safe alternative:
-
Use
std::fs::metadata()to verify the file is actually a socket before removal (sockets cannot be symlinked to —metadatadoes not follow symlinks whenstd::fs::symlink_metadatais used, and sockets are a distinct file type). -
Alternatively, use
std::fs::remove_fileonly after verifying withsymlink_metadatathat the file type isFileType::is_socket():
async fn cleanup_stale_socket(path: &str) -> Result<(), AdminSocketError> {
let socket_path = Path::new(path);
if !socket_path.exists() {
return Ok(());
}
let metadata = std::fs::symlink_metadata(path)
.map_err(AdminSocketError::Io)?;
if metadata.file_type().is_symlink() {
warn!("admin socket path {} is a symlink, refusing to remove", path);
return Err(AdminSocketError::BindFailed(
"socket path is a symlink, refusing to remove".to_string()
));
}
if !metadata.file_type().is_socket() {
warn!("admin socket path {} is not a socket file, refusing to remove", path);
return Err(AdminSocketError::BindFailed(
"path exists but is not a socket".to_string()
));
}
if is_socket_active(path).await {
warn!("socket file {} exists and another process is listening; disabling admin socket", path);
return Err(AdminSocketError::SocketInUse(path.to_string()));
}
warn!("removing stale socket file: {}", path);
tokio::fs::remove_file(path)
.await
.map_err(AdminSocketError::Io)
}
This prevents both symlink attacks (refuses to remove symlinks) and accidental removal of non-socket files.
C2. Admin Socket Has No Access Control — Any Local User Can Trigger Reload
File: src/admin/socket.rs:108-133,254-305
Problem: The admin socket accepts connections from any local user. There is
no authentication, no peer credential check, and no ownership/permission
restriction on the socket file itself. After UnixListener::bind(), the socket
inherits the process umask but no explicit restrictive permissions are set.
Any local user who can reach the socket path can:
-
Trigger a config reload (
reloadcommand) — re-reads the config file from disk and hot-swaps the live routing table. If an attacker can write to the config file path (e.g., via a separate misconfiguration or directory permission issue), they can chain this: write a malicious config, then sendreloadvia the admin socket to activate it. This could redirect traffic to an attacker-controlled upstream. -
Read status (
statuscommand) — reveals uptime and number of configured sites. Minor information disclosure but useful for reconnaissance.
The reload command is particularly dangerous because it reads the config file
from disk each time (src/admin/socket.rs:257). The config path is set at
startup from StaticConfig and cannot be changed at runtime, but the file
contents at that path can be modified by any process with write access. The
admin socket becomes a trigger mechanism for activating malicious configs.
Solution: Multi-layered defense:
- Set restrictive permissions on the socket immediately after binding:
use std::os::unix::fs::PermissionsExt;
let listener = UnixListener::bind(socket_path)?;
let perms = std::fs::Permissions::from_mode(0o660); // owner + group only
std::fs::set_permissions(socket_path, perms)?;
Or 0o600 for owner-only access. This requires the proxy to run under a
dedicated user and for the admin tool (socat, etc.) to run as the same
user or group.
- Add peer credential checking using
SO_PEERCREDon Linux:
use std::os::unix::net::UnixStream;
fn check_peer_uid(stream: &UnixStream) -> bool {
use std::os::unix::io::AsRawFd;
let uid = nix::sys::socket::getpeereid(stream.as_raw_fd())
.map(|(_, uid, _)| uid);
match uid {
Ok(peer_uid) => peer_uid == 0 || peer_uid == get_current_uid(),
Err(_) => false,
}
}
- Document that the socket path should be in a directory with restrictive
permissions (e.g.,
/run/reverse-proxy/owned by the proxy user with mode0700).
C3. Admin Socket reload Response Leaks Filesystem Paths and Error Details
File: src/admin/socket.rs:257-265,268-276
Problem: When the reload command fails, the error response includes the
full std::io::Error or toml::de::Error message, which can contain absolute
filesystem paths, file permissions, and internal config structure details:
Err(e) => {
return serde_json::to_string(&ErrorResponse {
status: "error",
message: format!("failed to read config file: {}", e), // leaks path
})
.unwrap();
}
Err(e) => {
return serde_json::to_string(&ErrorResponse {
status: "error",
message: format!("failed to parse config file: {}", e), // leaks structure
})
.unwrap();
}
The same applies to the unknown command response at line 247, which echoes
arbitrary input back without sanitization:
message: format!("unknown command: {}", command),
For an unauthenticated socket, this information disclosure helps an attacker enumerate the system (filesystem layout, config syntax, software version via error messages).
Solution: Return generic error messages to the socket client and log the details server-side:
Err(e) => {
tracing::error!(error = %e, "failed to read config file");
serde_json::to_string(&ErrorResponse {
status: "error",
message: "reload failed".to_string(),
}).unwrap()
}
For unknown commands, avoid echoing input:
_ => serde_json::to_string(&ErrorResponse {
status: "error",
message: "unknown command".to_string(),
}).unwrap(),
Warning Findings
W1. No Concurrency Limit on Admin Socket Connections
File: src/admin/socket.rs:108-133
Problem: Each accepted connection spawns a new tokio task (tokio::spawn at
line 114) with no concurrency limit. A local user with access to the socket can
open many simultaneous connections, spawning unlimited tasks. While the 5-second
read timeout and 4096-byte limit (added in review #004) mitigate the most
trivial DoS, a determined attacker can:
- Open many connections simultaneously, each sending data slowly (within the 5-second timeout), consuming memory and task slots.
- Send data up to 4096 bytes per connection — with enough concurrent connections, this still consumes significant memory.
Solution: Add a tokio::sync::Semaphore with a reasonable limit (e.g., 10
concurrent connections):
let semaphore = Arc::new(tokio::sync::Semaphore::new(10));
// In the accept loop:
let permit = match semaphore.clone().acquire_owned().await {
Ok(p) => p,
Err(_) => {
warn!("admin socket connection limit reached, dropping connection");
continue;
}
};
tokio::spawn(async move {
let _permit = permit;
handle_connection(stream, admin_socket).await;
});
W2. Config File TOCTOU: Reload Reads File Without Atomicity
File: src/admin/socket.rs:257, src/shutdown.rs:88
Problem: Both handle_reload and handle_sighup_reload read the config
file with tokio::fs::read_to_string(), then parse it. If another process is
writing to the config file at the same time (e.g., a configuration management
tool like Ansible writing a partial file), the proxy could read a partially
written config and apply it. This is a filesystem-level TOCTOU issue.
Unlike the symlink race (C1), this is harder to exploit directly — the parse will likely fail on a partial file, resulting in a reload error rather than a bad config being applied. However, it could result in a brief window where the operator sees confusing errors during config rotation.
Solution: Use atomic file replacement — write to a temporary file in the same directory, then rename over the target. Document this pattern for operators. Alternatively, compute a checksum or stat the file before and after reading to detect mid-write changes:
let metadata_before = tokio::fs::metadata(&admin_socket.config_path).await?;
let config_content = tokio::fs::read_to_string(&admin_socket.config_path).await?;
let metadata_after = tokio::fs::metadata(&admin_socket.config_path).await?;
if metadata_before.modified()? != metadata_after.modified()? {
return serde_json::to_string(&ErrorResponse {
status: "error",
message: "config file changed during read, please retry".to_string(),
}).unwrap();
}
W3. Admin Socket Path Is Not Validated or Sanitized
File: src/admin/socket.rs:81-83, src/config/static_config.rs:22-24
Problem: The admin_socket_path from the config file is used directly as a
filesystem path with no validation. A malicious or misconfigured path could
point to:
- A path on a critical filesystem (e.g.,
/etc/passwd) - An extremely long path (potential buffer issues in downstream code)
- A path with directory traversal (e.g.,
../../etc/cron.d/malicious)
While the default /run/reverse-proxy/admin.sock is safe, the config is user-
controlled and loaded from disk. Combined with C2 (no authentication), a local
attacker who can write to the config file and trigger a reload could redirect
the admin socket to an arbitrary path.
This is partially mitigated by the fact that admin_socket_path is in
StaticConfig (requires restart, not hot-reloadable), but the startup config
still trusts the path.
Solution: Add validation that the socket path:
- Ends with
.sockor.socket(or at least doesn't end with a suspicious extension) - Is under a known-safe directory prefix (e.g.,
/run/,/var/run/, or a configurable base directory) - Does not contain path traversal components (
..) - Has a reasonable length limit
fn validate_admin_socket_path(path: &str) -> Result<(), ValidationError> {
if path.is_empty() { return Ok(()); } // disabled is valid
if path.len() > 255 {
return Err(ValidationError::AdminSocketPathTooLong);
}
if path.contains("..") {
return Err(ValidationError::AdminSocketPathTraversal);
}
let path = Path::new(path);
if path.is_absolute() {
Ok(())
} else {
Err(ValidationError::AdminSocketPathRelative)
}
}
W4. is_socket_active Side-Effect on Other Processes
File: src/admin/socket.rs:163-165
Problem: This was flagged in review #003 (W7) and accepted as Phase 1.
However, in the context of the symlink attack in C1, this function becomes more
dangerous: if an attacker replaces the socket path with a symlink to another
service's socket, is_socket_active will connect to that service, which could
trigger behavior in the target service (e.g., accepting a connection, logging
it, starting a session). This amplifies the C1 symlink attack beyond file
deletion.
Solution: See C1 — the proposed fix uses symlink_metadata() and checks
is_socket() before calling is_socket_active, which eliminates the symlink
attack surface. Additionally, the is_socket_active check should only be
reached after verifying the file is not a symlink and is actually a socket
file type.
W5. reload Command Does Not Validate Config Before Applying
File: src/admin/socket.rs:268-279, src/config/dynamic_config.rs:134-158
Problem: While ConfigReloadHandle::reload() calls validate() before
storing the new config, the validation does not check that the config file
itself hasn't been tampered with (e.g., via checksum). More critically, the
validation passes cli_allow_wildcard_bind: false during reload
(validate(&new_static, &new_dynamic, false) at line 141), but the startup
path may have passed true (via --allow-wildcard-bind CLI flag). This means
a config reload could tighten the wildcard bind restriction that was
intentionally relaxed via CLI, causing existing listeners on 0.0.0.0 to
continue running while the validation reports them as errors.
This is a corner case — the running listeners are not stopped — but it creates a confusing state where the config is rejected on reload even though it was accepted at startup.
Solution: Store the cli_allow_wildcard_bind flag in ConfigReloadHandle
or StaticConfig so that reload uses the same flag as startup:
pub struct ConfigReloadHandle {
config: Arc<ArcSwap<DynamicConfig>>,
static_config: ArcSwap<StaticConfig>,
reload_mutex: Mutex<()>,
cli_allow_wildcard_bind: bool, // stored from startup
}
W6. Reload Error Response Includes Static Config Change Warnings That Should Not Be Exposed
File: src/admin/socket.rs:286-293
Problem: On successful reload, if static config fields have changed, the
response includes a log at warning level but the admin socket only returns
{"status": "ok"}. However, the reload_mutex inside ConfigReloadHandle
already ensures only one reload runs at a time, but the changed_fields list
is only logged, not returned to the caller. An operator sending reload via
socat has no way to know if their reload actually changed static fields that
require a restart. This is an operational concern rather than a security issue.
Solution: Include changed_fields in the reload response when non-empty:
Ok(changed_fields) => {
if !changed_fields.is_empty() {
serde_json::to_string(&OkWithFieldsResponse {
status: "ok",
message: format!("static fields changed (restart required): {}", changed_fields.join(", ")),
}).unwrap()
} else {
serde_json::to_string(&OkResponse { status: "ok" }).unwrap()
}
}
W7. Health Check Endpoint Exposes Service on Predictable Port
File: src/health.rs:17-36
Problem: The health check listener binds to 127.0.0.1 on the configured
port (default 9900). While binding to localhost is correct, the port is
predictable and not configurable to bind to a Unix socket instead. On shared
systems, any local process can connect to 127.0.0.1:9900/health and confirm
the proxy is running. This is low risk (the response is just 200 OK with no
body), but the endpoint is completely unauthenticated and could be used for
reconnaissance.
More importantly, there's no rate limiting on the health endpoint. A local attacker could flood it to consume connection resources, though this is mitigated by the fact that it's localhost-only and a simple GET.
Solution: Consider adding an option to bind the health check to a Unix socket instead of a TCP port, or adding a shared-secret token header check. For Phase 1, this is acceptable as-is since the endpoint is localhost-only and returns minimal information.
Suggestions
S1. Add Connection Peer Logging to Admin Socket
File: src/admin/socket.rs:112
Suggestion: Log the peer credentials (UID, PID, GID) of each admin socket
connection using SO_PEERCRED. This provides an audit trail for reload
operations:
use std::os::unix::io::AsRawFd;
fn log_peer_credentials(stream: &tokio::net::UnixStream) {
if let Ok((_, uid)) = nix::sys::socket::getpeereid(stream.as_raw_fd()) {
info!(peer_uid = uid, "admin socket connection accepted");
}
}
S2. Use AF_UNIX Peer Credential Validation Instead of File Permissions
File: src/admin/socket.rs
Suggestion: Rather than (or in addition to) setting socket file permissions,
use SO_PEERCRED/getpeereid to validate the connecting process's UID at
accept time. This is more robust than filesystem permissions because:
- It works regardless of the umask
- It allows fine-grained control (e.g., allow root and the proxy user only)
- It cannot be bypassed by group membership changes
Requires adding nix as a dependency (or using libc directly).
S3. Set Socket File Permissions Immediately After Bind
File: src/admin/socket.rs:90-102
Suggestion: After the successful UnixListener::bind(), immediately set
restrictive permissions on the socket file:
let listener = UnixListener::bind(socket_path)?;
let perms = std::fs::Permissions::from_mode(0o660);
std::fs::set_permissions(socket_path, perms)?;
info!("admin socket listening on {} (mode 660)", socket_path);
This is the minimum defense-in-depth even if full peer credential checking is not implemented in Phase 1.
S4. Consider Adding a Simple Challenge-Response to the Admin Socket Protocol
File: src/admin/socket.rs:167-252
Suggestion: Even without full authentication, a simple shared-secret token
check would significantly raise the bar for local attackers. Add a token
field to StaticConfig (or an environment variable) and require it as a prefix
to commands:
<token>:reload\n
<token>:status\n
This prevents casual discovery of the socket path from enabling reload. The
token could be stored in an environment variable (ADMIN_SOCKET_TOKEN) that
the proxy reads at startup, keeping it out of the config file.
S5. Move is_socket_active Check After File Type Verification
File: src/admin/socket.rs:143-161
Suggestion: This is the implementation detail of C1's fix. The current order is:
- Check if path exists
- Check if socket is active (connect to it)
- Remove the file
The safe order should be:
- Check if path exists
- Verify it's not a symlink (
symlink_metadata+is_symlink()) - Verify it's a socket file type (
is_socket()) - Check if socket is active
- Remove the file
S6. Document Admin Socket Security Model
Suggestion: Add a section to docs/architecture/operations.md (or similar)
that clearly states:
- The admin socket is unauthenticated by design for Phase 1
- The socket path must be in a directory with restrictive permissions
- The socket file permissions should be
0600or0660 - Only trusted users should have filesystem access to the socket path
- Future phases may add peer credential authentication or challenge-response
Summary Statistics
| Severity | Count | Status |
|---|---|---|
| Critical | 3 | Must fix before production |
| Warning | 7 | Should fix — security hardening |
| Suggestion | 6 | Consider for defense-in-depth |
Recommended Fix Priority
- C1 (symlink race / arbitrary file deletion) — Exploitable by any local
user who can reach the socket directory. Fix with
symlink_metadata+ file type check before removal. - C2 (no authentication / unrestricted socket permissions) — Any local user can trigger config reload. Set socket permissions to 0o660 or 0o600 after bind as a minimum; add peer credential checking for full protection.
- C3 (error messages leak filesystem paths) — Information disclosure on an unauthenticated socket. Return generic errors and log details server-side.
- W1 (no connection concurrency limit) — DoS vector. Add a semaphore.
- W2 (config file read without atomicity) — Partial-read risk during config rotation. Document atomic replacement pattern for operators.
- W3 (socket path not validated) — Path traversal/symlink risk in config. Add basic validation.
- W4 (is_socket_active side effect) — Amplifies C1. Fixed by C1's
symlink_metadataguard. - W5 (reload validation uses different wildcard flag) — Inconsistency between startup and reload validation.
- Remaining W and S findings — Fix opportunistically.
Additional Notes
The unusual traffic on this file was likely driven by interest in C1 and C2. The symlink race (C1) is the most directly exploitable vulnerability — it requires only local filesystem access to the socket directory and timing to replace the socket with a symlink. Combined with C2 (no authentication), a local attacker who can write to the socket directory can both delete arbitrary files (C1) and trigger config reloads with attacker-controlled content (C2 + config file write access).