Files
reverse-proxy/docs/architecture/overview.md
glm-5.1 8ee6284b62 Add architecture specification for Rust/axum reverse proxy
Phase 1 architecture docs covering proxy handler, TLS termination (ACME +
manual), TOML config with static/dynamic split (ArcSwap), and operations
(rate limiting, logging, health check, systemd, graceful shutdown).

Nine ADRs documenting key decisions: Rust/axum, custom proxy handler,
TOML config, rustls-acme for cert management, tokio-rustls direct,
token bucket rate limiting, custom log format for fail2ban,
static/dynamic config split, and signal handling strategy.

Includes threat landscape research documenting the nginx CVEs motivating
this project.
2026-06-11 07:25:50 +00:00

166 lines
8.0 KiB
Markdown

---
status: draft
last_updated: 2026-06-11
---
# Overview
## Vision
A memory-safe, minimal reverse proxy that replaces our vulnerable nginx instance
for forward-proxying to backend services. The proxy terminates TLS, injects
standard proxy headers, enforces rate limits, and forwards requests to upstream
services — with operational feature parity for our current single-domain Gitea
setup.
## Why This Exists
Our nginx 1.24.0 installation is vulnerable to multiple actively-exploited
CVEs, including CVE-2026-42945 (unauthenticated RCE via `rewrite`/`set`
directives). The broader threat landscape is worsening: LLM-assisted fuzzing
is accelerating bug discovery in nginx's C codebase, and security researchers
report additional undisclosed vulnerabilities. Upgrading nginx patches known
CVEs but does not address the structural problem — memory corruption bugs are
endemic to C, and the discovery rate is accelerating.
Rust's memory safety eliminates the entire class of buffer overflow,
use-after-free, and double-free bugs that constitute 6 of 7 recent nginx CVEs.
Combined with rustls (pure Rust TLS, no OpenSSL dependency), this provides a
fundamentally safer baseline.
See [threat-landscape.md](../research/threat-landscape.md) for full vulnerability
details.
## Scope
### In Scope
- **Phase 1**: Replace nginx for `git.alk.dev` with feature parity
- TLS termination with ACME (Let's Encrypt) certificate management
- Manual certificate paths as fallback mode
- HTTP → HTTPS redirect
- Reverse proxy to Gitea at `127.0.0.1:3000`
- Proxy header injection (Host, X-Real-IP, X-Forwarded-For, X-Forwarded-Proto)
- Request rate limiting with fail2ban-compatible logging (global per-IP; per-site in Phase 2)
- 100 MB body size limit (global; per-site in Phase 2)
- Configurable bind address (no `0.0.0.0` default)
- Health check endpoint
- Graceful shutdown (SIGTERM handling)
- Systemd unit file
- **Phase 2**: Multi-site support
- SNI-based TLS routing for multiple domains
- Config file for site definitions
- Dynamic config reload (ArcSwap pattern)
- **Phase 3**: Operational hardening
- Metrics endpoint (Prometheus-compatible)
- Connection limits and timeouts
- Log rotation
### Out of Scope
- HTTP/2 or HTTP/3 proxying (services that need these run their own native
Rust servers — e.g., `api.alk.dev`)
- Load balancing or round-robin upstream selection
- WebSocket proxying (can be added later if needed)
- Static file serving
- Access control beyond rate limiting (no auth, no IP allowlists in Phase 1)
- CGI, SCGI, uWSGI, FastCGI
## Architecture
```
┌────────────────────────────────────┐
│ reverse-proxy (Rust/axum) │
config.toml ──────► │ StaticConfig + DynamicConfig │
│ (ArcSwap for hot-reload) │
│ │
bind_addr:80 ──► │ HTTP listener → 301 redirect │
│ to HTTPS │
│ │
bind_addr:443 ──► │ TLS listener (tokio-rustls) │
│ ├─ ACME mode: rustls-acme resolver │
│ │ (auto cert provisioning/renewal) │
│ └─ Manual mode: cert/key file paths │
│ │
│ axum router │
│ ├─ Host-based routing │
│ ├─ Rate limiting middleware │
│ ├─ Proxy header injection │
│ ├─ Body size limit (100MB) │
│ └─ Reverse proxy handler │
│ └─ hyper Client → upstream │
│ │
│ /health → 200 OK │
└────────────────────────────────────┘
```
## Crate Dependencies
### Core
| Crate | Version | Purpose | Notes |
|-------|---------|---------|-------|
| `axum` | 0.8 | HTTP framework | Routing, middleware, extractors |
| `tokio` | 1 (full) | Async runtime | Multi-threaded runtime |
| `hyper` | 1 | HTTP protocol | Used via axum, and directly for proxy `Client` |
| `tower` | 0.5 | Middleware ecosystem | Service trait, layers |
| `rustls` | 0.23 | TLS implementation | `aws_lc_rs` crypto provider |
| `tokio-rustls` | 0.26 | Async TLS I/O | Wraps TCP with TLS |
| `rustls-acme` | 0.12 | ACME client | Let's Encrypt auto-provisioning and renewal |
### Supporting
| Crate | Version | Purpose | Notes |
|-------|---------|---------|-------|
| `serde` | 1 | Serialization | TOML config deserialization |
| `toml` | 0.8 | Config format | Declarative site definitions |
| `arc-swap` | 1 | Atomic config swap | Lock-free DynamicConfig reload |
| `tracing` | 0.1 | Structured logging | fail2ban-compatible output |
| `tracing-subscriber` | 0.3 | Log output | File + journald support |
| `rustls-pemfile` | 2 | PEM parsing | Manual cert loading |
| `rustls-pki-types` | 1 | TLS types | CertificateDer, PrivateKeyDer |
| `clap` | 4 | CLI arguments | Server startup options |
| `signal-hook` | 0.3 | Signal handling | SIGTERM/SIGINT for shutdown, SIGHUP for config reload |
Versions listed are minimum major versions. Implementation should pin exact
versions in `Cargo.toml` per standard Rust practice.
## Exports
This is a single-binary deployment. There are no library exports. The product
is the `reverse-proxy` binary plus a systemd unit file and a config file.
## Dependencies on Other Projects
- **alknet**: The `ArcSwap<DynamicConfig>` pattern, `tokio-rustls` TLS acceptor
construction, `rustls-acme` integration, and `ServerConfig` builder patterns
are adapted from alknet's transport and config layers. These patterns are
referenced as validation that the approaches work in production; all code
in this project is written from scratch.
## Design Decisions
All design decisions are documented as ADRs in [decisions/](decisions/).
| ADR | Decision | Summary |
|-----|----------|---------|
| [001](decisions/001-rust-axum.md) | Rust with axum | Memory safety eliminates the bug class causing nginx CVEs; axum provides ergonomic tower integration |
| [002](decisions/002-custom-proxy-handler.md) | Custom proxy handler | Single upstream, single domain — axum-reverse-proxy adds unnecessary complexity |
| [003](decisions/003-toml-config.md) | TOML configuration format | Rust-native, unambiguous, excellent serde support |
| [004](decisions/004-rustls-acme.md) | ACME-primary certificate management | Eliminates certbot dependency; automatic provisioning and renewal |
| [005](decisions/005-tokio-rustls-direct.md) | tokio-rustls directly, not axum-server | Full control over TLS config, ACME resolver integration, cipher suite configuration |
| [006](decisions/006-rate-limiting-approach.md) | Token bucket rate limiting | In-memory per-IP token bucket matching nginx burst semantics |
| [007](decisions/007-custom-log-format.md) | Custom structured log format | key=value pairs with RATE_LIMIT prefix for fail2ban |
| [008](decisions/008-static-dynamic-config-split.md) | Static/dynamic config with ArcSwap | Immutable StaticConfig, hot-reloadable DynamicConfig via ArcSwap |
| [009](decisions/009-signal-handling.md) | Signal handling strategy | signal-hook for SIGTERM/SIGINT/SIGHUP |
## Open Questions
Open questions are tracked in [open-questions.md](open-questions.md). Key
questions affecting this document:
- **OQ-01**: Should cipher suites be restricted beyond rustls defaults? (open)
- **OQ-03**: Should the health check endpoint be on a separate port? (open)
- **OQ-05**: Should the proxy bind to multiple addresses or just one? (open)