Files

glm-5.1 8ee6284b62 Add architecture specification for Rust/axum reverse proxy

Phase 1 architecture docs covering proxy handler, TLS termination (ACME +
manual), TOML config with static/dynamic split (ArcSwap), and operations
(rate limiting, logging, health check, systemd, graceful shutdown).

Nine ADRs documenting key decisions: Rust/axum, custom proxy handler,
TOML config, rustls-acme for cert management, tokio-rustls direct,
token bucket rate limiting, custom log format for fail2ban,
static/dynamic config split, and signal handling strategy.

Includes threat landscape research documenting the nginx CVEs motivating
this project.

2026-06-11 07:25:50 +00:00

2.9 KiB

Raw Blame History

ADR-006: Token Bucket Rate Limiting with In-Memory State

Status

Accepted

Context

The proxy must enforce request rate limits per client IP address, replacing nginx's limit_req_zone directive. Rate limiting is critical for preventing abuse and for fail2ban integration (rate-limited requests trigger fail2ban actions).

Several rate limiting approaches exist:

Token bucket: Tokens accumulate at a fixed rate; each request consumes a token. Allows short bursts up to the bucket capacity.
Leaky bucket: Requests are processed at a fixed rate; excess requests queue or are rejected. No burst allowance.
Fixed window: Count requests in fixed time windows (e.g., per minute). Allows burst at window boundaries.
Sliding window: Count requests in a rolling time window. More accurate than fixed window but more complex.

The current nginx config uses limit_req zone=gitea_limit burst=20 nodelay, which is a token bucket with burst allowance.

For state storage:

In-memory HashMap: Fast, no external dependencies, lost on restart.
External store (Redis, etc.): Shared across instances, persists across restarts. Adds operational complexity.
tower-governor crate: Pre-built rate limiting middleware. Uses generalized cell algorithm. Adds dependency.

Decision

Use a token bucket algorithm with in-memory HashMap<IpAddr, TokenBucket> state, protected by tokio::sync::Mutex. Rate limiting runs as axum middleware before the proxy handler.

Rate limits are global per-IP (not per-site) in Phase 1. Per-site rate limits may be added in Phase 2 as the config model evolves.

Stale entries in the HashMap are cleaned up periodically. A background task scans the HashMap at a configurable interval (default: 60 seconds) and removes entries that haven't been accessed within the cleanup interval.

Rationale

Token bucket matches nginx's limit_req burst semantics, ensuring behavioral compatibility during migration.
In-memory state is sufficient for a single-instance proxy (no shared state needed).
tokio::sync::Mutex (not std::sync::Mutex) avoids holding the lock across await points and integrates with the async runtime.
Custom implementation gives full control over logging output for fail2ban integration (ADR-007).
State loss on restart is acceptable — rate limit state is inherently ephemeral.

Consequences

Positive:

Behavioral compatibility with nginx rate limiting
Full control over fail2ban log format
No external dependencies (Redis, etc.)
Simple implementation (~100 lines)

Negative:

Rate limit state is lost on restart (acceptable for single-instance deploy)
Not suitable for multi-instance deployments without external state store (Phase 1 is single-instance)
HashMap grows over time without eviction (mitigated by periodic cleanup)

References

operations.md
nginx limit_req documentation

2.9 KiB Raw Blame History