# ADR-010: Monotonic Spline Distributions for Behavioral Region Modeling ## Status Accepted ## Context After projecting activations onto SVD dimensions, the firewall needs to score how "normal" or "anomalous" a projection is relative to the distribution of normal inputs. This requires modeling the probability density of normal inputs along each dimension. Alternatives: - **Gaussian**: Simple, well-understood. But real behavioral distributions are often skewed, multimodal, or heavy-tailed. Gaussian assumes symmetry. - **Kernel Density Estimation (KDE)**: Non-parametric, flexible. But bandwidth selection is tricky, and KDE doesn't provide a parametric form for efficient storage and fast evaluation. - **Mixture of Gaussians**: More flexible than single Gaussian. But requires choosing the number of components and risks overfitting. - **Empirical CDF**: Non-parametric, no assumptions. But requires storing all calibration data points — not compact. - **Monotonic spline distributions**: Parametric CDF modeled as a monotonic spline. Compact (handful of knots), smooth, tail-sensitive, and differentiable. The CDF is naturally monotonic, which enforces a valid probability distribution. ## Decision Use monotonic spline distributions to model behavioral regions along each SVD dimension. The CDF is represented as a monotonic cubic spline with a small number of knots (typically 10–20 per dimension). Tail behavior uses exponential decay beyond the observed range. The scoring function computes how far a projection falls in the tail of the distribution — projections well within the normal region score low (CLEAR), projections near or beyond the tail score increasingly high. ## Consequences **Positive**: - **Smooth scoring**: Continuous score rather than hard threshold, avoiding cliff-edge behavior - **Tail sensitivity**: Exponential tails capture rare-but-critical anomalous inputs without flagging the bulk of normal inputs - **Parametric compactness**: A handful of spline knots (10–20) represent the full distribution shape. Very small storage footprint. - **Differentiability**: Scores are differentiable — potential for future adversarial training or gradient-based analysis - **No distributional assumptions**: Unlike Gaussian, spline distributions handle skew, heavy tails, and non-standard shapes **Negative**: - More complex than Gaussian — requires spline fitting during codebook compilation - Spline knot selection affects scoring quality — poor knot placement can miss important distribution features - Less familiar to most ML practitioners than Gaussian or KDE ## References - [codebook.md](../codebook.md) - metaspline PoC: `spline.py`, `transform.py`, `space.py` (~280 lines total)