Phase 0→1 setup for alknet-firewall — a behavioral signal detection library that screens untrusted LLM inputs using small model activations. Architecture docs (5 specs, 10 ADRs, 7 open questions): - overview: vision, scope, dependencies, package structure - firewall: core API, alarm protocol, score composition, error handling - codebook: SVD basis, spline distributions, calibration, tensor format - model: activation extraction, model-agnostic interface, lazy loading - configuration: thresholds, model selection, detection tuning Research reports: - modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI - python-ml-packaging: optional PyTorch, HF Hub download, safetensors - llm-input-safety-landscape: threat taxonomy, defenses, academic evidence Agent role adaptations for Python project (replaced Rust conventions).
47 lines
1.5 KiB
Markdown
47 lines
1.5 KiB
Markdown
# ADR-005: Safetensors-Only Model Loading
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
Model weight files come in two formats:
|
|
|
|
- **Pickle-based** (`.pt`, `.bin`, `.pth`): Can execute arbitrary Python code
|
|
during loading. Known supply chain attack vector.
|
|
- **safetensors**: Simple binary format with JSON header. No code execution.
|
|
76x faster CPU loading. Zero-copy/lazy loading support.
|
|
|
|
This is a security product. Loading untrusted pickle files in a security
|
|
product is a contradiction. The LiteLLM supply chain attack (CVE-2026-33634,
|
|
CVSS 9.4) demonstrated that compromised model files can lead to credential
|
|
theft and backdoors.
|
|
|
|
## Decision
|
|
|
|
Only load model weights from safetensors format. Never load `.pt`, `.bin`,
|
|
or `.pth` files. Apply this policy to both the detector model and the codebook
|
|
tensors.
|
|
|
|
## Consequences
|
|
|
|
**Positive**:
|
|
- Eliminates entire class of supply chain attacks via model files
|
|
- 76x faster model loading on CPU
|
|
- Zero-copy/lazy loading reduces memory usage
|
|
- Cross-framework compatible (PyTorch, ONNX, numpy)
|
|
- Consistent with HuggingFace's own migration to safetensors-default
|
|
|
|
**Negative**:
|
|
- Some older models only ship `.bin` weights — must convert before use
|
|
- Safetensors doesn't support saving optimizer state (irrelevant — we only
|
|
do inference)
|
|
- Explicit `use_safetensors=True` parameter needed in transformers for older
|
|
versions
|
|
|
|
## References
|
|
|
|
- [python-ml-packaging.md](../research/python-ml-packaging.md) — Section 6:
|
|
safetensors format comparison
|
|
- CVE-2026-33634 — LiteLLM supply chain attack |