feat: initial architecture specification and research
Phase 0→1 setup for alknet-firewall — a behavioral signal detection library that screens untrusted LLM inputs using small model activations. Architecture docs (5 specs, 10 ADRs, 7 open questions): - overview: vision, scope, dependencies, package structure - firewall: core API, alarm protocol, score composition, error handling - codebook: SVD basis, spline distributions, calibration, tensor format - model: activation extraction, model-agnostic interface, lazy loading - configuration: thresholds, model selection, detection tuning Research reports: - modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI - python-ml-packaging: optional PyTorch, HF Hub download, safetensors - llm-input-safety-landscape: threat taxonomy, defenses, academic evidence Agent role adaptations for Python project (replaced Rust conventions).
This commit is contained in:
47
docs/architecture/decisions/005-safetensors-only.md
Normal file
47
docs/architecture/decisions/005-safetensors-only.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# ADR-005: Safetensors-Only Model Loading
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Model weight files come in two formats:
|
||||
|
||||
- **Pickle-based** (`.pt`, `.bin`, `.pth`): Can execute arbitrary Python code
|
||||
during loading. Known supply chain attack vector.
|
||||
- **safetensors**: Simple binary format with JSON header. No code execution.
|
||||
76x faster CPU loading. Zero-copy/lazy loading support.
|
||||
|
||||
This is a security product. Loading untrusted pickle files in a security
|
||||
product is a contradiction. The LiteLLM supply chain attack (CVE-2026-33634,
|
||||
CVSS 9.4) demonstrated that compromised model files can lead to credential
|
||||
theft and backdoors.
|
||||
|
||||
## Decision
|
||||
|
||||
Only load model weights from safetensors format. Never load `.pt`, `.bin`,
|
||||
or `.pth` files. Apply this policy to both the detector model and the codebook
|
||||
tensors.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive**:
|
||||
- Eliminates entire class of supply chain attacks via model files
|
||||
- 76x faster model loading on CPU
|
||||
- Zero-copy/lazy loading reduces memory usage
|
||||
- Cross-framework compatible (PyTorch, ONNX, numpy)
|
||||
- Consistent with HuggingFace's own migration to safetensors-default
|
||||
|
||||
**Negative**:
|
||||
- Some older models only ship `.bin` weights — must convert before use
|
||||
- Safetensors doesn't support saving optimizer state (irrelevant — we only
|
||||
do inference)
|
||||
- Explicit `use_safetensors=True` parameter needed in transformers for older
|
||||
versions
|
||||
|
||||
## References
|
||||
|
||||
- [python-ml-packaging.md](../research/python-ml-packaging.md) — Section 6:
|
||||
safetensors format comparison
|
||||
- CVE-2026-33634 — LiteLLM supply chain attack
|
||||
Reference in New Issue
Block a user