# ADR-005: Safetensors-Only Model Loading ## Status Accepted ## Context Model weight files come in two formats: - **Pickle-based** (`.pt`, `.bin`, `.pth`): Can execute arbitrary Python code during loading. Known supply chain attack vector. - **safetensors**: Simple binary format with JSON header. No code execution. 76x faster CPU loading. Zero-copy/lazy loading support. This is a security product. Loading untrusted pickle files in a security product is a contradiction. The LiteLLM supply chain attack (CVE-2026-33634, CVSS 9.4) demonstrated that compromised model files can lead to credential theft and backdoors. ## Decision Only load model weights from safetensors format. Never load `.pt`, `.bin`, or `.pth` files. Apply this policy to both the detector model and the codebook tensors. ## Consequences **Positive**: - Eliminates entire class of supply chain attacks via model files - 76x faster model loading on CPU - Zero-copy/lazy loading reduces memory usage - Cross-framework compatible (PyTorch, ONNX, numpy) - Consistent with HuggingFace's own migration to safetensors-default **Negative**: - Some older models only ship `.bin` weights — must convert before use - Safetensors doesn't support saving optimizer state (irrelevant — we only do inference) - Explicit `use_safetensors=True` parameter needed in transformers for older versions ## References - [python-ml-packaging.md](../research/python-ml-packaging.md) — Section 6: safetensors format comparison - CVE-2026-33634 — LiteLLM supply chain attack