alknet-firewall/docs/architecture/decisions/005-safetensors-only.md

# ADR-005: Safetensors-Only Model Loading

## Status

Accepted

## Context

Model weight files come in two formats:

- **Pickle-based** (`.pt`, `.bin`, `.pth`): Can execute arbitrary Python code
  during loading. Known supply chain attack vector.
- **safetensors**: Simple binary format with JSON header. No code execution.
  76x faster CPU loading. Zero-copy/lazy loading support.

This is a security product. Loading untrusted pickle files in a security
product is a contradiction. The LiteLLM supply chain attack (CVE-2026-33634,
CVSS 9.4) demonstrated that compromised model files can lead to credential
theft and backdoors.

## Decision

Only load model weights from safetensors format. Never load `.pt`, `.bin`,
or `.pth` files. Apply this policy to both the detector model and the codebook
tensors.

## Consequences

**Positive**:
- Eliminates entire class of supply chain attacks via model files
- 76x faster model loading on CPU
- Zero-copy/lazy loading reduces memory usage
- Cross-framework compatible (PyTorch, ONNX, numpy)
- Consistent with HuggingFace's own migration to safetensors-default

**Negative**:
- Some older models only ship `.bin` weights — must convert before use
- Safetensors doesn't support saving optimizer state (irrelevant — we only
  do inference)
- Explicit `use_safetensors=True` parameter needed in transformers for older
  versions

## References

- [python-ml-packaging.md](../research/python-ml-packaging.md) — Section 6:
  safetensors format comparison
- CVE-2026-33634 — LiteLLM supply chain attack