Files
alknet-firewall/docs/architecture/decisions/005-safetensors-only.md
glm-5.1 cf464c2296 feat: initial architecture specification and research
Phase 0→1 setup for alknet-firewall — a behavioral signal detection
library that screens untrusted LLM inputs using small model activations.

Architecture docs (5 specs, 10 ADRs, 7 open questions):
- overview: vision, scope, dependencies, package structure
- firewall: core API, alarm protocol, score composition, error handling
- codebook: SVD basis, spline distributions, calibration, tensor format
- model: activation extraction, model-agnostic interface, lazy loading
- configuration: thresholds, model selection, detection tuning

Research reports:
- modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI
- python-ml-packaging: optional PyTorch, HF Hub download, safetensors
- llm-input-safety-landscape: threat taxonomy, defenses, academic evidence

Agent role adaptations for Python project (replaced Rust conventions).
2026-06-13 05:17:40 +00:00

1.5 KiB

ADR-005: Safetensors-Only Model Loading

Status

Accepted

Context

Model weight files come in two formats:

  • Pickle-based (.pt, .bin, .pth): Can execute arbitrary Python code during loading. Known supply chain attack vector.
  • safetensors: Simple binary format with JSON header. No code execution. 76x faster CPU loading. Zero-copy/lazy loading support.

This is a security product. Loading untrusted pickle files in a security product is a contradiction. The LiteLLM supply chain attack (CVE-2026-33634, CVSS 9.4) demonstrated that compromised model files can lead to credential theft and backdoors.

Decision

Only load model weights from safetensors format. Never load .pt, .bin, or .pth files. Apply this policy to both the detector model and the codebook tensors.

Consequences

Positive:

  • Eliminates entire class of supply chain attacks via model files
  • 76x faster model loading on CPU
  • Zero-copy/lazy loading reduces memory usage
  • Cross-framework compatible (PyTorch, ONNX, numpy)
  • Consistent with HuggingFace's own migration to safetensors-default

Negative:

  • Some older models only ship .bin weights — must convert before use
  • Safetensors doesn't support saving optimizer state (irrelevant — we only do inference)
  • Explicit use_safetensors=True parameter needed in transformers for older versions

References

  • python-ml-packaging.md — Section 6: safetensors format comparison
  • CVE-2026-33634 — LiteLLM supply chain attack