Research-driven resolution of OQ-01, OQ-02, OQ-05, OQ-06: - OQ-01: Remove ONNX Runtime from scope entirely — doesn't support activation extraction natively (optimum #972 closed as not planned), bloated model exports; burn/cublas via safetensors is a better future path - OQ-02: Codebook compresses ~65% (1,245 → 500-600 lines); add Package Structure and Extraction from PoC sections to codebook.md based on PoC analysis of metaspline firewall_codebook.py - OQ-05: Standalone API + thin adapter pattern (ADR-011); Phase 1 ships Firewall.screen() only, Phase 2 adds <100-line adapter packages for LlamaFirewall, OpenAI Agents SDK, NeMo Guardrails - OQ-06: TOML for file-based config — standard modern Python, two-way door Also: research OQ-03 rolling windows from taskgraph-semantic reference code, remove onnxruntime/optimum from dependencies, move streaming screening to Phase 2, add burn/cublas as Phase 3 alternative backend.
64 lines
2.1 KiB
Markdown
64 lines
2.1 KiB
Markdown
# ADR-006: PyTorch as Optional Dependency
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
PyTorch is the inference backend for the detector model. However, PyTorch is
|
|
large:
|
|
|
|
- `torch` (CPU): ~200MB download, ~700MB installed
|
|
- `torch` (CUDA): ~2.5GB download, ~5GB+ installed
|
|
|
|
Making PyTorch a required dependency would force a 200MB-2.5GB download on
|
|
every user, even those who already have PyTorch installed. This is the
|
|
standard problem for ML libraries, and the HuggingFace ecosystem has
|
|
converged on a solution.
|
|
|
|
## Decision
|
|
|
|
Make PyTorch an optional dependency via extras (`pip install
|
|
alknet-firewall[torch]`). The base install includes all non-ML dependencies
|
|
(sklearn, huggingface-hub, safetensors, tokenizers, numpy). ML inference
|
|
backends are installed separately.
|
|
|
|
Use lazy imports with clear error messages when PyTorch is not installed:
|
|
|
|
```python
|
|
try:
|
|
import torch
|
|
except ImportError:
|
|
raise ImportError(
|
|
"PyTorch is required for alknet-firewall inference. "
|
|
"Install with: pip install 'alknet-firewall[torch]' "
|
|
"or pip install torch --index-url https://download.pytorch.org/whl/cpu"
|
|
)
|
|
```
|
|
|
|
## Consequences
|
|
|
|
**Positive**:
|
|
- Base install is ~30MB download, ~100MB installed — very lightweight
|
|
- Users with existing PyTorch installations don't re-download
|
|
- Follows HuggingFace ecosystem conventions (transformers, safetensors, HF
|
|
hub all use this pattern)
|
|
- uv supports CPU/GPU torch variant selection via `[tool.uv.sources]` and
|
|
`[[tool.uv.index]]`
|
|
|
|
**Negative**:
|
|
- More complex dependency specification in pyproject.toml
|
|
- Users must read installation docs to choose the right extra
|
|
- Runtime import errors if users forget to install a backend
|
|
- CPU-only torch requires two-step install or uv configuration (can't be
|
|
expressed in pip extras alone)
|
|
- PyTorch is the only supported inference backend; future alternatives
|
|
(burn/cublas via safetensors) would require separate integration work
|
|
|
|
## References
|
|
|
|
- [modern-python-project-setup.md](../research/modern-python-project-setup.md) —
|
|
Section 2: PyTorch handling
|
|
- [python-ml-packaging.md](../research/python-ml-packaging.md) — Section 1:
|
|
PyTorch as dependency |