docs: resolve 4 open questions, add research, spec codebook package structure
Research-driven resolution of OQ-01, OQ-02, OQ-05, OQ-06: - OQ-01: Remove ONNX Runtime from scope entirely — doesn't support activation extraction natively (optimum #972 closed as not planned), bloated model exports; burn/cublas via safetensors is a better future path - OQ-02: Codebook compresses ~65% (1,245 → 500-600 lines); add Package Structure and Extraction from PoC sections to codebook.md based on PoC analysis of metaspline firewall_codebook.py - OQ-05: Standalone API + thin adapter pattern (ADR-011); Phase 1 ships Firewall.screen() only, Phase 2 adds <100-line adapter packages for LlamaFirewall, OpenAI Agents SDK, NeMo Guardrails - OQ-06: TOML for file-based config — standard modern Python, two-way door Also: research OQ-03 rolling windows from taskgraph-semantic reference code, remove onnxruntime/optimum from dependencies, move streaming screening to Phase 2, add burn/cublas as Phase 3 alternative backend.
This commit is contained in:
@@ -6,17 +6,16 @@ Accepted
|
||||
|
||||
## Context
|
||||
|
||||
PyTorch is the primary inference backend for the detector model. However,
|
||||
PyTorch is large:
|
||||
PyTorch is the inference backend for the detector model. However, PyTorch is
|
||||
large:
|
||||
|
||||
- `torch` (CPU): ~200MB download, ~700MB installed
|
||||
- `torch` (CUDA): ~2.5GB download, ~5GB+ installed
|
||||
- `onnxruntime`: ~30-50MB download, ~300MB installed
|
||||
|
||||
Making PyTorch a required dependency would force a 200MB-2.5GB download on
|
||||
every user, even those who already have PyTorch installed or prefer ONNX
|
||||
Runtime. This is the standard problem for ML libraries, and the HuggingFace
|
||||
ecosystem has converged on a solution.
|
||||
every user, even those who already have PyTorch installed. This is the
|
||||
standard problem for ML libraries, and the HuggingFace ecosystem has
|
||||
converged on a solution.
|
||||
|
||||
## Decision
|
||||
|
||||
@@ -43,7 +42,6 @@ except ImportError:
|
||||
**Positive**:
|
||||
- Base install is ~30MB download, ~100MB installed — very lightweight
|
||||
- Users with existing PyTorch installations don't re-download
|
||||
- ONNX Runtime alternative available for minimal footprint (~100MB total)
|
||||
- Follows HuggingFace ecosystem conventions (transformers, safetensors, HF
|
||||
hub all use this pattern)
|
||||
- uv supports CPU/GPU torch variant selection via `[tool.uv.sources]` and
|
||||
@@ -55,6 +53,8 @@ except ImportError:
|
||||
- Runtime import errors if users forget to install a backend
|
||||
- CPU-only torch requires two-step install or uv configuration (can't be
|
||||
expressed in pip extras alone)
|
||||
- PyTorch is the only supported inference backend; future alternatives
|
||||
(burn/cublas via safetensors) would require separate integration work
|
||||
|
||||
## References
|
||||
|
||||
|
||||
@@ -0,0 +1,75 @@
|
||||
# ADR-011: Standalone API with Thin Adapter Integration Strategy
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
alknet-firewall provides behavioral signal detection — fundamentally different
|
||||
from text-surface defenses like Llama Guard, NeMo Guardrails, or Guardrails AI.
|
||||
It requires running a small detector model and extracting hidden state
|
||||
activations, not classifying input text. Users may want to run both text-surface
|
||||
defenses and behavioral detection in series.
|
||||
|
||||
Research into existing guardrail systems ([patterns-analysis.md](../../research/guardrail-integration-patterns/patterns-analysis.md))
|
||||
identified three viable integration targets with high compatibility:
|
||||
|
||||
- **LlamaFirewall**: `BaseScanner.scan()` → `ScanResult` maps directly to
|
||||
`Firewall.screen()` → `Alarm`
|
||||
- **OpenAI Agents SDK**: `@input_guardrail` decorator pattern with blocking
|
||||
execution
|
||||
- **NeMo Guardrails**: Custom Python action in input rails (Colang DSL can't
|
||||
express behavioral detection natively)
|
||||
|
||||
Two systems have low compatibility: Guardrails AI (expects text-surface
|
||||
validators with content fixes, not alarms) and Amazon Bedrock Guardrails
|
||||
(closed service, no extension mechanism).
|
||||
|
||||
## Decision
|
||||
|
||||
**Phase 1**: Ship a standalone API only. No adapters, no common interface.
|
||||
|
||||
```python
|
||||
# The core API — simple, composable, no framework dependencies
|
||||
firewall = Firewall()
|
||||
alarm = firewall.screen("untrusted input text")
|
||||
```
|
||||
|
||||
**Phase 2**: Build thin adapter packages as optional dependencies. Each adapter
|
||||
is <100 lines and has no impact on the core library:
|
||||
|
||||
- `alknet-firewall-llamafirewall`: Custom `BaseScanner` subclass
|
||||
- `alknet-firewall-agents`: `@input_guardrail` wrapper
|
||||
- `alknet-firewall-nemo`: Custom NeMo input rail action
|
||||
|
||||
Do NOT build a common `ScreeningProvider` interface. The integration patterns
|
||||
differ enough between systems that a shared abstraction would be premature and
|
||||
constraining. If a common pattern emerges organically from the adapters,
|
||||
extract it then.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive**:
|
||||
- Phase 1 ships faster — no adapter development or testing overhead
|
||||
- Core API stays clean and framework-independent
|
||||
- Users can compose manually: call `firewall.screen()` then pass results to
|
||||
any guardrail system
|
||||
- Adapters are optional packages, not core dependencies — no coupling
|
||||
- Thin adapters are easy to maintain when guardrail frameworks change their
|
||||
APIs
|
||||
|
||||
**Negative**:
|
||||
- Phase 1 users must write their own glue code (typically 5–10 lines)
|
||||
- No "pip install and configure" experience until Phase 2
|
||||
- Multiple small adapter packages to maintain
|
||||
- Risk of API drift between core and adapters if adapters are maintained
|
||||
infrequently
|
||||
|
||||
## References
|
||||
|
||||
- [OQ-05](../open-questions.md) — How should the firewall integrate with
|
||||
existing guardrail systems?
|
||||
- [patterns-analysis.md](../../research/guardrail-integration-patterns/patterns-analysis.md) — Full research analysis
|
||||
- [ADR-002](002-behavioral-signals.md) — Behavioral signal detection (not text
|
||||
classification)
|
||||
Reference in New Issue
Block a user