docs: resolve 4 open questions, add research, spec codebook package structure
Research-driven resolution of OQ-01, OQ-02, OQ-05, OQ-06: - OQ-01: Remove ONNX Runtime from scope entirely — doesn't support activation extraction natively (optimum #972 closed as not planned), bloated model exports; burn/cublas via safetensors is a better future path - OQ-02: Codebook compresses ~65% (1,245 → 500-600 lines); add Package Structure and Extraction from PoC sections to codebook.md based on PoC analysis of metaspline firewall_codebook.py - OQ-05: Standalone API + thin adapter pattern (ADR-011); Phase 1 ships Firewall.screen() only, Phase 2 adds <100-line adapter packages for LlamaFirewall, OpenAI Agents SDK, NeMo Guardrails - OQ-06: TOML for file-based config — standard modern Python, two-way door Also: research OQ-03 rolling windows from taskgraph-semantic reference code, remove onnxruntime/optimum from dependencies, move streaming screening to Phase 2, add burn/cublas as Phase 3 alternative backend.
This commit is contained in:
@@ -46,6 +46,7 @@ raises "behavioral alarms" without needing to know specific attack types.
|
||||
| [008](decisions/008-three-level-alarm.md) | Three-Level Alarm System | Accepted |
|
||||
| [009](decisions/009-last-token-extraction.md) | Last-Token Activation Extraction | Accepted |
|
||||
| [010](decisions/010-monotonic-spline-distributions.md) | Monotonic Spline Distributions | Accepted |
|
||||
| [011](decisions/011-guardrail-integration-strategy.md) | Standalone API + Thin Adapter Integration | Accepted |
|
||||
|
||||
## Open Questions
|
||||
|
||||
@@ -53,12 +54,12 @@ See [open-questions.md](open-questions.md) for the full tracker.
|
||||
|
||||
| OQ | Question | Priority | Status |
|
||||
|----|----------|----------|--------|
|
||||
| OQ-01 | Should ONNX Runtime be a supported inference backend in Phase 1? | medium | open |
|
||||
| OQ-02 | What is the minimum viable codebook — can the 1,245-line codebook be compressed? | high | open |
|
||||
| OQ-03 | Should the firewall support streaming/chunked input screening? | medium | open |
|
||||
| ~~OQ-01~~ | ~~Should ONNX Runtime be a supported inference backend in Phase 1?~~ | ~~medium~~ | **resolved** (removed from scope; burn/cublas is better future path) |
|
||||
| ~~OQ-02~~ | ~~What is the minimum viable codebook — can the 1,245-line codebook be compressed?~~ | ~~high~~ | **resolved** (~65% compression to 500–600 lines) |
|
||||
| OQ-03 | Should the firewall support streaming/chunked input screening? | medium | open (research complete, Phase 2) |
|
||||
| ~~OQ-04~~ | ~~Should detection thresholds be per-model or globally configurable?~~ | ~~medium~~ | **resolved** (both: model-specific defaults, user-overridable) |
|
||||
| OQ-05 | How should the firewall integrate with existing guardrail systems? | medium | open |
|
||||
| OQ-06 | Should file-based configuration use TOML or YAML? | low | open |
|
||||
| ~~OQ-05~~ | ~~How should the firewall integrate with existing guardrail systems?~~ | ~~medium~~ | **resolved** (ADR-011: standalone API + thin adapters) |
|
||||
| ~~OQ-06~~ | ~~Should file-based configuration use TOML or YAML?~~ | ~~low~~ | **resolved** (TOML) |
|
||||
|
||||
## Document Lifecycle
|
||||
|
||||
|
||||
Reference in New Issue
Block a user