docs: resolve 4 open questions, add research, spec codebook package structure
Research-driven resolution of OQ-01, OQ-02, OQ-05, OQ-06: - OQ-01: Remove ONNX Runtime from scope entirely — doesn't support activation extraction natively (optimum #972 closed as not planned), bloated model exports; burn/cublas via safetensors is a better future path - OQ-02: Codebook compresses ~65% (1,245 → 500-600 lines); add Package Structure and Extraction from PoC sections to codebook.md based on PoC analysis of metaspline firewall_codebook.py - OQ-05: Standalone API + thin adapter pattern (ADR-011); Phase 1 ships Firewall.screen() only, Phase 2 adds <100-line adapter packages for LlamaFirewall, OpenAI Agents SDK, NeMo Guardrails - OQ-06: TOML for file-based config — standard modern Python, two-way door Also: research OQ-03 rolling windows from taskgraph-semantic reference code, remove onnxruntime/optimum from dependencies, move streaming screening to Phase 2, add burn/cublas as Phase 3 alternative backend.
This commit is contained in:
@@ -4,45 +4,40 @@ Centralized tracker for unresolved questions across all architecture documents.
|
||||
|
||||
## Theme: Inference Backend
|
||||
|
||||
### OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?
|
||||
### ~~OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?~~
|
||||
|
||||
- **Origin**: [model.md](model.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — needs research into ONNX export path)
|
||||
- **Resolution**: Removed from scope entirely. ONNX Runtime does not support
|
||||
`output_hidden_states=True` natively (HuggingFace optimum issue #972 was
|
||||
closed as "not planned"), making activation extraction — the core operation —
|
||||
impractical without a custom ONNX graph modification pipeline. The ONNX
|
||||
model format also produces bloated exports. A future alternative inference
|
||||
path using burn/cublas with safetensors is more promising since it supports
|
||||
all platforms and uses the same model format we already require.
|
||||
- **Cross-references**: ADR-006
|
||||
|
||||
ONNX Runtime provides a much smaller install footprint (~30-50MB vs 200MB-2.5GB
|
||||
for PyTorch) and is well-suited for inference-only use. HuggingFace's `optimum`
|
||||
library provides drop-in replacement classes. However, supporting it in Phase 1
|
||||
adds complexity: model must be exported to ONNX format, `optimum` integration
|
||||
must be tested, and the activation extraction API may differ from PyTorch.
|
||||
|
||||
The likely path is: build with PyTorch first, then export to ONNX by default.
|
||||
This needs research to confirm the activation extraction API compatibility and
|
||||
ONNX export quality for SmolLM2-135M. Leave open for now.
|
||||
|
||||
---
|
||||
|
||||
## Theme: Codebook Design
|
||||
|
||||
### OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?
|
||||
### ~~OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?~~
|
||||
|
||||
- **Origin**: [codebook.md](codebook.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: high
|
||||
- **Resolution**: (pending — dedicated research session needed)
|
||||
- **Resolution**: Yes — ~65% compression to 500–600 lines total (400–500 runtime
|
||||
+ 150–200 training). The PoC contains ~480 lines of essential runtime code
|
||||
plus ~178 lines needed from metaspline core. The 5x-repeated decomposition
|
||||
pipeline collapses into a single `decompose()` function (~50 lines saved).
|
||||
The histogram classifier (~130 lines) is exploratory and not MVP. The
|
||||
`build()` method (429 lines) is decomposed: training logic moves to
|
||||
`training/compiler.py`, runtime state becomes immutable serialized data.
|
||||
See [poc-architecture.md](../research/codebook-analysis/poc-architecture.md)
|
||||
and the Package Structure section in [codebook.md](codebook.md).
|
||||
- **Cross-references**: ADR-004
|
||||
|
||||
The PoC codebook is 1,245 lines — much of it may be boilerplate, dead code,
|
||||
or excessive parameterization from the research phase. Understanding what's
|
||||
essential vs. exploratory is critical for the initial extraction. The codebook
|
||||
training pipeline (`run_manifold_projection.py`) should also be analyzed.
|
||||
|
||||
Consider: How many SVD dimensions are actually needed? What's the minimum
|
||||
calibration dataset? Can spline distributions be simplified? This needs a
|
||||
dedicated session to analyze the PoC codebase.
|
||||
|
||||
---
|
||||
|
||||
## Theme: API Design
|
||||
@@ -103,42 +98,30 @@ candidate for Phase 2.
|
||||
|
||||
## Theme: Integration
|
||||
|
||||
### OQ-05: How should the firewall integrate with existing guardrail systems?
|
||||
### ~~OQ-05: How should the firewall integrate with existing guardrail systems?~~
|
||||
|
||||
- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — needs deep dive into current guardrail landscape)
|
||||
- **Cross-references**: ADR-002
|
||||
|
||||
The behavioral firewall is complementary to text-surface defenses. Users may
|
||||
want to run both Llama Guard (text classification) and alknet-firewall
|
||||
(behavioral signals) in series. However, what we're doing is fundamentally
|
||||
different — it requires having the model and having trained on its specific
|
||||
behavioral signals. This means direct API-level integration with other systems
|
||||
may not be straightforward.
|
||||
|
||||
A deep dive into the current state of guardrail integration patterns
|
||||
(LlamaFirewall's scanner interface, NeMo Guardrails' Colang DSL, etc.) is
|
||||
needed to determine whether we should build adapters, define a common
|
||||
interface, or simply provide a clean standalone API and let users compose
|
||||
systems themselves.
|
||||
|
||||
Leave open — will research soon.
|
||||
- **Resolution**: Standalone API + thin adapter pattern (ADR-011). Phase 1:
|
||||
ship the standalone `Firewall.screen(text) → Alarm` API only. Phase 2:
|
||||
build thin adapter packages (<100 lines each) for LlamaFirewall,
|
||||
OpenAI Agents SDK, and NeMo Guardrails as optional dependencies. Do NOT
|
||||
build a common `ScreeningProvider` interface — behavioral detection is
|
||||
fundamentally different from text-surface defenses and premature abstraction
|
||||
would be constraining.
|
||||
- **Cross-references**: ADR-002, ADR-011
|
||||
|
||||
---
|
||||
|
||||
## Theme: Project Setup
|
||||
|
||||
### OQ-06: Should file-based configuration use TOML or YAML?
|
||||
### ~~OQ-06: Should file-based configuration use TOML or YAML?~~
|
||||
|
||||
- **Origin**: [configuration.md](configuration.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending — Phase 2 concern)
|
||||
- **Cross-references**: None
|
||||
|
||||
Phase 1 uses constructor-based configuration only. A future phase may add
|
||||
file-based configuration for easier deployment. TOML is consistent with
|
||||
Python packaging (pyproject.toml) and increasingly the standard for Python
|
||||
config. YAML is more familiar in ops/ML contexts. Either works.
|
||||
- **Resolution**: TOML. Consistent with modern Python packaging conventions
|
||||
(`pyproject.toml`) and increasingly the standard for Python configuration.
|
||||
This is a two-way door decision — reverting to YAML later is straightforward.
|
||||
- **Cross-references**: None
|
||||
Reference in New Issue
Block a user