Files
alknet-firewall/docs/architecture/open-questions.md
glm-5.1 c225cf420c docs: resolve OQ-03 — adopt rolling token window screening (ADR-012)
Research confirmed rolling token windows as the right approach for long
document screening. ADR-012 formalizes the decision: Phase 2 implements
screen_document() with 25% overlap (512 tokens for SmolLM2-135M), max
pooling aggregation, and character offset tracking. Short inputs fall
through to screen() unchanged.

This resolves the last open question. All 6 original OQs are now resolved:
- OQ-01: ONNX removed (burn/cublas better future path)
- OQ-02: 65% codebook compression achievable
- OQ-03: Rolling token windows for Phase 2 (ADR-012)
- OQ-04: Both model-specific defaults + user-overridable
- OQ-05: Standalone API + thin adapters (ADR-011)
- OQ-06: TOML for file-based config
2026-06-13 08:25:12 +00:00

109 lines
4.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Open Questions
Centralized tracker for unresolved questions across all architecture documents.
## Theme: Inference Backend
### ~~OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?~~
- **Origin**: [model.md](model.md), [overview.md](overview.md)
- **Status**: **resolved**
- **Priority**: medium
- **Resolution**: Removed from scope entirely. ONNX Runtime does not support
`output_hidden_states=True` natively (HuggingFace optimum issue #972 was
closed as "not planned"), making activation extraction — the core operation —
impractical without a custom ONNX graph modification pipeline. The ONNX
model format also produces bloated exports. A future alternative inference
path using burn/cublas with safetensors is more promising since it supports
all platforms and uses the same model format we already require.
- **Cross-references**: ADR-006
---
## Theme: Codebook Design
### ~~OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?~~
- **Origin**: [codebook.md](codebook.md)
- **Status**: **resolved**
- **Priority**: high
- **Resolution**: Yes — ~65% compression to 500600 lines total (400500 runtime
+ 150200 training). The PoC contains ~480 lines of essential runtime code
plus ~178 lines needed from metaspline core. The 5x-repeated decomposition
pipeline collapses into a single `decompose()` function (~50 lines saved).
The histogram classifier (~130 lines) is exploratory and not MVP. The
`build()` method (429 lines) is decomposed: training logic moves to
`training/compiler.py`, runtime state becomes immutable serialized data.
See [poc-architecture.md](../research/codebook-analysis/poc-architecture.md)
and the Package Structure section in [codebook.md](codebook.md).
- **Cross-references**: ADR-004
---
## Theme: API Design
### ~~OQ-03: Should the firewall support streaming/chunked input screening?~~
- **Origin**: [firewall.md](firewall.md)
- **Status**: **resolved**
- **Priority**: medium
- **Resolution**: Rolling token window approach (ADR-012). Phase 2 implements
`screen_document()` with overlapping token windows (25% overlap, model's
full context length per window), max pooling for score aggregation, and
character offset tracking for granular "which sections are suspicious"
reporting. Short inputs fall through to the single-window `screen()` path.
The research doc includes a directionally correct implementation sketch.
Two distinct windowing concepts are now clearly separated: token-level
smoothing (within a single forward pass, already in codebook) vs
input-level rolling windows (multiple forward passes for long documents,
Phase 2).
- **Cross-references**: ADR-003, ADR-012
---
### ~~OQ-04: Should detection thresholds be per-model or globally configurable?~~
- **Origin**: [configuration.md](configuration.md), [codebook.md](codebook.md)
- **Status**: **resolved**
- **Priority**: medium
- **Resolution**: Both — thresholds are **model-specific by default** (shipped
with the codebook) but **globally overridable by the user**. Once calibrated,
models produce remarkably similar behavioral patterns across models (inspired
by the "platonic representation hypothesis" — different models converge on
similar internal representations of the same data). The individual activation
spaces differ, but the behavioral patterns they encode are consistent enough
that thresholds transfer reasonably well. The codebook ships recommended
thresholds calibrated for its model; users can adjust.
- **Cross-references**: ADR-003, ADR-004
---
## Theme: Integration
### ~~OQ-05: How should the firewall integrate with existing guardrail systems?~~
- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
- **Status**: **resolved**
- **Priority**: medium
- **Resolution**: Standalone API + thin adapter pattern (ADR-011). Phase 1:
ship the standalone `Firewall.screen(text) → Alarm` API only. Phase 2:
build thin adapter packages (<100 lines each) for LlamaFirewall,
OpenAI Agents SDK, and NeMo Guardrails as optional dependencies. Do NOT
build a common `ScreeningProvider` interface — behavioral detection is
fundamentally different from text-surface defenses and premature abstraction
would be constraining.
- **Cross-references**: ADR-002, ADR-011
---
## Theme: Project Setup
### ~~OQ-06: Should file-based configuration use TOML or YAML?~~
- **Origin**: [configuration.md](configuration.md)
- **Status**: **resolved**
- **Priority**: low
- **Resolution**: TOML. Consistent with modern Python packaging conventions
(`pyproject.toml`) and increasingly the standard for Python configuration.
This is a two-way door decision — reverting to YAML later is straightforward.
- **Cross-references**: None