feat: initial architecture specification and research

Phase 0→1 setup for alknet-firewall — a behavioral signal detection library that screens untrusted LLM inputs using small model activations. Architecture docs (5 specs, 10 ADRs, 7 open questions): - overview: vision, scope, dependencies, package structure - firewall: core API, alarm protocol, score composition, error handling - codebook: SVD basis, spline distributions, calibration, tensor format - model: activation extraction, model-agnostic interface, lazy loading - configuration: thresholds, model selection, detection tuning Research reports: - modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI - python-ml-packaging: optional PyTorch, HF Hub download, safetensors - llm-input-safety-landscape: threat taxonomy, defenses, academic evidence Agent role adaptations for Python project (replaced Rust conventions).
2026-06-13 05:17:40 +00:00
parent 141628bae4
commit cf464c2296
23 changed files with 3900 additions and 44 deletions
--- a/docs/architecture/open-questions.md
+++ b/docs/architecture/open-questions.md
@@ -0,0 +1,129 @@
+# Open Questions
+
+Centralized tracker for unresolved questions across all architecture documents.
+
+## Theme: Inference Backend
+
+### OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?
+
+- **Origin**: [model.md](model.md), [overview.md](overview.md)
+- **Status**: open
+- **Priority**: medium
+- **Resolution**: (pending)
+- **Cross-references**: ADR-006
+
+ONNX Runtime provides a much smaller install footprint (~30-50MB vs 200MB-2.5GB
+for PyTorch) and is well-suited for inference-only use. HuggingFace's `optimum`
+library provides drop-in replacement classes. However, supporting it in Phase 1
+adds complexity: model must be exported to ONNX format, `optimum` integration
+must be tested, and the activation extraction API may differ from PyTorch.
+
+Consider: Is the smaller footprint worth the integration complexity in Phase 1,
+or should ONNX support wait until Phase 2 when the core API is stable?
+
+---
+
+## Theme: Codebook Design
+
+### OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?
+
+- **Origin**: [codebook.md](codebook.md)
+- **Status**: open
+- **Priority**: high
+- **Resolution**: (pending)
+- **Cross-references**: ADR-004
+
+The PoC codebook is 1,245 lines — much of it may be boilerplate, dead code,
+or excessive parameterization from the research phase. Understanding what's
+essential vs. exploratory is critical for the initial extraction. The codebook
+training pipeline (`run_manifold_projection.py`) should also be analyzed.
+
+Consider: How many SVD dimensions are actually needed? What's the minimum
+calibration dataset? Can spline distributions be simplified?
+
+---
+
+## Theme: API Design
+
+### OQ-03: Should the firewall support streaming/chunked input screening?
+
+- **Origin**: [firewall.md](firewall.md)
+- **Status**: open
+- **Priority**: low
+- **Resolution**: (pending)
+- **Cross-references**: ADR-003
+
+Some inputs arrive in chunks (streaming API responses, large documents). Should
+the firewall support incremental screening as chunks arrive, or require the
+full input before screening? Incremental screening could detect attacks earlier
+but requires buffering and state management.
+
+This is low priority for Phase 1 but affects the internal API design.
+
+---
+
+### OQ-04: Should detection thresholds be per-model or globally configurable?
+
+- **Origin**: [configuration.md](configuration.md), [codebook.md](codebook.md)
+- **Status**: open
+- **Priority**: medium
+- **Resolution**: (pending)
+- **Cross-references**: ADR-003, ADR-004
+
+Different detector models may produce different score distributions. Thresholds
+that work for SmolLM2-135M may not work for a different model. Should
+thresholds be tied to the codebook (per-model) or set globally by the user?
+
+Consider: Per-model defaults with user overrides? Codebook ships with
+recommended thresholds that the user can adjust?
+
+---
+
+## Theme: Integration
+
+### OQ-05: How should the firewall integrate with existing guardrail systems?
+
+- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
+- **Status**: open
+- **Priority**: medium
+- **Resolution**: (pending)
+- **Cross-references**: ADR-002
+
+The behavioral firewall is complementary to text-surface defenses. Users may
+want to run both Llama Guard (text classification) and alknet-firewall
+(behavioral signals) in series. How should these be composed?
+
+Consider: Integration adapters? A common interface? Callback hooks? Or is
+composition the user's responsibility and we just provide a clean standalone API?
+
+---
+
+## Theme: Project Setup
+
+### OQ-06: Should file-based configuration use TOML or YAML?
+
+- **Origin**: [configuration.md](configuration.md)
+- **Status**: open
+- **Priority**: low
+- **Resolution**: (pending)
+- **Cross-references**: None
+
+Phase 1 uses constructor-based configuration only. A future phase may add
+file-based configuration for easier deployment. TOML is consistent with
+Python packaging (pyproject.toml) and increasingly the standard for Python
+config. YAML is more familiar in ops/ML contexts. Either works.
+
+---
+
+### OQ-07: Is a Rust port feasible given current ML framework maturity?
+
+- **Origin**: [overview.md](overview.md), ADR-001
+- **Status**: open
+- **Priority**: low
+- **Resolution**: (pending)
+- **Cross-references**: ADR-001
+
+A Rust port using burn/cubecl was attempted during the PoC phase and failed.
+The ML framework ecosystem in Rust is not yet mature enough for this type
+of work. This remains a speculative Phase 3 goal. Revisit when burn/cubecl
+matures or alternative Rust ML frameworks emerge.