Files
alknet-firewall/docs/architecture/open-questions.md
glm-5.1 cf464c2296 feat: initial architecture specification and research
Phase 0→1 setup for alknet-firewall — a behavioral signal detection
library that screens untrusted LLM inputs using small model activations.

Architecture docs (5 specs, 10 ADRs, 7 open questions):
- overview: vision, scope, dependencies, package structure
- firewall: core API, alarm protocol, score composition, error handling
- codebook: SVD basis, spline distributions, calibration, tensor format
- model: activation extraction, model-agnostic interface, lazy loading
- configuration: thresholds, model selection, detection tuning

Research reports:
- modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI
- python-ml-packaging: optional PyTorch, HF Hub download, safetensors
- llm-input-safety-landscape: threat taxonomy, defenses, academic evidence

Agent role adaptations for Python project (replaced Rust conventions).
2026-06-13 05:17:40 +00:00

129 lines
4.4 KiB
Markdown

# Open Questions
Centralized tracker for unresolved questions across all architecture documents.
## Theme: Inference Backend
### OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?
- **Origin**: [model.md](model.md), [overview.md](overview.md)
- **Status**: open
- **Priority**: medium
- **Resolution**: (pending)
- **Cross-references**: ADR-006
ONNX Runtime provides a much smaller install footprint (~30-50MB vs 200MB-2.5GB
for PyTorch) and is well-suited for inference-only use. HuggingFace's `optimum`
library provides drop-in replacement classes. However, supporting it in Phase 1
adds complexity: model must be exported to ONNX format, `optimum` integration
must be tested, and the activation extraction API may differ from PyTorch.
Consider: Is the smaller footprint worth the integration complexity in Phase 1,
or should ONNX support wait until Phase 2 when the core API is stable?
---
## Theme: Codebook Design
### OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?
- **Origin**: [codebook.md](codebook.md)
- **Status**: open
- **Priority**: high
- **Resolution**: (pending)
- **Cross-references**: ADR-004
The PoC codebook is 1,245 lines — much of it may be boilerplate, dead code,
or excessive parameterization from the research phase. Understanding what's
essential vs. exploratory is critical for the initial extraction. The codebook
training pipeline (`run_manifold_projection.py`) should also be analyzed.
Consider: How many SVD dimensions are actually needed? What's the minimum
calibration dataset? Can spline distributions be simplified?
---
## Theme: API Design
### OQ-03: Should the firewall support streaming/chunked input screening?
- **Origin**: [firewall.md](firewall.md)
- **Status**: open
- **Priority**: low
- **Resolution**: (pending)
- **Cross-references**: ADR-003
Some inputs arrive in chunks (streaming API responses, large documents). Should
the firewall support incremental screening as chunks arrive, or require the
full input before screening? Incremental screening could detect attacks earlier
but requires buffering and state management.
This is low priority for Phase 1 but affects the internal API design.
---
### OQ-04: Should detection thresholds be per-model or globally configurable?
- **Origin**: [configuration.md](configuration.md), [codebook.md](codebook.md)
- **Status**: open
- **Priority**: medium
- **Resolution**: (pending)
- **Cross-references**: ADR-003, ADR-004
Different detector models may produce different score distributions. Thresholds
that work for SmolLM2-135M may not work for a different model. Should
thresholds be tied to the codebook (per-model) or set globally by the user?
Consider: Per-model defaults with user overrides? Codebook ships with
recommended thresholds that the user can adjust?
---
## Theme: Integration
### OQ-05: How should the firewall integrate with existing guardrail systems?
- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
- **Status**: open
- **Priority**: medium
- **Resolution**: (pending)
- **Cross-references**: ADR-002
The behavioral firewall is complementary to text-surface defenses. Users may
want to run both Llama Guard (text classification) and alknet-firewall
(behavioral signals) in series. How should these be composed?
Consider: Integration adapters? A common interface? Callback hooks? Or is
composition the user's responsibility and we just provide a clean standalone API?
---
## Theme: Project Setup
### OQ-06: Should file-based configuration use TOML or YAML?
- **Origin**: [configuration.md](configuration.md)
- **Status**: open
- **Priority**: low
- **Resolution**: (pending)
- **Cross-references**: None
Phase 1 uses constructor-based configuration only. A future phase may add
file-based configuration for easier deployment. TOML is consistent with
Python packaging (pyproject.toml) and increasingly the standard for Python
config. YAML is more familiar in ops/ML contexts. Either works.
---
### OQ-07: Is a Rust port feasible given current ML framework maturity?
- **Origin**: [overview.md](overview.md), ADR-001
- **Status**: open
- **Priority**: low
- **Resolution**: (pending)
- **Cross-references**: ADR-001
A Rust port using burn/cubecl was attempted during the PoC phase and failed.
The ML framework ecosystem in Rust is not yet mature enough for this type
of work. This remains a speculative Phase 3 goal. Revisit when burn/cubecl
matures or alternative Rust ML frameworks emerge.