feat: initial architecture specification and research
Phase 0→1 setup for alknet-firewall — a behavioral signal detection library that screens untrusted LLM inputs using small model activations. Architecture docs (5 specs, 10 ADRs, 7 open questions): - overview: vision, scope, dependencies, package structure - firewall: core API, alarm protocol, score composition, error handling - codebook: SVD basis, spline distributions, calibration, tensor format - model: activation extraction, model-agnostic interface, lazy loading - configuration: thresholds, model selection, detection tuning Research reports: - modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI - python-ml-packaging: optional PyTorch, HF Hub download, safetensors - llm-input-safety-landscape: threat taxonomy, defenses, academic evidence Agent role adaptations for Python project (replaced Rust conventions).
This commit is contained in:
129
docs/architecture/open-questions.md
Normal file
129
docs/architecture/open-questions.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# Open Questions
|
||||
|
||||
Centralized tracker for unresolved questions across all architecture documents.
|
||||
|
||||
## Theme: Inference Backend
|
||||
|
||||
### OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?
|
||||
|
||||
- **Origin**: [model.md](model.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-006
|
||||
|
||||
ONNX Runtime provides a much smaller install footprint (~30-50MB vs 200MB-2.5GB
|
||||
for PyTorch) and is well-suited for inference-only use. HuggingFace's `optimum`
|
||||
library provides drop-in replacement classes. However, supporting it in Phase 1
|
||||
adds complexity: model must be exported to ONNX format, `optimum` integration
|
||||
must be tested, and the activation extraction API may differ from PyTorch.
|
||||
|
||||
Consider: Is the smaller footprint worth the integration complexity in Phase 1,
|
||||
or should ONNX support wait until Phase 2 when the core API is stable?
|
||||
|
||||
---
|
||||
|
||||
## Theme: Codebook Design
|
||||
|
||||
### OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?
|
||||
|
||||
- **Origin**: [codebook.md](codebook.md)
|
||||
- **Status**: open
|
||||
- **Priority**: high
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-004
|
||||
|
||||
The PoC codebook is 1,245 lines — much of it may be boilerplate, dead code,
|
||||
or excessive parameterization from the research phase. Understanding what's
|
||||
essential vs. exploratory is critical for the initial extraction. The codebook
|
||||
training pipeline (`run_manifold_projection.py`) should also be analyzed.
|
||||
|
||||
Consider: How many SVD dimensions are actually needed? What's the minimum
|
||||
calibration dataset? Can spline distributions be simplified?
|
||||
|
||||
---
|
||||
|
||||
## Theme: API Design
|
||||
|
||||
### OQ-03: Should the firewall support streaming/chunked input screening?
|
||||
|
||||
- **Origin**: [firewall.md](firewall.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-003
|
||||
|
||||
Some inputs arrive in chunks (streaming API responses, large documents). Should
|
||||
the firewall support incremental screening as chunks arrive, or require the
|
||||
full input before screening? Incremental screening could detect attacks earlier
|
||||
but requires buffering and state management.
|
||||
|
||||
This is low priority for Phase 1 but affects the internal API design.
|
||||
|
||||
---
|
||||
|
||||
### OQ-04: Should detection thresholds be per-model or globally configurable?
|
||||
|
||||
- **Origin**: [configuration.md](configuration.md), [codebook.md](codebook.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-003, ADR-004
|
||||
|
||||
Different detector models may produce different score distributions. Thresholds
|
||||
that work for SmolLM2-135M may not work for a different model. Should
|
||||
thresholds be tied to the codebook (per-model) or set globally by the user?
|
||||
|
||||
Consider: Per-model defaults with user overrides? Codebook ships with
|
||||
recommended thresholds that the user can adjust?
|
||||
|
||||
---
|
||||
|
||||
## Theme: Integration
|
||||
|
||||
### OQ-05: How should the firewall integrate with existing guardrail systems?
|
||||
|
||||
- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-002
|
||||
|
||||
The behavioral firewall is complementary to text-surface defenses. Users may
|
||||
want to run both Llama Guard (text classification) and alknet-firewall
|
||||
(behavioral signals) in series. How should these be composed?
|
||||
|
||||
Consider: Integration adapters? A common interface? Callback hooks? Or is
|
||||
composition the user's responsibility and we just provide a clean standalone API?
|
||||
|
||||
---
|
||||
|
||||
## Theme: Project Setup
|
||||
|
||||
### OQ-06: Should file-based configuration use TOML or YAML?
|
||||
|
||||
- **Origin**: [configuration.md](configuration.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: None
|
||||
|
||||
Phase 1 uses constructor-based configuration only. A future phase may add
|
||||
file-based configuration for easier deployment. TOML is consistent with
|
||||
Python packaging (pyproject.toml) and increasingly the standard for Python
|
||||
config. YAML is more familiar in ops/ML contexts. Either works.
|
||||
|
||||
---
|
||||
|
||||
### OQ-07: Is a Rust port feasible given current ML framework maturity?
|
||||
|
||||
- **Origin**: [overview.md](overview.md), ADR-001
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-001
|
||||
|
||||
A Rust port using burn/cubecl was attempted during the PoC phase and failed.
|
||||
The ML framework ecosystem in Rust is not yet mature enough for this type
|
||||
of work. This remains a speculative Phase 3 goal. Revisit when burn/cubecl
|
||||
matures or alternative Rust ML frameworks emerge.
|
||||
Reference in New Issue
Block a user