docs: resolve 4 open questions, add research, spec codebook package structure
Research-driven resolution of OQ-01, OQ-02, OQ-05, OQ-06: - OQ-01: Remove ONNX Runtime from scope entirely — doesn't support activation extraction natively (optimum #972 closed as not planned), bloated model exports; burn/cublas via safetensors is a better future path - OQ-02: Codebook compresses ~65% (1,245 → 500-600 lines); add Package Structure and Extraction from PoC sections to codebook.md based on PoC analysis of metaspline firewall_codebook.py - OQ-05: Standalone API + thin adapter pattern (ADR-011); Phase 1 ships Firewall.screen() only, Phase 2 adds <100-line adapter packages for LlamaFirewall, OpenAI Agents SDK, NeMo Guardrails - OQ-06: TOML for file-based config — standard modern Python, two-way door Also: research OQ-03 rolling windows from taskgraph-semantic reference code, remove onnxruntime/optimum from dependencies, move streaming screening to Phase 2, add burn/cublas as Phase 3 alternative backend.
This commit is contained in:
@@ -46,6 +46,7 @@ raises "behavioral alarms" without needing to know specific attack types.
|
||||
| [008](decisions/008-three-level-alarm.md) | Three-Level Alarm System | Accepted |
|
||||
| [009](decisions/009-last-token-extraction.md) | Last-Token Activation Extraction | Accepted |
|
||||
| [010](decisions/010-monotonic-spline-distributions.md) | Monotonic Spline Distributions | Accepted |
|
||||
| [011](decisions/011-guardrail-integration-strategy.md) | Standalone API + Thin Adapter Integration | Accepted |
|
||||
|
||||
## Open Questions
|
||||
|
||||
@@ -53,12 +54,12 @@ See [open-questions.md](open-questions.md) for the full tracker.
|
||||
|
||||
| OQ | Question | Priority | Status |
|
||||
|----|----------|----------|--------|
|
||||
| OQ-01 | Should ONNX Runtime be a supported inference backend in Phase 1? | medium | open |
|
||||
| OQ-02 | What is the minimum viable codebook — can the 1,245-line codebook be compressed? | high | open |
|
||||
| OQ-03 | Should the firewall support streaming/chunked input screening? | medium | open |
|
||||
| ~~OQ-01~~ | ~~Should ONNX Runtime be a supported inference backend in Phase 1?~~ | ~~medium~~ | **resolved** (removed from scope; burn/cublas is better future path) |
|
||||
| ~~OQ-02~~ | ~~What is the minimum viable codebook — can the 1,245-line codebook be compressed?~~ | ~~high~~ | **resolved** (~65% compression to 500–600 lines) |
|
||||
| OQ-03 | Should the firewall support streaming/chunked input screening? | medium | open (research complete, Phase 2) |
|
||||
| ~~OQ-04~~ | ~~Should detection thresholds be per-model or globally configurable?~~ | ~~medium~~ | **resolved** (both: model-specific defaults, user-overridable) |
|
||||
| OQ-05 | How should the firewall integrate with existing guardrail systems? | medium | open |
|
||||
| OQ-06 | Should file-based configuration use TOML or YAML? | low | open |
|
||||
| ~~OQ-05~~ | ~~How should the firewall integrate with existing guardrail systems?~~ | ~~medium~~ | **resolved** (ADR-011: standalone API + thin adapters) |
|
||||
| ~~OQ-06~~ | ~~Should file-based configuration use TOML or YAML?~~ | ~~low~~ | **resolved** (TOML) |
|
||||
|
||||
## Document Lifecycle
|
||||
|
||||
|
||||
@@ -151,6 +151,71 @@ model. The bundled codebook is specific to the default detector model
|
||||
(SmolLM2-135M at the pinned revision). Users who switch to a different
|
||||
detector model must provide a matching codebook via `codebook_path`.
|
||||
|
||||
## Package Structure
|
||||
|
||||
Based on analysis of the PoC codebook
|
||||
([poc-architecture.md](../research/codebook-analysis/poc-architecture.md)),
|
||||
the production codebook decomposes into:
|
||||
|
||||
```
|
||||
src/alknet_firewall/
|
||||
├── codebook/
|
||||
│ ├── __init__.py # Public exports
|
||||
│ ├── codebook.py # Codebook class (init, load, project, score)
|
||||
│ ├── transforms.py # simplex, reverse_bary3d, bary_to_simplex
|
||||
│ ├── splines.py # MonotonicCubicSpline, SplineDistribution
|
||||
│ ├── profiles.py # DirectionProfile, population stats
|
||||
│ ├── classifiers.py # DirectionClassifier (logistic weights)
|
||||
│ ├── results.py # DetectionResult, DimensionSignal, AlarmLevel
|
||||
│ ├── projection.py # project(), decompose()
|
||||
│ └── detection.py # detect(), threshold comparison
|
||||
├── training/
|
||||
│ ├── __init__.py
|
||||
│ ├── compiler.py # build() — SVD, spline fitting, profile comp
|
||||
│ ├── stats.py # pooled_std, cohen_d, silhouette
|
||||
│ └── data_loader.py # Condition catalog, prompt sets, data loading
|
||||
└── data/
|
||||
└── codebook/
|
||||
├── basis.safetensors
|
||||
├── regions.safetensors
|
||||
├── splines.json
|
||||
└── config.json
|
||||
```
|
||||
|
||||
### Extraction from PoC
|
||||
|
||||
The PoC `firewall_codebook.py` is 1,245 lines with significant duplication
|
||||
(the decomposition pipeline z → CDF → simplex → barycentric → (sum, u, v) is
|
||||
repeated 5 times). Analysis identifies:
|
||||
|
||||
- **~480 lines of essential runtime code** in the PoC
|
||||
- **~178 lines needed from metaspline core** (SplineDistribution,
|
||||
MonotonicCubicSpline, ensure_strictly_increasing, simplex)
|
||||
- **~130 lines of histogram classifier** — exploratory alternative, not MVP
|
||||
(the continuous logistic classifier is superior)
|
||||
- **~95 lines of AUC evaluation** — testing tool, not runtime
|
||||
- **~429 lines in `build()`** — must be decomposed: training moves to
|
||||
`training/compiler.py`, runtime state becomes immutable serialized data
|
||||
|
||||
Target: **~400–500 lines runtime + ~150–200 lines training = ~65% compression**
|
||||
from the PoC's 1,245 lines.
|
||||
|
||||
### Key Extraction Decisions
|
||||
|
||||
1. **`build()` moves entirely to `training/compiler.py`** — Runtime codebook
|
||||
is read-only. The codebook class should not have a `build()` method.
|
||||
2. **`decompose()` becomes a pure function** — `decompose(z, splines)` is a
|
||||
pure mathematical transform. No state dependencies beyond splines.
|
||||
3. **Detection is separate from the codebook class** — `detect()` is a
|
||||
stateless function given codebook data. Enables swapping detection
|
||||
strategies without touching the codebook.
|
||||
4. **Only 4 of 502 metaspline core lines are needed at runtime** —
|
||||
`SplineDistribution`, `MonotonicCubicSpline`, `ensure_strictly_increasing`,
|
||||
and `simplex()`. Everything else (DensitySpline, unfold/fold, dcs_norm) is
|
||||
dropped entirely.
|
||||
5. **Saved `.pt` files from the PoC provide golden test data** — manifold
|
||||
projection results for Qwen3-0.6B/1.7B can be reused for integration tests.
|
||||
|
||||
## Data Format
|
||||
|
||||
The codebook is stored as:
|
||||
@@ -243,6 +308,5 @@ class Codebook:
|
||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
||||
questions affecting this document:
|
||||
|
||||
- **OQ-02**: What is the minimum viable codebook — can the 1,245-line PoC
|
||||
codebook be compressed? (open)
|
||||
- **OQ-04**: Should detection thresholds be per-model or globally configurable? (open)
|
||||
- **OQ-02**: ~~What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?~~ (resolved — ~65% compression to 500–600 lines; see Package Structure section)
|
||||
- ~~**OQ-04**~~: ~~Should detection thresholds be per-model or globally configurable?~~ (resolved — both: model-specific defaults, user-overridable)
|
||||
@@ -93,7 +93,8 @@ alarm = firewall.screen("Hello, how are you?")
|
||||
```
|
||||
|
||||
No configuration file is required. All parameters can be passed via the
|
||||
constructor. A future phase may add file-based configuration (TOML or YAML).
|
||||
constructor. A future phase may add file-based configuration (TOML, consistent
|
||||
with Python packaging conventions and `pyproject.toml`).
|
||||
|
||||
## Design Decisions
|
||||
|
||||
@@ -108,4 +109,5 @@ constructor. A future phase may add file-based configuration (TOML or YAML).
|
||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
||||
questions affecting this document:
|
||||
|
||||
- ~~**OQ-04**~~: ~~Should detection thresholds be per-model or globally configurable?~~ (resolved — both: model-specific defaults shipped with codebook, user-overridable)
|
||||
- ~~**OQ-04**~~: ~~Should detection thresholds be per-model or globally configurable?~~ (resolved — both: model-specific defaults shipped with codebook, user-overridable)
|
||||
- ~~**OQ-06**~~: ~~Should file-based configuration use TOML or YAML?~~ (resolved — TOML, consistent with modern Python packaging)
|
||||
@@ -6,17 +6,16 @@ Accepted
|
||||
|
||||
## Context
|
||||
|
||||
PyTorch is the primary inference backend for the detector model. However,
|
||||
PyTorch is large:
|
||||
PyTorch is the inference backend for the detector model. However, PyTorch is
|
||||
large:
|
||||
|
||||
- `torch` (CPU): ~200MB download, ~700MB installed
|
||||
- `torch` (CUDA): ~2.5GB download, ~5GB+ installed
|
||||
- `onnxruntime`: ~30-50MB download, ~300MB installed
|
||||
|
||||
Making PyTorch a required dependency would force a 200MB-2.5GB download on
|
||||
every user, even those who already have PyTorch installed or prefer ONNX
|
||||
Runtime. This is the standard problem for ML libraries, and the HuggingFace
|
||||
ecosystem has converged on a solution.
|
||||
every user, even those who already have PyTorch installed. This is the
|
||||
standard problem for ML libraries, and the HuggingFace ecosystem has
|
||||
converged on a solution.
|
||||
|
||||
## Decision
|
||||
|
||||
@@ -43,7 +42,6 @@ except ImportError:
|
||||
**Positive**:
|
||||
- Base install is ~30MB download, ~100MB installed — very lightweight
|
||||
- Users with existing PyTorch installations don't re-download
|
||||
- ONNX Runtime alternative available for minimal footprint (~100MB total)
|
||||
- Follows HuggingFace ecosystem conventions (transformers, safetensors, HF
|
||||
hub all use this pattern)
|
||||
- uv supports CPU/GPU torch variant selection via `[tool.uv.sources]` and
|
||||
@@ -55,6 +53,8 @@ except ImportError:
|
||||
- Runtime import errors if users forget to install a backend
|
||||
- CPU-only torch requires two-step install or uv configuration (can't be
|
||||
expressed in pip extras alone)
|
||||
- PyTorch is the only supported inference backend; future alternatives
|
||||
(burn/cublas via safetensors) would require separate integration work
|
||||
|
||||
## References
|
||||
|
||||
|
||||
@@ -0,0 +1,75 @@
|
||||
# ADR-011: Standalone API with Thin Adapter Integration Strategy
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
alknet-firewall provides behavioral signal detection — fundamentally different
|
||||
from text-surface defenses like Llama Guard, NeMo Guardrails, or Guardrails AI.
|
||||
It requires running a small detector model and extracting hidden state
|
||||
activations, not classifying input text. Users may want to run both text-surface
|
||||
defenses and behavioral detection in series.
|
||||
|
||||
Research into existing guardrail systems ([patterns-analysis.md](../../research/guardrail-integration-patterns/patterns-analysis.md))
|
||||
identified three viable integration targets with high compatibility:
|
||||
|
||||
- **LlamaFirewall**: `BaseScanner.scan()` → `ScanResult` maps directly to
|
||||
`Firewall.screen()` → `Alarm`
|
||||
- **OpenAI Agents SDK**: `@input_guardrail` decorator pattern with blocking
|
||||
execution
|
||||
- **NeMo Guardrails**: Custom Python action in input rails (Colang DSL can't
|
||||
express behavioral detection natively)
|
||||
|
||||
Two systems have low compatibility: Guardrails AI (expects text-surface
|
||||
validators with content fixes, not alarms) and Amazon Bedrock Guardrails
|
||||
(closed service, no extension mechanism).
|
||||
|
||||
## Decision
|
||||
|
||||
**Phase 1**: Ship a standalone API only. No adapters, no common interface.
|
||||
|
||||
```python
|
||||
# The core API — simple, composable, no framework dependencies
|
||||
firewall = Firewall()
|
||||
alarm = firewall.screen("untrusted input text")
|
||||
```
|
||||
|
||||
**Phase 2**: Build thin adapter packages as optional dependencies. Each adapter
|
||||
is <100 lines and has no impact on the core library:
|
||||
|
||||
- `alknet-firewall-llamafirewall`: Custom `BaseScanner` subclass
|
||||
- `alknet-firewall-agents`: `@input_guardrail` wrapper
|
||||
- `alknet-firewall-nemo`: Custom NeMo input rail action
|
||||
|
||||
Do NOT build a common `ScreeningProvider` interface. The integration patterns
|
||||
differ enough between systems that a shared abstraction would be premature and
|
||||
constraining. If a common pattern emerges organically from the adapters,
|
||||
extract it then.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive**:
|
||||
- Phase 1 ships faster — no adapter development or testing overhead
|
||||
- Core API stays clean and framework-independent
|
||||
- Users can compose manually: call `firewall.screen()` then pass results to
|
||||
any guardrail system
|
||||
- Adapters are optional packages, not core dependencies — no coupling
|
||||
- Thin adapters are easy to maintain when guardrail frameworks change their
|
||||
APIs
|
||||
|
||||
**Negative**:
|
||||
- Phase 1 users must write their own glue code (typically 5–10 lines)
|
||||
- No "pip install and configure" experience until Phase 2
|
||||
- Multiple small adapter packages to maintain
|
||||
- Risk of API drift between core and adapters if adapters are maintained
|
||||
infrequently
|
||||
|
||||
## References
|
||||
|
||||
- [OQ-05](../open-questions.md) — How should the firewall integrate with
|
||||
existing guardrail systems?
|
||||
- [patterns-analysis.md](../../research/guardrail-integration-patterns/patterns-analysis.md) — Full research analysis
|
||||
- [ADR-002](002-behavioral-signals.md) — Behavioral signal detection (not text
|
||||
classification)
|
||||
@@ -196,5 +196,5 @@ All exception types subclass `AlknetFirewallError` (base library exception).
|
||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
||||
questions affecting this document:
|
||||
|
||||
- **OQ-03**: Should the firewall support streaming/chunked input screening? (open — rolling window approach is promising)
|
||||
- **OQ-05**: How should the firewall integrate with existing guardrail systems? (open — needs research)
|
||||
- **OQ-03**: Should the firewall support streaming/chunked input screening? (open — rolling window approach is promising; [research complete](../research/streaming-screening-patterns/rolling-window-analysis.md))
|
||||
- ~~**OQ-05**~~: ~~How should the firewall integrate with existing guardrail systems?~~ (resolved — ADR-011: standalone API + thin adapters Phase 2)
|
||||
@@ -72,8 +72,7 @@ class DetectorModel(Protocol):
|
||||
```
|
||||
|
||||
The `infer` method returns hidden states at key layers, abstracting away
|
||||
whether the backend is PyTorch, ONNX Runtime, or a future Rust inference
|
||||
engine.
|
||||
whether the backend is PyTorch or a future alternative inference engine.
|
||||
|
||||
### Lazy Loading
|
||||
|
||||
@@ -158,4 +157,4 @@ class HFDetectorModel:
|
||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
||||
questions affecting this document:
|
||||
|
||||
- **OQ-01**: Should ONNX Runtime be a supported inference backend in Phase 1? (open)
|
||||
- **OQ-01**: ~~Should ONNX Runtime be a supported inference backend in Phase 1?~~ (resolved — removed from scope; burn/cublas is a better future path)
|
||||
@@ -4,45 +4,40 @@ Centralized tracker for unresolved questions across all architecture documents.
|
||||
|
||||
## Theme: Inference Backend
|
||||
|
||||
### OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?
|
||||
### ~~OQ-01: Should ONNX Runtime be a supported inference backend in Phase 1?~~
|
||||
|
||||
- **Origin**: [model.md](model.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — needs research into ONNX export path)
|
||||
- **Resolution**: Removed from scope entirely. ONNX Runtime does not support
|
||||
`output_hidden_states=True` natively (HuggingFace optimum issue #972 was
|
||||
closed as "not planned"), making activation extraction — the core operation —
|
||||
impractical without a custom ONNX graph modification pipeline. The ONNX
|
||||
model format also produces bloated exports. A future alternative inference
|
||||
path using burn/cublas with safetensors is more promising since it supports
|
||||
all platforms and uses the same model format we already require.
|
||||
- **Cross-references**: ADR-006
|
||||
|
||||
ONNX Runtime provides a much smaller install footprint (~30-50MB vs 200MB-2.5GB
|
||||
for PyTorch) and is well-suited for inference-only use. HuggingFace's `optimum`
|
||||
library provides drop-in replacement classes. However, supporting it in Phase 1
|
||||
adds complexity: model must be exported to ONNX format, `optimum` integration
|
||||
must be tested, and the activation extraction API may differ from PyTorch.
|
||||
|
||||
The likely path is: build with PyTorch first, then export to ONNX by default.
|
||||
This needs research to confirm the activation extraction API compatibility and
|
||||
ONNX export quality for SmolLM2-135M. Leave open for now.
|
||||
|
||||
---
|
||||
|
||||
## Theme: Codebook Design
|
||||
|
||||
### OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?
|
||||
### ~~OQ-02: What is the minimum viable codebook — can the 1,245-line PoC codebook be compressed?~~
|
||||
|
||||
- **Origin**: [codebook.md](codebook.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: high
|
||||
- **Resolution**: (pending — dedicated research session needed)
|
||||
- **Resolution**: Yes — ~65% compression to 500–600 lines total (400–500 runtime
|
||||
+ 150–200 training). The PoC contains ~480 lines of essential runtime code
|
||||
plus ~178 lines needed from metaspline core. The 5x-repeated decomposition
|
||||
pipeline collapses into a single `decompose()` function (~50 lines saved).
|
||||
The histogram classifier (~130 lines) is exploratory and not MVP. The
|
||||
`build()` method (429 lines) is decomposed: training logic moves to
|
||||
`training/compiler.py`, runtime state becomes immutable serialized data.
|
||||
See [poc-architecture.md](../research/codebook-analysis/poc-architecture.md)
|
||||
and the Package Structure section in [codebook.md](codebook.md).
|
||||
- **Cross-references**: ADR-004
|
||||
|
||||
The PoC codebook is 1,245 lines — much of it may be boilerplate, dead code,
|
||||
or excessive parameterization from the research phase. Understanding what's
|
||||
essential vs. exploratory is critical for the initial extraction. The codebook
|
||||
training pipeline (`run_manifold_projection.py`) should also be analyzed.
|
||||
|
||||
Consider: How many SVD dimensions are actually needed? What's the minimum
|
||||
calibration dataset? Can spline distributions be simplified? This needs a
|
||||
dedicated session to analyze the PoC codebase.
|
||||
|
||||
---
|
||||
|
||||
## Theme: API Design
|
||||
@@ -103,42 +98,30 @@ candidate for Phase 2.
|
||||
|
||||
## Theme: Integration
|
||||
|
||||
### OQ-05: How should the firewall integrate with existing guardrail systems?
|
||||
### ~~OQ-05: How should the firewall integrate with existing guardrail systems?~~
|
||||
|
||||
- **Origin**: [firewall.md](firewall.md), [overview.md](overview.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — needs deep dive into current guardrail landscape)
|
||||
- **Cross-references**: ADR-002
|
||||
|
||||
The behavioral firewall is complementary to text-surface defenses. Users may
|
||||
want to run both Llama Guard (text classification) and alknet-firewall
|
||||
(behavioral signals) in series. However, what we're doing is fundamentally
|
||||
different — it requires having the model and having trained on its specific
|
||||
behavioral signals. This means direct API-level integration with other systems
|
||||
may not be straightforward.
|
||||
|
||||
A deep dive into the current state of guardrail integration patterns
|
||||
(LlamaFirewall's scanner interface, NeMo Guardrails' Colang DSL, etc.) is
|
||||
needed to determine whether we should build adapters, define a common
|
||||
interface, or simply provide a clean standalone API and let users compose
|
||||
systems themselves.
|
||||
|
||||
Leave open — will research soon.
|
||||
- **Resolution**: Standalone API + thin adapter pattern (ADR-011). Phase 1:
|
||||
ship the standalone `Firewall.screen(text) → Alarm` API only. Phase 2:
|
||||
build thin adapter packages (<100 lines each) for LlamaFirewall,
|
||||
OpenAI Agents SDK, and NeMo Guardrails as optional dependencies. Do NOT
|
||||
build a common `ScreeningProvider` interface — behavioral detection is
|
||||
fundamentally different from text-surface defenses and premature abstraction
|
||||
would be constraining.
|
||||
- **Cross-references**: ADR-002, ADR-011
|
||||
|
||||
---
|
||||
|
||||
## Theme: Project Setup
|
||||
|
||||
### OQ-06: Should file-based configuration use TOML or YAML?
|
||||
### ~~OQ-06: Should file-based configuration use TOML or YAML?~~
|
||||
|
||||
- **Origin**: [configuration.md](configuration.md)
|
||||
- **Status**: open
|
||||
- **Status**: **resolved**
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending — Phase 2 concern)
|
||||
- **Cross-references**: None
|
||||
|
||||
Phase 1 uses constructor-based configuration only. A future phase may add
|
||||
file-based configuration for easier deployment. TOML is consistent with
|
||||
Python packaging (pyproject.toml) and increasingly the standard for Python
|
||||
config. YAML is more familiar in ops/ML contexts. Either works.
|
||||
- **Resolution**: TOML. Consistent with modern Python packaging conventions
|
||||
(`pyproject.toml`) and increasingly the standard for Python configuration.
|
||||
This is a two-way door decision — reverting to YAML later is straightforward.
|
||||
- **Cross-references**: None
|
||||
@@ -56,17 +56,16 @@ for the full threat analysis and academic evidence.
|
||||
- Interpretable detection signals (SVD direction analysis)
|
||||
|
||||
- **Phase 2**: Integration and operational hardening
|
||||
- ONNX Runtime inference backend
|
||||
- Async/batch screening API
|
||||
- Integration adapters for LlamaFirewall, NeMo Guardrails
|
||||
- Integration adapters for LlamaFirewall, NeMo Guardrails, OpenAI Agents SDK
|
||||
- Metrics and observability
|
||||
- Codebook training pipeline (`run_manifold_projection.py` extraction)
|
||||
- Streaming/rolling-window input screening (granular detection for documents)
|
||||
|
||||
- **Phase 3**: Advanced capabilities
|
||||
- Multi-turn attack detection (payload splitting)
|
||||
- Streaming/rolling-window input screening (granular detection for documents)
|
||||
- Custom model fine-tuning for domain-specific detection
|
||||
- ONNX Runtime inference backend (export from PyTorch)
|
||||
- Alternative inference backends (burn/cublas via safetensors)
|
||||
|
||||
### Out of Scope
|
||||
|
||||
@@ -138,8 +137,6 @@ for the full threat analysis and academic evidence.
|
||||
|---------|-------|---------|---------|-------|
|
||||
| `torch` | `[torch]` | >=2.2 | Model inference | 200MB-2.5GB; optional dependency |
|
||||
| `transformers` | `[torch]` | >=4.40 | Model loading pipeline | Required with torch extra |
|
||||
| `onnxruntime` | `[onnx]` | >=1.17 | Alternative inference | ~30-50MB; Phase 2 |
|
||||
| `optimum` | `[onnx]` | latest | ONNX Runtime integration | Phase 2 |
|
||||
|
||||
### Development (Not Published)
|
||||
|
||||
@@ -187,6 +184,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
|
||||
| [008](decisions/008-three-level-alarm.md) | Three-level alarm system | CLEAR/SUSPICIOUS/DANGEROUS balances simplicity with nuance |
|
||||
| [009](decisions/009-last-token-extraction.md) | Last-token activation extraction | Standard for autoregressive models; full sequence context |
|
||||
| [010](decisions/010-monotonic-spline-distributions.md) | Monotonic spline distributions | Compact, smooth, tail-sensitive behavioral region modeling |
|
||||
| [011](decisions/011-guardrail-integration-strategy.md) | Standalone API + thin adapters | Phase 1 standalone, Phase 2 thin adapter packages |
|
||||
|
||||
## Dependencies on Other Projects
|
||||
|
||||
@@ -204,5 +202,5 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
|
||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
||||
questions affecting this document:
|
||||
|
||||
- **OQ-01**: Should ONNX Runtime be a supported inference backend in Phase 1? (open)
|
||||
- **OQ-05**: How should the firewall integrate with existing guardrail systems? (open)
|
||||
- **OQ-01**: Should ONNX Runtime be a supported inference backend in Phase 1? (resolved — removed from scope; ONNX doesn't support activation extraction natively, and burn/cublas is a better future path)
|
||||
- **OQ-05**: How should the firewall integrate with existing guardrail systems? (resolved — ADR-011: standalone API + thin adapters in Phase 2)
|
||||
Reference in New Issue
Block a user