Files
storage/docs/architecture/open-questions.md
glm-5.1 ae242f33b9 Restructure identity tables: separate credential types, add peer_credentials, specify FK cascades and indexes
Identity tables were derived from hub's PostgreSQL schema but simplified
without documenting what was removed or why. This restructures them for the
current auth landscape (API key + wraith SSH/cert-authority):

- ADR-049: Separate api_keys and peer_credentials tables (different lookup
  patterns, columns, lifecycles), remove Gitea columns, map hub data→metadata
- ADR-050: Extract SHA-256 vs KDF decision from inline spec text
- Add peer_credentials table for SSH key and cert-authority auth
- Specify all FK cascade behaviors within system DB (RESTRICT, CASCADE, SET NULL)
- Complete index specifications for all identity tables
- Add scope boundary section (storage owns schemas, not auth/authorization)
- Update audit_logs with credentialId+credentialType polymorphic reference
- Add 3 new open questions (OQ-33/34/35) for credential type expansion
2026-06-02 12:33:20 +00:00

325 lines
14 KiB
Markdown

---
status: draft
last_updated: 2026-06-02
---
# Open Questions Tracker
Cross-cutting compilation of all unresolved questions across the storage
architecture documents, organized by theme.
## Summary
| Status | Count |
|--------|-------|
| Open | 13 |
| Resolved (this revision) | 18 |
| Previously resolved | 11 |
**Open questions requiring decisions:**
- **OQ-07** (encryptRaw performance) — low priority, add if needed
- **OQ-10** (Edit[] classification) — needs POC
- **OQ-11** (auto-migrate vs explicit) — conditional on OQ-10
- **OQ-12** (schema evolution vs event-sourced replay) — post-v1 concern
- **OQ-13** (schema evolution events in event stream) — post-v1
- **OQ-25** (scope string semantics) — evaluator concern
- **OQ-27** (tenant DB schema migration strategy) — multi-tenant operations
- **OQ-28** (cross-tenant delegation with separate DBs) — cross-DB coordination
- **OQ-29** (standalone drizzle-honker npm package) — community value
- **OQ-30** (composite event target for single-node hub) — latency optimization
- **OQ-31** (consumer naming for durable subscriptions) — restart stability
- **OQ-32** (Drizzle Kit migration compatibility) — custom adapter
- **OQ-33** (peer_credentials SSH key type expansion) — defer until needed
- **OQ-34** (hub api_keys migration path) — needed for hub transition
- **OQ-35** (peer_credentials Iroh auth metadata) — defer until Iroh NAPI complete
## Theme 1: Package Boundaries and Dependencies
### OQ-01: Should @alkdev/flowgraph export a Type.Module?
- **Origin**: [metagraph-module.md](metagraph-module.md)
- **Status**: resolved
- **Priority**: high
- **Resolution**: Storage can start with standalone schemas. Adopt `Import()` when flowgraph provides a Module. No circular dependency.
- **Cross-references**: ADR-003, ADR-010
### OQ-02: Should concrete graph type Modules live in storage or their packages?
- **Origin**: [metagraph-module.md](metagraph-module.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: Both. Storage provides reference Modules; packages may also export their own.
- **Cross-references**: ADR-003
## Theme 2: Data Model
### OQ-03: Should actors be a node type or a standalone table?
- **Origin**: [overview.md](overview.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: Actors become `PrincipalNode` in ACL graph. `actors` table removed. `ACTOR_TYPE` replaced by `IdentityType` in AclGraph Module. See ADR-035.
- **Cross-references**: ADR-035, ADR-034
### OQ-04: Should the repository layer be host-specific or host-agnostic?
- **Origin**: [overview.md](overview.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: Single host (SQLite). Question is moot — no dual-host repository needed. ADR-038.
### OQ-05: *EdgeConstraints entries use Type.Ref or Type.String?
- **Origin**: [metagraph-module.md](metagraph-module.md)
- **Status**: resolved
- **Priority**: low
- **Resolution**: `Type.String()` — constraint arrays contain names, not schemas. ADR-015.
### OQ-06: Graph pointer abstraction vs repository layer?
- **Origin**: [metagraph-module.md](metagraph-module.md)
- **Status**: resolved
- **Priority**: low
- **Resolution**: Direct key-based addressing for v1. Typed pointers post-v1. ADR-017.
## Theme 3: Encryption and Security
### OQ-07: Add encryptRaw() for performance?
- **Origin**: [encrypted-data.md](encrypted-data.md)
- **Status**: open
- **Priority**: low
- **Notes**: PBKDF2 adds ~100ms. Add if batch operations demand it.
### OQ-08: Should key attribute on secret nodes be encrypted?
- **Status**: resolved
- **Priority**: low
- **Resolution**: Plaintext for now. Add `keyHash` if names are sensitive.
### OQ-09: Should secret nodes have lastUsedAt and expiresAt as columns?
- **Status**: resolved
- **Priority**: low
- **Resolution**: JSON attributes for spoke, standalone table for hub.
## Theme 4: Schema Evolution
### OQ-10: Can Value.Diff Edit[] be reliably classified as breaking vs non-breaking?
- **Origin**: [schema-evolution.md](schema-evolution.md)
- **Status**: open
- **Priority**: high
- **Notes**: Theoretical classification needs POC validation.
### OQ-11: Auto-migrate data on schema change, or explicit consumer action?
- **Origin**: [schema-evolution.md](schema-evolution.md)
- **Status**: open
- **Priority**: high
- **Notes**: Conditional on OQ-10 POC outcome.
### OQ-12: Schema evolution vs event-sourced replay?
- **Origin**: [schema-evolution.md](schema-evolution.md)
- **Status**: open
- **Priority**: medium
- **Notes**: Post-v1. Honker streams enable event-sourced replay more naturally than before.
### OQ-13: Schema evolution events in event stream?
- **Origin**: [schema-evolution.md](schema-evolution.md)
- **Status**: open
- **Priority**: low
- **Notes**: Post-v1. Honker streams provide a natural transport for schema change events.
## Theme 5: Encrypted Data Scope
### OQ-14: Per-attribute, per-node, or per-graph encryption?
- **Status**: resolved
- **Priority**: high
- **Resolution**: Per-attribute. ADR-023.
### OQ-15: Key management in this package?
- **Status**: resolved
- **Priority**: high
- **Resolution**: No. Application provides key ring. ADR-026.
## Theme 6: Repository Layer
### OQ-16: Should repository layer live in storage or consumer?
- **Status**: resolved
- **Priority**: high
- **Resolution**: CRUD in storage; operations bridging in consumer. ADR-033.
### OQ-17: Attribute queries — JSON path, native columns, or dbtype-generated?
- **Status**: resolved (updated)
- **Priority**: high
- **Resolution**: JSON path for metagraph attributes (dynamic schemas). Native columns for domain-specific tables (CallGraph, etc.). OperationSpecs provide the CRUD contract for both patterns. ADR-048 supersedes ADR-033.
- **Cross-references**: ADR-048
### OQ-18: Auto-generated vs hand-written CRUD?
- **Status**: resolved (updated)
- **Priority**: medium
- **Resolution**: Not hand-written CRUD — OperationSpecs. Storage outputs `OperationSpec[]` from table definitions. The consumer (hub/spoke) registers handlers. ADR-048.
- **Cross-references**: ADR-048
### OQ-19: Storage-operations bridge package location?
- **Status**: resolved (updated)
- **Priority**: medium
- **Resolution**: No separate bridge package needed. Storage outputs `OperationSpec[]` as part of its table definitions (type-only peer dep on `@alkdev/operations`). The consumer wires specs into the registry. ADR-048.
## Theme 7: Access Control
### OQ-20: Should DelegatesEdge support expiration?
- **Origin**: [acl.md](acl.md)
- **Status**: open
- **Priority**: low
- **Notes**: Session-scoped delegation could be modeled by creating/removing edges per session rather than adding `expiresAt`.
### OQ-21: Should ACL evaluator live in storage or hub?
- **Origin**: [acl.md](acl.md)
- **Status**: resolved
- **Priority**: high
- **Resolution**: Storage provides traversal primitives; hub composes with operations `enforceAccess`. The single-host model (no PG/SQLite split) simplifies this — no cross-DB joins needed for ACL evaluation within a tenant DB. ADR-034.
### OQ-22: How are ACL graph instances created and managed?
- **Origin**: [acl.md](acl.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: One ACL graph instance per tenant DB (ADR-040). The tenant DB is inherently org-scoped, so the ACL graph covers one org. No cross-org scoping issue within a single tenant DB.
- **Cross-references**: ADR-040
### OQ-23: BelongsToEdge derived or primary?
- **Origin**: [acl.md](acl.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: Derived. `organization_members` SQL table is authoritative for indexed lookups; `BelongsToEdge` in ACL graph enables traversal evaluation. ADR-045.
- **Cross-references**: ADR-045
### OQ-24: How does identityId reference hub entities without package dependency?
- **Origin**: [acl.md](acl.md)
- **Status**: resolved
- **Priority**: medium
- **Resolution**: Logical string references, consistent with ADR-020. With identity tables now in `@alkdev/storage` (ADR-041), the `PrincipalNode.identityId` logically references `accounts.id` in the system DB. Same pattern, clearer provenance.
- **Cross-references**: ADR-020, ADR-041
### OQ-25: Scope string semantics for subset validation?
- **Origin**: [acl.md](acl.md)
- **Status**: open
- **Priority**: high
- **Notes**: Keypal's colon-separated hierarchical scope model with `*` wildcard. ACL evaluator must use same semantics. Scope matching is an evaluator concern, not a storage concern.
## Theme 8: Honker and SQLite
### OQ-26: Can Honker fully replace @alkdev/pubsub's Redis transport for single-node deployments?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: resolved
- **Priority**: high
- **Resolution**: Yes for single-node. The `HonkerEventTarget` adapter (ADR-047) implements pubsub's `TypedEventTarget` on Honker's `notify`/`listen` and `stream`/`subscribe`. POC 2-4 validated: same-process pub/sub works, transactional semantics hold, concurrent listeners work. Redis still needed for multi-node deployments. In-process EventTarget provides sub-ms latency for hot paths (vs ~17ms for Honker round-trip).
- **Cross-references**: ADR-047
### OQ-27: How are schema migrations applied across all tenant DBs?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: open
- **Priority**: high
- **Notes**: Each tenant DB has its own migration history. When a schema change is deployed, all tenant DBs need migration. Options: (1) Migration queue — enqueue a migration job per tenant DB, workers claim and execute. (2) Lazy migration — migrate on first access. (3) Startup sweep — hub iterates all tenant DBs at startup and applies pending migrations.
### OQ-28: How does cross-tenant delegation work with separate DBs?
- **Origin**: [overview.md](overview.md)
- **Status**: open
- **Priority**: medium
- **Notes**: If a user in org A delegates to a user in org B, both tenant DBs are involved. The hub mediates. For v1, cross-tenant delegation can be deferred or handled via the system DB as a coordination point.
### OQ-29: Should the Drizzle-Honker adapter be published as a standalone npm package?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: open
- **Priority**: low
- **Notes**: The adapter is ~100 lines and useful to anyone combining Drizzle with Honker. Publishing as `drizzle-honker` would benefit the community. Decision: start inside `@alkdev/storage`, extract later if there's demand.
### OQ-30: Composite event target for single-node hub deployments?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: open
- **Priority**: medium
- **Notes**: POC 2 showed ~17ms median latency for Honker notify→listen vs sub-ms for in-process EventTarget. For single-node hubs, a composite that dispatches to both (in-process for speed, Honker for durability/cross-process) would be the ideal default. Design needed.
### OQ-31: Consumer naming convention for durable stream subscriptions?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: open
- **Priority**: medium
- **Notes**: Honker's `stream.subscribe(consumer)` requires a consumer name for offset tracking. The name must be stable across hub restarts (PID-based names don't survive restart). Need a convention: `{service}:{host}` or a configurable consumer group ID.
### OQ-32: Drizzle Kit migration compatibility with Honker adapter?
- **Origin**: [honker-integration.md](honker-integration.md)
- **Status**: open
- **Priority**: medium
- **Notes**: Drizzle Kit supports SQLite migrations but expects `better-sqlite3` or `libsql`. Need to verify `drizzle-kit push`/`drizzle-kit generate` works with the custom Honker adapter, or whether we need a custom migration runner.
## Theme 9: Identity and Credentials
### OQ-33: Should `peer_credentials.credentialType` support additional SSH key types beyond Ed25519?
- **Origin**: [sqlite-host.md](sqlite-host.md)
- **Status**: open
- **Priority**: low
- **Notes**: Current spec assumes Ed25519 only (matching wraith ADR-012). RSA and ECDSA keys are common in legacy SSH deployments. If wraith adds support for additional key types, `credentialType` values like `ssh_key_rsa`, `ssh_key_ecdsa` or a `keyType` column may be needed. Defer until wraith supports additional key types.
### OQ-34: How should hub `api_keys` data migrate to the restructured storage schema?
- **Origin**: [sqlite-host.md](sqlite-host.md), [ADR-049](decisions/049-identity-schema-restructuring.md)
- **Status**: open
- **Priority**: medium
- **Notes**: The hub's existing PostgreSQL `api_keys` table has columns (`description`, `keyId`) that map differently to storage's schema. `description` maps to `metadata` (no dedicated column). `keyId` (FK → api_keys.id) becomes `credentialId` + `credentialType` (polymorphic). Hub's `data` columns map to `commonCols.metadata`. A migration script is needed when the hub consumes storage's identity tables.
### OQ-35: Should `peer_credentials` support Iroh-specific authentication metadata?
- **Origin**: [sqlite-host.md](sqlite-host.md)
- **Status**: open
- **Priority**: low
- **Notes**: Iroh connections use node IDs (base58-encoded) for addressing. If Iroh provides an authentication mechanism beyond SSH key auth (e.g., node ID-based trust), `peer_credentials` may need an iroh-specific credential type or additional columns. The Iroh NAPI wrapper is not yet complete; defer until its pubsub integration is implemented.
## ADR Impact
| ADR | Resolves | Informs |
|-----|----------|---------|
| ADR-003 | OQ-01 (partial) | |
| ADR-015 | OQ-05 | |
| ADR-017 | OQ-06 | |
| ADR-020 | OQ-24 | |
| ADR-023 | OQ-14 | |
| ADR-026 | OQ-15 | |
| ADR-033 | OQ-04, OQ-16, OQ-17, OQ-18 | OQ-17 (superseded by ADR-048) |
| ADR-034 | OQ-03, OQ-21 | OQ-25 |
| ADR-035 | OQ-03 | |
| ADR-038 | OQ-04 (moot) | OQ-17 (less pressure) |
| ADR-040 | OQ-22 | OQ-27, OQ-28 |
| ADR-041 | OQ-24 | |
| ADR-042 | | OQ-24 |
| ADR-043 | | |
| ADR-044 | OQ-19 (less pressure) | OQ-29, OQ-32 |
| ADR-045 | OQ-23 | OQ-20 |
| ADR-046 | | OQ-17 |
| ADR-047 | OQ-26 | OQ-30 |
| ADR-048 | OQ-17 (updated), OQ-18 (updated), OQ-19 (updated) | |
| ADR-049 | | OQ-33, OQ-34, OQ-35 |
| ADR-050 | | | |