Files
storage/docs/architecture/open-questions.md
glm-5.1 6b5f32bad4 Add ACL graph architecture spec with principal-agent framework
- New acl.md: AclGraph Module definition (PrincipalNode, ResourceNode,
  DelegatesEdge, ScopesEdge, MemberEdge), principal-agent hierarchy
  with no-escalation invariant, setup-time vs runtime separation,
  multi-parent aggregation rules, cycle detection, scope semantics
- ADR-034: ACL as metagraph (not domain-specific tables)
- ADR-035: Actors become PrincipalNode entries, standalone table removed
- ADR-036: Principal-agent as DelegatesEdge with scope narrowing
- ADR-037: Setup-time definitions seed graph types, runtime instances
  are separate graphs
- Resolve OQ-03 (actors table design) — actors become ACL nodes
- Add OQ-20 through OQ-25 (delegation expiration, evaluator location,
  graph instance lifecycle, BelongsToEdge derivation, identityId
  references, scope string semantics)
- Update README.md and overview.md to reflect new doc and ADRs
- Note: multi-tenancy / graph scoping problem (no ownerId/scopeId on
  graphs table, no identity tables at this level) still needs
  resolution — identity and org tables will likely need to be added
  at this level for referential integrity
2026-05-31 07:11:59 +00:00

17 KiB

status, last_updated
status last_updated
draft 2026-05-30

Open Questions Tracker

Cross-cutting compilation of all unresolved questions across the storage architecture documents, organized by theme. Questions that appear in multiple documents are unified here with cross-references.

When a question is resolved, update its status to resolved and add a resolution note. Once all questions in a theme are resolved, the theme section can be removed and the resolution noted in the relevant ADR.

Summary

Status Count
Open 13
Partially resolved 1
Resolved 11

Open questions requiring decisions:

  • OQ-04 (repository layer host-specific vs host-agnostic) — start host-specific
  • OQ-07 (encryptRaw performance) — low priority, add if needed
  • OQ-10 (Edit[] classification) — needs POC
  • OQ-11 (auto-migrate vs explicit consumer action) — conditional on OQ-10
  • OQ-12 (schema evolution vs event-sourced replay) — post-v1 concern
  • OQ-13 (schema evolution events in event stream) — post-v1
  • OQ-19 (storage-operations bridge package location) — depends on long-term CRUD strategy
  • OQ-20 (delegation expiration) — ACL design
  • OQ-21 (ACL evaluator location) — ACL design
  • OQ-22 (ACL graph instance lifecycle) — ACL design
  • OQ-23 (BelongsToEdge derivation) — ACL design
  • OQ-24 (identityId reference mechanism) — ACL design
  • OQ-25 (scope string semantics for subset validation) — ACL design

Partially resolved:

  • OQ-01 (flowgraph Module export) — storage can start without it

Resolved (v1 direction decided, long-term question remains open):

  • OQ-17 (attribute query strategy) — JSON path for v1 (ADR-033), hybrid viable with dbtype later
  • OQ-18 (auto-generated vs hand-written CRUD) — hand-write for v1 (ADR-033), auto-gen remains an option

How to Use This Document

  • Each question has an ID (e.g., OQ-01), status, origin (which doc(s)), and priority
  • Cross-references link related questions and ADRs
  • Resolved questions have a resolution note

ADR Impact

ADR Resolves Informs
ADR-003 OQ-01 (partial — storage can start without flowgraph Module)
ADR-015 OQ-05 (constraint semantics)
ADR-018 OQ-17 (v1 decision: dbtype integration deferred, JSON path for v1)
ADR-020 OQ-02 (no nodeTypeId for now, can add later)
ADR-033 OQ-17 (JSON path queries for v1), OQ-18 (hand-written CRUD for v1)
ADR-034 OQ-03 (actors become ACL nodes) OQ-21 (evaluator location), OQ-23 (BelongsToEdge derivation), OQ-24 (identityId references)
ADR-035 OQ-03 (standalone table removed)
ADR-036 OQ-20 (delegation expiration)
ADR-037 OQ-21 (evaluator location), OQ-22 (graph instance lifecycle)

Theme 1: Package Boundaries and Dependencies

OQ-01: Should @alkdev/flowgraph export a Type.Module, or should storage define its own entries with documented correspondence?

  • Origin: metagraph-module.md
  • Status: partially resolved
  • Priority: high
  • Notes: Storage can start with standalone schemas and Type.Composite([BaseNode, CallNodeAttrs]) — no dependency on flowgraph. Adopt Import() when flowgraph provides a Module. This avoids a circular dependency: @alkdev/storage does NOT depend on @alkdev/flowgraph.
  • Cross-references: ADR-003, ADR-010

OQ-02: Should concrete graph type Modules live in storage or in their respective packages?

  • Origin: metagraph-module.md
  • Status: resolved
  • Priority: medium
  • Resolution: Both. Storage provides reference Modules in modules/ that consumers can use directly or replace. Flowgraph may also export a Module — the two are compatible via Module $defs.
  • Cross-references: ADR-003

Theme 2: Data Model

OQ-03: Should actors be a node type or a standalone table?

  • Origin: overview.md
  • Status: resolved
  • Priority: medium
  • Resolution: Actors become PrincipalNode entries in the ACL graph instance. The standalone actors table is removed. ACTOR_TYPE is replaced by the IdentityType enum in the AclGraph Module. See ADR-035.
  • Cross-references: ADR-035, ADR-034, acl.md

OQ-04: Should the repository layer be host-specific or host-agnostic?

  • Origin: overview.md
  • Status: open
  • Priority: medium
  • Notes: A host-agnostic repository requires an abstraction over Drizzle's query builder. A host-specific repository is simpler but means duplicating query logic for PG. Decision: start host-specific in SQLite, extract common patterns later.
  • Cross-references: sqlite-host.md

OQ-05: Should *EdgeConstraints entries use Type.Ref or Type.String for allowed source/target types?

  • Origin: metagraph-module.md
  • Status: resolved
  • Priority: low
  • Resolution: Type.String() — the constraint arrays contain node type names, not node type schemas.
  • Cross-references: ADR-015

OQ-06: How does the graph pointer abstraction interact with the repository layer?

  • Origin: metagraph-module.md
  • Status: resolved
  • Priority: low
  • Resolution: For v1, repository functions use direct key-based addressing. Validate on read — if data doesn't match the Module entry, throw. Typed pointers are post-v1 (ADR-017).
  • Cross-references: ADR-017, forward-look.md

Theme 3: Encryption and Security

OQ-07: Should we add encryptRaw() for performance?

  • Origin: encrypted-data.md
  • Status: open
  • Priority: low
  • Notes: PBKDF2 derivation adds ~100ms per operation. For batch operations (e.g., rotating 1000 keys), this adds up. An encryptRaw() that skips PBKDF2 would be much faster. Decision: add in a future iteration if performance demands it.

OQ-08: Should the key attribute on secret nodes be encrypted?

  • Origin: encrypted-data.md
  • Status: resolved
  • Priority: low
  • Resolution: Plaintext key names are acceptable for now. If secret names are sensitive, add a keyHash attribute for blind lookups.

OQ-09: Should secret nodes have lastUsedAt and expiresAt as first-class columns?

  • Origin: encrypted-data.md
  • Status: resolved
  • Priority: low
  • Resolution: For spoke use (occasional lookups), JSON attributes are fine. For hub use (high-throughput key validation), a standalone api_keys table with proper indexes is still needed.

Theme 4: Schema Evolution

OQ-10: Can Value.Diff Edit[] be reliably classified as breaking vs non-breaking?

  • Origin: schema-evolution.md
  • Status: open
  • Priority: high
  • Notes: The classification table in schema-evolution.md is theoretical. A POC should validate whether Edit[] output contains enough information to distinguish String → Literal("x") (narrowing, non-breaking) from String → Number (incompatible, breaking). Alternative: skip classification and just use Value.Check(newSchema, storedData) for verification.

OQ-11: Should the repository layer auto-migrate data on schema change, or require explicit consumer action?

  • Origin: schema-evolution.md
  • Status: open
  • Priority: high
  • Notes: Conditional on OQ-10 POC outcome. If classification is feasible, the repository layer auto-applies Value.Cast for non-breaking changes and requires explicit consumer action for breaking changes. If classification is not feasible, the repository layer auto-applies Value.Cast only when Value.Check(newSchema, storedData) passes for all stored data.

OQ-12: How does schema evolution interact with the hub's event-sourced call graph?

  • Origin: schema-evolution.md
  • Status: open
  • Priority: medium
  • Notes: If the hub migrates to event-sourced replay (projector evolution), storage's call graph tables become disposable projections. But other graph types (ACL, tasks, secrets) may not have an event stream to replay from. The schema evolution design should work for both projections and direct-persisted data.

OQ-13: Should schema evolution events be part of the event stream?

  • Origin: schema-evolution.md
  • Status: open
  • Priority: low
  • Notes: Post-v1. For v1, schema changes are applied directly via the repository layer with version tracking.

Theme 5: Encrypted Data Scope

OQ-14: Should encryption be per-attribute, per-node, or per-graph?

  • Origin: overview.md
  • Status: resolved
  • Priority: high
  • Resolution: Per-attribute. The EncryptedData schema is a single attribute within a node type, not the entire node. This preserves queryability on non-sensitive fields (ADR-023).

OQ-15: Should key management be in this package?

  • Origin: overview.md
  • Status: resolved
  • Priority: high
  • Resolution: No. @alkdev/storage provides encryption/decryption primitives but NOT key management. The consuming application provides the key ring (ADR-026).

Theme 6: Repository Layer

OQ-16: Should the repository layer live in @alkdev/storage or in a consumer package?

  • Origin: overview.md
  • Status: resolved
  • Priority: high
  • Resolution: The repository CRUD layer (host-specific typed queries, schema validation before writes) belongs in @alkdev/storage. The operations bridging layer (generating OperationSpecs from metagraph schemas) belongs in a consumer or adapter package. These are separate concerns — CRUD is a storage concern; call protocol integration is an application concern.

Theme 7: Repository Layer Strategy

OQ-17: How should the repository layer handle attribute queries — JSON path, native columns, or dbtype-generated?

  • Origin: forward-look.md
  • Status: resolved (v1)
  • Priority: high
  • Resolution: For v1, attribute queries use JSON path extraction (json_extract on SQLite, ->>/#>> on PG). Hand-written CRUD for static tables. dbtype integration and hybrid approach are post-v1. See ADR-033. The long-term question of whether to adopt the hybrid approach (static tables via dbtype, dynamic attributes remain JSON) remains open for future iterations.
  • Cross-references: ADR-033, ADR-018, forward-look.md

OQ-18: Should the repository layer's CRUD operations be auto-generated (drizzle-graphql pattern) or hand-written?

  • Origin: forward-look.md
  • Status: resolved (v1)
  • Priority: medium
  • Resolution: For v1, hand-write CRUD functions with explicit signatures. The three long-term options (hand-written, auto-generated from Drizzle, auto-generated from dbtype) remain open for future iterations. See ADR-033.
  • Cross-references: ADR-033, OQ-17

OQ-19: Where does the storage-operations bridge package live in the @alkdev workspace?

  • Origin: forward-look.md
  • Status: open
  • Priority: medium
  • Notes: Four options: (1) hub-internal code, (2) dedicated @alkdev/storage-operations adapter, (3) from-storage adapter inside @alkdev/operations, (4) part of @alkdev/dbtype's from-dbtype adapter. Option 1 is the most immediate (no new package). Option 2 is the cleanest separation. Option 3 creates an undesirable dependency direction (operations → storage). Option 4 is the long-term goal if dbtype is adopted. The choice depends on OQ-17/OQ-18 resolution: if hand-written CRUD, the bridge is trivial and can live in the hub; if auto-generated from dbtype, the bridge naturally lives with dbtype.
  • Cross-references: OQ-16, OQ-17, ADR-033

Theme 8: Access Control

OQ-20: Should DelegatesEdge support temporary delegation with expiration?

  • Origin: acl.md
  • Status: open
  • Priority: low
  • Notes: Currently, DelegatesEdge has narrowedScopes and narrowedResources but no expiresAt. If delegation should be time-limited (e.g., "delegate for this session only" or "delegate for 24 hours"), an expiration attribute is needed. Session-scoped delegation could be modeled by creating/removing edges per session, avoiding the need for an expiresAt attribute. Time-based expiration adds complexity to the evaluator (checking edge validity at call time) but may be useful for non-session contexts.
  • Cross-references: ADR-036

OQ-21: Should the ACL evaluator live in @alkdev/storage or in the hub?

  • Origin: acl.md
  • Status: open
  • Priority: high
  • Notes: The ACL evaluator traverses delegation chains and computes effective scopes. Three options: (1) @alkdev/storage provides traversal primitives (walk edges, compute effective scopes for a principal given a graph instance) and the hub composes them with @alkdev/operations' enforceAccess. (2) The hub implements the evaluator from scratch, using storage's repository layer for graph queries. (3) A new @alkdev/acl package provides the evaluator, depending on both @alkdev/storage and @alkdev/operations. Option 1 keeps the dependency direction clean (storage doesn't depend on operations). Option 3 is the cleanest separation but adds a package. The choice depends on whether the evaluator is generic enough to be reusable across different hub implementations.
  • Cross-references: ADR-034, ADR-037

OQ-22: How are ACL graph instances created and managed?

  • Origin: acl.md
  • Status: open
  • Priority: medium
  • Notes: Several options: (1) One global ACL graph instance per hub. Simple but means all orgs share a single graph — large graphs may have traversal performance implications. (2) One ACL graph instance per org. Isolated, each org's permissions are self-contained. Requires cross-org delegation to span graphs. (3) One ACL graph instance per "scoping context" (e.g., per spoke context). Most granular but most complex. The choice depends on whether delegation crosses org boundaries (if a user delegates to an agent in another org's context, graphs must be traversable across instances).
  • Cross-references: ADR-037

OQ-23: Should BelongsToEdge be derived (materialized from organization_members) or primary (ACL graph is the source of truth)?

  • Origin: acl.md
  • Status: open
  • Priority: medium
  • Notes: The hub already has an organization_members table with membershipLevel. If BelongsToEdge is derived, the hub writes both organization_members rows and ACL graph edges when membership changes, keeping them in sync. If BelongsToEdge is primary, the ACL graph is the source of truth and the hub reads org membership from the graph. Derived is consistent with the hub's existing identity tables being authoritative. Primary means the ACL graph replaces org membership data, requiring graph queries for simple membership lookups. Lean toward derived — the hub's identity tables are authoritative for authentication, the ACL graph is authoritative for authorization.
  • Cross-references: ADR-034

OQ-24: How does identityId reference hub entities without creating a package dependency?

  • Origin: acl.md
  • Status: open
  • Priority: medium
  • Notes: PrincipalNode.identityId references an account, organization, or role in the hub's database, but @alkdev/storage must not depend on @alkdev/operations or the hub. The identityId is a string, not a FK. This is consistent with ADR-020 (no nodeTypeId on nodes) — the metagraph pattern stores node attributes without assuming external referential integrity. Options: (1) Logical references (current design) — identityId is a string that the hub resolves. (2) Convention-based references — a URI scheme like alk://account/user-1 or alk://org/acme that encodes the entity type and ID. (3) A shared types package that both storage and hub import. Option 1 is the simplest and consistent with the existing pattern. The burden of referential integrity falls on the consumer (the hub), not on storage.
  • Cross-references: ADR-020, ADR-034

OQ-25: What are the scope string semantics for subset validation?

  • Origin: acl.md
  • Status: open
  • Priority: high
  • Notes: narrowedScopes ⊆ effectiveScopes is the no-escalation invariant, but the semantics of this subset check depend on how scope strings work. @alkdev/operations uses keypal's scope model (colon-separated hierarchical segments, * wildcard for suffix matching). "dev:*" matches "dev.read", "dev.write", "dev.fs.read", etc. The ACL evaluator must use the same semantics or delegation validation will be inconsistent with runtime access checks. Option: import scope matching logic from @alkdev/operations or extract it to a shared utility. The ACL graph stores scopes as plain strings; matching is an evaluator concern, not a storage concern.
  • Cross-references: ADR-036, /workspace/@alkdev/operations/src/access.ts