Files
storage/docs/architecture/overview.md
glm-5.1 a2ee452a63 Add repository layer strategy: JSON path queries, CRUD decisions, ecosystem integration
Add three open questions (OQ-17, OQ-18, OQ-19) covering attribute query
strategy, CRUD generation approach, and storage-operations bridge placement.
Create ADR-033 recording the v1 decision: JSON path queries for attributes
with hand-written CRUD for static tables.

Expand forward-look.md with Repository Layer Strategy section analyzing
three approaches (JSON path, native columns via dbtype, hybrid) and their
implications for the metagraph pattern. Add drizzle-graphql and dbtype
from-dbtype comparison showing neither handles dynamic schema-as-data.

Update overview.md with dbtype/ujsx in the dependency diagram, expanded
ecosystem context in the bridging pattern section, and new open questions.

Align open-questions.md: resolve OQ-17 and OQ-18 for v1 (ADR-033), add
OQ-19 as open, update summary counts and ADR impact table.
2026-05-30 11:02:49 +00:00

286 lines
19 KiB
Markdown

---
status: reviewed
last_updated: 2026-05-30
---
# @alkdev/storage — Overview
Typed graph storage with dual database hosts. Deno-first, published via JSR.
## Purpose
`@alkdev/storage` provides a **metagraph** storage model: graph types define
schemas, node types define data shapes within those graphs, and edge types
define typed relationships. Instances of these type definitions become actual
graphs populated with nodes and edges.
This pattern replaces domain-specific table proliferation with a small number of
general-purpose tables that can model anything — call graphs, ACL rules, task
dependencies, encrypted secrets — while enforcing schema integrity through
TypeBox validation.
The package evolved from `@ade/ade-v0/packages/core/graphs` and
`@ade/ade-v0/packages/storage_sqlite`, simplified and refactored for the @alkdev
ecosystem.
## Architecture
```
@alkdev/storage/
├── mod.ts → re-exports graphs/ (zero db deps)
├── src/
│ ├── graphs/ → Metagraph Module, bridge functions, crypto (no db deps)
│ │ ├── modules/ → TypeBox Module definitions
│ │ │ ├── metagraph.ts → Config, BaseNode, BaseEdge
│ │ │ ├── call-graph.ts → CallGraph reference Module
│ │ │ ├── secret-graph.ts → SecretGraph reference Module
│ │ │ └── index.ts → barrel re-export
│ │ ├── bridge.ts → moduleToDbSchema, validateNode, validateEdge
│ │ ├── crypto.ts → encrypt, decrypt, generateEncryptionKey, EncryptedDataSchema
│ │ └── mod.ts → re-exports all graphs exports
│ ├── sqlite/ → SQLite host (drizzle-orm/libsql)
│ │ ├── tables/ → drizzle table definitions
│ │ ├── relations.ts → drizzle relational mappings
│ │ ├── schema.ts → barrel re-export
│ │ └── client.ts → injectable createSqliteDatabase()
│ └── pg/ → PostgreSQL host (NOT YET IMPLEMENTED)
└── test/
└── reference-modules.test.ts → Metagraph, bridge, crypto tests
```
### Subpath Exports (JSR/npm)
| Export | Contents | Dependencies |
| ------------------------ | --------------------------------------- | --------------------------------------- |
| `@alkdev/storage` | Graph schema types, Metagraph Module | `@alkdev/typebox` |
| `@alkdev/storage/graphs` | Same as `.` — alias for the main export | Same as `.` |
| `@alkdev/storage/sqlite` | SQLite tables, relations, client | `@alkdev/drizzlebox`, `drizzle-orm`, `@libsql/client` |
| `@alkdev/storage/pg` | PostgreSQL tables, relations, client | ⚠️ NOT YET IMPLEMENTED |
The `./graphs` subpath exists because the source code lives in `src/graphs/` and
the main `mod.ts` re-exports it. Importing from either `@alkdev/storage` or
`@alkdev/storage/graphs` yields the same types and Metagraph Module.
## Terminology
| Term | Definition |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Metagraph** | A type system where graph types define schemas, node types define data shapes within those graphs, and edge types define typed relationships. Graph instances are concrete data conforming to these type definitions. |
| **Hub** | The central service in the hub-spoke architecture. A consumer of `@alkdev/storage` — uses the PostgreSQL host for persistent graph storage. The hub also depends on `@alkdev/operations`, `@alkdev/pubsub`, `@alkdev/flowgraph`. |
| **Spoke** | A local/embedded instance that runs per-project or per-session. A consumer of `@alkdev/storage` — uses the SQLite host for local graph storage. |
| **Graph type** | A class of graphs (e.g., "call-graph", "acl"). Defines structural constraints (directed/undirected/mixed, multi-edges, self-loops) and the valid node/edge type vocabularies. Stored in the `graph_types` table. |
| **Node type** | A category of node within a graph type. Defines the attribute schema for nodes of that type. Stored in the `node_types` table. |
| **Edge type** | A category of edge within a graph type. Defines the attribute schema and optionally restricts which node types can be source/target. Stored in the `edge_types` table. |
| **Graph instance** | A concrete graph belonging to a graph type. Contains nodes and edges conforming to its type definitions. Stored in the `graphs` table. |
| **Consumer** | Code that imports `@alkdev/storage` (or a subpath) to define graph types and persist graph data. The hub, spokes, and other @alkdev packages are consumers. |
| **Repository layer** | ⚠️ Not yet implemented. The typed CRUD functions (insert, find, update, delete) that sit between consumer code and raw Drizzle queries. Performs schema validation before writes. No dependency on `@alkdev/operations` — the consumer wires CRUD into the registry. |
| **Validation boundary** | The line where schema validation is enforced. In this package, validation happens in the Metagraph Module (at type definition time) and the repository layer (at mutation time), NOT in the database. |
## Design Decisions
All design decisions are documented as ADRs in [decisions/](decisions/).
| ADR | Decision | Summary |
|-----|----------|---------|
| [001](decisions/001-deno-first-jsr-publishes.md) | Deno-first, JSR publishes | Published to JSR; npm comes free via `@jsr/alkdev__storage` |
| [002](decisions/002-metagraph-over-domain-tables.md) | Metagraph over domain-specific tables | 6 general-purpose tables serve all domains |
| [003](decisions/003-typebox-module-as-api-surface.md) | TypeBox Module as API surface | `Type.Module()` replaces `SchemaBuilder`; `Metagraph.Import()` + `Type.Composite()` |
| [004](decisions/004-injectable-clients-no-side-effects.md) | Injectable clients, no side effects | `createSqliteDatabase(client)` takes a pre-created client |
| [005](decisions/005-drizzle-plus-typebox-via-drizzlebox.md) | Drizzle + TypeBox via drizzlebox | Drizzle tables are single source of truth; drizzlebox generates TypeBox schemas |
| [006](decisions/006-enum-pattern-as-const-objects.md) | `as const` objects, not TypeScript enums | Avoids JSR slow-types; consistent pattern across codebase |
| [007](decisions/007-no-comments-in-code.md) | No comments in code | Documentation lives in architecture docs and TypeBox descriptions |
| [008](decisions/008-common-columns-pattern.md) | Common columns pattern | `id`, `metadata`, `createdAt`, `updatedAt` on every table |
| [033](decisions/033-json-path-queries-for-v1.md) | JSON path queries and hand-written CRUD for v1 | Attribute queries use JSON path; CRUD is hand-written; dbtype and auto-generation are post-v1 |
## Dependencies
| Package | Purpose | Layer |
| -------------------- | ------------------------------------ | ------------------------ |
| `@alkdev/typebox` | Runtime schema validation | graphs/ |
| `@alkdev/drizzlebox` | Generate TypeBox from Drizzle tables | sqlite/ |
| `drizzle-orm` | ORM, table definitions, queries | sqlite/ (and future pg/) |
| `@libsql/client` | SQLite client (libsql/turso) | sqlite/ |
| `postgres` | PostgreSQL client | pg/ (not yet used) |
`@alkdev/typebox` and `@alkdev/drizzlebox` are npm packages (not yet on JSR).
JSR handles npm dependencies natively.
**Ecosystem packages are not runtime dependencies of `@alkdev/storage`.** All
ecosystem references in this document describe consumer-side data shapes and
integration patterns, not import dependencies. The `@alkdev/operations`,
`@alkdev/pubsub`, `@alkdev/flowgraph`, and `@alkdev/taskgraph` packages are
consumed by the hub and spokes, not by storage itself.
## What Exists vs. What's Needed
### Implemented
- Metagraph Module (`Type.Module` with Config, BaseNode, BaseEdge entries)
- Bridge functions (`moduleToDbSchema`, `validateNode`, `validateEdge`)
- Reference graph type Modules (CallGraph, SecretGraph)
- Crypto utility (AES-256-GCM + PBKDF2, `EncryptedDataSchema`)
- SQLite host: 6 metagraph tables + actors table + Drizzle relations + client
factory
- TypeBox select/insert schemas generated from Drizzle tables (drizzlebox)
- Reference module tests (bridge functions, validation, Module composition)
### Not Yet Implemented
| Gap | Priority | Notes |
| ----------------------------------------- | ------------ | --------------------------------------------------------------------------------------------------- |
| Repository/CRUD layer | High | ⚠️ Not yet implemented. Typed insert, find, update, delete functions for graphs, nodes, edges. No dependency on `@alkdev/operations` — consumer wires CRUD into registry. |
| PostgreSQL host | Medium | Same table shapes, `pgTable` + `jsonb` + `timestamp` + `pgEnum`. Stub only. |
| ACL graph type | Medium | Access control as a graph. Informed by `@alkdev/operations`' `Identity` and `AccessControl`. Depends on CRUD layer. |
| Task graph type | Low | Informed by `@alkdev/taskgraph`'s `TaskGraphNodeAttributes` and `DependencyEdge` schemas. |
| Graphology bridge | Low | `moduleToGraphology()` and `fromGraphologyExport()` — Phase 4 of the metagraph implementation path. |
## Ecosystem Integration
`@alkdev/storage` is a **data layer package** consumed by other packages in the
@alkdev ecosystem. It does not depend on the hub — the dependency flows the
other way. The hub consumes storage (along with operations, pubsub, flowgraph,
and taskgraph) as part of its architecture.
### Dependency Direction
```
@alkdev/pubsub ← transport only (no storage dependency)
@alkdev/operations ← call protocol, registry, identity, access control
↑ (depends on: @alkdev/pubsub, @alkdev/typebox)
@alkdev/flowgraph ← call graph schema, operation graph, workflow templates
↑ (depends on: @alkdev/operations [peer], @alkdev/typebox)
@alkdev/taskgraph ← task dependency graph schema, cost-benefit analysis
(depends on: @alkdev/typebox)
@alkdev/dbtype ← schema-first multi-dialect DB type system (Phase 0, not yet implemented)
(depends on: @alkdev/typebox, @alkdev/ujsx)
Renders UJSX element trees to Drizzle dialects; future: from-dbtype
adapter generates CRUD OperationSpecs for @alkdev/operations
@alkdev/storage ← YOU ARE HERE — typed graph persistence
(depends on: @alkdev/typebox, @alkdev/drizzlebox)
↑ ↑
| |
Hub / Spoke Any consumer that needs
(consumes all) persistent graph storage
```
The key insight: `@alkdev/storage` provides the **persistence primitives**
(schemas, tables, repository layer). The **domain semantics** (what a call graph
means, what identity looks like, how access control works) are defined by the
packages above. Storage stores the shapes those packages define; it does not
define the semantics itself.
### What Comes from Where
| Concept | Source package | Storage's role |
|---------|---------------|----------------|
| Call protocol events (`call.requested`, `call.responded`, etc.) | `@alkdev/operations` | Storage persists the outcomes — graphs with `CallNodeAttrs` nodes |
| Identity (`id`, `scopes`, `resources`) | `@alkdev/operations` | Storage stores identity as node attributes; `Identity` is a data shape, not a storage concept |
| Access control (`AccessControl`, `requiredScopes`) | `@alkdev/operations` | Storage's ACL graph type mirrors the operations `AccessControl` schema as graph structure |
| Call graph schema (`CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus`) | `@alkdev/flowgraph` | Storage persists these in-memory shapes to the database |
| Task graph schema (`TaskGraphNodeAttributes`, `DependencyEdge`) | `@alkdev/taskgraph` | Storage persists task dependency shapes |
| Event transport (`TypedEventTarget`, `EventEnvelope`) | `@alkdev/pubsub` | Storage is not involved in event routing; it stores the events' outcomes |
| Database schema rendering (`<table>`, `<column>`, HostConfig) | `@alkdev/dbtype` | Storage's static metagraph tables could be dbtype-rendered in the future (OQ-17, OQ-18) |
| Universal IR (`h()`, `createComponent`, `createRoot`) | `@alkdev/ujsx` | Storage's `Type.Module` format is structurally compatible with ujsx rendering; no runtime dependency |
### Repository Layer Bridging Pattern (Consumer-Side Concern)
The repository layer in `@alkdev/storage` provides typed CRUD — no `@alkdev/operations`
dependency. A **consumer-side** bridging module can then wire these CRUD functions
into the `@alkdev/operations` registry, analogous to how `drizzle-graphql`
auto-generates a GraphQL schema from Drizzle tables — but using operations
(queries, mutations, subscriptions) instead of GraphQL resolvers. This works
because:
1. `@alkdev/operations` already maps closely to GraphQL's
queries/mutations/subscriptions (it was modeled after that pattern)
2. `@alkdev/pubsub` provides the subscription transport (forked from
graphql-yoga's pubsub with additions like in-memory, Redis, WebSocket,
WebWorker event targets)
3. `@alkdev/storage`'s metagraph tables are the data source, analogous to
Drizzle tables for drizzle-graphql
The bridging module would live in a consumer package (e.g., the hub or a
dedicated `@alkdev/storage-operations` adapter), not in `@alkdev/storage` itself,
to avoid circular dependencies:
```
@alkdev/storage → defines types + tables (no operations dependency)
@alkdev/operations → defines call protocol + registry (no storage dependency)
Consumer (hub / adapter) → imports both, generates operations from schemas
```
#### Ecosystem Context
The question of *where* this bridge lives and *how* it's generated connects to
the broader ecosystem:
- **drizzle-graphql** (`/workspace/drizzle-graphql`): Auto-generates GraphQL
CRUD from Drizzle tables. The reference pattern for "database schema → API
surface." Produces `{ schema, entities }` from `buildSchema(db)`. No TypeBox,
no metagraph.
- **@alkdev/dbtype**: Schema-first multi-dialect system using ujsx element trees.
Defines `<table>`, `<column>` elements rendered to Drizzle via HostConfig. Has
a designed `from-dbtype` adapter that generates `OperationSpec[]` from element
trees + Type.Module bundles. Phase 0 (architecture only, no implementation).
- **@alkdev/operations**: Runtime-agnostic typed operations registry with
adapters (`FromOpenAPI`, `from_mcp`, `from_typemap`) that generate
`OperationSpec[]` from external specifications. The `from-dbtype` adapter would
be another adapter in the same pattern.
The strategic question (OQ-17, OQ-18) is whether storage's repository CRUD
operations should be hand-written, auto-generated from Drizzle schemas, or
auto-generated from dbtype element trees once dbtype is implemented. For v1,
hand-written CRUD is the simplest path and doesn't block any long-term option.
See [forward-look.md](forward-look.md) for the full analysis.
### Avoiding Circular Dependencies
Neither `@alkdev/storage` nor `@alkdev/operations` should depend on each
other directly. Storage defines the schema types and database tables; operations
defines the call protocol and execution model. The consumer (hub, spoke, or
adapter package) imports both and bridges them. This preserves the
single-responsibility principle and allows each package to evolve independently.
If shared type definitions are needed (e.g., `Identity` referenced in both
storage node attributes and operations call events), they should either:
1. Be duplicated in each package with a documented correspondence (acceptable
for small, stable types)
2. Be extracted to a minimal shared types package if the duplication becomes
burdensome
## Open Questions
Open questions are tracked in [open-questions.md](open-questions.md). Key
questions affecting this package:
- **OQ-03**: Should actors be a node type or a standalone table? (open, deferred to ACL design)
- **OQ-04**: Should the repository layer be host-specific or host-agnostic? (open, start host-specific)
- **OQ-14**: Should encryption be per-attribute, per-node, or per-graph? (resolved: per-attribute)
- **OQ-15**: Should key management be in this package? (resolved: no, application provides key ring)
- **OQ-16**: Should the repository layer live in storage or a consumer package? (resolved: CRUD in storage, operations bridging in consumer)
- **OQ-17**: How should the repository layer handle attribute queries — JSON path, native columns, or dbtype-generated? (open, JSON path for v1)
- **OQ-18**: Should CRUD operations be auto-generated or hand-written? (open, hand-write for v1)
- **OQ-19**: Where does the storage-operations bridge package live? (open, depends on OQ-17/OQ-18)
## References
- Metagraph Module evolution: [metagraph-module.md](./metagraph-module.md)
- Schema evolution via TypeBox value system: [schema-evolution.md](./schema-evolution.md)
- Forward-looking connections: [forward-look.md](./forward-look.md)
- Operations architecture: `/workspace/@alkdev/operations/docs/architecture/README.md`
- Pubsub architecture: `/workspace/@alkdev/pubsub/docs/architecture/README.md`
- Flowgraph architecture: `/workspace/@alkdev/flowgraph/docs/architecture/README.md`
- Taskgraph architecture: `/workspace/@alkdev/taskgraph_ts/docs/architecture/README.md`
- drizzle-graphql (reference for repo bridging pattern): `/workspace/drizzle-graphql/`
- Source heritage: `@ade/ade-v0/packages/core/graphs` and
`@ade/ade-v0/packages/storage_sqlite`
- Drizzle ORM: https://orm.drizzle.team/
- TypeBox: `/workspace/@alkdev/typebox/`
- JSR: https://jsr.io/