Reorient @alkdev/storage around a single SQLite database host with Honker
for pub/sub, event streams, and task queues. PostgreSQL is removed as a
target (ADR-038), eliminating dual schema maintenance and infrastructure
complexity. Honker provides DB + pubsub + queues in one .db file (ADR-039).
Add system/tenant DB model (ADR-040): identity tables in system.db, all
graph data in tenant-{orgId}.db files. Identity tables move from the hub
into storage (ADR-041). Scoping columns (ownerId, projectId) added to
graphs table (ADR-042). Graph types get scope (system/tenant/user) to
protect infrastructure schemas (ADR-043).
Define Drizzle-Honker session adapter (ADR-044): ~100-line adapter enabling
Drizzle typed queries and Honker pubsub/queue on a single connection with
transactional consistency.
Resolve OQ-03, OQ-04, OQ-19, OQ-21, OQ-22, OQ-23, OQ-24. Add new
open questions OQ-26 through OQ-29 for Honker integration specifics.
New docs: honker-integration.md (adapter, event patterns, migration).
Scrub all PG/jsonb/libsql references from existing spec docs.
399 lines
18 KiB
Markdown
399 lines
18 KiB
Markdown
---
|
|
status: draft
|
|
last_updated: 2026-05-31
|
|
---
|
|
|
|
# Forward Look: Pointers, dbtype, and Universal IR
|
|
|
|
How the Module-based metagraph connects to the broader @alkdev ecosystem —
|
|
typed graph pointers, dbtype table rendering, and the ujsx universal IR
|
|
pipeline. These are forward-looking designs that justify why certain structural
|
|
decisions were made now
|
|
(pointer abstraction deferred per [ADR-017](./decisions/017-pointer-abstraction-is-forward-looking.md),
|
|
dbtype integration deferred per [ADR-018](./decisions/018-dbtype-integration-is-post-v1.md)).
|
|
|
|
## Overview
|
|
|
|
Three packages in the @alkdev ecosystem share the same pipeline shape:
|
|
|
|
```
|
|
Schema (TypeBox Module) → Element Tree (ujsx) → Host (HostConfig)
|
|
```
|
|
|
|
| Package | Schema | Element tree | Host |
|
|
|---------|--------|-------------|------|
|
|
| `@alkdev/ujsx` | `UJSX` Module | `<element>`, `<root>` | DOM, custom |
|
|
| `@alkdev/dbtype` | Table/Column schemas | `<table>`, `<column>` | SQLite, PG, MySQL drizzle dialects |
|
|
| `@alkdev/storage` | `Metagraph` Module | ⚠️ Future: `<graphSchema>`, `<nodeType>` | ⚠️ Future: graph DB hosts |
|
|
|
|
When storage's graph type definitions align with the Module pattern, they
|
|
join this same pipeline. The immediate benefit is recursive/cross-referencing
|
|
schemas (today). The forward benefit is that graph type definitions, table
|
|
definitions, and pointer expressions can all be authored as ujsx element trees
|
|
rendered to different hosts.
|
|
|
|
## Pointer Abstraction
|
|
|
|
Addressing nodes and edges within a graph instance follows the same pattern as
|
|
ujsx's `ValuePointer` and `selectNode`/`setNode` — and the same pattern as
|
|
jsonpathly's JPATH Module for path expressions.
|
|
|
|
### ujsx's pointer system (proven)
|
|
|
|
ujsx already implements a reactive pointer system:
|
|
|
|
```ts
|
|
class ValuePointer<T> {
|
|
private _signal: Signal<T>;
|
|
private _path: string[];
|
|
get value(): T
|
|
set value(v: T)
|
|
get reactive(): ReadonlySignal<T>
|
|
get path(): string[]
|
|
}
|
|
|
|
function selectNode(root: UNode, path: string[]): UNode | undefined
|
|
function setNode(root: UNode, path: string[], value: UNode): UNode
|
|
```
|
|
|
|
This addresses elements within a ujsx tree by path segments (child indices,
|
|
prop names). A graph instance has analogous structure: nodes identified by
|
|
key, edges identified by key, attributes addressed by JSON path.
|
|
|
|
### Graph pointer analogy
|
|
|
|
```ts
|
|
// ujsx pointer: element tree → path → value
|
|
selectNode(root, ["children", 0, "props", "name"])
|
|
|
|
// Graph pointer: graph instance → path → value
|
|
selectNode(graph, ["nodes", "call-001", "attributes", "requestId"])
|
|
```
|
|
|
|
The structural analogy:
|
|
|
|
| ujsx concept | Graph concept |
|
|
|-------------|---------------|
|
|
| Element tree root | Graph instance |
|
|
| `UNode` | Node or Edge |
|
|
| `path: string[]` | Key path: `["nodes", key]` or `["edges", key]` |
|
|
| `selectNode(root, path)` | `selectGraphNode(graph, path)` |
|
|
| `setNode(root, path, value)` | `setGraphNode(graph, path, value)` (via repository) |
|
|
|
|
### JPATH Module (jsonpathly)
|
|
|
|
The research shows that JSONPath expressions can themselves be a TypeBox Module
|
|
(`JPATH = Type.Module({...})` with recursive `Type.Ref("Subscript")`). This means
|
|
pointer paths are not just runtime strings — they're typed schemas that can be
|
|
validated and composed.
|
|
|
|
For graph storage, this opens the possibility of **typed graph queries** — a
|
|
pointer expression like `nodes.call-001.attributes.requestId` has a schema that
|
|
validates against the graph type's Module. If `CallNode` doesn't have a
|
|
`requestId` field, the pointer expression is invalid at compile time.
|
|
|
|
### Scope for v1
|
|
|
|
The pointer abstraction is a forward-looking design. For v1:
|
|
|
|
- **Repository functions** use direct key-based addressing:
|
|
`findNode(graphId, nodeKey)`, `findEdge(graphId, edgeKey)`
|
|
- **Attribute access** is untyped JSON retrieval:
|
|
`node.attributes.requestId`
|
|
- **The Module** validates attribute shapes, but query paths are strings
|
|
|
|
The jump to typed pointers requires either the JPATH Module (for path
|
|
validation) or ujsx-style `ValuePointer` with signals (for reactive graph
|
|
observation). Both are post-v1 concerns, but the graph type Module makes them
|
|
feasible because it provides the schema the pointer validates against.
|
|
|
|
## Relationship to @alkdev/dbtype
|
|
|
|
`@alkdev/dbtype` defines database schemas as ujsx element trees and renders them
|
|
to Drizzle dialects via HostConfig. Storage's SQLite/PG table definitions are a
|
|
natural consumer of this pipeline.
|
|
|
|
### Current vs. Future Table Definition
|
|
|
|
**Current** (manual Drizzle table defs):
|
|
|
|
```ts
|
|
export const graphTypes = sqliteTable("graph_types", {
|
|
id: text("id").primaryKey(),
|
|
name: text("name").notNull(),
|
|
config: text("config", { mode: "json" }).notNull(),
|
|
// ...
|
|
});
|
|
```
|
|
|
|
**Future** (dbtype element tree → HostConfig rendering):
|
|
|
|
```tsx
|
|
const GraphTypesEl = h("table", { name: "graph_types" },
|
|
h(IdColumn, {}),
|
|
h("column", { name: "name", type: "string", notNull: true }),
|
|
h("column", { name: "config", type: "json", mode: "json", notNull: true }),
|
|
h(AuditColumns, {}),
|
|
);
|
|
|
|
const root = createRoot(sqliteHost, {});
|
|
root.render(GraphTypesEl);
|
|
const drizzleTable = root.ctx.tables.graph_types;
|
|
```
|
|
|
|
### Why this matters for storage
|
|
|
|
1. **Single source of truth**: Today's `sqlite/tables/` and future `pg/tables/`
|
|
define the same shapes in two different Drizzle dialects. dbtype renders the
|
|
same element tree to both — no manual duplication.
|
|
2. **Schema extraction**: `extractTable()` produces both TypeBox schemas (for
|
|
validation) and column metadata (for Drizzle rendering) from the same tree.
|
|
Storage gets `SelectGraphType` and `InsertGraphType` schemas for free.
|
|
3. **Module alignment**: dbtype assembles extracted schemas into a
|
|
`Type.Module` for cross-table references. Storage's metagraph Module and
|
|
dbtype's table Module could share a namespace — the `graph_types.config`
|
|
column stores the JSON Schema from `Metagraph.Config`.
|
|
|
|
### v1 approach
|
|
|
|
For v1, storage continues with manual Drizzle table definitions. The dbtype
|
|
integration is deferred because:
|
|
|
|
- dbtype is Phase 0 (architecture complete, no implementation)
|
|
- The manual defs work and are well-understood
|
|
- The Module pattern for graph types can be adopted independently (no dbtype
|
|
dependency)
|
|
- With PostgreSQL removed (ADR-038), the original pressure for dbtype —
|
|
eliminating dual SQLite/PG table maintenance — is significantly reduced.
|
|
There is now only one set of table definitions to maintain.
|
|
|
|
When dbtype reaches Phase 1 (implementation), storage can migrate from
|
|
Drizzle table definitions to dbtype elements one table at a time. The Module-based
|
|
graph type definitions are already compatible — they're both TypeBox `Type.Module`
|
|
objects.
|
|
|
|
## ujsx as Universal IR
|
|
|
|
The three packages (ujsx, dbtype, storage) share the same pipeline shape:
|
|
**Schema → Element Tree → Host**. This is not coincidental — ujsx is a
|
|
universal declarative IR, and different "render targets" are just different
|
|
HostConfigs.
|
|
|
|
### What this could look like
|
|
|
|
```tsx
|
|
// Graph type definitions as ujsx elements (future)
|
|
const CallGraphSchema = h("graphSchema", { name: "call-graph" },
|
|
h("config", { type: "directed", multi: false, allowSelfLoops: false }),
|
|
h("nodeType", { name: "call" },
|
|
h(BaseNode, {}),
|
|
h("attr", { name: "requestId", type: "string", required: true }),
|
|
h("attr", { name: "status", ref: "CallStatus" }),
|
|
),
|
|
h("edgeType", { name: "triggered" },
|
|
h(BaseEdge, {}),
|
|
h("attr", { name: "type", literal: "triggered" }),
|
|
),
|
|
h("edgeConstraints", { edgeType: "triggered",
|
|
allowedSourceTypes: ["Call"],
|
|
allowedTargetTypes: ["Call", "Subcall"] }),
|
|
);
|
|
```
|
|
|
|
Rendered to different hosts:
|
|
|
|
| Host | Output |
|
|
|------|--------|
|
|
| TypeBox Host | `Type.Module({ CallNode: ..., TriggeredEdge: ... })` |
|
|
| SQLite Host | `sqliteTable("node_types", { ... })` + `sqliteTable("edge_types", { ... })` |
|
|
| PG Host | `pgTable("node_types", { ... })` + `pgTable("edge_types", { ... })` |
|
|
| graphology Host | `SerializedGraph` format |
|
|
| Documentation Host | Mermaid diagram, typed API docs |
|
|
|
|
### What's real today vs. aspirational
|
|
|
|
| Capability | Status |
|
|
|-----------|--------|
|
|
| `Type.Module` for graph type definitions | ✅ Implemented — Metagraph, CallGraph, SecretGraph Modules |
|
|
| Bridge functions (`moduleToDbSchema`, `validateNode`, `validateEdge`) | ✅ Implemented |
|
|
| Reference graph type Modules (CallGraph, SecretGraph) | ✅ Implemented |
|
|
| Crypto utility (`encrypt`, `decrypt`, `generateEncryptionKey`, `EncryptedDataSchema`) | ✅ Implemented |
|
|
| Codegen from TypeScript interfaces → Module entries | ✅ TsToModule exists |
|
|
| dbtype element trees → Drizzle tables | ⚠️ dbtype Phase 0, no implementation |
|
|
| `<graphSchema>` ujsx elements | ⚠️ Conceptual — needs HostConfig design |
|
|
| Typed graph pointers via JPATH | ⚠️ Conceptual — needs JPATH Module design |
|
|
| Reactive graph observation via ValuePointer | ⚠️ Conceptual — needs signal integration |
|
|
|
|
The Module-based graph type definitions (this spec) are the **first concrete
|
|
step** in this pipeline. Everything else builds on having a `Type.Module` as
|
|
the schema source of truth.
|
|
|
|
## Repository Layer Strategy
|
|
|
|
The repository layer (typed CRUD for the 6 metagraph tables + queries for graph data)
|
|
is the next major feature to implement. The question of *how* it queries attributes
|
|
connects to broader ecosystem decisions about dbtype and operations.
|
|
|
|
### Three Approaches
|
|
|
|
#### A. JSON Path Queries (Near-Term)
|
|
|
|
The repository layer maps filter criteria to JSON path extraction:
|
|
|
|
```ts
|
|
findNodes({ graphId, attributes: { status: "active" } })
|
|
// SQLite: json_extract(attributes, '$.status') = 'active'
|
|
// PG: attributes ->> 'status' = 'active'
|
|
```
|
|
|
|
- Works with current table definitions (no schema changes)
|
|
- SQLite `json_extract()` and PG `->>` / `#>>` operators handle JSON path
|
|
- No native index support on individual JSON attributes
|
|
- PG can add GIN indexes on `jsonb` columns for containment queries, but not for
|
|
arbitrary key-value lookups
|
|
- Simple, immediate, no new infrastructure
|
|
|
|
This is the pragmatic v1 approach. The metagraph pattern *requires* JSON attributes
|
|
because node types are dynamic schemas (defined at runtime, stored in
|
|
`node_types.schema`), not static columns known at database definition time.
|
|
|
|
#### B. Native Columns via dbtype (Long-Term, Speculative)
|
|
|
|
If storage migrates to dbtype element trees for table definitions, the 6 static
|
|
metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) could
|
|
be rendered via the dbtype pipeline: element tree → HostConfig → Drizzle tables.
|
|
This would eliminate the manual duplication between `sqlite/` and future `pg/`.
|
|
|
|
However, dbtype does NOT solve the attribute indexing problem:
|
|
|
|
- The metagraph's `attributes` column MUST remain JSON because the shape is defined
|
|
by runtime schemas (node type definitions), not by static column definitions
|
|
- dbtype generates static table schemas; it does not handle dynamic schema-as-data
|
|
patterns like the metagraph
|
|
- A "call" node's attributes (`requestId`, `status`, `duration`) are not columns
|
|
on the `nodes` table — they're values in the `attributes` JSON column, validated
|
|
by the corresponding node type's TypeBox schema
|
|
|
|
#### C. Hybrid: Static Tables via dbtype, Dynamic Attributes Remain JSON
|
|
|
|
The hybrid approach preserves the metagraph's dynamic schema model while leveraging
|
|
dbtype for the static table scaffolding:
|
|
|
|
1. **Static tables**: dbtype renders the 6 metagraph tables to Drizzle dialects.
|
|
This eliminates the SQLite/PG manual duplication for table *structure*.
|
|
The `attributes` column is still `text/jsonb` across both dialects.
|
|
|
|
2. **Dynamic attributes**: Remain JSON. The Module-based node type schemas validate
|
|
data at the application layer, not the database layer. This is by design
|
|
(ADR-003, ADR-014).
|
|
|
|
3. **Virtual columns / computed columns**: A post-v1 optimization, not a v1 concern.
|
|
Frequently queried attributes could be extracted to indexed columns as a
|
|
performance optimization. For example, if `nodes.attributes.status` is a common
|
|
filter, a computed column or trigger could copy it to `nodes.status_column` with
|
|
an index. This would be a denormalization trade-off (triggers, migration
|
|
complexity, dual-write responsibility) and is not designed or planned for v1.
|
|
|
|
4. **Repository CRUD**: The static table CRUD operations (insert graph type, find
|
|
node by key) could be auto-generated like drizzle-graphql or the dbtype
|
|
`from-dbtype` adapter. Graph-specific attribute queries remain JSON path.
|
|
|
|
### Implications for Each Approach
|
|
|
|
| Concern | Path A (JSON) | Path B (Native) | Path C (Hybrid) |
|
|
|---------|---------------|-----------------|------------------|
|
|
| Works today | ✅ | ❌ (requires dbtype) | ❌ (requires dbtype) |
|
|
| Preserves metagraph pattern | ✅ | ❌ (conflicts with dynamic schemas) | ✅ |
|
|
| Eliminates SQLite/PG duplication | ❌ | ✅ | ✅ |
|
|
| Indexes on attributes | GIN on PG only | ✅ full native | GIN + virtual columns |
|
|
| Repository generation | Hand-write CRUD | Auto-gen from dbtype | Auto-gen for static, JSON path for dynamic |
|
|
| Dependency on dbtype | None | Full | Partial (static tables only) |
|
|
|
|
### Connection to drizzle-graphql
|
|
|
|
The overview references drizzle-graphql as a pattern for auto-generating a CRUD/query
|
|
surface. The dbtype `from-dbtype` adapter is the @alkdev equivalent: it consumes
|
|
element trees + Type.Module bundles and produces `OperationSpec[]` for the
|
|
operations registry.
|
|
|
|
The parallel:
|
|
|
|
| Concern | drizzle-graphql | dbtype from-dbtype |
|
|
|---------|----------------|-------------------|
|
|
| Input | Drizzle schema (tables + relations) | UJSX element tree + Type.Module |
|
|
| Output | GraphQL schema (queries + mutations) | `OperationSpec[]` (CRUD operations) |
|
|
| Dialects | SQLite, PG, MySQL | SQLite, PG, MySQL (via HostConfig) |
|
|
| Table model | Static columns only | Static columns only |
|
|
| Dynamic data (JSON attrs) | Not handled | Not handled |
|
|
|
|
Neither drizzle-graphql nor dbtype's `from-dbtype` handles dynamic schema-as-data
|
|
patterns. The metagraph's JSON attributes require their own query layer, regardless
|
|
of whether the static tables are auto-generated. This means the repository layer
|
|
for `@alkdev/storage` will always have two parts:
|
|
|
|
1. **Static table CRUD** — could be auto-generated (by dbtype or hand-written)
|
|
2. **Graph data queries** — JSON path queries against the `attributes` column,
|
|
validated by the Module schema at the application layer
|
|
|
|
### v1 Decision
|
|
|
|
For v1, the practical path is **A (JSON path queries) with hand-written CRUD**. This
|
|
decision is recorded as [ADR-033](./decisions/033-json-path-queries-for-v1.md). The
|
|
hybrid approach (C) remains viable for a future iteration when dbtype reaches
|
|
implementation, and it doesn't require any changes to the metagraph data model —
|
|
only to how the static table definitions are generated. See OQ-17, OQ-18, OQ-19
|
|
in [open-questions.md](./open-questions.md) for the specific long-term questions
|
|
that remain open beyond v1.
|
|
|
|
### Decisions Required
|
|
|
|
- **OQ-17**: JSON path vs native columns vs hybrid for attribute queries (resolved for v1 — see ADR-033)
|
|
- **OQ-18**: Auto-generated vs hand-written CRUD for static tables (resolved for v1 — see ADR-033)
|
|
- **OQ-19**: Where the storage-operations bridge package should live (open)
|
|
|
|
## Constraints on Current Design
|
|
|
|
The forward-looking patterns documented here constrain the Module evolution
|
|
design in [metagraph-module.md](./metagraph-module.md):
|
|
|
|
1. **The Module format must be self-contained** — `Type.Module({...})` entries
|
|
with `Type.Ref` and `Type.Composite` are the same structures that a ujsx
|
|
TypeBox Host would produce. If the Module format were an ad-hoc builder
|
|
output, it couldn't be rendered by a different host later.
|
|
|
|
2. **Edge constraints must be schema entries, not just DB columns** — the
|
|
constraint data needs to survive serialization/deserialization and be
|
|
validatable independently. DB-only columns can't do this.
|
|
|
|
3. **The base attribute schemas (`BaseNode`, `BaseEdge`) must be TypeBox
|
|
schemas** — not Drizzle column definitions, not builder-internal objects.
|
|
Only TypeBox schemas can be composed via `Type.Composite`, referenced via
|
|
`Type.Ref`, and serialized to JSON Schema.
|
|
|
|
4. **No ujsx dependency** — storage's Module-based graph types join the
|
|
pipeline conceptually, not as a runtime dependency. The `Type.Module`
|
|
output is the same shape that a ujsx HostConfig would produce, but storage
|
|
doesn't need ujsx to create it. The alignment is structural, not dependent.
|
|
|
|
5. **Schemas-as-JSON enables `Value.Diff`/`Value.Patch`/`Value.Cast`** —
|
|
because TypeBox Modules serialize to JSON Schema, the TypeBox value system
|
|
can operate on schemas themselves (diff to detect changes, patch to update
|
|
stored schemas, cast to migrate data). This is not possible if schemas are
|
|
opaque builder objects or Drizzle column definitions. See
|
|
[schema-evolution.md](./schema-evolution.md).
|
|
|
|
## References
|
|
|
|
- ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts`
|
|
- ujsx HostConfig adapter: `/workspace/@alkdev/ujsx/src/host/config.ts`
|
|
- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md`
|
|
- dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md`
|
|
- dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md`
|
|
- dbtype repo adapter: `/workspace/@alkdev/dbtype/docs/architecture/repo-adapter.md`
|
|
- drizzle-graphql (reference for CRUD generation pattern): `/workspace/drizzle-graphql/`
|
|
- Operations registry: `/workspace/@alkdev/operations/docs/architecture/README.md`
|
|
- JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts`
|
|
- jsonpathly source: `/workspace/jsonpathly/`
|
|
- Module evolution spec: [metagraph-module.md](./metagraph-module.md)
|
|
- Schema evolution spec: [schema-evolution.md](./schema-evolution.md)
|
|
- ADR-033: JSON path queries and hand-written CRUD for v1 |