Add repository layer strategy: JSON path queries, CRUD decisions, ecosystem integration

Add three open questions (OQ-17, OQ-18, OQ-19) covering attribute query
strategy, CRUD generation approach, and storage-operations bridge placement.
Create ADR-033 recording the v1 decision: JSON path queries for attributes
with hand-written CRUD for static tables.

Expand forward-look.md with Repository Layer Strategy section analyzing
three approaches (JSON path, native columns via dbtype, hybrid) and their
implications for the metagraph pattern. Add drizzle-graphql and dbtype
from-dbtype comparison showing neither handles dynamic schema-as-data.

Update overview.md with dbtype/ujsx in the dependency diagram, expanded
ecosystem context in the bridging pattern section, and new open questions.

Align open-questions.md: resolve OQ-17 and OQ-18 for v1 (ADR-033), add
OQ-19 as open, update summary counts and ADR impact table.
This commit is contained in:
2026-05-30 11:02:49 +00:00
parent ed8710a7f5
commit a2ee452a63
5 changed files with 258 additions and 6 deletions

View File

@@ -1,5 +1,5 @@
---
status: draft
status: reviewed
last_updated: 2026-05-30
---
@@ -225,6 +225,129 @@ The Module-based graph type definitions (this spec) are the **first concrete
step** in this pipeline. Everything else builds on having a `Type.Module` as
the schema source of truth.
## Repository Layer Strategy
The repository layer (typed CRUD for the 6 metagraph tables + queries for graph data)
is the next major feature to implement. The question of *how* it queries attributes
connects to broader ecosystem decisions about dbtype and operations.
### Three Approaches
#### A. JSON Path Queries (Near-Term)
The repository layer maps filter criteria to JSON path extraction:
```ts
findNodes({ graphId, attributes: { status: "active" } })
// SQLite: json_extract(attributes, '$.status') = 'active'
// PG: attributes ->> 'status' = 'active'
```
- Works with current table definitions (no schema changes)
- SQLite `json_extract()` and PG `->>` / `#>>` operators handle JSON path
- No native index support on individual JSON attributes
- PG can add GIN indexes on `jsonb` columns for containment queries, but not for
arbitrary key-value lookups
- Simple, immediate, no new infrastructure
This is the pragmatic v1 approach. The metagraph pattern *requires* JSON attributes
because node types are dynamic schemas (defined at runtime, stored in
`node_types.schema`), not static columns known at database definition time.
#### B. Native Columns via dbtype (Long-Term, Speculative)
If storage migrates to dbtype element trees for table definitions, the 6 static
metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) could
be rendered via the dbtype pipeline: element tree → HostConfig → Drizzle tables.
This would eliminate the manual duplication between `sqlite/` and future `pg/`.
However, dbtype does NOT solve the attribute indexing problem:
- The metagraph's `attributes` column MUST remain JSON because the shape is defined
by runtime schemas (node type definitions), not by static column definitions
- dbtype generates static table schemas; it does not handle dynamic schema-as-data
patterns like the metagraph
- A "call" node's attributes (`requestId`, `status`, `duration`) are not columns
on the `nodes` table — they're values in the `attributes` JSON column, validated
by the corresponding node type's TypeBox schema
#### C. Hybrid: Static Tables via dbtype, Dynamic Attributes Remain JSON
The hybrid approach preserves the metagraph's dynamic schema model while leveraging
dbtype for the static table scaffolding:
1. **Static tables**: dbtype renders the 6 metagraph tables to Drizzle dialects.
This eliminates the SQLite/PG manual duplication for table *structure*.
The `attributes` column is still `text/jsonb` across both dialects.
2. **Dynamic attributes**: Remain JSON. The Module-based node type schemas validate
data at the application layer, not the database layer. This is by design
(ADR-003, ADR-014).
3. **Virtual columns / computed columns**: A post-v1 optimization, not a v1 concern.
Frequently queried attributes could be extracted to indexed columns as a
performance optimization. For example, if `nodes.attributes.status` is a common
filter, a computed column or trigger could copy it to `nodes.status_column` with
an index. This would be a denormalization trade-off (triggers, migration
complexity, dual-write responsibility) and is not designed or planned for v1.
4. **Repository CRUD**: The static table CRUD operations (insert graph type, find
node by key) could be auto-generated like drizzle-graphql or the dbtype
`from-dbtype` adapter. Graph-specific attribute queries remain JSON path.
### Implications for Each Approach
| Concern | Path A (JSON) | Path B (Native) | Path C (Hybrid) |
|---------|---------------|-----------------|------------------|
| Works today | ✅ | ❌ (requires dbtype) | ❌ (requires dbtype) |
| Preserves metagraph pattern | ✅ | ❌ (conflicts with dynamic schemas) | ✅ |
| Eliminates SQLite/PG duplication | ❌ | ✅ | ✅ |
| Indexes on attributes | GIN on PG only | ✅ full native | GIN + virtual columns |
| Repository generation | Hand-write CRUD | Auto-gen from dbtype | Auto-gen for static, JSON path for dynamic |
| Dependency on dbtype | None | Full | Partial (static tables only) |
### Connection to drizzle-graphql
The overview references drizzle-graphql as a pattern for auto-generating a CRUD/query
surface. The dbtype `from-dbtype` adapter is the @alkdev equivalent: it consumes
element trees + Type.Module bundles and produces `OperationSpec[]` for the
operations registry.
The parallel:
| Concern | drizzle-graphql | dbtype from-dbtype |
|---------|----------------|-------------------|
| Input | Drizzle schema (tables + relations) | UJSX element tree + Type.Module |
| Output | GraphQL schema (queries + mutations) | `OperationSpec[]` (CRUD operations) |
| Dialects | SQLite, PG, MySQL | SQLite, PG, MySQL (via HostConfig) |
| Table model | Static columns only | Static columns only |
| Dynamic data (JSON attrs) | Not handled | Not handled |
Neither drizzle-graphql nor dbtype's `from-dbtype` handles dynamic schema-as-data
patterns. The metagraph's JSON attributes require their own query layer, regardless
of whether the static tables are auto-generated. This means the repository layer
for `@alkdev/storage` will always have two parts:
1. **Static table CRUD** — could be auto-generated (by dbtype or hand-written)
2. **Graph data queries** — JSON path queries against the `attributes` column,
validated by the Module schema at the application layer
### v1 Decision
For v1, the practical path is **A (JSON path queries) with hand-written CRUD**. This
decision is recorded as [ADR-033](./decisions/033-json-path-queries-for-v1.md). The
hybrid approach (C) remains viable for a future iteration when dbtype reaches
implementation, and it doesn't require any changes to the metagraph data model —
only to how the static table definitions are generated. See OQ-17, OQ-18, OQ-19
in [open-questions.md](./open-questions.md) for the specific long-term questions
that remain open beyond v1.
### Decisions Required
- **OQ-17**: JSON path vs native columns vs hybrid for attribute queries (resolved for v1 — see ADR-033)
- **OQ-18**: Auto-generated vs hand-written CRUD for static tables (resolved for v1 — see ADR-033)
- **OQ-19**: Where the storage-operations bridge package should live (open)
## Constraints on Current Design
The forward-looking patterns documented here constrain the Module evolution
@@ -263,7 +386,11 @@ design in [metagraph-module.md](./metagraph-module.md):
- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md`
- dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md`
- dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md`
- dbtype repo adapter: `/workspace/@alkdev/dbtype/docs/architecture/repo-adapter.md`
- drizzle-graphql (reference for CRUD generation pattern): `/workspace/drizzle-graphql/`
- Operations registry: `/workspace/@alkdev/operations/docs/architecture/README.md`
- JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts`
- jsonpathly source: `/workspace/jsonpathly/`
- Module evolution spec: [metagraph-module.md](./metagraph-module.md)
- Schema evolution spec: [schema-evolution.md](./schema-evolution.md)
- Schema evolution spec: [schema-evolution.md](./schema-evolution.md)
- ADR-033: JSON path queries and hand-written CRUD for v1