docs: add schema-evolution.md — TypeBox Value.Diff/Patch/Cast for schema evolution
This commit is contained in:
@@ -244,6 +244,13 @@ design in [metagraph-module.md](./metagraph-module.md):
|
||||
output is the same shape that a ujsx HostConfig would produce, but storage
|
||||
doesn't need ujsx to create it. The alignment is structural, not dependent.
|
||||
|
||||
5. **Schemas-as-JSON enables `Value.Diff`/`Value.Patch`/`Value.Cast`** —
|
||||
because TypeBox Modules serialize to JSON Schema, the TypeBox value system
|
||||
can operate on schemas themselves (diff to detect changes, patch to update
|
||||
stored schemas, cast to migrate data). This is not possible if schemas are
|
||||
opaque builder objects or Drizzle column definitions. See
|
||||
[schema-evolution.md](./schema-evolution.md).
|
||||
|
||||
## References
|
||||
|
||||
- ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts`
|
||||
@@ -254,3 +261,4 @@ design in [metagraph-module.md](./metagraph-module.md):
|
||||
- JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts`
|
||||
- jsonpathly source: `/workspace/jsonpathly/`
|
||||
- Module evolution spec: [metagraph-module.md](./metagraph-module.md)
|
||||
- Schema evolution spec: [schema-evolution.md](./schema-evolution.md)
|
||||
@@ -282,10 +282,11 @@ storage node attributes and operations call events), they should either:
|
||||
deployment-specific secret management.
|
||||
|
||||
5. **Schema evolution strategy**: When graph type schemas evolve (new node types,
|
||||
changed attribute schemas), who handles migration? The repository layer
|
||||
should support schema version checking, but actual data migration scripts are
|
||||
application-level. See [metagraph.md](./metagraph.md) for the versioning
|
||||
approach.
|
||||
changed attribute schemas), how are changes detected and data migrated?
|
||||
TypeBox's `Value.Diff` can diff schemas-as-JSON to detect changes,
|
||||
`Value.Cast` can migrate data shapes, and `Value.Check` can verify
|
||||
compatibility. The `version` field on `graph_types` tracks breaking changes.
|
||||
See [schema-evolution.md](./schema-evolution.md) for the full design.
|
||||
|
||||
6. **~~Should the repository layer live in `@alkdev/storage` or in a consumer
|
||||
package?~~** Decision: the repository CRUD layer (host-specific typed
|
||||
@@ -301,6 +302,7 @@ storage node attributes and operations call events), they should either:
|
||||
## References
|
||||
|
||||
- Metagraph Module evolution: [metagraph-module.md](./metagraph-module.md)
|
||||
- Schema evolution via TypeBox value system: [schema-evolution.md](./schema-evolution.md)
|
||||
- Forward-looking connections: [forward-look.md](./forward-look.md)
|
||||
- Operations architecture: `/workspace/@alkdev/operations/docs/architecture/README.md`
|
||||
- Pubsub architecture: `/workspace/@alkdev/pubsub/docs/architecture/README.md`
|
||||
|
||||
559
docs/architecture/schema-evolution.md
Normal file
559
docs/architecture/schema-evolution.md
Normal file
@@ -0,0 +1,559 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-28
|
||||
---
|
||||
|
||||
# Schema Evolution
|
||||
|
||||
How graph type schemas evolve over time — detecting changes, classifying their
|
||||
impact, and migrating stored data. Uses TypeBox's `Value.Diff`/`Value.Patch`/
|
||||
`Value.Cast` to operate on schemas-as-JSON and data-as-JSON, aligned with the
|
||||
ecosystem's event-sourced design.
|
||||
|
||||
## Overview
|
||||
|
||||
The @alkdev ecosystem is event-driven. Call protocol events are append-only.
|
||||
Graph instances in storage are projections — materialized views produced by
|
||||
replaying events through projector functions. When a graph type's schema
|
||||
changes, stored data may need migration, and the repository layer needs to
|
||||
detect and handle the change.
|
||||
|
||||
The key insight: **TypeBox schemas are JSON**. They're JSON Schema objects
|
||||
stored as JSON in `node_types.schema` and `edge_types.schema` columns (text in
|
||||
SQLite, jsonb in PG). Because they're JSON, TypeBox's own `Value.Diff`,
|
||||
`Value.Patch`, and `Value.Cast` can operate on them directly — diffing schemas
|
||||
to detect changes, patching stored schemas to update them, and casting stored
|
||||
data to fit new schema shapes.
|
||||
|
||||
Two distinct domains of JSON values are involved:
|
||||
|
||||
- **Schemas-as-JSON**: The TypeBox schema objects stored in `node_types.schema`
|
||||
and `edge_types.schema` columns. `Value.Diff`/`Value.Patch` operate on these
|
||||
(detecting schema changes, updating stored schemas).
|
||||
- **Data-as-JSON**: The node/edge attribute values stored in `nodes.attributes`
|
||||
and `edges.attributes` columns. `Value.Cast`/`Value.Check` operate on these
|
||||
(migrating data to fit new schemas, verifying compatibility).
|
||||
|
||||
This is not a migration framework. It's the observation that the existing
|
||||
TypeBox value system, combined with schemas-as-JSON storage, gives us schema
|
||||
evolution primitives for free — we just need to wire them together.
|
||||
|
||||
### The Edit Type
|
||||
|
||||
`Value.Diff` returns an array of `Edit` objects — structural delta operations
|
||||
that transform one JSON value into another:
|
||||
|
||||
```ts
|
||||
type Insert = { type: "insert", path: string, value: unknown }
|
||||
type Update = { type: "update", path: string, value: unknown }
|
||||
type Delete = { type: "delete", path: string }
|
||||
type Edit = Insert | Update | Delete
|
||||
```
|
||||
|
||||
Paths use RFC 6901 JSON Pointer format (`/properties/requestId`,
|
||||
`/properties/status/anyOf/0/const`). The `value` field contains the inserted
|
||||
or updated value.
|
||||
|
||||
`Value.Patch(current, edits)` applies the edits to `current` and returns the
|
||||
result. `Value.Diff` and `Value.Patch` are inverses:
|
||||
`Patch(current, Diff(current, next))` ≈ `next`.
|
||||
|
||||
**Critical: `Edit` paths are structural, not semantic.** An `Update` at path
|
||||
`/properties/status` could mean type narrowing (`String` → `Literal`) or type
|
||||
change (`String` → `Number`). Diff doesn't know — it only sees the raw JSON
|
||||
structure. Classification logic (breaking vs non-breaking) must interpret the
|
||||
edits with schema awareness.
|
||||
|
||||
## The Event-Sourced Context
|
||||
|
||||
The @alkdev ecosystem uses an append-only event model:
|
||||
|
||||
- **Call protocol** (`@alkdev/operations`): 5 typed events (`call.requested`,
|
||||
`call.responded`, `call.completed`, `call.aborted`, `call.error`) form an
|
||||
append-only event stream per workflow execution
|
||||
- **Event log as source of truth** (`@alkdev/flowgraph` ADR-005): The in-memory
|
||||
`CallEventMapValue[]` is the authoritative state; projected views (status,
|
||||
results, call graph) are derived from it
|
||||
- **Hub persistence**: The hub persists call protocol events to Postgres and
|
||||
replays them to reconstruct reactive state after restart
|
||||
- **Storage as projection**: The metagraph tables store the projected state —
|
||||
graph instances, nodes, edges — not the raw event stream
|
||||
|
||||
This means storage's graph data is analogous to a read model in event-sourced
|
||||
systems. Schema evolution in storage is projection migration, not event
|
||||
migration. The event stream (managed by the hub) retains full history;
|
||||
storage's tables hold the current materialized view.
|
||||
|
||||
### Single-Author, Not CRDT
|
||||
|
||||
Unlike Yjs or Automerge, the @alkdev event model is single-author per session
|
||||
with central coordination. There are no concurrent multi-author edit conflicts.
|
||||
This means:
|
||||
|
||||
- **Idempotent replay** (same events → same state) is sufficient; CRDT merge
|
||||
semantics are not needed
|
||||
- **Schema evolution** is forward-only (new code processes old data), not
|
||||
bidirectional (concurrent versions merging)
|
||||
- **Migration** is apply-on-read or apply-on-write, not conflict resolution
|
||||
|
||||
If future use cases require multi-author concurrent graph editing (e.g.,
|
||||
collaborative task boards), a CRDT layer would need to sit between the event
|
||||
stream and storage. That's a post-v1 concern.
|
||||
|
||||
## Schema-as-JSON: Value.Diff on Schemas
|
||||
|
||||
TypeBox schemas are JSON Schema objects — plain JSON values. The current
|
||||
`node_types.schema` and `edge_types.schema` columns store them as JSON text
|
||||
(SQLite) or jsonb (PG). This means `Value.Diff` can diff schemas themselves.
|
||||
|
||||
### Detecting Schema Changes
|
||||
|
||||
```ts
|
||||
import { Value } from "@alkdev/typebox";
|
||||
|
||||
// Stored schema (from DB)
|
||||
const storedSchema = await db.query.nodeTypes.findFirst({
|
||||
where: eq(nodeTypes.name, "Call"),
|
||||
}).schema;
|
||||
|
||||
// Current schema (from Module)
|
||||
const currentSchema = CallGraph.CallNode;
|
||||
|
||||
// Diff the schemas
|
||||
const edits = Value.Diff(storedSchema, currentSchema);
|
||||
|
||||
if (edits.length === 0) {
|
||||
// No change — schema matches
|
||||
} else {
|
||||
// Schema has changed — classify the edits
|
||||
}
|
||||
```
|
||||
|
||||
### Classifying Schema Edits
|
||||
|
||||
`Value.Diff` is schema-agnostic — it diffs raw JSON structure without
|
||||
understanding JSON Schema semantics. A classification layer is needed to
|
||||
determine whether an edit is breaking or non-breaking.
|
||||
|
||||
| Edit pattern | Schema-level meaning | Breaking? |
|
||||
|---|---|---|
|
||||
| Insert new property with `default` | New field with default | No |
|
||||
| Insert new property without `default` | New required field — old data won't have it | Yes (unless `Type.Optional`) |
|
||||
| Insert new property with `Optional` wrapper | New optional field — old data valid | No |
|
||||
| Update property type: `String` → `Literal("x")` | Type narrowing (subtype) | No (existing data with `"x"` is valid) |
|
||||
| Update property type: `String` → `Number` | Type change (incompatible) | Yes |
|
||||
| Update property type: add `Optional` wrapper | Making field optional | No (backward compatible) |
|
||||
| Delete property | Field removed from schema | Yes (old data with this field is non-conforming if `additionalProperties: false`) |
|
||||
| Update `$defs` references | Cross-reference changes | Depends — see note |
|
||||
|
||||
**⚠️ Needs POC**: The classification above is theoretical. Whether this can be
|
||||
done reliably from `Edit[]` objects (which are raw JSON pointer + value pairs)
|
||||
needs validation. POC scope:
|
||||
- **Test corpus**: ~20 schema change patterns covering each row in the table
|
||||
above (add optional field, add required field, narrow type, change type,
|
||||
remove field, add enum value, `$ref` change)
|
||||
- **Success criteria**: Classification accuracy >95% against expected
|
||||
breaking/non-breaking labels for each pattern
|
||||
- **Fallback**: If classification accuracy is too low, use Strategy C (hybrid
|
||||
with `Value.Check` verification) instead of Strategy A.
|
||||
|
||||
### Three Detection Strategies
|
||||
|
||||
**Strategy A: Diff schemas, classify edits**
|
||||
|
||||
```
|
||||
Value.Diff(storedSchema, currentSchema) → Edit[]
|
||||
classify(Edit[]) → { breaking: boolean, edits: ClassifiedEdit[] }
|
||||
```
|
||||
|
||||
Pros: Identifies *what* changed and *how*. Enables targeted migration (only
|
||||
migrate data affected by breaking changes).
|
||||
|
||||
Cons: Classification is hard to get right. Schema semantics are rich (type
|
||||
narrowing, union widening, `$ref` changes, `additionalProperties` interactions).
|
||||
|
||||
**Strategy B: Diff schemas, test data compatibility**
|
||||
|
||||
```
|
||||
Value.Diff(storedSchema, currentSchema) → Edit[]
|
||||
if (edits.length > 0) {
|
||||
// Schema changed — test if existing data is still valid
|
||||
const sampleData = fetchSampleNodeData(graphTypeId, nodeTypeName);
|
||||
const compatible = sampleData.every(d => Value.Check(currentSchema, d));
|
||||
}
|
||||
```
|
||||
|
||||
Pros: Simple. No classification logic needed. Works with any schema change.
|
||||
|
||||
Cons: Requires fetching data to test. Binary answer (compatible or not) — no
|
||||
information about *what* changed. Sample may not be representative.
|
||||
|
||||
**Strategy C: Hybrid — diff for detection, Check for verification**
|
||||
|
||||
```
|
||||
edits = Value.Diff(storedSchema, currentSchema)
|
||||
if (edits.length === 0) return "unchanged";
|
||||
|
||||
// Schema changed. Fast-path: check if version bump covers it.
|
||||
// storedVersion = graph_types.version from DB
|
||||
// currentVersion = consumer-defined constant (e.g., CURRENT_CALL_GRAPH_VERSION = 2)
|
||||
if (storedVersion < currentVersion) return "version-mismatch";
|
||||
|
||||
// No version bump but schema changed — non-breaking change expected.
|
||||
// Verify by checking stored data against new schema.
|
||||
// findIncompatibleNodes = repository query that returns nodes
|
||||
// where Value.Check(currentSchema, node.attributes) is false
|
||||
const incompatible = await findIncompatibleNodes(graphTypeId, nodeTypeName, currentSchema);
|
||||
if (incompatible.length === 0) return "non-breaking";
|
||||
return "breaking";
|
||||
```
|
||||
|
||||
`findIncompatibleNodes` is a repository query function: fetch nodes of the
|
||||
given type in the given graph, filter to those where
|
||||
`!Value.Check(currentSchema, node.attributes)`. `currentVersion` is a
|
||||
consumer-defined integer constant that the consumer increments when making
|
||||
a breaking change to their graph type Module.
|
||||
|
||||
This combines detection (diff), metadata (version), and verification (check)
|
||||
and is likely the most robust approach. **Recommended for POC exploration.**
|
||||
|
||||
## Data Migration via Value.Cast
|
||||
|
||||
Once a schema change is detected and classified, `Value.Cast` can migrate
|
||||
existing data to fit the new schema shape.
|
||||
|
||||
### How Value.Cast Works
|
||||
|
||||
`Value.Cast(schema, value)` attempts to fit a value into the shape defined by
|
||||
the schema:
|
||||
|
||||
- **Matching properties**: Retained from the original value
|
||||
- **Missing required properties with defaults**: Filled from `schema.default`
|
||||
- **Missing required properties without defaults**: Created as zeros (`0`,
|
||||
`""`, `false`, `{}`, `[]`)
|
||||
- **Unknown properties**: Dropped if `additionalProperties: false`, retained
|
||||
otherwise
|
||||
- **Union types**: Each variant is scored and the best match is selected
|
||||
|
||||
This is exactly what's needed for data migration — reshape stored node
|
||||
attributes to fit a new schema while preserving matching fields.
|
||||
|
||||
### Example: Adding a Required Field with Default
|
||||
|
||||
```ts
|
||||
// Schema v1 (stored in DB)
|
||||
const CallNodeV1 = Type.Object({
|
||||
requestId: Type.String(),
|
||||
operationId: Type.String(),
|
||||
status: Type.Union([...]),
|
||||
});
|
||||
|
||||
// Schema v2 (new Module entry — added priority field)
|
||||
const CallNodeV2 = Type.Object({
|
||||
requestId: Type.String(),
|
||||
operationId: Type.String(),
|
||||
status: Type.Union([...]),
|
||||
priority: Type.Number({ default: 0 }), // new field with default
|
||||
});
|
||||
|
||||
// Existing data
|
||||
const oldNode = {
|
||||
requestId: "req-001",
|
||||
operationId: "op-call",
|
||||
status: "completed",
|
||||
};
|
||||
|
||||
// Cast migrates the data
|
||||
const migratedNode = Value.Cast(CallNodeV2, oldNode);
|
||||
// → { requestId: "req-001", operationId: "op-call", status: "completed", priority: 0 }
|
||||
```
|
||||
|
||||
### Example: Type Narrowing (Non-Breaking)
|
||||
|
||||
```ts
|
||||
// Schema v1: type: Type.String()
|
||||
// Schema v2: type: Type.Literal("triggered")
|
||||
|
||||
const oldEdge = { type: "triggered", metadata: {} };
|
||||
const migrated = Value.Cast(CallGraphV2.TriggeredEdge, oldEdge);
|
||||
// → { type: "triggered", metadata: {} } (unchanged — "triggered" satisfies Literal)
|
||||
```
|
||||
|
||||
### Example: Type Change (Breaking)
|
||||
|
||||
```ts
|
||||
// Schema v1: priority: Type.Number()
|
||||
// Schema v2: priority: Type.Union([Type.Literal("low"), Type.Literal("high")])
|
||||
|
||||
const oldNode = { requestId: "req-001", priority: 3 };
|
||||
const migrated = Value.Cast(CallNodeV2, oldNode);
|
||||
// Cast scores each union variant against the value.
|
||||
// Neither "low" nor "high" matches 3 — Cast picks the best match,
|
||||
// but the result is likely incorrect (data loss).
|
||||
```
|
||||
|
||||
**⚠️ Cast limitation**: `Value.Cast` does not provide custom migration
|
||||
functions. For breaking changes that require transformation logic (e.g.,
|
||||
`priority: 3` → `priority: "high"` based on a threshold), the repository layer
|
||||
needs a custom migration handler. Cast handles the common case (add fields,
|
||||
narrow types, drop removed fields); custom logic handles the rest.
|
||||
|
||||
## Schema-as-JSON: Patching Stored Schemas
|
||||
|
||||
When a graph type Module changes, the stored schema in `node_types.schema` must
|
||||
be updated. `Value.Patch` can apply the diff to the stored schema:
|
||||
|
||||
```ts
|
||||
// 1. Diff current stored schema against new Module entry
|
||||
const edits = Value.Diff(storedSchema, CallGraph.CallNode);
|
||||
|
||||
// 2. Patch the stored schema
|
||||
const newStoredSchema = Value.Patch(storedSchema, edits);
|
||||
|
||||
// 3. Update the DB row
|
||||
await db.update(nodeTypes)
|
||||
.set({ schema: newStoredSchema, updatedAt: new Date() })
|
||||
.where(eq(nodeTypes.name, "Call"));
|
||||
```
|
||||
|
||||
This is simpler than reconstructing the full Module and running
|
||||
`moduleToDbSchema()` again. But it requires the same caveats as `Value.Diff` —
|
||||
the edits are structural, not semantic. If the Module entry's `$defs` structure
|
||||
changed (e.g., a `Type.Ref` target was updated), the diff may not capture the
|
||||
semantic change correctly.
|
||||
|
||||
**Alternative**: Re-run `moduleToDbSchema()` on the updated Module and write the
|
||||
full output. This is simpler and more reliable but requires the full Module —
|
||||
not just the schema entry — to be available at migration time.
|
||||
|
||||
**Decision for v1**: Re-run `moduleToDbSchema()` on the updated Module. The
|
||||
Module is always available when the consumer defines graph types. Patch-based
|
||||
schema update is an optimization for later.
|
||||
|
||||
## Evolution Strategies
|
||||
|
||||
### Additive-Only (Recommended for v1)
|
||||
|
||||
The simplest strategy: only add optional fields and new node/edge types. Never
|
||||
remove or rename existing fields.
|
||||
|
||||
- **New optional fields**: `Type.Optional(Type.String())` — old data is still
|
||||
valid under the new schema
|
||||
- **New node/edge types**: New rows in `node_types`/`edge_types` — existing
|
||||
rows unchanged
|
||||
- **New enum values**: Add to `Type.Union` of `Type.Literal` — old data with
|
||||
existing values is still valid
|
||||
|
||||
This avoids the need for `Value.Cast` migrations entirely. The `version` field
|
||||
on `graph_types` stays at 1.
|
||||
|
||||
**When additive-only breaks down**: If a field was incorrectly designed (wrong
|
||||
type, wrong name, wrong semantics), additive-only forces you to deprecate the
|
||||
old field and add a new one. The deprecated field stays in the schema forever.
|
||||
This is acceptable for early development but creates technical debt over time.
|
||||
|
||||
### Version-Bumped with Cast Migration
|
||||
|
||||
When a breaking change is needed, bump the `graph_types.version` integer. The
|
||||
repository layer checks the version before processing and applies `Value.Cast`
|
||||
to migrate data when the version doesn't match.
|
||||
|
||||
```ts
|
||||
const graphType = await findGraphType("call-graph");
|
||||
const currentModule = CallGraph;
|
||||
|
||||
if (graphType.version < CURRENT_CALL_GRAPH_VERSION) {
|
||||
// Mark migration in progress
|
||||
await updateGraphTypeVersion(graphType.id, graphType.version + 1);
|
||||
|
||||
// Schema has changed — migrate all nodes
|
||||
const nodes = await findNodesByGraphType(graphType.id);
|
||||
for (const node of nodes) {
|
||||
const migrated = Value.Cast(
|
||||
currentModule[`${node.nodeType}Node`],
|
||||
node.attributes,
|
||||
);
|
||||
// Guard: verify the cast result before writing
|
||||
if (!Value.Check(currentModule[`${node.nodeType}Node`], migrated)) {
|
||||
throw new Error(
|
||||
`Cast produced invalid data for node ${node.id}. ` +
|
||||
`Custom migration required for this schema change.`
|
||||
);
|
||||
}
|
||||
await updateNode(node.id, { attributes: migrated });
|
||||
}
|
||||
// Update the schema and finalize version
|
||||
await updateNodeTypesSchemas(graphType.id, moduleToDbSchema(currentModule));
|
||||
await updateGraphTypeVersion(graphType.id, graphType.version + 1);
|
||||
}
|
||||
```
|
||||
|
||||
**Version bump contract** (decision: even/odd scheme):
|
||||
- **Even version**: Stable schema, no pending migrations
|
||||
- **Odd version**: Migration in progress — reads return stale data or error
|
||||
(consumer-configurable)
|
||||
- **After migration**: Version is bumped to the next even number
|
||||
|
||||
### Migration Safety
|
||||
|
||||
Partial migration is the primary risk — if the process crashes or errors on
|
||||
node N of M, some nodes are migrated and some are not. The even/odd version
|
||||
scheme provides the recovery mechanism:
|
||||
|
||||
1. **Before migration**: Set `version` to odd (in-progress state)
|
||||
2. **During migration**: Each node is cast, verified with `Value.Check`, then
|
||||
written. If any cast produces invalid data, the migration aborts with an
|
||||
error — the consumer must provide a custom migration function for that
|
||||
schema change.
|
||||
3. **After migration**: Set `version` to even (stable state), update stored
|
||||
schemas via `moduleToDbSchema()`
|
||||
4. **Recovery**: If the process crashes during migration, the odd version
|
||||
signals that migration is incomplete. On restart, the repository layer
|
||||
detects the odd version and can either resume the migration (migrate the
|
||||
remaining nodes) or roll back (restore from backup / re-project from events).
|
||||
|
||||
**Read behavior during migration**: Consumer-configurable. Options:
|
||||
- Reject reads with an error ("migration in progress")
|
||||
- Return stale data (from unmigrated nodes)
|
||||
- Return mixed data (some migrated, some not — not recommended)
|
||||
|
||||
**Cast safety guard**: Every `Value.Cast` result must be verified with
|
||||
`Value.Check(newSchema, migratedData)` before writing. If the check fails,
|
||||
the migration is classified as breaking after all, and the consumer must
|
||||
provide a custom migration function. Document this as a hard requirement.
|
||||
|
||||
**Performance**: For large datasets (10K+ nodes), the migration loop should
|
||||
batch reads/writes to avoid holding all nodes in memory. The graph type is
|
||||
effectively read-only during migration — writes to that graph type's nodes
|
||||
should be rejected while `version` is odd.
|
||||
|
||||
This is the simplest versioning that handles breaking changes. It's
|
||||
application-level (the consumer decides when to bump and migrate), not
|
||||
framework-level (storage doesn't auto-migrate). `graph_types.version` is
|
||||
consistent with `@alkdev/flowgraph` ADR-004: flowgraph itself doesn't version
|
||||
schemas (it's in-memory), but the persisting consumer (storage) provides the
|
||||
versioned envelope.
|
||||
|
||||
### Event-Sourced Replay (Forward-Looking)
|
||||
|
||||
In a fully event-sourced model, schema evolution is handled by replaying events
|
||||
through updated projector functions. Storage's tables are rebuilt from the
|
||||
event log, so no data migration is needed — you just re-project.
|
||||
|
||||
This requires the event log to be the authoritative source and storage to be
|
||||
a disposable projection. The hub's call graph persistence (Postgres) is
|
||||
approaching this model: events are persisted, and state is reconstructed by
|
||||
replay. But the current metagraph tables are not rebuilt from events — they're
|
||||
written directly by the repository layer.
|
||||
|
||||
If the hub migrates to a model where call graph nodes/edges are projections
|
||||
of the call protocol event stream, then schema evolution becomes projector
|
||||
evolution — update the projector, replay the events, rebuild the projection.
|
||||
No `Value.Cast` needed for stored data (because there is no stored data — just
|
||||
re-projected views).
|
||||
|
||||
**This is a post-v1 design** — it requires the event log to be the primary
|
||||
persistence, with storage tables as read-optimized projections. The current
|
||||
model writes directly to storage tables; the event stream is separate (managed
|
||||
by the hub's call graph module, not by `@alkdev/storage`).
|
||||
|
||||
## Relationship to the Module
|
||||
|
||||
The `Metagraph` Module ([metagraph-module.md](./metagraph-module.md)) is the
|
||||
source of truth for graph type schemas. When the Module changes:
|
||||
|
||||
1. **`moduleToDbSchema()`** produces the updated DB row values
|
||||
2. **`Value.Diff(storedSchema, moduleEntry)`** detects what changed
|
||||
3. **`Value.Cast(moduleEntry, storedData)`** migrates affected node/edge data
|
||||
4. **`Value.Patch(storedSchema, edits)`** updates the stored schema (or
|
||||
re-run `moduleToDbSchema()`)
|
||||
|
||||
The Module's `$defs` structure adds complexity to diffing — a `Type.Ref`
|
||||
resolution changes the effective schema in ways that a raw JSON diff might not
|
||||
capture. Using `moduleToDbSchema()` (which resolves refs before writing to the
|
||||
DB) avoids this problem — the stored schema is already dereferenced.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### SE1: Additive-only for v1, Cast migration when needed
|
||||
|
||||
For v1, schema changes should be additive (new optional fields, new types,
|
||||
new enum values). This avoids data migration entirely. When additive-only is
|
||||
insufficient, `Value.Cast` handles the common migration cases. Custom
|
||||
migration functions are the consumer's responsibility.
|
||||
|
||||
### SE2: Version as a coarse-grained breaking-change signal
|
||||
|
||||
The `version` integer on `graph_types` tracks **breaking** schema changes.
|
||||
Non-breaking changes (additive) do not require a version bump. This is a
|
||||
coarse signal — the repository layer checks version before processing and
|
||||
knows to run migration logic when it doesn't match.
|
||||
|
||||
### SE3: Schema change detection via Value.Diff, not manual tracking
|
||||
|
||||
Rather than maintaining a separate "schema version log" or changelog, the
|
||||
repository layer uses `Value.Diff(storedSchema, moduleEntry)` to detect when
|
||||
a stored schema has diverged from the current Module entry. This is
|
||||
schema-agnostic and works for any change.
|
||||
|
||||
### SE4: moduleToDbSchema() for schema updates, not Value.Patch
|
||||
|
||||
When updating stored schemas, re-run `moduleToDbSchema()` on the full Module
|
||||
rather than using `Value.Patch` to apply edits. This is more reliable because
|
||||
it doesn't depend on Diff correctly capturing `Type.Ref`/`$defs` changes.
|
||||
Patch-based schema update is an optimization for later.
|
||||
|
||||
### SE5: Single-author model, not CRDT
|
||||
|
||||
Schema evolution assumes single-author per graph type. There is no concurrent
|
||||
multi-author editing of graph types. If this changes (multiple consumers
|
||||
defining the same graph type with different schemas), a merge/CRDT layer would
|
||||
be needed. That's a post-v1 concern.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Can `Edit[]` from `Value.Diff` be reliably classified as breaking vs
|
||||
non-breaking?** The classification table above is theoretical. A POC should
|
||||
validate whether the `Edit[]` output contains enough information to
|
||||
distinguish, for example, `String → Literal("x")` (narrowing, non-breaking)
|
||||
from `String → Number` (incompatible, breaking). Alternative: skip
|
||||
classification and just use `Value.Check(newSchema, storedData)` for
|
||||
verification.
|
||||
|
||||
2. **Should the repository layer auto-migrate data on schema change, or
|
||||
require explicit consumer action?** Auto-migration is simpler for consumers
|
||||
but risky (data transformation without consumer awareness). Explicit
|
||||
migration is safer but more boilerplate. **Decision (conditional on OQ1
|
||||
POC outcome):** if classification is feasible (OQ1 POC succeeds), the
|
||||
repository layer auto-applies `Value.Cast` for changes it classifies as
|
||||
non-breaking, and requires explicit consumer action for breaking changes.
|
||||
If classification is not feasible, the fallback is: the repository layer
|
||||
auto-applies `Value.Cast` only when `Value.Check(newSchema, storedData)`
|
||||
passes for all stored data (verification, not classification), and requires
|
||||
explicit consumer action otherwise. This ensures auto-migration never
|
||||
corrupts data — if in doubt, the consumer decides.
|
||||
|
||||
3. **How does this interact with the hub's event-sourced call graph
|
||||
persistence?** If the hub migrates to event-sourced replay (projector
|
||||
evolution), storage's call graph tables become disposable projections and
|
||||
`Value.Cast` migration is unnecessary. But other graph types (ACL, tasks,
|
||||
secrets) may not have an event stream to replay from. The schema evolution
|
||||
design should work for both projections and direct-persisted data.
|
||||
|
||||
4. **Should schema evolution events be part of the event stream?** If the
|
||||
system is event-sourced, schema changes themselves could be events
|
||||
(`schema.updated`, `schema.version_bumped`). This would give a full audit
|
||||
trail of schema evolution, but adds complexity. **Decision: post-v1.** For
|
||||
v1, schema changes are applied directly via the repository layer with version
|
||||
tracking.
|
||||
|
||||
## References
|
||||
|
||||
- TypeBox `Value` namespace: `/workspace/@alkdev/typebox/src/value/`
|
||||
- TypeBox `Value.Diff`/`Value.Patch`: `/workspace/@alkdev/typebox/src/value/delta/delta.ts`
|
||||
- TypeBox `Value.Cast`: `/workspace/@alkdev/typebox/src/value/cast/cast.ts`
|
||||
- TypeBox `Value.Check`: `/workspace/@alkdev/typebox/src/value/check/check.ts`
|
||||
- Event Log as Source of Truth (ADR-005): `/workspace/@alkdev/flowgraph/docs/architecture/decisions/005-event-log-as-source-of-truth.md`
|
||||
- Call protocol: `/workspace/@alkdev/operations/docs/architecture/call-protocol.md`
|
||||
- Metagraph Module: [metagraph-module.md](./metagraph-module.md)
|
||||
- Current schema versioning: [metagraph.md](./metagraph.md)
|
||||
Reference in New Issue
Block a user