- Create decisions/ directory with 32 numbered ADRs (ADR-001 through ADR-032) extracted from inline DD/SD/ED/SE decision sections - Create open-questions.md with 16 OQs organized by theme, cross-referenced to ADRs, with status tracking (resolved/open) - Create README.md as architecture index with doc table, ADR table, and lifecycle status definitions (draft/reviewed/stable/deprecated) - Replace inline decision sections in all spec docs with ADR reference tables - Replace inline open questions with OQ references to centralized tracker - Update frontmatter: metagraph-module.md, overview.md, sqlite-host.md → reviewed; schema-evolution.md and encrypted-data.md remain draft - DD1-DD10 → ADR-009 through ADR-018 - D1-D8 → ADR-001 through ADR-008 - SD1-SD5 → ADR-019 through ADR-023 (SD5 folded into ADR-006/008) - ED1-ED5 → ADR-023 through ADR-027 - SE1-SE5 → ADR-028 through ADR-032
22 KiB
status, last_updated
| status | last_updated |
|---|---|
| draft | 2026-05-28 |
Schema Evolution
How graph type schemas evolve over time — detecting changes, classifying their
impact, and migrating stored data. Uses TypeBox's Value.Diff/Value.Patch/
Value.Cast to operate on schemas-as-JSON and data-as-JSON, aligned with the
ecosystem's event-sourced design.
Overview
The @alkdev ecosystem is event-driven. Call protocol events are append-only. Graph instances in storage are projections — materialized views produced by replaying events through projector functions. When a graph type's schema changes, stored data may need migration, and the repository layer needs to detect and handle the change.
The key insight: TypeBox schemas are JSON. They're JSON Schema objects
stored as JSON in node_types.schema and edge_types.schema columns (text in
SQLite, jsonb in PG). Because they're JSON, TypeBox's own Value.Diff,
Value.Patch, and Value.Cast can operate on them directly — diffing schemas
to detect changes, patching stored schemas to update them, and casting stored
data to fit new schema shapes.
Two distinct domains of JSON values are involved:
- Schemas-as-JSON: The TypeBox schema objects stored in
node_types.schemaandedge_types.schemacolumns.Value.Diff/Value.Patchoperate on these (detecting schema changes, updating stored schemas). - Data-as-JSON: The node/edge attribute values stored in
nodes.attributesandedges.attributescolumns.Value.Cast/Value.Checkoperate on these (migrating data to fit new schemas, verifying compatibility).
This is not a migration framework. It's the observation that the existing TypeBox value system, combined with schemas-as-JSON storage, gives us schema evolution primitives for free — we just need to wire them together.
The Edit Type
Value.Diff returns an array of Edit objects — structural delta operations
that transform one JSON value into another:
type Insert = { type: "insert", path: string, value: unknown }
type Update = { type: "update", path: string, value: unknown }
type Delete = { type: "delete", path: string }
type Edit = Insert | Update | Delete
Paths use RFC 6901 JSON Pointer format (/properties/requestId,
/properties/status/anyOf/0/const). The value field contains the inserted
or updated value.
Value.Patch(current, edits) applies the edits to current and returns the
result. Value.Diff and Value.Patch are inverses:
Patch(current, Diff(current, next)) ≈ next.
Critical: Edit paths are structural, not semantic. An Update at path
/properties/status could mean type narrowing (String → Literal) or type
change (String → Number). Diff doesn't know — it only sees the raw JSON
structure. Classification logic (breaking vs non-breaking) must interpret the
edits with schema awareness.
The Event-Sourced Context
The @alkdev ecosystem uses an append-only event model:
- Call protocol (
@alkdev/operations): 5 typed events (call.requested,call.responded,call.completed,call.aborted,call.error) form an append-only event stream per workflow execution - Event log as source of truth (
@alkdev/flowgraphADR-005): The in-memoryCallEventMapValue[]is the authoritative state; projected views (status, results, call graph) are derived from it - Hub persistence: The hub persists call protocol events to Postgres and replays them to reconstruct reactive state after restart
- Storage as projection: The metagraph tables store the projected state — graph instances, nodes, edges — not the raw event stream
This means storage's graph data is analogous to a read model in event-sourced systems. Schema evolution in storage is projection migration, not event migration. The event stream (managed by the hub) retains full history; storage's tables hold the current materialized view.
Single-Author, Not CRDT
Unlike Yjs or Automerge, the @alkdev event model is single-author per session with central coordination. There are no concurrent multi-author edit conflicts. This means:
- Idempotent replay (same events → same state) is sufficient; CRDT merge semantics are not needed
- Schema evolution is forward-only (new code processes old data), not bidirectional (concurrent versions merging)
- Migration is apply-on-read or apply-on-write, not conflict resolution
If future use cases require multi-author concurrent graph editing (e.g., collaborative task boards), a CRDT layer would need to sit between the event stream and storage. That's a post-v1 concern.
Schema-as-JSON: Value.Diff on Schemas
TypeBox schemas are JSON Schema objects — plain JSON values. The current
node_types.schema and edge_types.schema columns store them as JSON text
(SQLite) or jsonb (PG). This means Value.Diff can diff schemas themselves.
Detecting Schema Changes
import { Value } from "@alkdev/typebox";
// Stored schema (from DB)
const storedSchema = await db.query.nodeTypes.findFirst({
where: eq(nodeTypes.name, "Call"),
}).schema;
// Current schema (from Module)
const currentSchema = CallGraph.CallNode;
// Diff the schemas
const edits = Value.Diff(storedSchema, currentSchema);
if (edits.length === 0) {
// No change — schema matches
} else {
// Schema has changed — classify the edits
}
Classifying Schema Edits
Value.Diff is schema-agnostic — it diffs raw JSON structure without
understanding JSON Schema semantics. A classification layer is needed to
determine whether an edit is breaking or non-breaking.
| Edit pattern | Schema-level meaning | Breaking? |
|---|---|---|
Insert new property with default |
New field with default | No |
Insert new property without default |
New required field — old data won't have it | Yes (unless Type.Optional) |
Insert new property with Optional wrapper |
New optional field — old data valid | No |
Update property type: String → Literal("x") |
Type narrowing (subtype) | No (existing data with "x" is valid) |
Update property type: String → Number |
Type change (incompatible) | Yes |
Update property type: add Optional wrapper |
Making field optional | No (backward compatible) |
| Delete property | Field removed from schema | Yes (old data with this field is non-conforming if additionalProperties: false) |
Update $defs references |
Cross-reference changes | Depends — see note |
⚠️ Needs POC: The classification above is theoretical. Whether this can be
done reliably from Edit[] objects (which are raw JSON pointer + value pairs)
needs validation. POC scope:
- Test corpus: ~20 schema change patterns covering each row in the table
above (add optional field, add required field, narrow type, change type,
remove field, add enum value,
$refchange) - Success criteria: Classification accuracy >95% against expected breaking/non-breaking labels for each pattern
- Fallback: If classification accuracy is too low, use Strategy C (hybrid
with
Value.Checkverification) instead of Strategy A.
Three Detection Strategies
Strategy A: Diff schemas, classify edits
Value.Diff(storedSchema, currentSchema) → Edit[]
classify(Edit[]) → { breaking: boolean, edits: ClassifiedEdit[] }
Pros: Identifies what changed and how. Enables targeted migration (only migrate data affected by breaking changes).
Cons: Classification is hard to get right. Schema semantics are rich (type
narrowing, union widening, $ref changes, additionalProperties interactions).
Strategy B: Diff schemas, test data compatibility
Value.Diff(storedSchema, currentSchema) → Edit[]
if (edits.length > 0) {
// Schema changed — test if existing data is still valid
const sampleData = fetchSampleNodeData(graphTypeId, nodeTypeName);
const compatible = sampleData.every(d => Value.Check(currentSchema, d));
}
Pros: Simple. No classification logic needed. Works with any schema change.
Cons: Requires fetching data to test. Binary answer (compatible or not) — no information about what changed. Sample may not be representative.
Strategy C: Hybrid — diff for detection, Check for verification
edits = Value.Diff(storedSchema, currentSchema)
if (edits.length === 0) return "unchanged";
// Schema changed. Fast-path: check if version bump covers it.
// storedVersion = graph_types.version from DB
// currentVersion = consumer-defined constant (e.g., CURRENT_CALL_GRAPH_VERSION = 2)
if (storedVersion < currentVersion) return "version-mismatch";
// No version bump but schema changed — non-breaking change expected.
// Verify by checking stored data against new schema.
// findIncompatibleNodes = repository query that returns nodes
// where Value.Check(currentSchema, node.attributes) is false
const incompatible = await findIncompatibleNodes(graphTypeId, nodeTypeName, currentSchema);
if (incompatible.length === 0) return "non-breaking";
return "breaking";
findIncompatibleNodes is a repository query function: fetch nodes of the
given type in the given graph, filter to those where
!Value.Check(currentSchema, node.attributes). currentVersion is a
consumer-defined integer constant that the consumer increments when making
a breaking change to their graph type Module.
This combines detection (diff), metadata (version), and verification (check) and is likely the most robust approach. Recommended for POC exploration.
Data Migration via Value.Cast
Once a schema change is detected and classified, Value.Cast can migrate
existing data to fit the new schema shape.
How Value.Cast Works
Value.Cast(schema, value) attempts to fit a value into the shape defined by
the schema:
- Matching properties: Retained from the original value
- Missing required properties with defaults: Filled from
schema.default - Missing required properties without defaults: Created as zeros (
0,"",false,{},[]) - Unknown properties: Dropped if
additionalProperties: false, retained otherwise - Union types: Each variant is scored and the best match is selected
This is exactly what's needed for data migration — reshape stored node attributes to fit a new schema while preserving matching fields.
Example: Adding a Required Field with Default
// Schema v1 (stored in DB)
const CallNodeV1 = Type.Object({
requestId: Type.String(),
operationId: Type.String(),
status: Type.Union([...]),
});
// Schema v2 (new Module entry — added priority field)
const CallNodeV2 = Type.Object({
requestId: Type.String(),
operationId: Type.String(),
status: Type.Union([...]),
priority: Type.Number({ default: 0 }), // new field with default
});
// Existing data
const oldNode = {
requestId: "req-001",
operationId: "op-call",
status: "completed",
};
// Cast migrates the data
const migratedNode = Value.Cast(CallNodeV2, oldNode);
// → { requestId: "req-001", operationId: "op-call", status: "completed", priority: 0 }
Example: Type Narrowing (Non-Breaking)
// Schema v1: type: Type.String()
// Schema v2: type: Type.Literal("triggered")
const oldEdge = { type: "triggered", metadata: {} };
const migrated = Value.Cast(CallGraphV2.TriggeredEdge, oldEdge);
// → { type: "triggered", metadata: {} } (unchanged — "triggered" satisfies Literal)
Example: Type Change (Breaking)
// Schema v1: priority: Type.Number()
// Schema v2: priority: Type.Union([Type.Literal("low"), Type.Literal("high")])
const oldNode = { requestId: "req-001", priority: 3 };
const migrated = Value.Cast(CallNodeV2, oldNode);
// Cast scores each union variant against the value.
// Neither "low" nor "high" matches 3 — Cast picks the best match,
// but the result is likely incorrect (data loss).
⚠️ Cast limitation: Value.Cast does not provide custom migration
functions. For breaking changes that require transformation logic (e.g.,
priority: 3 → priority: "high" based on a threshold), the repository layer
needs a custom migration handler. Cast handles the common case (add fields,
narrow types, drop removed fields); custom logic handles the rest.
Schema-as-JSON: Patching Stored Schemas
When a graph type Module changes, the stored schema in node_types.schema must
be updated. Value.Patch can apply the diff to the stored schema:
// 1. Diff current stored schema against new Module entry
const edits = Value.Diff(storedSchema, CallGraph.CallNode);
// 2. Patch the stored schema
const newStoredSchema = Value.Patch(storedSchema, edits);
// 3. Update the DB row
await db.update(nodeTypes)
.set({ schema: newStoredSchema, updatedAt: new Date() })
.where(eq(nodeTypes.name, "Call"));
This is simpler than reconstructing the full Module and running
moduleToDbSchema() again. But it requires the same caveats as Value.Diff —
the edits are structural, not semantic. If the Module entry's $defs structure
changed (e.g., a Type.Ref target was updated), the diff may not capture the
semantic change correctly.
Alternative: Re-run moduleToDbSchema() on the updated Module and write the
full output. This is simpler and more reliable but requires the full Module —
not just the schema entry — to be available at migration time.
Decision for v1: Re-run moduleToDbSchema() on the updated Module. The
Module is always available when the consumer defines graph types. Patch-based
schema update is an optimization for later.
Evolution Strategies
Additive-Only (Recommended for v1)
The simplest strategy: only add optional fields and new node/edge types. Never remove or rename existing fields.
- New optional fields:
Type.Optional(Type.String())— old data is still valid under the new schema - New node/edge types: New rows in
node_types/edge_types— existing rows unchanged - New enum values: Add to
Type.UnionofType.Literal— old data with existing values is still valid
This avoids the need for Value.Cast migrations entirely. The version field
on graph_types stays at 1.
When additive-only breaks down: If a field was incorrectly designed (wrong type, wrong name, wrong semantics), additive-only forces you to deprecate the old field and add a new one. The deprecated field stays in the schema forever. This is acceptable for early development but creates technical debt over time.
Version-Bumped with Cast Migration
When a breaking change is needed, bump the graph_types.version integer. The
repository layer checks the version before processing and applies Value.Cast
to migrate data when the version doesn't match.
const graphType = await findGraphType("call-graph");
const currentModule = CallGraph;
if (graphType.version < CURRENT_CALL_GRAPH_VERSION) {
// Mark migration in progress
await updateGraphTypeVersion(graphType.id, graphType.version + 1);
// Schema has changed — migrate all nodes
const nodes = await findNodesByGraphType(graphType.id);
for (const node of nodes) {
const migrated = Value.Cast(
currentModule[`${node.nodeType}Node`],
node.attributes,
);
// Guard: verify the cast result before writing
if (!Value.Check(currentModule[`${node.nodeType}Node`], migrated)) {
throw new Error(
`Cast produced invalid data for node ${node.id}. ` +
`Custom migration required for this schema change.`
);
}
await updateNode(node.id, { attributes: migrated });
}
// Update the schema and finalize version
await updateNodeTypesSchemas(graphType.id, moduleToDbSchema(currentModule));
await updateGraphTypeVersion(graphType.id, graphType.version + 1);
}
Version bump contract (decision: even/odd scheme):
- Even version: Stable schema, no pending migrations
- Odd version: Migration in progress — reads return stale data or error (consumer-configurable)
- After migration: Version is bumped to the next even number
Migration Safety
Partial migration is the primary risk — if the process crashes or errors on node N of M, some nodes are migrated and some are not. The even/odd version scheme provides the recovery mechanism:
- Before migration: Set
versionto odd (in-progress state) - During migration: Each node is cast, verified with
Value.Check, then written. If any cast produces invalid data, the migration aborts with an error — the consumer must provide a custom migration function for that schema change. - After migration: Set
versionto even (stable state), update stored schemas viamoduleToDbSchema() - Recovery: If the process crashes during migration, the odd version signals that migration is incomplete. On restart, the repository layer detects the odd version and can either resume the migration (migrate the remaining nodes) or roll back (restore from backup / re-project from events).
Read behavior during migration: Consumer-configurable. Options:
- Reject reads with an error ("migration in progress")
- Return stale data (from unmigrated nodes)
- Return mixed data (some migrated, some not — not recommended)
Cast safety guard: Every Value.Cast result must be verified with
Value.Check(newSchema, migratedData) before writing. If the check fails,
the migration is classified as breaking after all, and the consumer must
provide a custom migration function. Document this as a hard requirement.
Performance: For large datasets (10K+ nodes), the migration loop should
batch reads/writes to avoid holding all nodes in memory. The graph type is
effectively read-only during migration — writes to that graph type's nodes
should be rejected while version is odd.
This is the simplest versioning that handles breaking changes. It's
application-level (the consumer decides when to bump and migrate), not
framework-level (storage doesn't auto-migrate). graph_types.version is
consistent with @alkdev/flowgraph ADR-004: flowgraph itself doesn't version
schemas (it's in-memory), but the persisting consumer (storage) provides the
versioned envelope.
Event-Sourced Replay (Forward-Looking)
In a fully event-sourced model, schema evolution is handled by replaying events through updated projector functions. Storage's tables are rebuilt from the event log, so no data migration is needed — you just re-project.
This requires the event log to be the authoritative source and storage to be a disposable projection. The hub's call graph persistence (Postgres) is approaching this model: events are persisted, and state is reconstructed by replay. But the current metagraph tables are not rebuilt from events — they're written directly by the repository layer.
If the hub migrates to a model where call graph nodes/edges are projections
of the call protocol event stream, then schema evolution becomes projector
evolution — update the projector, replay the events, rebuild the projection.
No Value.Cast needed for stored data (because there is no stored data — just
re-projected views).
This is a post-v1 design — it requires the event log to be the primary
persistence, with storage tables as read-optimized projections. The current
model writes directly to storage tables; the event stream is separate (managed
by the hub's call graph module, not by @alkdev/storage).
Relationship to the Module
The Metagraph Module (metagraph-module.md) is the
source of truth for graph type schemas. When the Module changes:
moduleToDbSchema()produces the updated DB row valuesValue.Diff(storedSchema, moduleEntry)detects what changedValue.Cast(moduleEntry, storedData)migrates affected node/edge dataValue.Patch(storedSchema, edits)updates the stored schema (or re-runmoduleToDbSchema())
The Module's $defs structure adds complexity to diffing — a Type.Ref
resolution changes the effective schema in ways that a raw JSON diff might not
capture. Using moduleToDbSchema() (which resolves refs before writing to the
DB) avoids this problem — the stored schema is already dereferenced.
Design Decisions
All design decisions are documented as ADRs in decisions/.
| ADR | Decision | Summary |
|---|---|---|
| 028 | Additive-only for v1, Cast migration when needed | Additive changes avoid migration; Value.Cast for common cases |
| 029 | Version as a coarse-grained breaking-change signal | Only breaking changes bump the version; even/odd for migration state |
| 030 | Schema change detection via Value.Diff | No manual changelog; diff stored vs current schemas |
| 031 | moduleToDbSchema() for schema updates | Re-run full Module projection, not Value.Patch |
| 032 | Single-author model, not CRDT | No concurrent multi-author graph types |
Open Questions
Open questions are tracked in open-questions.md. Key questions affecting schema evolution:
- OQ-10: Can
Edit[]fromValue.Diffbe reliably classified as breaking vs non-breaking? - OQ-11: Should the repository layer auto-migrate data on schema change, or require explicit consumer action?
- OQ-12: How does schema evolution interact with the hub's event-sourced call graph persistence?
- OQ-13: Should schema evolution events be part of the event stream?
References
- TypeBox
Valuenamespace:/workspace/@alkdev/typebox/src/value/ - TypeBox
Value.Diff/Value.Patch:/workspace/@alkdev/typebox/src/value/delta/delta.ts - TypeBox
Value.Cast:/workspace/@alkdev/typebox/src/value/cast/cast.ts - TypeBox
Value.Check:/workspace/@alkdev/typebox/src/value/check/check.ts - Event Log as Source of Truth (ADR-005):
/workspace/@alkdev/flowgraph/docs/architecture/decisions/005-event-log-as-source-of-truth.md - Call protocol:
/workspace/@alkdev/operations/docs/architecture/call-protocol.md - Metagraph Module: metagraph-module.md
- Schema versioning in the data model: metagraph-module.md (Versioning section and DD3)