Files
flowgraph/docs/architecture/decisions/003-storage-decoupled.md
glm-5.1 d2253099ee add flowgraph architecture docs (Phase 1 SDD)
Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.
2026-05-19 09:36:22 +00:00

3.4 KiB

ADR-003: Decoupled Storage — In-Memory Graph with Export/Import Boundary

Status

Proposed

Context

Call graphs need to persist across hub restarts. The alkhub storage schema (call_graph_nodes and call_graph_edges tables) stores call data in Postgres. The question is: should flowgraph handle its own persistence, or should it provide a serialization boundary and let the hub handle storage?

Taskgraph takes the serialization boundary approach: export() returns a graphology JSON blob, fromJSON() restores it. The hub stores this data in whatever format it needs.

The alkhub call graph storage schema has specific requirements (payload truncation, redaction, indexing) that are storage-layer concerns, not graph concerns.

Decision

Flowgraph operates on in-memory graphology instances and provides export()/fromJSON() for serialization. Storage, persistence, and database operations are the hub's concern, not flowgraph's.

// In-memory graph
const graph = FlowGraph.fromCallEvents(events);

// Export for persistence
const data = graph.export();  // graphology native JSON

// Hub stores this in Postgres
await db.saveCallGraph(data);

// Restore from storage
const restored = FlowGraph.fromJSON(await db.loadCallGraph());

Rationale

  1. Separation of concerns — flowgraph is a graph library, not a database client. Mixing graph operations with SQL queries violates the single-responsibility principle.

  2. Storage varies by consumer — the hub uses Postgres, but other consumers might use SQLite, IndexedDB, or in-memory caches. Flowgraph shouldn't prescribe a storage backend.

  3. The storage schema has concerns beyond the graph — payload truncation (10KB threshold), field redaction (stripping API keys), and indexing are storage-layer concerns. Flowgraph stores raw input/output/error fields; the hub handles truncation at the persistence boundary.

  4. Taskgraph's pattern works — the same approach has served taskgraph well. The hub loads graph data from DB, constructs a TaskGraph in memory, runs analysis, and saves changes back.

  5. Platform-agnostic requirement — flowgraph must work in Deno, Node, and Bun. Database clients vary by platform (native addons, connection pooling, etc.). Keeping flowgraph pure JS means no native dependencies.

Consequences

  • export() and fromJSON() are the persistence boundary — consumers that need persistence serialize the graph and handle storage themselves.
  • No database imports in flowgraphpg, better-sqlite3, mongodb, etc. are not in flowgraph's dependency tree.
  • Payload handling is the hub's concern — flowgraph stores raw input/output/error on call nodes. Truncation and redaction happen when the hub writes to Postgres.
  • fromJSON() validates the data structure — using Value.Check() against the FlowGraphSerialized schema. Invalid data throws InvalidInputError. But fromJSON() does NOT validate business rules (e.g., no cycles — that's validateGraph()).
  • The hub must keep its storage schema in sync with flowgraph's FlowGraphSerialized — if the storage column types change, the hub's mapping code needs updating, not flowgraph.

References

  • Taskgraph serialization: @alkdev/taskgraph_ts/src/graph/construction.ts (fromJSON, export)
  • Call graph storage: @alkdev/alkhub_ts/docs/architecture/storage/call-graph.md
  • Schema: schema.md — FlowGraphSerialized format