Systematically compared @alkdev/taskgraph, @alkdev/operations, and
@alkdev/flowgraph against storage/arch specs and fixed all mismatches.
Key changes:
Tasks (storage/tasks.md + ADR-011):
- Rename TaskFrontmatter → TaskInput to match library export
- Fix dependsOn (was depends_on) in field mappings — library uses
camelCase; parseFrontmatter normalizes YAML snake_case on input
- Document DependencyEdge shape {from, to, qualityRetention?} and
DB↔library field mapping
- Document graph node vs DB column distinction (TaskGraphNodeAttrs
is a subset of TaskInput)
- Fix default risk fallback from low → medium (matches resolveDefaults)
- Fix cross-project guard column references (dependentTaskId, not taskId)
- Clarify @alkdev/taskgraph TS is source of truth; frontmatter is for
LLM output parsing and legacy imports, not Rust CLI
- Add complete library exports reference
Operations (storage/spokes.md + operations.md):
- Add version, title, _meta columns to operations table (required by
OperationSpec, were missing)
- Fix type casing: query/mutation/subscription (lowercase, matching
OperationType runtime values)
- Make outputSchema and accessControl NOT NULL (matching library)
- Document ErrorDefinition shape {code, description, schema, httpStatus?}
- Document _meta vs commonCols.metadata distinction
- Add registerAll, get, getHandler, getByName, list, subscribe methods
- Fix buildCallHandler signature ({ registry, callMap })
- Fix OperationType values (lowercase)
Call graph (storage/call-graph.md + call-graph.md):
- Change operationId to NOT NULL with RESTRICT FK (was nullable/SET NULL)
— matches flowgraph's required CallNodeAttrs.operationId
- Document sentinel __removed__ operation strategy for deletions
- Document ISO 8601 string ↔ timestamptz conversion requirement
- Rewrite CallEventMap to match actual library: flat dot-notation keys,
timestamp on all events, nested error structure, optional output on
completed event
- Remove call.running event (doesn't exist in library) — hub calls
updateStatus(running) directly on dispatch
- Fix buildCallHandler({ registry, callMap }) signature
- Fix PendingRequestMap constructor (positional EventTarget)
- Add updateCall/removeCall/graph methods to API summary
- Document abort cascade as hub logic, not flowgraph logic
- Add open questions for operation deletion and reactive vs call graph
semantics
Table reference (storage/table-reference.md):
- Update call_graph_nodes.operationId cascade to RESTRICT
- Update operations.type comment to lowercase
- Update status enum reference
122 lines
14 KiB
Markdown
122 lines
14 KiB
Markdown
---
|
|
status: draft
|
|
last_updated: 2026-05-25
|
|
---
|
|
|
|
# Table Schemas: Call Graph
|
|
|
|
Call graph observability tables. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see [table-reference.md](./table-reference.md). For design decisions, see [../../../decisions/](../../../decisions/). For call protocol architecture, see [../../call-graph.md](../../call-graph.md). For the flowgraph library that manages call/operation graphs in memory, see `@alkdev/flowgraph`.
|
|
|
|
### `call_graph_nodes`
|
|
|
|
Call graph entries for observability. Every operation invocation creates a node; parent-child relationships create edges. The `status` column matches `@alkdev/flowgraph/schema`'s `CallStatus` enum. See call-graph.md for the full call protocol spec.
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| commonCols | — | id, metadata, createdAt, updatedAt |
|
|
| requestId | text NOT NULL UNIQUE | Protocol-level correlation key. Also serves as the flowgraph node key. |
|
|
| operationId | text NOT NULL | FK → operations.id (RESTRICT) — The operation definition that was called. **NOT NULL** — `@alkdev/flowgraph`'s `CallNodeAttrs.operationId` is a required string. The RESTRICT constraint means an operation cannot be deleted while call records reference it. **Deletion strategy**: The hub should deny operation removal when active call records exist. If removal is required (e.g., cleanup), the hub must first reassign call records to a sentinel operation row (pre-seeded in migrations with id `__removed__`, name `removed`, namespace `system`), then delete the original operation. |
|
|
| parentRequestId | text | Parent call's requestId (null = top-level call). Denormalized fast lookup — redundant with `triggered` edge in `call_graph_edges`. |
|
|
| identity | jsonb | Caller identity at time of call (`{ id, scopes, resources? }`), matching `@alkdev/flowgraph/schema`'s `CallNodeAttrs.identity`. |
|
|
| callerAccountId | text | FK → accounts.id — The account that initiated this call. Nullable — system-initiated calls may not have an account. onDelete: SET NULL (calls survive account deletion for audit). This follows the D1 cascade policy — live session/call data uses nullable FK + SET NULL to preserve audit history. |
|
|
| status | text NOT NULL | Matches `@alkdev/flowgraph/schema`'s `CallStatus` enum: `pending`, `running`, `completed`, `failed`, `aborted`. State transitions are enforced by the flowgraph state machine — `pending → running → completed/failed` and `pending/running → aborted`. |
|
|
| input | jsonb | Call input (redacted before storage — see Payload Redaction). |
|
|
| output | jsonb | Call output (on success). **Contains `ResponseEnvelope.data` only** — the hub unwraps the envelope before storing in the call graph. Maps to `CallNodeAttrs.output` in flowgraph. |
|
|
| error | jsonb | `{ code, message, details? }` (on failure). Maps to `CallNodeAttrs.error` in flowgraph. |
|
|
| startedAt | timestamp with tz | When call was dispatched. Maps to `CallNodeAttrs.startedAt` in flowgraph. **Type conversion**: flowgraph stores timestamps as ISO 8601 strings; storage layer must convert between `timestamptz` and ISO strings during read/write. |
|
|
| completedAt | timestamp with tz | When call completed/failed/aborted. Maps to `CallNodeAttrs.completedAt` in flowgraph. **Type conversion**: same as `startedAt`. |
|
|
|
|
**identity boundaries**: Caller identity at time of call (account, scopes, resources). This is immutable after creation. **metadata boundaries**: Retention metadata and other system fields. User-facing data goes in `input`/`output`.
|
|
|
|
**Timestamp serialization**: `@alkdev/flowgraph`'s `CallNodeAttrs` stores `startedAt` and `completedAt` as **ISO 8601 strings** (`Type.Optional(Type.String())`), not native Date objects. The storage layer stores them as Postgres `timestamp with tz`. The hub must:
|
|
- **On write (DB→flowgraph)**: Convert `timestamptz` → ISO string via `.toISOString()`
|
|
- **On read (flowgraph→DB)**: Convert ISO string → `Date` or pass as parameterized timestamp
|
|
|
|
**Indexes**: `idx_call_graph_nodes_request_id` UNIQUE on `(requestId)`, `idx_call_graph_nodes_operation_id` on `(operationId)`, `idx_call_graph_nodes_status` on `(status)`, `idx_call_graph_nodes_caller_account_id` on `(callerAccountId)`, `idx_call_graph_nodes_created_at` on `(createdAt)` — time-range queries, `idx_call_graph_nodes_operation_created` on `(operationId, createdAt)` — operation + time queries, `idx_call_graph_nodes_started_at` on `(startedAt)` — p99 latency analysis.
|
|
|
|
**Call graph payload size**: The `input` and `output` JSONB columns can grow arbitrarily large. For observability, the full payload is valuable but can bloat storage. Strategy: truncate payloads larger than 10KB to `{ _truncated: true, size: number, preview: string }` at the application layer. Full payloads can optionally be stored in object storage (S3/MinIO) with a reference URL in the `metadata` column. This keeps the call graph table lean while preserving the ability to inspect large payloads when needed.
|
|
|
|
**`call.running` and `startedAt`**: There is no `call.running` event in `@alkdev/flowgraph`'s `CallEventMapValue`. The `call.requested` event creates the node in `pending` state. The transition to `running` is performed by the hub's `CallHandler` calling `flowGraph.updateStatus(requestId, "running", { startedAt: now.toISOString() })` directly when it dispatches the operation handler. This is hub-initiated, not event-driven. See call-graph.md for the write path details.
|
|
|
|
**Mapping to `@alkdev/flowgraph`**: The `call_graph_nodes` columns map directly to `CallNodeAttrs` in `@alkdev/flowgraph/schema`. The in-memory flowgraph instance uses `requestId` as the node key. Storage reads populate a `FlowGraph.fromCallEvents()` call graph for observability queries, and storage writes persist each call protocol event incrementally.
|
|
|
|
### `call_graph_edges`
|
|
|
|
Edges in call graph (typed directed edges between calls). The `edgeType` column aligns with `@alkdev/flowgraph/schema`'s `EdgeType` enum for the edge types that flowgraph models (`triggered`, `depends_on`). The `requested_by` type is a storage-layer extension for identity tracing.
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| commonCols | — | id, metadata, createdAt, updatedAt |
|
|
| sourceId | text NOT NULL | FK → call_graph_nodes.id (CASCADE) — deleting a source node removes its outgoing edges |
|
|
| targetId | text NOT NULL | FK → call_graph_nodes.id (CASCADE) — deleting a target node removes its incoming edges |
|
|
| edgeType | text NOT NULL | Edge type (see Edge Type Semantics below) |
|
|
|
|
**Indexes**: `idx_call_graph_edges_source_id` on `(sourceId)` — find calls originating from a node, `idx_call_graph_edges_target_id` on `(targetId)` — find calls targeting a node, `idx_call_graph_edges_source_id_type` on `(sourceId, edgeType)` — find outgoing calls of a specific type.
|
|
|
|
**Unique constraint**: `unq_call_graph_edges_source_target_type` UNIQUE on `(sourceId, targetId, edgeType)` — prevents duplicate edges from retries/reconnections.
|
|
|
|
### Edge Type Semantics
|
|
|
|
The `edgeType` column is an extensible text field. The initial set of edge types aligns with `@alkdev/flowgraph/schema`'s `EdgeType` enum for the first two, with a storage-layer extension for the third:
|
|
|
|
| Edge Type | Flowgraph `EdgeType` | Meaning |
|
|
|-----------|---------------------|---------|
|
|
| `triggered` | `EdgeType.triggered` | The source node caused the target node to execute. Represents the parent-child call hierarchy — when call A invokes call B (via `parentRequestId`), a `triggered` edge connects them. This is the most common edge type and corresponds to the call graph nesting described in the call protocol. Created automatically by `FlowGraph.addCall()` when `parentRequestId` is present. |
|
|
| `depends_on` | `EdgeType.depends_on` | The source node requires the result of the target node before it can complete. Represents a data dependency — call A cannot proceed until call B's output is available. Unlike `triggered`, the source does not cause the target to execute; it merely waits on it. Created by coordination logic via `FlowGraph.addDependency()`. |
|
|
| `requested_by` | Storage extension (no flowgraph `EdgeType`) | The target node was executed on behalf of the source node's identity. Represents the identity/authorization chain — call A's identity was delegated or propagated to call B. Used to trace which account's authority a call was performed under, distinct from the execution hierarchy (`triggered`). This is persisted in the database for observability but not modeled in the in-memory flowgraph graph. |
|
|
|
|
New edge types may be added as the call protocol evolves. Convention: use `snake_case` names, document each new type in this table, and ensure the type has a clear semantic distinction from existing types.
|
|
|
|
### Relationship: parentRequestId vs call_graph_edges
|
|
|
|
The `parentRequestId` column on `call_graph_nodes` and `triggered` edges in `call_graph_edges` both represent the parent-child call hierarchy, but serve different purposes:
|
|
|
|
- **`parentRequestId`** is a convenience shortcut on the node itself, set at call creation time from the call protocol's `parentRequestId` field. It enables fast point lookups ("who is this call's parent?") without a JOIN. Also used as the node key in the flowgraph instance.
|
|
- **`triggered` edges** represent the same relationship in the graph structure, enabling traversal queries ("find all children of this node"), path queries, and graph algorithm operations (topological sort, cycle detection).
|
|
- They are **intentionally redundant**: `parentRequestId` is denormalized for fast reads; edges are normalized for graph operations. Both should be kept consistent — when a node with a `parentRequestId` is stored, a `triggered` edge should also be created.
|
|
|
|
### Mapping to `@alkdev/flowgraph` In-Memory Model
|
|
|
|
The storage tables map to `@alkdev/flowgraph` types as follows:
|
|
|
|
| Storage Table/Column | Flowgraph Type | Notes |
|
|
|----------------------|---------------|-------|
|
|
| `call_graph_nodes` row | `CallNodeAttrs` (node in `FlowGraph`) | `requestId` is the node key in the flowgraph instance |
|
|
| `call_graph_nodes.status` | `CallStatus` enum | Same values: `pending`, `running`, `completed`, `failed`, `aborted` |
|
|
| `call_graph_nodes.identity` | `CallNodeAttrs.identity` | `{ id, scopes, resources }` |
|
|
| `call_graph_nodes.error` | `CallNodeAttrs.error` | `{ code, message, details? }` |
|
|
| `call_graph_edges` with `edgeType='triggered'` | `TriggeredEdgeAttrs` | Created by `FlowGraph.addCall()` when `parentRequestId` is present |
|
|
| `call_graph_edges` with `edgeType='depends_on'` | `DependencyEdgeAttrs` | Created by `FlowGraph.addDependency()` |
|
|
| `call_graph_edges` with `edgeType='requested_by'` | No flowgraph equivalent | Storage-layer only, not modeled in the in-memory graph |
|
|
|
|
**Reconstruction**: After a hub restart, the call graph is rebuilt from stored events or incremental rows using `FlowGraph.fromCallEvents()` or by iterating over `call_graph_nodes` + `call_graph_edges` rows and populating a `FlowGraph` instance via `addCall()` and `addDependency()`.
|
|
|
|
**Identifier mapping**: `call_graph_nodes` uses two identifiers — `id` (UUID, from `commonCols`, used as PK and FK target for edges) and `requestId` (text, UNIQUE, used as the flowgraph node key). When writing edges to `call_graph_edges`, the hub resolves `requestId` → `call_graph_nodes.id` for the FK references. When reconstructing from the database, the hub resolves `call_graph_nodes.id` → `requestId` for flowgraph node keys. This mapping is efficient because `call_graph_nodes.requestId` has a UNIQUE index.
|
|
|
|
**Serialization**: Flowgraph's `export()` produces graphology's native JSON format (`CallGraphSerialized`), which is suitable for snapshot/restore but not for incremental queries. The hub uses incremental storage for real-time observability and can optionally persist snapshots for fast recovery.
|
|
|
|
### Retention Policy
|
|
|
|
Call graph data is retained for 90 days by default (configurable via hub config). Completed/failed/aborted nodes and their edges older than the retention period are cleaned up by a background job. Pending/running nodes are never auto-deleted.
|
|
|
|
Aggregation for observability: Before deletion, summary statistics (call counts, average duration, error rates by operation) may be computed and stored in a separate aggregation table (deferred to Phase 2).
|
|
|
|
The `metadata` column on `call_graph_nodes` stores retention metadata: `{ _retentionExpiresAt: timestamp }` for tracking when a node becomes eligible for cleanup.
|
|
|
|
### Payload Redaction
|
|
|
|
Call graph `input` and `output` payloads may contain sensitive data (API keys, tokens, personal information). A redaction strategy is applied before storage.
|
|
|
|
**Redaction rules**: (1) Known sensitive field names (`apiKey`, `token`, `password`, `secret`, `authorization`, `key`) are replaced with `[REDACTED]`. (2) String values matching common secret patterns (Bearer tokens, base64-encoded secrets) are replaced with `[REDACTED]`. (3) Redaction is applied BEFORE the 10KB truncation — the truncated preview contains only redacted data.
|
|
|
|
**Redaction timing**: Applied at the application layer before DB write. Never store raw payloads and redact on read — redaction must be one-way.
|
|
|
|
**Configuration**: The list of redacted field names and patterns is configurable via hub config, with sensible defaults.
|
|
|
|
### Payload Truncation
|
|
|
|
**Truncation timing**: Payloads are truncated on DB write, not in-flight. In-flight calls hold full payloads in memory for processing. Only the persisted version is truncated.
|
|
|
|
**Truncation strategy**: Payloads larger than 10KB are truncated to `{ _truncated: true, size: number, preview: string }` where `preview` is the first 1024 bytes (not characters) of the JSON-serialized payload. The threshold is configurable via `HubConfig.callGraph.payloadTruncationThreshold` (defaults to 10240 bytes).
|
|
|
|
**Object storage reference**: For payloads exceeding the truncation threshold, the full payload MAY be stored in object storage (S3/MinIO) with a reference URL in the `metadata` column as `{ _storageRef: 's3://bucket/key' }`. This is Phase 2 and not yet implemented. |