hub/docs/architecture/storage/call-graph.md

---
status: draft
last_updated: 2026-05-25
---

# Table Schemas: Call Graph

Call graph observability tables. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see [table-reference.md](./table-reference.md). For design decisions, see [../../../decisions/](../../../decisions/). For call protocol architecture, see [../../call-graph.md](../../call-graph.md). For the flowgraph library that manages call/operation graphs in memory, see `@alkdev/flowgraph`.

### `call_graph_nodes`

Call graph entries for observability. Every operation invocation creates a node; parent-child relationships create edges. The `status` column matches `@alkdev/flowgraph/schema`'s `CallStatus` enum. See call-graph.md for the full call protocol spec.

| Column | Type | Notes |
|--------|------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| requestId | text NOT NULL UNIQUE | Protocol-level correlation key. Also serves as the flowgraph node key. |
| operationId | text NOT NULL | FK → operations.id (RESTRICT) — The operation definition that was called. **NOT NULL** — `@alkdev/flowgraph`'s `CallNodeAttrs.operationId` is a required string. The RESTRICT constraint means an operation cannot be deleted while call records reference it. **Deletion strategy**: The hub should deny operation removal when active call records exist. If removal is required (e.g., cleanup), the hub must first reassign call records to a sentinel operation row (pre-seeded in migrations with id `__removed__`, name `removed`, namespace `system`), then delete the original operation. |
| parentRequestId | text | Parent call's requestId (null = top-level call). Denormalized fast lookup — redundant with `triggered` edge in `call_graph_edges`. |
| identity | jsonb | Caller identity at time of call (`{ id, scopes, resources? }`), matching `@alkdev/flowgraph/schema`'s `CallNodeAttrs.identity`. |
| callerAccountId | text | FK → accounts.id — The account that initiated this call. Nullable — system-initiated calls may not have an account. onDelete: SET NULL (calls survive account deletion for audit). This follows the D1 cascade policy — live session/call data uses nullable FK + SET NULL to preserve audit history. |
| status | text NOT NULL | Matches `@alkdev/flowgraph/schema`'s `CallStatus` enum: `pending`, `running`, `completed`, `failed`, `aborted`. State transitions are enforced by the flowgraph state machine — `pending → running → completed/failed` and `pending/running → aborted`. |
| input | jsonb | Call input (redacted before storage — see Payload Redaction). |
| output | jsonb | Call output (on success). **Contains `ResponseEnvelope.data` only** — the hub unwraps the envelope before storing in the call graph. Maps to `CallNodeAttrs.output` in flowgraph. |
| error | jsonb | `{ code, message, details? }` (on failure). Maps to `CallNodeAttrs.error` in flowgraph. |
| startedAt | timestamp with tz | When call was dispatched. Maps to `CallNodeAttrs.startedAt` in flowgraph. **Type conversion**: flowgraph stores timestamps as ISO 8601 strings; storage layer must convert between `timestamptz` and ISO strings during read/write. |
| completedAt | timestamp with tz | When call completed/failed/aborted. Maps to `CallNodeAttrs.completedAt` in flowgraph. **Type conversion**: same as `startedAt`. |

**identity boundaries**: Caller identity at time of call (account, scopes, resources). This is immutable after creation. **metadata boundaries**: Retention metadata and other system fields. User-facing data goes in `input`/`output`.

**Timestamp serialization**: `@alkdev/flowgraph`'s `CallNodeAttrs` stores `startedAt` and `completedAt` as **ISO 8601 strings** (`Type.Optional(Type.String())`), not native Date objects. The storage layer stores them as Postgres `timestamp with tz`. The hub must:
- **On write (DB→flowgraph)**: Convert `timestamptz` → ISO string via `.toISOString()`
- **On read (flowgraph→DB)**: Convert ISO string → `Date` or pass as parameterized timestamp

**Indexes**: `idx_call_graph_nodes_request_id` UNIQUE on `(requestId)`, `idx_call_graph_nodes_operation_id` on `(operationId)`, `idx_call_graph_nodes_status` on `(status)`, `idx_call_graph_nodes_caller_account_id` on `(callerAccountId)`, `idx_call_graph_nodes_created_at` on `(createdAt)` — time-range queries, `idx_call_graph_nodes_operation_created` on `(operationId, createdAt)` — operation + time queries, `idx_call_graph_nodes_started_at` on `(startedAt)` — p99 latency analysis.

**Call graph payload size**: The `input` and `output` JSONB columns can grow arbitrarily large. For observability, the full payload is valuable but can bloat storage. Strategy: truncate payloads larger than 10KB to `{ _truncated: true, size: number, preview: string }` at the application layer. Full payloads can optionally be stored in object storage (S3/MinIO) with a reference URL in the `metadata` column. This keeps the call graph table lean while preserving the ability to inspect large payloads when needed.

**`call.running` and `startedAt`**: There is no `call.running` event in `@alkdev/flowgraph`'s `CallEventMapValue`. The `call.requested` event creates the node in `pending` state. The transition to `running` is performed by the hub's `CallHandler` calling `flowGraph.updateStatus(requestId, "running", { startedAt: now.toISOString() })` directly when it dispatches the operation handler. This is hub-initiated, not event-driven. See call-graph.md for the write path details.

**Mapping to `@alkdev/flowgraph`**: The `call_graph_nodes` columns map directly to `CallNodeAttrs` in `@alkdev/flowgraph/schema`. The in-memory flowgraph instance uses `requestId` as the node key. Storage reads populate a `FlowGraph.fromCallEvents()` call graph for observability queries, and storage writes persist each call protocol event incrementally.

### `call_graph_edges`

Edges in call graph (typed directed edges between calls). The `edgeType` column aligns with `@alkdev/flowgraph/schema`'s `EdgeType` enum for the edge types that flowgraph models (`triggered`, `depends_on`). The `requested_by` type is a storage-layer extension for identity tracing.

| Column | Type | Notes |
|--------|------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| sourceId | text NOT NULL | FK → call_graph_nodes.id (CASCADE) — deleting a source node removes its outgoing edges |
| targetId | text NOT NULL | FK → call_graph_nodes.id (CASCADE) — deleting a target node removes its incoming edges |
| edgeType | text NOT NULL | Edge type (see Edge Type Semantics below) |

**Indexes**: `idx_call_graph_edges_source_id` on `(sourceId)` — find calls originating from a node, `idx_call_graph_edges_target_id` on `(targetId)` — find calls targeting a node, `idx_call_graph_edges_source_id_type` on `(sourceId, edgeType)` — find outgoing calls of a specific type.

**Unique constraint**: `unq_call_graph_edges_source_target_type` UNIQUE on `(sourceId, targetId, edgeType)` — prevents duplicate edges from retries/reconnections.

### Edge Type Semantics

The `edgeType` column is an extensible text field. The initial set of edge types aligns with `@alkdev/flowgraph/schema`'s `EdgeType` enum for the first two, with a storage-layer extension for the third:

| Edge Type | Flowgraph `EdgeType` | Meaning |
|-----------|---------------------|---------|
| `triggered` | `EdgeType.triggered` | The source node caused the target node to execute. Represents the parent-child call hierarchy — when call A invokes call B (via `parentRequestId`), a `triggered` edge connects them. This is the most common edge type and corresponds to the call graph nesting described in the call protocol. Created automatically by `FlowGraph.addCall()` when `parentRequestId` is present. |
| `depends_on` | `EdgeType.depends_on` | The source node requires the result of the target node before it can complete. Represents a data dependency — call A cannot proceed until call B's output is available. Unlike `triggered`, the source does not cause the target to execute; it merely waits on it. Created by coordination logic via `FlowGraph.addDependency()`. |
| `requested_by` | Storage extension (no flowgraph `EdgeType`) | The target node was executed on behalf of the source node's identity. Represents the identity/authorization chain — call A's identity was delegated or propagated to call B. Used to trace which account's authority a call was performed under, distinct from the execution hierarchy (`triggered`). This is persisted in the database for observability but not modeled in the in-memory flowgraph graph. |

New edge types may be added as the call protocol evolves. Convention: use `snake_case` names, document each new type in this table, and ensure the type has a clear semantic distinction from existing types.

### Relationship: parentRequestId vs call_graph_edges

The `parentRequestId` column on `call_graph_nodes` and `triggered` edges in `call_graph_edges` both represent the parent-child call hierarchy, but serve different purposes:

- **`parentRequestId`** is a convenience shortcut on the node itself, set at call creation time from the call protocol's `parentRequestId` field. It enables fast point lookups ("who is this call's parent?") without a JOIN. Also used as the node key in the flowgraph instance.
- **`triggered` edges** represent the same relationship in the graph structure, enabling traversal queries ("find all children of this node"), path queries, and graph algorithm operations (topological sort, cycle detection).
- They are **intentionally redundant**: `parentRequestId` is denormalized for fast reads; edges are normalized for graph operations. Both should be kept consistent — when a node with a `parentRequestId` is stored, a `triggered` edge should also be created.

### Mapping to `@alkdev/flowgraph` In-Memory Model

The storage tables map to `@alkdev/flowgraph` types as follows:

| Storage Table/Column | Flowgraph Type | Notes |
|----------------------|---------------|-------|
| `call_graph_nodes` row | `CallNodeAttrs` (node in `FlowGraph`) | `requestId` is the node key in the flowgraph instance |
| `call_graph_nodes.status` | `CallStatus` enum | Same values: `pending`, `running`, `completed`, `failed`, `aborted` |
| `call_graph_nodes.identity` | `CallNodeAttrs.identity` | `{ id, scopes, resources }` |
| `call_graph_nodes.error` | `CallNodeAttrs.error` | `{ code, message, details? }` |
| `call_graph_edges` with `edgeType='triggered'` | `TriggeredEdgeAttrs` | Created by `FlowGraph.addCall()` when `parentRequestId` is present |
| `call_graph_edges` with `edgeType='depends_on'` | `DependencyEdgeAttrs` | Created by `FlowGraph.addDependency()` |
| `call_graph_edges` with `edgeType='requested_by'` | No flowgraph equivalent | Storage-layer only, not modeled in the in-memory graph |

**Reconstruction**: After a hub restart, the call graph is rebuilt from stored events or incremental rows using `FlowGraph.fromCallEvents()` or by iterating over `call_graph_nodes` + `call_graph_edges` rows and populating a `FlowGraph` instance via `addCall()` and `addDependency()`.

**Identifier mapping**: `call_graph_nodes` uses two identifiers — `id` (UUID, from `commonCols`, used as PK and FK target for edges) and `requestId` (text, UNIQUE, used as the flowgraph node key). When writing edges to `call_graph_edges`, the hub resolves `requestId` → `call_graph_nodes.id` for the FK references. When reconstructing from the database, the hub resolves `call_graph_nodes.id` → `requestId` for flowgraph node keys. This mapping is efficient because `call_graph_nodes.requestId` has a UNIQUE index.

**Serialization**: Flowgraph's `export()` produces graphology's native JSON format (`CallGraphSerialized`), which is suitable for snapshot/restore but not for incremental queries. The hub uses incremental storage for real-time observability and can optionally persist snapshots for fast recovery.

### Retention Policy

Call graph data is retained for 90 days by default (configurable via hub config). Completed/failed/aborted nodes and their edges older than the retention period are cleaned up by a background job. Pending/running nodes are never auto-deleted.

Aggregation for observability: Before deletion, summary statistics (call counts, average duration, error rates by operation) may be computed and stored in a separate aggregation table (deferred to Phase 2).

The `metadata` column on `call_graph_nodes` stores retention metadata: `{ _retentionExpiresAt: timestamp }` for tracking when a node becomes eligible for cleanup.

### Payload Redaction

Call graph `input` and `output` payloads may contain sensitive data (API keys, tokens, personal information). A redaction strategy is applied before storage.

**Redaction rules**: (1) Known sensitive field names (`apiKey`, `token`, `password`, `secret`, `authorization`, `key`) are replaced with `[REDACTED]`. (2) String values matching common secret patterns (Bearer tokens, base64-encoded secrets) are replaced with `[REDACTED]`. (3) Redaction is applied BEFORE the 10KB truncation — the truncated preview contains only redacted data.

**Redaction timing**: Applied at the application layer before DB write. Never store raw payloads and redact on read — redaction must be one-way.

**Configuration**: The list of redacted field names and patterns is configurable via hub config, with sensible defaults.

### Payload Truncation

**Truncation timing**: Payloads are truncated on DB write, not in-flight. In-flight calls hold full payloads in memory for processing. Only the persisted version is truncated.

**Truncation strategy**: Payloads larger than 10KB are truncated to `{ _truncated: true, size: number, preview: string }` where `preview` is the first 1024 bytes (not characters) of the JSON-serialized payload. The threshold is configurable via `HubConfig.callGraph.payloadTruncationThreshold` (defaults to 10240 bytes).

**Object storage reference**: For payloads exceeding the truncation threshold, the full payload MAY be stored in object storage (S3/MinIO) with a reference URL in the `metadata` column as `{ _storageRef: 's3://bucket/key' }`. This is Phase 2 and not yet implemented.