From bde1cc4e70163c1dadc083d67809ea948ba78c93 Mon Sep 17 00:00:00 2001 From: "glm-5.1" Date: Sun, 26 Apr 2026 06:38:52 +0000 Subject: [PATCH] Decompose monolithic architecture.md into modular docs/architecture/ documents MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 751-line architecture.md violated the SDD process modular documentation target (~500 lines). It also had duplicate TaskGraph class definitions (one monolith, one decomposed) that directly contradicted each other, and embedded consumer-specific tool dispatch mappings that belong in downstream projects. Changes: - Split into 8 focused documents + 7 ADR records + redirect page - Removed the monolithic TaskGraph class (kept only decomposed version) - Moved CLI→plugin dispatch mapping out (belongs in plugin architecture) - Extracted implementation code (frontmatter splitter, findCycles, DAG propagation) into WHAT/WHY descriptions per architect role spec - Added proper ADR format for all resolved design decisions - Fixed review issues: C_fail mapping, DuplicateNodeError/DuplicateEdgeError types, ValidationError/GraphValidationError definitions, mutation error handling contract, enum naming convention, validation timing clarification --- docs/architecture.md | 760 +----------------- docs/architecture/README.md | 138 ++++ docs/architecture/api-surface.md | 215 +++++ docs/architecture/build-distribution.md | 89 ++ docs/architecture/cost-benefit.md | 131 +++ .../001-pivot-to-typescript-graphology.md | 31 + .../decisions/002-rebuild-vs-incremental.md | 26 + .../003-topo-order-throws-on-cycle.md | 27 + .../004-workflow-cost-dag-propagation.md | 28 + .../decisions/005-no-depth-escalation-v1.md | 26 + .../decisions/006-deterministic-edge-keys.md | 26 + .../decisions/007-subgraph-internal-only.md | 25 + docs/architecture/errors-validation.md | 129 +++ docs/architecture/frontmatter.md | 78 ++ docs/architecture/graph-model.md | 89 ++ docs/architecture/schemas.md | 194 +++++ 16 files changed, 1264 insertions(+), 748 deletions(-) create mode 100644 docs/architecture/README.md create mode 100644 docs/architecture/api-surface.md create mode 100644 docs/architecture/build-distribution.md create mode 100644 docs/architecture/cost-benefit.md create mode 100644 docs/architecture/decisions/001-pivot-to-typescript-graphology.md create mode 100644 docs/architecture/decisions/002-rebuild-vs-incremental.md create mode 100644 docs/architecture/decisions/003-topo-order-throws-on-cycle.md create mode 100644 docs/architecture/decisions/004-workflow-cost-dag-propagation.md create mode 100644 docs/architecture/decisions/005-no-depth-escalation-v1.md create mode 100644 docs/architecture/decisions/006-deterministic-edge-keys.md create mode 100644 docs/architecture/decisions/007-subgraph-internal-only.md create mode 100644 docs/architecture/errors-validation.md create mode 100644 docs/architecture/frontmatter.md create mode 100644 docs/architecture/graph-model.md create mode 100644 docs/architecture/schemas.md diff --git a/docs/architecture.md b/docs/architecture.md index a09f844..61cd320 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,751 +1,15 @@ -# @alkdev/taskgraph Architecture +# Architecture -> Status: draft — pivot from napi/Rust to pure TypeScript with graphology. +> **This document has been decomposed into modular documents.** See [docs/architecture/](architecture/) for the current architecture specification. -## Why This Exists +The monolithic architecture document was split to follow the SDD process's modular documentation pattern (~500 line target per document). The content now lives in: -The taskgraph CLI (`/workspace/@alkimiadev/taskgraph`) is useful but requires bash access. In agent systems, bash + untrusted data sources (web content, academic papers, etc.) is a security risk — adversarial content can instruct agents to exfiltrate data or take harmful actions through the shell. We've seen this in practice: researchers hiding prompt injections in academic papers using Unicode steganography that bypassed review systems. - -Rather than restricting which agents get bash access and hoping nothing goes wrong, we expose the graph and cost-benefit operations as a library callable as a native tool — no shell involved. - -The same graph code also serves agents that *do* have bash access — they call these operations directly as tools rather than shelling out to the CLI, which is faster and avoids argument parsing issues. - -## Why Not NAPI/Rust - -The original draft specified a Rust core with napi-rs bindings. That added significant complexity with minimal benefit for our use case: - -- **Cross-platform build pain** — macOS x64/ARM64, Linux x64/ARM64, Windows x64. Each needs a separate binary. Publishing is a headache. -- **Realistic graph sizes are small** — task graphs are typically 10–50 nodes, rarely exceeding 200. The performance difference between Rust and JS is negligible at this scale. -- **graphology already exists** — it provides all the DAG algorithms we need, and we already have it in the dependency tree at `/workspace/graphology`. -- **Runtime compatibility** — pure JS/TS works in Node, Deno, and Bun without native addon headaches. No platform-specific binaries. -- **Future UI path** — graphology is the graph engine behind sigma.js/react-sigma, making visualization straightforward later. -- **Near 1:1 petgraph ↔ graphology mapping** — porting back to Rust later is tractable because the graph operation semantics align closely. - -## Core Principle - -**The graph algorithms and cost-benefit math are the value.** Everything else — frontmatter parsing, file discovery, CLI output formatting — is input/output that belongs to the caller or to specific consumers. - -This is a standalone implementation. It replicates the essential logic from `/workspace/@alkimiadev/taskgraph` but does not depend on it. The upstream CLI continues to exist for human use and offline analysis. - -## Two Consumers - -### 1. alkhub (hub-spoke coordinator) - -The hub's database is the source of truth for tasks at runtime. The coordinator loads task rows + dependency edges from the DB, builds a graphology graph in memory, and runs graph algorithms (topo, cycles, parallel, critical path, bottleneck, risk-path). - -See `/workspace/@alkdev/alkhub_ts/docs/architecture/storage/tasks.md` for the DB schema and the graphology integration section. - -### 2. OpenCode plugin (task tool) - -An OpenCode plugin following the registry pattern (like `@alkdev/open-memory` and `@alkdev/open-coordinator`). Exposes a single `task` tool with `{action, args}` dispatch. Reads frontmatter from markdown files on disk, runs the same graph algorithms. Functionally replaces the `taskgraph` CLI for agents within OpenCode — no bash required. - -Commands replicated from the CLI (minus `graph`/DOT export which was added speculatively and isn't used): - -| CLI Command | Plugin Action | Notes | -|-------------|---------------|-------| -| `list` | `task({action: "list"})` | List all tasks | -| `show` | `task({action: "show", args: {id}})` | Show task details | -| `deps` | `task({action: "deps", args: {id}})` | What a task depends on | -| `dependents` | `task({action: "dependents", args: {id}})` | What depends on a task | -| `topo` | `task({action: "topo"})` | Topological order | -| `cycles` | `task({action: "cycles"})` | Cycle detection | -| `parallel` | `task({action: "parallel"})` | Parallel execution groups | -| `critical` | `task({action: "critical"})` | Critical path | -| `bottleneck` | `task({action: "bottleneck"})` | High-betweenness tasks | -| `risk` | `task({action: "risk"})` | Risk distribution | -| `risk-path` | `task({action: "riskPath"})` | Highest cumulative risk path | -| `decompose` | `task({action: "decompose"})` | Tasks that should be broken down | -| `workflow-cost` | `task({action: "workflowCost"})` | Expected value cost analysis | -| `validate` | `task({action: "validate"})` | Schema + graph validation | -| `init` | `task({action: "init", args: {id, name}})` | Scaffold a new task file | - -`init` is the **only write action**. All other actions are read-only. This matters for the security model: a read-only task tool is safe to expose to any agent; `init` requires write scope. - -## What We Replicate from taskgraph (Rust) - -### DependencyGraph — all algorithms - -| Operation | Source (Rust) | Implementation (TS) | -|-----------|---------------|---------------------| -| `hasCycles` | petgraph `is_cyclic_directed` | `graphology-dag` `hasCycle` | -| `findCycles` | DFS with recursion stack | Custom: DFS with 3-color marking + back-edge path extraction (see §findCycles) | -| `topologicalOrder` | petgraph `toposort` | `graphology-dag` `topologicalSort` | -| `dependencies(id)` | Incoming edges | graphology `inNeighbors` | -| `dependents(id)` | Outgoing edges | graphology `outNeighbors` | -| `parallelGroups` | Generational grouping | `graphology-dag` `topologicalGenerations` | -| `criticalPath` | Longest path by node count (memoized DFS) | Custom: same algorithm on graphology graph | -| `weightedCriticalPath` | Longest path by cumulative weight | Custom: same algorithm with weight function | -| `bottlenecks` | All-pairs path counting | `graphology-metrics` `betweenness` (Brandes) | - -### Categorical enums with numeric methods - -| Enum | Values | Method | Range | -|------|--------|--------|-------| -| `TaskScope` | single, narrow, moderate, broad, system | `costEstimate()` | 1.0–5.0 | -| `TaskRisk` | trivial, low, medium, high, critical | `successProbability()` | 0.50–0.98 | -| `TaskImpact` | isolated, component, phase, project | `weight()` | 1.0–3.0 | -| `TaskLevel` | planning, decomposition, implementation, review, research | — | (labeling only) | -| `TaskPriority` | low, medium, high, critical | — | (labeling only) | -| `TaskStatus` | pending, in-progress, completed, failed, blocked | — | (labeling only) | - -### Numeric method tables - -| TaskScope | costEstimate | tokenEstimate | -|-----------|-------------|---------------| -| single | 1.0 | 500 | -| narrow | 2.0 | 1500 | -| moderate | 3.0 | 3000 | -| broad | 4.0 | 6000 | -| system | 5.0 | 10000 | - -| TaskRisk | successProbability | riskWeight (1-p) | -|----------|--------------------|--------------------| -| trivial | 0.98 | 0.02 | -| low | 0.90 | 0.10 | -| medium | 0.80 | 0.20 | -| high | 0.65 | 0.35 | -| critical | 0.50 | 0.50 | - -| TaskImpact | weight | -|-----------|--------| -| isolated | 1.0 | -| component | 1.5 | -| phase | 2.0 | -| project | 3.0 | - -### Cost-benefit math - -- `calculateTaskEv` — expected value with retry logic (exact formula from Rust CLI) -- `riskPath` — `weightedCriticalPath(weight = riskWeight * impactWeight)` -- `shouldDecompose` — risk >= high OR scope >= broad -- `workflowCost` — DAG-propagation EV aggregation (see §Workflow-Cost DAG Propagation). Skips completed tasks unless flagged. -- `riskDistribution` — bucket tasks by risk category, show counts/percentages - -### Error types - -Typed error classes for programmatic recovery: - -```typescript -class TaskgraphError extends Error {} -class TaskNotFoundError extends TaskgraphError { taskId: string } -class CircularDependencyError extends TaskgraphError { cycles: string[][] } -class InvalidInputError extends TaskgraphError { field: string; message: string } -``` - -## What We Don't Replicate - -- `Task` / `TaskFrontmatter` Rust structs — replaced by typebox schemas + graphology node attributes -- `TaskCollection` / directory scanning — filesystem discovery belongs to the consumer -- `Config` / `.taskgraph.toml` — CLI configuration, not a library concern -- `clap` command definitions — CLI dispatch, replaced by plugin tool dispatch or direct API calls -- `toDot()` / DOT export — added speculatively, not used, dropped -- Rust's all-pairs path-counting bottleneck — replaced by graphology betweenness (Brandes, O(VE) vs O(N²×paths)) -- Zod interop — typebox is the sole schema system. No Zod bridge planned. Consumers with Zod in their stack can convert at their boundary. - -## Schema & Types (@alkdev/typebox) - -All data shapes are defined as typebox schemas. This gives us: - -1. **Static TypeScript types** via `Static` — compile-time safety -2. **Runtime validation** via `Value.Check()` / `Value.Assert()` — reject bad input before it hits the graph -3. **JSON Schema** for free — can be used by consumers for their own validation, API contracts, etc. - -The typebox schemas serve as the single source of truth for both types and validation. No separate type definitions, no Zod, no ad-hoc validation logic. - -### TaskInput schema - -The universal input shape for a task, matching the Rust `TaskFrontmatter` field set: - -```typescript -const TaskInput = Type.Object({ - id: Type.String(), - name: Type.String(), - dependsOn: Type.Array(Type.String()), - status: Type.Optional(TaskStatusEnum), - scope: Type.Optional(TaskScopeEnum), - risk: Type.Optional(TaskRiskEnum), - impact: Type.Optional(TaskImpactEnum), - level: Type.Optional(TaskLevelEnum), - priority: Type.Optional(TaskPriorityEnum), - tags: Type.Optional(Type.Array(Type.String())), - assignee: Type.Optional(Type.String()), - due: Type.Optional(Type.String()), - created: Type.Optional(Type.String()), - modified: Type.Optional(Type.String()), -}) -``` - -Categorical enums are defined with `Type.Union(Type.Literal(...))` — string values matching the DB and frontmatter conventions. - -### DependencyEdge schema - -```typescript -const DependencyEdge = Type.Object({ - from: Type.String(), // prerequisite task id - to: Type.String(), // dependent task id - qualityDegradation: Type.Optional(Type.Number()), // 0.0–1.0, default 0.9 -}) -``` - -The `qualityDegradation` field models how much upstream failure bleeds through to the dependent task. Value of 0.0 means no propagation (independent model), 1.0 means full propagation. Default is 0.9 following the Python research model. Only used by `workflowCost` in DAG-propagation mode; ignored by all other algorithms. - -### TaskGraphNodeAttributes schema - -Node attributes stored on the graphology graph. The node key is the task `id` (slug). Attributes carry only the metadata needed for graph analysis — no body/content: - -```typescript -const TaskGraphNodeAttributes = Type.Object({ - name: Type.String(), - scope: Type.Optional(TaskScopeEnum), - risk: Type.Optional(TaskRiskEnum), - impact: Type.Optional(TaskImpactEnum), - level: Type.Optional(TaskLevelEnum), - priority: Type.Optional(TaskPriorityEnum), - status: Type.Optional(TaskStatusEnum), -}) -``` - -### TaskGraphEdgeAttributes schema - -```typescript -const TaskGraphEdgeAttributes = Type.Object({ - qualityDegradation: Type.Optional(Type.Number()), -}) -``` - -Edges carry `qualityDegradation` for the DAG-propagation cost model. If absent, the default (0.9) is used by `workflowCost`. Other algorithms ignore edge attributes. - -### SerializedGraph schema - -Following the graphology native JSON format, parameterized with our attribute types: - -```typescript -const TaskGraphSerialized = SerializedGraph( - TaskGraphNodeAttributes, - TaskGraphEdgeAttributes, - Type.Object({}) -) -``` - -This validates the graphology `export()` output and enables `import()` from validated JSON blobs. - -## Graph Model - -### Edge direction - -**prerequisite → dependent** (matches Rust CLI convention). - -If task B has `dependsOn: ["A"]`, the edge is **A → B** (A must complete before B). - -In graphology terms: -- `graph.inNeighbors(B)` → prerequisites (what B depends on) -- `graph.outNeighbors(A)` → dependents (what depends on A) -- `graph.addEdge(A, B)` — prerequisite is source, dependent is target - -### Construction - -The graph must be constructable from multiple sources. - -```typescript -// 1. From TaskInput array (frontmatter/JSON — most common) -const graph = TaskGraph.fromTasks(tasks: TaskInput[]): TaskGraph - -// 2. From DB query results (alkhub use case — explicit edges with optional qualityDegradation) -const graph = TaskGraph.fromRecords(tasks: TaskInput[], edges: DependencyEdge[]): TaskGraph - -// 3. From graphology native JSON (export/import round-trip) -const graph = TaskGraph.fromJSON(data: TaskGraphSerialized): TaskGraph - -// 4. Incremental construction (programmatic/testing) -const graph = new TaskGraph() -graph.addTask("a", { name: "Task A" }) -graph.addTask("b", { name: "Task B", scope: "broad" }) -graph.addDependency("a", "b") // a is prerequisite of b -``` - -For paths 1 and 2, the preferred internal approach is to build a `SerializedGraph` JSON blob (nodes array + edges array) and call `graph.import()`. This is faster than N individual `addNode`/`addEdge` calls and avoids the verbose builder API. See graphology performance tips at `/workspace/graphology/docs/performance-tips.md`. - -**Note on qualityDegradation:** `fromTasks` constructs edges from `dependsOn` arrays in frontmatter, which cannot express per-edge `qualityDegradation`. Those edges get the default (0.9). `fromRecords` and `fromJSON` support per-edge values. Edges can be augmented after construction via `updateEdgeAttributes` if needed. - -### Categorical field defaults - -Categorical fields (`scope`, `risk`, `impact`, `level`) are optional (nullable) — NULL means "not yet assessed." The analysis functions need numeric values, so we provide a `resolveDefaults` helper: - -```typescript -function resolveDefaults(attrs: TaskGraphNodeAttributes): ResolvedTaskAttributes -``` - -This maps None → the Rust CLI's default values: -- risk: None → successProbability 0.80 (medium), riskWeight 0.20 -- scope: None → costEstimate 2.0 (narrow) -- impact: None → weight 1.0 (isolated) - -The raw nullable data is preserved on the graph. `resolveDefaults` is called internally by analysis functions but is also available to consumers that need the same default logic. - -### Task metadata lives on nodes - -Unlike the original napi design where `DependencyGraph` only stored IDs, node attributes carry the categorical metadata directly. This eliminates the need to pass `TaskInput[]` alongside the graph — `weightedCriticalPath` and `riskPath` read attributes from the graph nodes. The graph acts as an in-memory index/metadata store; task body content stays external. - -### Graph reactivity - -graphology's `Graph` class extends Node.js `EventEmitter` and emits fine-grained mutation events: `nodeAdded`, `edgeAdded`, `nodeDropped`, `edgeDropped`, `nodeAttributesUpdated`, `edgeAttributesUpdated`, `cleared`, `edgesCleared`. `TaskGraph` does **not** wrap or re-emit these. Consumers that need reactivity (e.g., the OpenCode plugin for file-watch → coordinator notification) access the underlying graphology instance via `graph.raw` and attach listeners directly. This keeps `TaskGraph` as a pure computation library with no opinion about reactivity. - -## API Surface - -### TaskGraph class - -```typescript -class TaskGraph { - // Construction - static fromTasks(tasks: TaskInput[]): TaskGraph - static fromRecords(tasks: TaskInput[], edges: DependencyEdge[]): TaskGraph - static fromJSON(data: TaskGraphSerialized): TaskGraph - addTask(id: string, attributes: TaskGraphNodeAttributes): void - addDependency(prerequisite: string, dependent: string): void - - // Mutation - removeTask(id: string): void - removeDependency(prerequisite: string, dependent: string): void - updateTask(id: string, attributes: Partial): void - updateEdgeAttributes(prerequisite: string, dependent: string, attrs: Partial): void - - // Queries - hasCycles(): boolean - findCycles(): string[][] - topologicalOrder(): string[] // throws CircularDependencyError if cyclic - dependencies(taskId: string): string[] - dependents(taskId: string): string[] - taskCount(): number - getTask(taskId: string): TaskGraphNodeAttributes | undefined - - // Analysis - parallelGroups(): string[][] - criticalPath(): string[] - weightedCriticalPath(weightFn: (taskId: string, attrs: TaskGraphNodeAttributes) => number): string[] - bottlenecks(): Array<{ taskId: string; score: number }> - - // Cost-benefit (methods that use categorical data on nodes) - riskPath(): RiskPathResult - shouldDecompose(taskId: string): DecomposeResult - workflowCost(options?: WorkflowCostOptions): WorkflowCostResult - riskDistribution(): RiskDistributionResult - - // Subgraph - subgraph(filter: (taskId: string, attrs: TaskGraphNodeAttributes) => boolean): TaskGraph - - // Validation - validateSchema(): ValidationError[] - validateGraph(): GraphValidationError[] - validate(): ValidationError[] - - // Export - export(): TaskGraphSerialized - toJSON(): TaskGraphSerialized - - // Reactivity - get raw(): Graph // underlying graphology instance for direct event listener attachment -} -``` - -### Standalone functions (can be used without TaskGraph class) - -```typescript -// Categorical enum numeric methods -function scopeCostEstimate(scope: TaskScope): number // 1.0–5.0 -function scopeTokenEstimate(scope: TaskScope): number // 500–10000 -function riskSuccessProbability(risk: TaskRisk): number // 0.50–0.98 -function riskWeight(risk: TaskRisk): number // 0.02–0.50 -function impactWeight(impact: TaskImpact): number // 1.0–3.0 - -// Defaults resolution -function resolveDefaults(attrs: Partial): ResolvedTaskAttributes - -// Cost-benefit -function calculateTaskEv(p: number, scopeCost: number, impactWeight: number, config?: EvConfig): EvResult -function shouldDecomposeTask(attrs: TaskGraphNodeAttributes): DecomposeResult -``` - -### Return types - -```typescript -import { Type, Static } from "@alkdev/typebox"; - - -export const RiskPathResult = Type.Object({ - path: Type.Array(Type.String()), - totalRisk: Type.Number(), -}); -export type RiskPathResult = Static; - -export const DecomposeResult = Type.Object({ - shouldDecompose: Type.Boolean(), - reasons: Type.Array(Type.String()), -}); -export type DecomposeResult = Static; - -export const WorkflowCostOptions = Type.Object({ - includeCompleted: Type.Optional(Type.Boolean()), - limit: Type.Optional(Type.Number()), - propagationMode: Type.Optional( - Type.Union([Type.Literal("independent"), Type.Literal("dag-propagate")]) - ), - defaultQualityDegradation: Type.Optional(Type.Number()), -}); -export type WorkflowCostOptions = Static; - - -export const WorkflowCostResult = Type.Object({ - tasks: Type.Array( - Type.Object({ - taskId: Type.String(), - name: Type.String(), - ev: Type.Number(), - pIntrinsic: Type.Number(), - pEffective: Type.Number(), - probability: Type.Number(), - scopeCost: Type.Number(), - impactWeight: Type.Number(), - }) - ), - totalEv: Type.Number(), - averageEv: Type.Number(), - propagationMode: Type.Union([ - Type.Literal("independent"), - Type.Literal("dag-propagate"), - ]), -}); -export type WorkflowCostResult = Static; - -export const EvConfig = Type.Object({ - retries: Type.Optional(Type.Number()), - fallbackCost: Type.Optional(Type.Number()), - timeLost: Type.Optional(Type.Number()), - valueRate: Type.Optional(Type.Number()), -}); -export type EvConfig = Static; - -export const EvResult = Type.Object({ - ev: Type.Number(), - pSuccess: Type.Number(), - expectedRetries: Type.Number(), -}); -export type EvResult = Static; - -export const RiskDistributionResult = Type.Object({ - trivial: Type.Array(Type.String()), - low: Type.Array(Type.String()), - medium: Type.Array(Type.String()), - high: Type.Array(Type.String()), - critical: Type.Array(Type.String()), - unspecified: Type.Array(Type.String()), -}); -export type RiskDistributionResult = Static; -``` - -## findCycles Implementation - -graphology does not provide a cycle extraction function — only `hasCycle` (boolean) and `stronglyConnectedComponents` (node groups, not paths). We implement a custom DFS cycle path extractor in `src/graph/queries.ts`. - -**Algorithm:** Extend the 3-color DFS (WHITE/GREY/BLACK) used by `graphology-dag`'s `hasCycle`. When a back edge is found (GREY → GREY), trace back through the recursion stack to extract the cycle path as an ordered node sequence. This returns the actual cycle paths needed for error reporting in `validate()`. - -**Optimization:** Use `stronglyConnectedComponents()` from `graphology-components` as a fast pre-check. If there are zero multi-node SCCs (and no self-loops), skip the DFS entirely — the graph is acyclic. - -**Relationship to `topologicalOrder`:** `topologicalOrder()` throws `CircularDependencyError` (with `cycles` populated) when the graph is cyclic, rather than returning `null`. This prevents silent ignoring of cycles and gives consumers the cycle information needed for error reporting. - -## Workflow-Cost DAG Propagation - -The Rust CLI computes EV per-task independently — no upstream quality degradation. As the framework doc in the Rust source notes, this is a simplified model (the "Kuhn poker analogy") — it captures a structural property of the problem but ignores how upstream failure degrades downstream work. The Python research notebook (`/workspace/@alkimiadev/taskgraph/docs/research/cost_benefit_analysis_framework.py`) implements a DAG-propagation model that addresses this. - -### Why DAG propagation matters - -The independent model is dangerously optimistic for non-trivial workflows. In a dependency chain where planning has p=0.65 (poor), the Python model shows a **213% cost increase** vs good planning (p=0.92). The independent model barely shows a difference because it ignores cascading failure. This structural property is independent of the "type" of developer — human, LLM, or otherwise. - -### Implementation - -We implement DAG propagation as the default mode, with the independent model as a degenerate case: - -```typescript -function calculateWorkflowCost(graph, options): WorkflowCostResult { - const topoOrder = graph.topologicalOrder() - const upstreamSuccessProbs = new Map() - let totalEv = 0 - - for (const nodeId of topoOrder) { - const pEff = options.propagationMode === 'dag-propagate' - ? computeEffectiveP(nodeId, upstreamSuccessProbs, graph, options) - : getIntrinsicP(nodeId) - - const { ev, pSuccess } = calculateTaskEv(pEff, scopeCost, impactWeight, config) - upstreamSuccessProbs.set(nodeId, pSuccess) - totalEv += ev * impactWeight - } -} - -function computeEffectiveP(nodeId, upstreamSuccessProbs, graph, options): number { - const parents = graph.dependencies(nodeId) // inNeighbors - if (parents.length === 0) return getIntrinsicP(nodeId) - - let inheritedQuality = 1.0 - for (const parent of parents) { - const parentP = upstreamSuccessProbs.get(parent) - const degradation = getEdgeDegradation(parent, nodeId) ?? options.defaultQualityDegradation - inheritedQuality *= (parentP + (1 - parentP) * (1 - degradation)) - } - return getIntrinsicP(nodeId) * inheritedQuality -} -``` - -**Key design choices:** -- **Default mode:** `dag-propagate` — the independent model is the degenerate case (set `defaultQualityDegradation: 0`) -- **Edge-level `qualityDegradation`** — carried on `TaskGraphEdgeAttributes`, defaults to 0.9. Expressible via `fromRecords` and `fromJSON`; frontmatter `dependsOn` gets the default. -- **Per-task output includes both `pIntrinsic` and `pEffective`** so consumers can see the degradation effect -- **Depth-escalation** (increasing risk at deeper chain levels) is a future v2 consideration pending empirical calibration data from actual task outcomes - -### Comparison with Rust CLI - -| Dimension | Rust CLI (Simple Sum) | TS (DAG Propagation) | -|-----------|----------------------|---------------------| -| Topology awareness | None | Full — topological order + upstream propagation | -| Upstream failure modeling | Ignored | Each parent's failure degrades child's effective p | -| Edge semantics | Not used | `qualityDegradation` per edge, default 0.9 | -| Result interpretation | Sum of independent per-task costs | Total workflow cost accounting for cascading failure | -| Degenerate case | — | Set `propagationMode: 'independent'` or `defaultQualityDegradation: 0` | - -## Validation - -Two levels, consistent with the Rust CLI's `validate` command: - -1. **`validateSchema()`** — typebox `Value.Check` on input data (frontmatter fields, enum values, required fields) -2. **`validateGraph()`** — graph-level invariants: cycle detection, dangling dependency references -3. **`validate()`** — both, for convenience - -## Frontmatter Parsing - -Included in this package (not a separate module). Supports the same YAML frontmatter format as the Rust CLI. - -```typescript -function parseFrontmatter(markdown: string): TaskInput -function parseTaskFile(filePath: string): Promise -function parseTaskDirectory(dirPath: string): Promise -function serializeFrontmatter(task: TaskInput, body?: string): string -``` - -### No gray-matter — self-contained splitter + `yaml` - -We write our own `---` delimited frontmatter splitter (~40 lines) and use `yaml` (by eemeli) as the sole YAML parser. **`gray-matter` is not a dependency.** - -This is a deliberate supply-chain security decision: - -- **`gray-matter` depends on `js-yaml@3.x`** — an old version with known code injection vulnerabilities, pinned but unmaintained (last publish April 2021). Even with gray-matter's custom engine API, `js-yaml` is still *installed* in `node_modules` as a transitive dependency. The attack surface is the install, not the import. -- **js-yaml has an active CVE** (CVE-2025-64718 — prototype pollution via YAML merge key `<<`). Installing it at all is unacceptable. -- **gray-matter's full tree is 11 packages** (js-yaml, argparse, kind-of, section-matter, extend-shallow, is-extendable, strip-bom-string, etc.) — none of which we need for our use case. -- **Recent npm supply chain attacks** (April 2026: 18-package phishing compromise targeting chalk/debug/etc., the Shai-Hulud self-replicating worm hitting 500+ packages, the axios RAT incident) demonstrate that every dependency in the tree is potential attack surface. Small, focused libraries with zero transitive deps are the class of packages most likely to survive the current ecosystem trend — massive dependency trees for trivial functionality are becoming a liability. - -**The splitter implementation:** - -```typescript -import { parse as yamlParse, stringify as yamlStringify } from 'yaml' - -const DELIMITER = '---' - -function splitFrontmatter(str: string): { data: string; content: string } | null { - if (!str.startsWith(DELIMITER)) return null - if (str.charAt(DELIMITER.length) === DELIMITER.slice(-1)) return null - - const afterOpen = str.slice(DELIMITER.length) - const closeIndex = afterOpen.indexOf('\n' + DELIMITER) - if (closeIndex === -1) return null - - const data = afterOpen.slice(0, closeIndex) - const content = afterOpen.slice(closeIndex + 1 + DELIMITER.length).replace(/^\r?\n/, '') - return { data, content } -} - -function parseFrontmatter(markdown: string): { data: Record; content: string } { - const split = splitFrontmatter(markdown) - if (!split || split.data.trim() === '') return { data: {}, content: split?.content ?? markdown } - return { data: yamlParse(split.data), content: split.content } -} - -function serializeFrontmatter(task: TaskInput, body?: string): string { - const frontmatter = yamlStringify(task) - return DELIMITER + '\n' + frontmatter + DELIMITER + '\n' + (body ?? '') -} -``` - -**What we don't replicate from gray-matter:** TOML/Coffee engines, JavaScript eval engine, `section-matter` (nested sections), in-memory cache, `stringify()`. We don't use any of these. The `yaml` package handles `stringify` natively. - -**`yaml` package profile:** -- Zero dependencies, full YAML 1.2 spec compliance, no known CVEs -- Actively maintained, excellent TypeScript types -- Single-package blast radius — if it's ever compromised, we fork it (pure JS, tractable to maintain) - -### WASM YAML parser — considered and rejected - -A Rust YAML crate compiled to WASM was considered as an alternative. This would eliminate even the `yaml` JS dependency, but it reintroduces complexity the napi→graphology pivot was designed to remove (Rust toolchain in CI, WASM compile target, cold-start latency, FFI boundary). The marginal security gain over `yaml` (already zero-dep) doesn't justify the added build complexity. - -## Project Structure - -``` -taskgraph_ts/ -├── package.json -├── tsconfig.json -├── src/ -│ ├── index.ts # Public API surface, re-exports -│ ├── schema/ -│ │ ├── index.ts # Re-exports all schemas -│ │ ├── enums.ts # TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskStatus, TaskPriority -│ │ ├── task.ts # TaskInput, DependencyEdge schemas -│ │ ├── graph.ts # TaskGraphNodeAttributes, TaskGraphEdgeAttributes, SerializedGraph -│ │ └── results.ts # RiskPathResult, DecomposeResult, WorkflowCostResult, RiskDistributionResult -│ ├── graph/ -│ │ ├── index.ts # TaskGraph class -│ │ ├── construction.ts # fromTasks, fromRecords, fromJSON, incremental building -│ │ ├── queries.ts # hasCycles, findCycles, topologicalOrder, dependencies, dependents -│ │ └── mutation.ts # removeTask, removeDependency, updateTask, updateEdgeAttributes -│ ├── analysis/ -│ │ ├── index.ts # Re-exports -│ │ ├── critical-path.ts # criticalPath, weightedCriticalPath -│ │ ├── bottleneck.ts # bottlenecks (graphology betweenness) -│ │ ├── risk.ts # riskPath, riskDistribution -│ │ ├── cost-benefit.ts # calculateTaskEv, workflowCost, computeEffectiveP -│ │ ├── decompose.ts # shouldDecompose -│ │ └── defaults.ts # resolveDefaults, enum numeric methods -│ ├── frontmatter/ -│ │ ├── index.ts # parseFrontmatter, parseTaskFile, parseTaskDirectory, serializeFrontmatter -│ │ ├── parse.ts # YAML/frontmatter parsing + typebox validation -│ │ └── serialize.ts # TaskInput → markdown with frontmatter -│ └── error/ -│ └── index.ts # TaskgraphError, TaskNotFoundError, CircularDependencyError, InvalidInputError -├── test/ -│ ├── graph.test.ts -│ ├── analysis.test.ts -│ ├── schema.test.ts -│ ├── frontmatter.test.ts -│ └── cost-benefit.test.ts -└── docs/ - └── architecture.md # This file -``` - -## Dependencies - -| Package | Purpose | -|---------|---------| -| `graphology` | Directed graph data structure + event emitter | -| `graphology-dag` | hasCycle, topologicalSort, topologicalGenerations | -| `graphology-metrics` | betweenness centrality (bottleneck) | -| `graphology-components` | strongly-connected components (findCycles pre-check) | -| `graphology-operators` | subgraph extraction | -| `@alkdev/typebox` | Schema definition, static types, runtime validation | -| `yaml` | YAML 1.2 parser (zero dependencies, no known CVEs) | - -## Build & Distribution - -- **Package**: `@alkdev/taskgraph` on npm -- **Module**: ESM primary, CJS compat -- **Targets**: Node 18+, Deno, Bun — pure JS, no native addons -- **Build**: `tsc` for declarations + bundler for distribution -- **No platform-specific binaries** — this is the whole point of the pivot - -## Resolved Design Decisions - -1. **Incremental vs rebuild on file change** — **Rebuild.** For our graph sizes (10–200 nodes), `graph.import()` from a serialized blob is sub-millisecond. Incremental updates would require tracking ID renames, dependency removals, and edge reconciliation — a whole change-detection layer for zero measurable performance gain. Both consumers (alkhub builds from DB query results; OpenCode plugin rebuilds from directory on file change) are well-served by rebuild. If a future use case requires incremental updates, add it as an optimization then. - -2. **Subgraph behavior** — **Strict internal-only.** `subgraph(filter)` returns a new `TaskGraph` with matching nodes and only edges where both endpoints are in the filtered set. This matches `graphology-operators` `subgraph` behavior and produces valid subgraphs for all algorithms (topo sort, betweenness, etc.). External dependency information is available on the original graph via `dependencies()`/`dependents()`. A separate `externalDependencies(filter)` utility can be added later if consumers need "show me what this subgraph depends on outside itself." - -3. **`topologicalOrder` on cyclic graph** — **Throw `CircularDependencyError`.** Both consumers treat cycles as bugs: alkhub's data comes from a validated DB schema; the OpenCode plugin's data comes from frontmatter that should be validated before graph construction. A partial ordering return type adds API complexity for a case that shouldn't happen in practice. `findCycles()` already exists for debugging when cycles are detected. - -4. **`workflowCost` skip-completed semantics** — **Always propagate through completed nodes; exclude from output only.** When `includeCompleted: false`, completed tasks are excluded from the result's task list, but they **remain in the propagation chain** with p=1.0. Removing completed tasks from propagation would *worsen* downstream probability estimates — exactly the opposite of what "what's left" queries need. The "show me what's done / not done" UX concern belongs in `list` with status filtering, not in `workflowCost`. - -5. **Depth-escalation for DAG propagation** — **Deferred to v2.** The multiplicative propagation model already captures depth effects implicitly: each hop compounds another `<1.0` factor. The Python research model shows substantial EV divergence between good and poor upstream planning (213% cost increase) purely from this compounding — without any explicit depth penalty. Adding an explicit depth heuristic on top would double-count the depth effect until we have empirical calibration data. The architecture supports future depth-escalation via per-edge `qualityDegradation` adjustments or `risk` categorical escalation without API changes. - -6. **Edge key generation** — **Adopt `${source}->${target}` keys from the start.** Using `addEdgeWithKey` with deterministic keys (`task-a->task-b`) avoids graphology's random key generation overhead and produces readable/debuggable edge identifiers. The constraint — no parallel edges between the same node pair — is correct for DAG dependency graphs. Duplicate dependency declarations are a validation error, not a valid use case. - -## Class Decomposition: Avoiding the Monolith - -The `TaskGraph` class as specified has ~25 methods spanning graph construction, mutation, queries, analysis, cost-benefit math, validation, and export. Making it a monolith would create duplicate work: both alkhub and the OpenCode plugin need to call the same analysis functions, but through different dispatch mechanisms. - -**The library is decomposed into standalone functions + a thin `TaskGraph` data class.** - -The `TaskGraph` class handles **graph construction, mutation, and basic queries only**: - -```typescript -class TaskGraph { - // Construction - static fromTasks(tasks: TaskInput[]): TaskGraph - static fromRecords(tasks: TaskInput[], edges: DependencyEdge[]): TaskGraph - static fromJSON(data: TaskGraphSerialized): TaskGraph - addTask(id: string, attributes: TaskGraphNodeAttributes): void - addDependency(prerequisite: string, dependent: string): void - - // Mutation - removeTask(id: string): void - removeDependency(prerequisite: string, dependent: string): void - updateTask(id: string, attributes: Partial): void - updateEdgeAttributes(prerequisite: string, dependent: string, attrs: Partial): void - - // Queries - hasCycles(): boolean - findCycles(): string[][] - topologicalOrder(): string[] - dependencies(taskId: string): string[] - dependents(taskId: string): string[] - taskCount(): number - getTask(taskId: string): TaskGraphNodeAttributes | undefined - - // Export - export(): TaskGraphSerialized - toJSON(): TaskGraphSerialized - get raw(): Graph -} -``` - -**All analysis functions are standalone** — they take a `TaskGraph` (or its underlying `Graph`) as their first argument. This is what the project structure already reflects (`src/analysis/critical-path.ts`, `src/analysis/risk.ts`, etc.): - -```typescript -// Analysis functions (standalone, composable) -function parallelGroups(graph: TaskGraph): string[][] -function criticalPath(graph: TaskGraph): string[] -function weightedCriticalPath(graph: TaskGraph, weightFn: ...): string[] -function bottlenecks(graph: TaskGraph): Array<{ taskId: string; score: number }> -function riskPath(graph: TaskGraph): RiskPathResult -function shouldDecomposeTask(attrs: TaskGraphNodeAttributes): DecomposeResult -function workflowCost(graph: TaskGraph, options?: WorkflowCostOptions): WorkflowCostResult -function riskDistribution(graph: TaskGraph): RiskDistributionResult -``` - -**The operations pattern (env/registry) belongs at the consumer layer, not the library layer.** The library exports pure functions. The OpenCode plugin wraps them in its own dispatch (`task({action: "workflowCost"})`). alkhub wraps them in its own operation definitions. The library doesn't need a registry — it's a toolkit, not a service. - -This avoids duplicate work: the same `workflowCost` implementation is called by both consumers, each wrapping it in their own dispatch mechanism. - -## Performance Notes - -From graphology's performance tips (`/workspace/graphology/docs/performance-tips.md`): - -- Prefer callback iteration (`forEachNode`, `forEachEdge`) over array-returning methods (`nodes()`, `edges()`) when iterating -- Use `addEdgeWithKey` with simple incremental keys instead of `addEdge` to skip the automatic key generation overhead -- Avoid callback nesting in hot loops; hoist inner callbacks -- For bulk construction, `graph.import(serializedData)` is faster than N individual add calls - -Realistic task graphs (10–200 nodes) make all of this academic, but the patterns are free to adopt. - -## Threat Model Context - -For background on the security motivation: - -- **Attack vector**: Agents with bash access processing untrusted content (web pages, academic papers, API responses) can be manipulated via prompt injection. This includes subtle attacks like Unicode steganography hiding instructions in otherwise legitimate content. -- **Defense in depth**: The instruction firewall project (using Ternary Bonsai 1.7b classifier to detect instruction-bearing content) addresses detection. This project addresses the other side — reducing the blast radius by removing bash as a requirement for analysis operations. -- **Tool-based access**: Instead of `taskgraph --json list | jq`, agents call `task.list()` as a tool. No shell, no injection surface, no data exfiltration path through bash. -- **Supply chain defense**: We write our own `---` frontmatter splitter (~40 lines) and depend only on `yaml` (zero transitive deps, no known CVEs). No `gray-matter`, no `js-yaml` — eliminates 11 packages from the tree. Recent npm supply chain attacks (18-package phishing compromise, Shai-Hulud self-replicating worm, axios RAT) demonstrate that every installed dependency is attack surface. Small, focused libraries with zero transitive deps are the class of packages most likely to survive the current ecosystem trend — massive dependency trees for trivial functionality are becoming a liability. - -## References - -- Rust taskgraph CLI: `/workspace/@alkimiadev/taskgraph/` -- graphology monorepo: `/workspace/graphology/` -- alkhub task storage spec: `/workspace/@alkdev/alkhub_ts/docs/architecture/storage/tasks.md` -- @alkdev/typebox: `/workspace/@alkdev/typebox/` -- open-memory plugin (registry pattern ref): `/workspace/@alkdev/open-memory/` -- open-coordinator plugin (registry pattern ref): `/workspace/@alkimiadev/open-coordinator/` -- Older graphology + typebox POC: `/workspace/lbug_test/convert_graphology.ts` -- Older taskgraph MCP POC (graphology usage ref): `/workspace/tools/ade_mcp/src/core/TaskGraphManager.ts` -- Python cost-benefit research: `/workspace/@alkimiadev/taskgraph/docs/research/cost_benefit_analysis_framework.py` \ No newline at end of file +- [architecture/README.md](architecture/README.md) — Overview, problem statement, consumer context +- [architecture/graph-model.md](architecture/graph-model.md) — Edge direction, construction, defaults, metadata +- [architecture/api-surface.md](architecture/api-surface.md) — TaskGraph class, standalone functions, return types +- [architecture/schemas.md](architecture/schemas.md) — TypeBox schemas, enums, numeric methods +- [architecture/cost-benefit.md](architecture/cost-benefit.md) — EV math, risk, DAG propagation, findCycles +- [architecture/frontmatter.md](architecture/frontmatter.md) — Parsing, serialization, supply chain security +- [architecture/errors-validation.md](architecture/errors-validation.md) — Error types, validation levels +- [architecture/build-distribution.md](architecture/build-distribution.md) — Dependencies, project structure, targets +- [architecture/decisions/](architecture/decisions/) — ADR records for design decisions \ No newline at end of file diff --git a/docs/architecture/README.md b/docs/architecture/README.md new file mode 100644 index 0000000..17e977e --- /dev/null +++ b/docs/architecture/README.md @@ -0,0 +1,138 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# @alkdev/taskgraph Architecture + +Pure TypeScript task graph library with graphology. Replicates and extends the essential graph algorithms and cost-benefit math from the Rust taskgraph CLI. + +## Why This Exists + +The taskgraph CLI (`@alkimiadev/taskgraph`) is useful but requires bash access. In agent systems, bash + untrusted data sources is a security risk — adversarial content can instruct agents to exfiltrate data or take harmful actions through the shell. This has been observed in practice: researchers hiding prompt injections in academic papers using Unicode steganography that bypassed review systems. + +Rather than restricting which agents get bash access and hoping nothing goes wrong, this library exposes the graph and cost-benefit operations as a callable API — no shell involved. + +The same graph code also serves agents that *do* have bash access — they call these operations directly rather than shelling out to the CLI, which is faster and avoids argument parsing issues. + +## Core Principle + +**The graph algorithms and cost-benefit math are the value.** Everything else — frontmatter parsing, file discovery, CLI output formatting — is input/output that belongs to the caller or to specific consumers. + +This is a standalone implementation. It replicates the essential logic from the Rust CLI but does not depend on it. The upstream CLI continues to exist for human use and offline analysis. + +## Why Not NAPI/Rust + +The original draft specified a Rust core with napi-rs bindings. That added significant complexity with minimal benefit for our use case: + +- **Cross-platform build pain** — macOS x64/ARM64, Linux x64/ARM64, Windows x64. Each needs a separate binary. +- **Realistic graph sizes are small** — task graphs are typically 10–50 nodes, rarely exceeding 200. The performance difference between Rust and JS is negligible at this scale. +- **graphology already exists** — it provides all the DAG algorithms we need, and we already have it in the dependency tree. +- **Runtime compatibility** — pure JS/TS works in Node, Deno, and Bun without native addon headaches. +- **Future UI path** — graphology is the graph engine behind sigma.js/react-sigma, making visualization straightforward later. +- **Near 1:1 petgraph ↔ graphology mapping** — porting back to Rust later is tractable because the graph operation semantics align closely. + +> See [ADR-001: Pivot to TypeScript + graphology](decisions/001-pivot-to-typescript-graphology.md) for the full decision record. + +## What This Library Provides + +Replicated from the Rust CLI: + +- **Graph algorithms** — topological sort, cycle detection, parallel groups, critical path, bottleneck analysis, dependency queries +- **Categorical enums with numeric methods** — TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority, TaskStatus +- **Cost-benefit analysis** — expected value calculation, risk distribution, decomposition detection +- **DAG-propagation cost model** — extends the Rust CLI's independent model with multiplicative upstream failure propagation. The Rust CLI treats each task's cost independently; the Python research model demonstrates that this is dangerously optimistic for non-trivial workflows — poor planning (p=0.65) produces a 213% cost increase vs good planning (p=0.92) when accounting for cascading failure. + +> See [cost-benefit.md](cost-benefit.md) for the propagation model details. + +Not replicated (belongs to callers/specific consumers): + +- `Task` / `TaskFrontmatter` Rust structs — replaced by TypeBox schemas + graphology node attributes +- `TaskCollection` / directory scanning — filesystem discovery belongs to the consumer +- `Config` / `.taskgraph.toml` — CLI configuration, not a library concern +- `clap` command definitions — CLI dispatch, replaced by consumer's own dispatch +- `toDot()` / DOT export — added speculatively in Rust, not used, dropped +- Zod interop — TypeBox is the sole schema system + +## Consumer Context + +Two downstream projects consume this library. Understanding their needs shapes the library's construction and API design: + +### alkhub (hub-spoke coordinator) + +The hub's database is the source of truth for tasks at runtime. The coordinator loads task rows + dependency edges from the DB, builds a graphology graph in memory, and runs graph algorithms. This consumer: + +- Builds graphs from structured data (DB query results), not files +- Needs per-edge `qualityDegradation` attributes for the DAG propagation model +- Requires the same analysis functions the CLI provides, but called as an API, not via shell + +> See alkhub task storage spec: `/workspace/@alkdev/alkhub_ts/docs/architecture/storage/tasks.md` + +### OpenCode plugin (future) + +An OpenCode plugin following the registry pattern (like `@alkdev/open-memory` and `@alkdev/open-coordinator`). Will expose a `task` tool with `{action, args}` dispatch. Reads frontmatter from markdown files on disk, runs the same graph algorithms. Functionally replaces the taskgraph CLI for agents within OpenCode — no bash required. This consumer: + +- Builds graphs from file-based frontmatter, not DB queries +- Uses the library's frontmatter parsing (included in this package) +- Wraps library functions in its own dispatch mechanism +- Needs `init` as the only write action; all other actions are read-only (security model) + +The specific CLI→plugin dispatch mapping belongs in the plugin's own architecture, not here. The library's contract is: export pure functions, let consumers wrap them however they need. + +## Threat Model + +- **Attack vector**: Agents with bash access processing untrusted content (web pages, academic papers, API responses) can be manipulated via prompt injection, including subtle attacks like Unicode steganography hiding instructions in otherwise legitimate content. +- **Defense in depth**: The instruction firewall project (Ternary Bonsai classifier to detect instruction-bearing content) addresses detection. This library addresses the other side — reducing blast radius by removing bash as a requirement for analysis operations. +- **Tool-based access**: Instead of `taskgraph --json list | jq`, agents call library functions directly. No shell, no injection surface, no data exfiltration path through bash. +- **Supply chain defense**: The frontmatter parser avoids `gray-matter` (which pulls in the vulnerable `js-yaml@3.x`). The library depends only on `yaml` (zero transitive deps, no known CVEs). See [frontmatter.md](frontmatter.md) for the full supply chain argument. + +## Structural Principle: Upstream Failures Multiply + +The cost-benefit framework demonstrates a structural property independent of developer type (human, LLM, or otherwise): errors upstream multiply the surface area for errors downstream. + +``` +planning failure → wrong decomposition → wasted implementation +decomposition failure → unclear tasks → rework +review failure → bugs shipped → rework +``` + +This is why the library implements DAG-propagation as the default cost model: it captures this multiplicative effect structurally, rather than treating each task's cost as independent. When people simplistically complain about "AI slop," what they should really be saying is "I suck at planning and that leads to poor implementations" — the structural property holds regardless of who's doing the work. + +> See [cost-benefit.md](cost-benefit.md) and the Rust taskgraph's framework doc: `/workspace/@alkimiadev/taskgraph/docs/framework.md` + +## Architecture Documents + +| Document | Content | +|----------|---------| +| [graph-model.md](graph-model.md) | Edge direction, construction paths, categorical defaults, node metadata, reactivity | +| [api-surface.md](api-surface.md) | TaskGraph data class, standalone analysis functions, return types | +| [schemas.md](schemas.md) | TypeBox schemas, categorical enums, numeric methods | +| [cost-benefit.md](cost-benefit.md) | EV math, risk analysis, DAG propagation, findCycles approach | +| [frontmatter.md](frontmatter.md) | Parsing, serialization, supply chain security decisions | +| [errors-validation.md](errors-validation.md) | Error types, validation levels | +| [build-distribution.md](build-distribution.md) | Dependencies, project structure, targets, performance | + +### Design Decisions + +All significant decisions are documented as ADRs in [decisions/](decisions/): + +| ADR | Decision | +|-----|----------| +| [001](decisions/001-pivot-to-typescript-graphology.md) | Pivot from NAPI/Rust to TypeScript + graphology | +| [002](decisions/002-rebuild-vs-incremental.md) | Rebuild graph on change, not incremental updates | +| [003](decisions/003-topo-order-throws-on-cycle.md) | topologicalOrder throws CircularDependencyError | +| [004](decisions/004-workflow-cost-dag-propagation.md) | DAG-propagation as default workflow cost model | +| [005](decisions/005-no-depth-escalation-v1.md) | No depth-escalation heuristic in v1 | +| [006](decisions/006-deterministic-edge-keys.md) | Deterministic edge keys via addEdgeWithKey | +| [007](decisions/007-subgraph-internal-only.md) | Subgraph returns internal-only edges | + +## References + +- Rust taskgraph CLI: `/workspace/@alkimiadev/taskgraph/` +- graphology monorepo: `/workspace/graphology/` +- alkhub task storage spec: `/workspace/@alkdev/alkhub_ts/docs/architecture/storage/tasks.md` +- @alkdev/typebox: `/workspace/@alkdev/typebox/` +- Cost-benefit framework: `/workspace/@alkimiadev/taskgraph/docs/framework.md` +- Workflow guide: `/workspace/@alkimiadev/taskgraph/docs/workflow.md` +- Python cost-benefit research: `/workspace/@alkimiadev/taskgraph/docs/research/cost_benefit_analysis_framework.py` +- SDD process: `/workspace/@alkdev/taskgraph_ts/docs/sdd_process.md` \ No newline at end of file diff --git a/docs/architecture/api-surface.md b/docs/architecture/api-surface.md new file mode 100644 index 0000000..947f747 --- /dev/null +++ b/docs/architecture/api-surface.md @@ -0,0 +1,215 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# API Surface + +The library's public API: a thin `TaskGraph` data class for graph construction/mutation/basic queries, plus standalone composable analysis functions. + +## Design Principle: Decomposition over Monolith + +The `TaskGraph` class handles **graph construction, mutation, and basic queries only**. All analysis functions (parallel groups, critical path, cost-benefit, etc.) are standalone functions that take a `TaskGraph` as their first argument. + +**Why**: Both consumers (alkhub, OpenCode plugin) need the same analysis functions but through different dispatch mechanisms. The library exports pure functions; each consumer wraps them in its own dispatch. This avoids duplicate work and prevents the class from becoming a 25+ method monolith. + +> The operations/dispatch pattern belongs at the consumer layer, not the library layer. The library is a toolkit, not a service. + +## TaskGraph Class + +```typescript +class TaskGraph { + // Construction + static fromTasks(tasks: TaskInput[]): TaskGraph + static fromRecords(tasks: TaskInput[], edges: DependencyEdge[]): TaskGraph + static fromJSON(data: TaskGraphSerialized): TaskGraph + addTask(id: string, attributes: TaskGraphNodeAttributes): void + addDependency(prerequisite: string, dependent: string): void + + // Mutation + removeTask(id: string): void + removeDependency(prerequisite: string, dependent: string): void + updateTask(id: string, attributes: Partial): void + updateEdgeAttributes(prerequisite: string, dependent: string, attrs: Partial): void + + // Queries + hasCycles(): boolean + findCycles(): string[][] + topologicalOrder(): string[] // throws CircularDependencyError if cyclic + dependencies(taskId: string): string[] + dependents(taskId: string): string[] + taskCount(): number + getTask(taskId: string): TaskGraphNodeAttributes | undefined + + // Subgraph + subgraph(filter: (taskId: string, attrs: TaskGraphNodeAttributes) => boolean): TaskGraph + + // Export + export(): TaskGraphSerialized + toJSON(): TaskGraphSerialized + + // Reactivity + get raw(): Graph // underlying graphology instance for direct event listener attachment +} +``` + +**Notes**: +- `topologicalOrder()` throws `CircularDependencyError` (with `cycles` populated) when cyclic — see [ADR-003](decisions/003-topo-order-throws-on-cycle.md) +- `subgraph()` returns a new `TaskGraph` with matching nodes and only edges where both endpoints are in the filtered set — see [ADR-007](decisions/007-subgraph-internal-only.md) +- `addDependency` uses `addEdgeWithKey` with deterministic keys (`${source}->${target}`) — see [ADR-006](decisions/006-deterministic-edge-keys.md) +- `addTask` throws `DuplicateNodeError` if the ID already exists, `addDependency` throws `DuplicateEdgeError` if the edge already exists, and `TaskNotFoundError` if either endpoint doesn't exist in the graph — see [errors-validation.md](errors-validation.md) + +## Standalone Analysis Functions + +All analysis functions take a `TaskGraph` (or its raw graphology `Graph`) as their first argument. They are composable and stateless. + +### Graph analysis + +```typescript +function parallelGroups(graph: TaskGraph): string[][] +function criticalPath(graph: TaskGraph): string[] +function weightedCriticalPath(graph: TaskGraph, weightFn: (taskId: string, attrs: TaskGraphNodeAttributes) => number): string[] +function bottlenecks(graph: TaskGraph): Array<{ taskId: string; score: number }> +``` + +### Cost-benefit analysis + +```typescript +function riskPath(graph: TaskGraph): RiskPathResult +function shouldDecomposeTask(attrs: TaskGraphNodeAttributes): DecomposeResult +function workflowCost(graph: TaskGraph, options?: WorkflowCostOptions): WorkflowCostResult +function riskDistribution(graph: TaskGraph): RiskDistributionResult +``` + +> **Note on `shouldDecomposeTask`**: Takes `TaskGraphNodeAttributes` (nullable categorical fields) and internally calls `resolveDefaults` for `risk` and `scope`. Unassessed fields (null) use defaults that are below the decomposition threshold, so only explicitly-assessed high-risk or broad-scope tasks are flagged. See [cost-benefit.md](cost-benefit.md). + +> **Note on `workflowCost` vs `calculateTaskEv`**: `calculateTaskEv` is a pure math function (takes numeric inputs, returns `EvResult`). `workflowCost` orchestrates the per-task calls, handles DAG propagation, and enriches results with `taskId` and `name` from the graph's node attributes. The per-task `EvResult` is a subset of `WorkflowCostResult.tasks[i]`. + +### Categorical enum numeric methods + +```typescript +function scopeCostEstimate(scope: TaskScope): number // 1.0–5.0 +function scopeTokenEstimate(scope: TaskScope): number // 500–10000 +function riskSuccessProbability(risk: TaskRisk): number // 0.50–0.98 +function riskWeight(risk: TaskRisk): number // 0.02–0.50 +function impactWeight(impact: TaskImpact): number // 1.0–3.0 + +function resolveDefaults(attrs: Partial): ResolvedTaskAttributes +``` + +### Cost-benefit core + +```typescript +function calculateTaskEv(p: number, scopeCost: number, impactWeight: number, config?: EvConfig): EvResult +``` + +> See [schemas.md](schemas.md) for the enum definitions and numeric mapping tables. + +## Return Types + +All return types are defined as TypeBox schemas (for runtime validation + JSON Schema export) with corresponding static TypeScript types. + +### RiskPathResult + +```typescript +const RiskPathResult = Type.Object({ + path: Type.Array(Type.String()), + totalRisk: Type.Number(), +}) +``` + +### DecomposeResult + +```typescript +const DecomposeResult = Type.Object({ + shouldDecompose: Type.Boolean(), + reasons: Type.Array(Type.String()), +}) +``` + +### WorkflowCostOptions + +```typescript +const WorkflowCostOptions = Type.Object({ + includeCompleted: Type.Optional(Type.Boolean()), + limit: Type.Optional(Type.Number()), + propagationMode: Type.Optional( + Type.Union([Type.Literal("independent"), Type.Literal("dag-propagate")]) + ), + defaultQualityDegradation: Type.Optional(Type.Number()), +}) +``` + +### WorkflowCostResult + +```typescript +const WorkflowCostResult = Type.Object({ + tasks: Type.Array( + Type.Object({ + taskId: Type.String(), + name: Type.String(), + ev: Type.Number(), + pIntrinsic: Type.Number(), + pEffective: Type.Number(), + probability: Type.Number(), + scopeCost: Type.Number(), + impactWeight: Type.Number(), + }) + ), + totalEv: Type.Number(), + averageEv: Type.Number(), + propagationMode: Type.Union([ + Type.Literal("independent"), + Type.Literal("dag-propagate"), + ]), +}) +``` + +### EvConfig / EvResult + +```typescript +const EvConfig = Type.Object({ + retries: Type.Optional(Type.Number()), + fallbackCost: Type.Optional(Type.Number()), + timeLost: Type.Optional(Type.Number()), + valueRate: Type.Optional(Type.Number()), +}) + +const EvResult = Type.Object({ + ev: Type.Number(), + pSuccess: Type.Number(), + expectedRetries: Type.Number(), +}) +``` + +### RiskDistributionResult + +```typescript +const RiskDistributionResult = Type.Object({ + trivial: Type.Array(Type.String()), + low: Type.Array(Type.String()), + medium: Type.Array(Type.String()), + high: Type.Array(Type.String()), + critical: Type.Array(Type.String()), + unspecified: Type.Array(Type.String()), +}) +``` + +> Full schema definitions with Static type exports are in [schemas.md](schemas.md). + +## Validation API + +```typescript +// On TaskGraph instances: +validateSchema(): ValidationError[] // TypeBox validation on input data +validateGraph(): GraphValidationError[] // Graph-level invariants (cycles, dangling refs) +validate(): ValidationError[] // Both, for convenience +``` + +> See [errors-validation.md](errors-validation.md) for error types and validation details. + +## Constraints + +- **No write actions in analysis functions** — all analysis functions are pure reads. `shouldDecomposeTask` only inspects attributes, it doesn't modify the graph. +- **throw-on-cycle for topo sort** — `topologicalOrder` throws rather than returning a partial result. See [ADR-003](decisions/003-topo-order-throws-on-cycle.md). +- **Analysis functions are independent** — they can be called in any order, without prerequisites beyond a valid graph. \ No newline at end of file diff --git a/docs/architecture/build-distribution.md b/docs/architecture/build-distribution.md new file mode 100644 index 0000000..bad0e32 --- /dev/null +++ b/docs/architecture/build-distribution.md @@ -0,0 +1,89 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Build & Distribution + +Dependencies, project structure, build targets, and performance notes. + +## Dependencies + +| Package | Purpose | +|---------|---------| +| `graphology` | Directed graph data structure + event emitter | +| `graphology-dag` | hasCycle, topologicalSort, topologicalGenerations | +| `graphology-metrics` | betweenness centrality (bottleneck) | +| `graphology-components` | strongly-connected components (findCycles pre-check) | +| `graphology-operators` | subgraph extraction | +| `@alkdev/typebox` | Schema definition, static types, runtime validation | +| `yaml` | YAML 1.2 parser (zero dependencies, no known CVEs) | + +## Project Structure + +``` +taskgraph_ts/ +├── package.json +├── tsconfig.json +├── src/ +│ ├── index.ts # Public API surface, re-exports +│ ├── schema/ +│ │ ├── index.ts # Re-exports all schemas +│ │ ├── enums.ts # TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskStatus, TaskPriority +│ │ ├── task.ts # TaskInput, DependencyEdge schemas +│ │ ├── graph.ts # TaskGraphNodeAttributes, TaskGraphEdgeAttributes, SerializedGraph +│ │ └── results.ts # RiskPathResult, DecomposeResult, WorkflowCostResult, RiskDistributionResult +│ ├── graph/ +│ │ ├── index.ts # TaskGraph class +│ │ ├── construction.ts # fromTasks, fromRecords, fromJSON, incremental building +│ │ ├── queries.ts # hasCycles, findCycles, topologicalOrder, dependencies, dependents +│ │ └── mutation.ts # removeTask, removeDependency, updateTask, updateEdgeAttributes +│ ├── analysis/ +│ │ ├── index.ts # Re-exports +│ │ ├── critical-path.ts # criticalPath, weightedCriticalPath +│ │ ├── bottleneck.ts # bottlenecks (graphology betweenness) +│ │ ├── risk.ts # riskPath, riskDistribution +│ │ ├── cost-benefit.ts # calculateTaskEv, workflowCost, computeEffectiveP +│ │ ├── decompose.ts # shouldDecomposeTask +│ │ └── defaults.ts # resolveDefaults, enum numeric methods +│ ├── frontmatter/ +│ │ ├── index.ts # parseFrontmatter, parseTaskFile, parseTaskDirectory, serializeFrontmatter +│ │ ├── parse.ts # YAML/frontmatter parsing + typebox validation +│ │ └── serialize.ts # TaskInput → markdown with frontmatter +│ └── error/ +│ └── index.ts # TaskgraphError, TaskNotFoundError, CircularDependencyError, InvalidInputError +├── test/ +│ ├── graph.test.ts +│ ├── analysis.test.ts +│ ├── schema.test.ts +│ ├── frontmatter.test.ts +│ └── cost-benefit.test.ts +└── docs/ + └── architecture/ # This architecture document set +``` + +The structure reflects the decomposition decision: `src/analysis/` contains standalone functions, `src/graph/` contains the TaskGraph data class. This is not an accident — it enforces at the filesystem level that analysis functions are separate from the graph class. + +## Build & Distribution + +- **Package**: `@alkdev/taskgraph` on npm +- **Module**: ESM primary, CJS compat +- **Targets**: Node 18+, Deno, Bun — pure JS, no native addons +- **Build**: `tsc` for declarations + bundler for distribution +- **No platform-specific binaries** — this is the whole point of the pivot from NAPI/Rust + +## Performance Notes + +From graphology's performance tips: +- Prefer callback iteration (`forEachNode`, `forEachEdge`) over array-returning methods (`nodes()`, `edges()`) when iterating +- Use `addEdgeWithKey` with deterministic `${source}->${target}` keys instead of `addEdge` to skip the automatic key generation overhead — see [ADR-006](decisions/006-deterministic-edge-keys.md) +- Avoid callback nesting in hot loops; hoist inner callbacks +- For bulk construction, `graph.import(serializedData)` is faster than N individual add calls + +Realistic task graphs (10–200 nodes) make all of this academic, but the patterns are free to adopt. + +## Constraints + +- **Pure JavaScript** — no Rust, no WASM, no native addons. This is non-negotiable — it's the core design decision. +- **ESM primary** — CJS compat is a distribution concern, not a design choice. Consumers should import as ESM. +- **No platform-specific binaries** — the library must work in Node, Deno, and Bun without compilation steps. \ No newline at end of file diff --git a/docs/architecture/cost-benefit.md b/docs/architecture/cost-benefit.md new file mode 100644 index 0000000..0cc43d2 --- /dev/null +++ b/docs/architecture/cost-benefit.md @@ -0,0 +1,131 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Cost-Benefit Analysis + +Expected value math, risk analysis, DAG-propagation cost model, and cycle detection. + +## Overview + +The cost-benefit functions are the key analytical value of the library. They go beyond simple graph topology to answer structural questions about task workflows: which path has the highest cumulative risk? What's the expected cost of a workflow? Which tasks should be decomposed? + +These functions implement the cost-benefit framework from `/workspace/@alkimiadev/taskgraph/docs/framework.md` and extend it with DAG-propagation (from the Python research model) that the Rust CLI's independent model ignores. + +## Core Concepts + +### Expected Value of a Task + +``` +EV_task = P_success × C_success + (1 - P_success) × C_fail +``` + +Where categorical fields provide the inputs: +- **P_success** = `riskSuccessProbability(risk)` — probability the task completes successfully +- **C_success** = `scopeCostEstimate(scope)` — cost when it works +- **C_fail** = modeled via `EvConfig` parameters: `scopeCost + fallbackCost + timeLost × expectedRetries`. The `calculateTaskEv` function uses `scopeCost` as `C_success` and derives `C_fail` from the same `scopeCost` plus `fallbackCost` and `timeLost` scaled by expected retry count. `fallbackCost` and `timeLost` default to 0 if not provided, yielding `C_fail = C_success` in the simplest case. The `valueRate` parameter converts the result to dollar terms if needed. + +### Structural Insight: Upstream Failures Multiply + +``` +planning failure → wrong decomposition → wasted implementation +decomposition failure → unclear tasks → rework +review failure → bugs shipped → rework +``` + +This means `risk: critical` at planning level > `risk: critical` at implementation level. The cost-benefit framework demonstrates this: poor planning (p=0.65) increases total cost by 150% compared to good planning (p=0.92), even with identical implementation tasks. + +The failure propagates: poor planning reduces decomposition quality, which reduces implementation effectiveness, which increases integration issues. This structural property is independent of the developer type — human, LLM, or otherwise. + +### Decomposition Threshold + +`shouldDecomposeTask` flags tasks where: +- risk >= high, OR +- scope >= broad + +This is a structural insight: large or risky tasks have higher failure rates and should be broken down. The threshold is consistent with the Rust CLI's `decompose` command. + +## DAG-Propagation Cost Model + +### Why + +The Rust CLI computes EV per-task independently — no upstream quality degradation. As the Python research model demonstrates, this is dangerously optimistic for non-trivial workflows. In a dependency chain where planning has p=0.65 (poor), the Python model shows a **213% cost increase** vs good planning (p=0.92). The independent model barely shows a difference because it ignores cascading failure. + +### Implementation Approach + +DAG propagation is the **default mode**. The independent model is a degenerate case (set `defaultQualityDegradation: 0` or `propagationMode: 'independent'`). + +The algorithm processes tasks in topological order, maintaining an `upstreamSuccessProbs` map: + +1. For each task in topological order: + - If propagation mode is `dag-propagate`: compute `pEffective` from intrinsic probability + upstream propagation + - If propagation mode is `independent`: use intrinsic probability directly + - Calculate EV using `calculateTaskEv` + - Store the task's actual success probability for downstream propagation + +2. When computing effective probability for a task with prerequisites: + - Start with intrinsic probability + - For each prerequisite, compute inherited quality: `parentP + (1 - parentP) × (1 - qualityDegradation)` + - Multiply all inherited quality factors together with intrinsic probability + +3. The `qualityDegradation` per edge determines how much a parent's failure bleeds through: + - 0.0 = no propagation (independent model) + - 1.0 = full propagation (parent failure guarantees child failure) + - default 0.9 = high but not total propagation + +### Per-task output + +Each task in the `WorkflowCostResult.tasks` array includes both `pIntrinsic` and `pEffective` so consumers can see the degradation effect. The per-task entries also include `taskId` and `name` (enriched from the graph's node attributes) — `calculateTaskEv` is the pure math function (takes only numeric inputs), while `workflowCost` is the aggregate that orchestrates the per-task calls and enriches results with identity metadata from the graph. + +### Skip-completed semantics + +When `includeCompleted: false`, completed tasks are excluded from the result's task list, but they **remain in the propagation chain** with p=1.0. Removing completed tasks from propagation would *worsen* downstream probability estimates — exactly the opposite of what "what's left" queries need. + +> See [ADR-004](decisions/004-workflow-cost-dag-propagation.md) and [ADR-005](decisions/005-no-depth-escalation-v1.md). + +### Comparison with Rust CLI + +| Dimension | Rust CLI (Simple Sum) | This Library (DAG Propagation) | +|-----------|----------------------|-------------------------------| +| Topology awareness | None | Full — topological order + upstream propagation | +| Upstream failure modeling | Ignored | Each parent's failure degrades child's effective p | +| Edge semantics | Not used | `qualityDegradation` per edge, default 0.9 | +| Result interpretation | Sum of independent per-task costs | Total workflow cost accounting for cascading failure | +| Degenerate case | — | Set `propagationMode: 'independent'` or `defaultQualityDegradation: 0` | + +## Risk Analysis Functions + +### riskPath + +`riskPath(graph)` → `RiskPathResult` + +Calls `weightedCriticalPath` with weight function `riskWeight * impactWeight`. Returns the path with highest cumulative risk and its total risk score. + +### riskDistribution + +`riskDistribution(graph)` → `RiskDistributionResult` + +Groups tasks by risk category. Returns counts per bucket: trivial, low, medium, high, critical, unspecified. + +### shouldDecomposeTask + +`shouldDecomposeTask(attrs: TaskGraphNodeAttributes)` → `DecomposeResult` + +Pure function — takes node attributes (not a graph). Internally calls `resolveDefaults` to handle nullable `risk`/`scope` fields. A task with `risk: null` uses the default (medium, which is below the threshold); a task with `scope: null` uses the default (narrow, which is below the threshold). This means unassessed tasks are never flagged for decomposition — an explicit `risk: "high"` or `scope: "broad"` is required. + +## findCycles + +graphology provides `hasCycle` (boolean) and `stronglyConnectedComponents` (node groups, not paths). The library implements a custom cycle path extractor for error reporting: + +- **Algorithm**: Extended 3-color DFS (WHITE/GREY/BLACK). When a back edge is found (GREY → GREY), trace back through the recursion stack to extract the cycle path as an ordered node sequence. Each inner array in the returned `string[][]` is a single cycle — an ordered sequence of node IDs where the last node has an edge back to the first. The algorithm returns **one representative cycle per back edge**, not an exhaustive enumeration of all simple cycles (which could be exponential). For error reporting, one cycle per problematic region is sufficient. +- **Optimization**: Use `stronglyConnectedComponents()` as a fast pre-check. If there are zero multi-node SCCs (and no self-loops), skip the DFS entirely. +- **Relationship to topologicalOrder**: `topologicalOrder()` throws `CircularDependencyError` (with `cycles` populated from `findCycles`) when the graph is cyclic. This gives consumers the cycle information needed for error reporting. + +> See [errors-validation.md](errors-validation.md) for error handling. + +## Constraints + +- **DAG-propagation is default** — the independent model is opt-in, not the other way around. The independent model is the degenerate case, not the norm. +- **No depth-escalation in v1** — the multiplicative propagation model already captures depth effects implicitly (each hop compounds another `<1.0` factor). Adding an explicit depth penalty would double-count until we have empirical calibration data. See [ADR-005](decisions/005-no-depth-escalation-v1.md). +- **Categorical estimates, not numeric** — The framework uses categorical fields because LLMs reliably distinguish "high vs medium risk" but struggle with "$3.42 vs $3.50". Categoricals remain valid across environments (different models, providers, token costs). \ No newline at end of file diff --git a/docs/architecture/decisions/001-pivot-to-typescript-graphology.md b/docs/architecture/decisions/001-pivot-to-typescript-graphology.md new file mode 100644 index 0000000..906f1fa --- /dev/null +++ b/docs/architecture/decisions/001-pivot-to-typescript-graphology.md @@ -0,0 +1,31 @@ +# ADR-001: Pivot from NAPI/Rust to TypeScript + graphology + +**Status**: Accepted + +## Context + +The original design specified a Rust core with napi-rs bindings, extracting the graph logic from the existing taskgraph CLI. This would provide high performance but introduced significant complexity. + +## Decision + +Pivot to pure TypeScript with graphology as the graph engine. No Rust compilation, no native addons, no platform-specific binaries. + +## Consequences + +### Positive +- Cross-platform builds eliminated — pure JS works in Node, Deno, and Bun +- graphology already provides all needed DAG algorithms, and is already in our dependency tree +- Publishing is simple (`npm publish` with no CI matrix for platform binaries) +- Future UI path is straightforward — graphology powers sigma.js/react-sigma +- Near 1:1 petgraph ↔ graphology mapping means porting back to Rust is tractable + +### Negative +- Raw algorithm performance is slower than Rust for very large graphs +- graphology's API differences require adaptation (not a drop-in petgraph replacement) + +### Neutral +- The Rust CLI continues to exist for human/offline use — this is not a replacement, it's a parallel implementation for different consumers + +## Trade-off + +Performance at realistic graph sizes (10–200 nodes) is negligible between Rust and JS. The build/publish complexity savings of pure JS massively outweigh the theoretical performance gain. \ No newline at end of file diff --git a/docs/architecture/decisions/002-rebuild-vs-incremental.md b/docs/architecture/decisions/002-rebuild-vs-incremental.md new file mode 100644 index 0000000..6c4086a --- /dev/null +++ b/docs/architecture/decisions/002-rebuild-vs-incremental.md @@ -0,0 +1,26 @@ +# ADR-002: Rebuild graph on change, not incremental updates + +**Status**: Accepted + +## Context + +When task data changes (file edits, DB updates), the in-memory graph needs to reflect the new state. Two approaches: incremental updates (add/remove individual nodes/edges) or full rebuild from source data. + +## Decision + +**Rebuild.** For our graph sizes (10–200 nodes), `graph.import()` from a serialized blob is sub-millisecond. Both consumers (alkhub builds from DB query results; OpenCode plugin rebuilds from directory on file change) are well-served by rebuild. + +## Consequences + +### Positive +- No change-detection layer needed — no tracking ID renames, dependency removals, edge reconciliation +- Simpler codebase — no diff algorithm, no incremental update logic +- Always consistent — rebuild guarantees the graph matches the source data exactly + +### Negative +- Technically wasteful for small changes (rebuilding entire graph when one task changed) +- Not suitable for very large graphs or extremely frequent updates + +### Mitigation + +If a future use case requires incremental updates, add it as an optimization then. The API surface (construction methods) supports both patterns — incremental construction exists via `addTask`/`addDependency`. \ No newline at end of file diff --git a/docs/architecture/decisions/003-topo-order-throws-on-cycle.md b/docs/architecture/decisions/003-topo-order-throws-on-cycle.md new file mode 100644 index 0000000..689d13d --- /dev/null +++ b/docs/architecture/decisions/003-topo-order-throws-on-cycle.md @@ -0,0 +1,27 @@ +# ADR-003: topologicalOrder throws CircularDependencyError on cyclic graphs + +**Status**: Accepted + +## Context + +When a graph has cycles, topological sort cannot produce a complete ordering. Options: return `null`, return a partial ordering, or throw an error with cycle information. + +## Decision + +**Throw `CircularDependencyError`** with `cycles` populated from `findCycles()`. Do not return a partial ordering or `null`. + +## Consequences + +### Positive +- Prevents silent ignoring of cycles — consumers get explicit error information +- `CircularDependencyError.cycles` provides the actual cycle paths for error reporting +- Simpler return type — `string[]` instead of `string[] | null` or `string[][]` +- Both consumers treat cycles as bugs: alkhub data comes from validated DB schema; OpenCode plugin data comes from frontmatter that should be validated before graph construction + +### Negative +- Callers who want "best effort" ordering on cyclic graphs must catch the error and call `findCycles()` separately +- Cannot get partial results — if you want "topo sort of the acyclic portions," that requires filtering first + +### Mitigation + +`findCycles()` and `hasCycles()` are available for consumers that want to handle cycles gracefully before calling `topologicalOrder()`. \ No newline at end of file diff --git a/docs/architecture/decisions/004-workflow-cost-dag-propagation.md b/docs/architecture/decisions/004-workflow-cost-dag-propagation.md new file mode 100644 index 0000000..3fc512c --- /dev/null +++ b/docs/architecture/decisions/004-workflow-cost-dag-propagation.md @@ -0,0 +1,28 @@ +# ADR-004: DAG-propagation as default workflow cost model + +**Status**: Accepted + +## Context + +The Rust CLI computes expected value per-task independently — no upstream quality degradation. The Python research model implements DAG-propagation where each parent's failure degrades the child's effective probability. The independent model is dangerously optimistic for non-trivial workflows: poor planning (p=0.65) shows a 213% cost increase vs good planning (p=0.92) with the propagation model, but barely any difference with the independent model. + +## Decision + +**DAG-propagation is the default mode.** The independent model is a degenerate case accessible via `propagationMode: 'independent'` or `defaultQualityDegradation: 0`. + +## Consequences + +### Positive +- More accurate cost estimates — captures the structural reality that upstream failures multiply downstream damage +- Per-task output includes both `pIntrinsic` and `pEffective` so consumers can see the degradation effect +- The independent model is still available as an opt-in degenerate case +- Per-edge `qualityDegradation` allows fine-grained modeling of how much each dependency bleeds failure + +### Negative +- More complex implementation than simple sum +- Results differ from the Rust CLI — consumers migrating from CLI to library will see different numbers +- Requires `qualityDegradation` per edge (default 0.9) which adds a concept the Rust CLI didn't have + +### Mitigation + +The `propagationMode` option allows consumers to start with the independent model and migrate to DAG-propagation when ready. The per-task `pIntrinsic`/`pEffective` split makes the propagation effect transparent. \ No newline at end of file diff --git a/docs/architecture/decisions/005-no-depth-escalation-v1.md b/docs/architecture/decisions/005-no-depth-escalation-v1.md new file mode 100644 index 0000000..783389e --- /dev/null +++ b/docs/architecture/decisions/005-no-depth-escalation-v1.md @@ -0,0 +1,26 @@ +# ADR-005: No depth-escalation heuristic in v1 + +**Status**: Accepted + +## Context + +In the DAG-propagation model, each hop compounds another `<1.0` factor. This implicitly captures depth effects — deeper chains have more compounding. An explicit depth-escalation heuristic (increasing risk at deeper chain levels) would add another multiplicative penalty on top. + +## Decision + +**Defer depth-escalation to v2.** The multiplicative propagation model already captures depth effects implicitly. Adding an explicit depth heuristic would double-count the depth effect until we have empirical calibration data from actual task outcomes. + +## Consequences + +### Positive +- No double-counting of depth effects +- Simpler model to explain, implement, and debug +- Architecture supports future depth-escalation via per-edge `qualityDegradation` adjustments or `risk` categorical escalation without API changes + +### Negative +- May underestimate cost for very deep dependency chains where risk genuinely escalates with depth +- The model treats all "hops" as equivalent — a 5-hop chain where each step is moderate risk may actually be worse than the model predicts + +### Future + +If empirical data from actual task outcomes shows that depth-escalation is needed, it can be added without API changes — either by adjusting `qualityDegradation` per depth, or by escalating the `risk` categorical. This is a calibration question, not an architecture question. \ No newline at end of file diff --git a/docs/architecture/decisions/006-deterministic-edge-keys.md b/docs/architecture/decisions/006-deterministic-edge-keys.md new file mode 100644 index 0000000..9c4dc01 --- /dev/null +++ b/docs/architecture/decisions/006-deterministic-edge-keys.md @@ -0,0 +1,26 @@ +# ADR-006: Deterministic edge keys via addEdgeWithKey + +**Status**: Accepted + +## Context + +graphology's default `addEdge(source, target)` generates random edge keys (e.g., `ge8kq2`). This makes debugging harder and adds overhead from key generation. For our use case, each source→target pair has at most one edge (no parallel edges in a DAG dependency graph). + +## Decision + +Use `addEdgeWithKey` with deterministic keys in the format `${source}->${target}` (e.g., `task-a->task-b`). This produces readable, debuggable edge identifiers and skips graphology's key generation overhead. + +## Consequences + +### Positive +- Debuggable edge identifiers — `task-a->task-b` is immediately understandable +- No random key generation overhead +- Deterministic — exporting and re-importing produces the same graph + +### Negative +- Constraint enforced: no parallel edges between the same node pair +- Key format collision if task IDs contain `->` (extremely unlikely with kebab-case slugs) + +### Mitigation + +Duplicate dependency declarations (same source→target pair declared twice) are a validation error, not a valid use case. The constraint is correct for DAG dependency graphs. \ No newline at end of file diff --git a/docs/architecture/decisions/007-subgraph-internal-only.md b/docs/architecture/decisions/007-subgraph-internal-only.md new file mode 100644 index 0000000..0978bf2 --- /dev/null +++ b/docs/architecture/decisions/007-subgraph-internal-only.md @@ -0,0 +1,25 @@ +# ADR-007: Subgraph returns internal-only edges + +**Status**: Accepted + +## Context + +When filtering a graph to a subset of nodes, what happens to edges where only one endpoint is in the filtered set? Options: include cross-boundary edges (external dependencies visible), or strict internal-only (only edges where both endpoints are in the filtered set). + +## Decision + +**Strict internal-only.** `subgraph(filter)` returns a new `TaskGraph` with matching nodes and only edges where both endpoints are in the filtered set. This matches `graphology-operators` `subgraph` behavior and produces valid subgraphs for all algorithms (topo sort, betweenness, etc.). + +## Consequences + +### Positive +- Result is always a valid (potentially disconnected) subgraph — all algorithms work correctly +- Matches graphology's built-in subgraph behavior +- No surprise external references in analysis results + +### Negative +- External dependency information is lost — you can't see "what does this subgraph depend on outside itself" from the subgraph alone + +### Mitigation + +External dependency information is available on the original graph via `dependencies()`/`dependents()`. A separate `externalDependencies(filter)` utility can be added later if consumers need "show me what this subgraph depends on outside itself." \ No newline at end of file diff --git a/docs/architecture/errors-validation.md b/docs/architecture/errors-validation.md new file mode 100644 index 0000000..d7f4d9b --- /dev/null +++ b/docs/architecture/errors-validation.md @@ -0,0 +1,129 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Errors & Validation + +Error types and validation levels for the library. + +## Error Types + +Typed error classes for programmatic recovery. All library errors extend `TaskgraphError`. + +```typescript +class TaskgraphError extends Error {} + +class TaskNotFoundError extends TaskgraphError { + taskId: string +} + +class CircularDependencyError extends TaskgraphError { + cycles: string[][] // each inner array is an ordered cycle path (last node → first node) +} + +class InvalidInputError extends TaskgraphError { + field: string + message: string +} + +class DuplicateNodeError extends TaskgraphError { + taskId: string +} + +class DuplicateEdgeError extends TaskgraphError { + source: string + target: string +} +``` + +### When Each Error Is Thrown + +| Error | Trigger | +|-------|---------| +| `TaskNotFoundError` | `getTask`, `dependencies`, `dependents` called with non-existent task ID | +| `CircularDependencyError` | `topologicalOrder()` called on a cyclic graph | +| `InvalidInputError` | Frontmatter parsing finds invalid field values or missing required fields | +| `DuplicateNodeError` | `addTask` called with an ID that already exists in the graph | +| `DuplicateEdgeError` | `addDependency` called for a source→target pair that already exists | + +### Mutation Operations on Non-Existent Targets + +| Operation | Behavior When Target Doesn't Exist | +|-----------|-----------------------------------| +| `removeTask(id)` | No-op — if the node doesn't exist, nothing to remove | +| `removeDependency(src, tgt)` | No-op — if the edge doesn't exist, nothing to remove | +| `updateTask(id, attrs)` | Throws `TaskNotFoundError` — cannot update attributes of a non-existent node | +| `updateEdgeAttributes(src, tgt, attrs)` | Throws `TaskNotFoundError` — cannot update attributes of a non-existent edge (implies at least one endpoint missing) | +| `addDependency(prereq, dep)` | Throws `TaskNotFoundError` — at least one endpoint must exist first (use `addTask` before `addDependency`) | + +This policy avoids silent failures on writes that should succeed (update, add) while allowing idempotent removals (remove is a no-op, not an error). + +## Validation Levels + +Two validation levels, consistent with the Rust CLI's `validate` command: + +### 1. Schema validation (`validateSchema()`) + +TypeBox `Value.Check` on input data — frontmatter fields, enum values, required fields. Returns `ValidationError[]`. Catches: +- Missing required fields (`id`, `name`) +- Invalid enum values (e.g., `risk: "extreme"`) +- Type mismatches (e.g., `dependsOn: "not-an-array"`) + +### 2. Graph validation (`validateGraph()`) + +Graph-level invariants — catches problems that exist between tasks, not within a single task. Returns `GraphValidationError[]`: +- Cycle detection (via `findCycles()`) +- Dangling dependency references (task depends on an ID not in the graph) + +### 3. Combined validation (`validate()`) + +Runs both schema and graph validation. Returns `ValidationError[]` (the union of both types). + +### Validation Return Types + +```typescript +interface ValidationError { + type: "schema" + taskId?: string // which task has the issue (if applicable) + field: string // which field is invalid + message: string // human-readable description + value?: unknown // the invalid value (if safe to include) +} + +interface GraphValidationError { + type: "graph" + category: "cycle" | "dangling-reference" + taskId?: string + message: string + details?: unknown // e.g., cycles: string[][] for cycle errors +} +``` + +Both types are returned as arrays. Validation never throws — it collects all issues and returns them. This allows consumers to implement "collect all errors" strategies. + +## Cycle Handling + +The library takes a strict approach to cycles: + +- `hasCycles()` returns a boolean — no side effects +- `findCycles()` returns the actual cycle paths — for debugging and error reporting +- `topologicalOrder()` **throws** `CircularDependencyError` when the graph is cyclic, rather than returning a partial ordering — see [ADR-003](decisions/003-topo-order-throws-on-cycle.md) + +**Cyclic graphs are a valid graph state** — they can be constructed, queried, and validated. Only operations that require a DAG (topo sort, critical path, parallel groups, workflow cost) throw on cycles. Construction never throws. + +## Construction vs. Validation Error Handling + +The fundamental contract: + +1. **Construction never throws** — `fromTasks`, `fromRecords`, `fromJSON`, `addTask`, `addDependency` can be called freely. `DuplicateNodeError` and `DuplicateEdgeError` are the exceptions — they represent programming errors (adding something that already exists), not data validation issues. +2. **Validation returns error arrays** — `validateSchema()`, `validateGraph()`, and `validate()` collect issues without throwing. +3. **`topologicalOrder()` is the operation-level exception** — it throws because returning a partial result would be silently incorrect. + +This distinction exists because validation is a "check before you proceed" operation (collect all issues, show the user), while topo sort is an operation that cannot produce a meaningful result on a cyclic graph. + +## Constraints + +- **All errors are typed** — no string-based error matching. Consumers can catch specific error classes. +- **Validation returns arrays, not throws** — consumers choose their own error handling strategy (fail-fast vs. collect-all-errors). +- **`topologicalOrder` is the sole exception** — it throws on cyclic graphs because returning a partial result would be silently incorrect. \ No newline at end of file diff --git a/docs/architecture/frontmatter.md b/docs/architecture/frontmatter.md new file mode 100644 index 0000000..0bbe09b --- /dev/null +++ b/docs/architecture/frontmatter.md @@ -0,0 +1,78 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Frontmatter Parsing + +Parsing and serialization of task markdown files with YAML frontmatter. Included in this package, not a separate module. + +## Overview + +The library provides frontmatter parsing so that file-based consumers (e.g., the future OpenCode plugin) can read task markdown files directly without depending on an external parser. This supports the same YAML frontmatter format as the Rust CLI. + +## Public Functions + +```typescript +function parseFrontmatter(markdown: string): TaskInput +function parseTaskFile(filePath: string): Promise +function parseTaskDirectory(dirPath: string): Promise +function serializeFrontmatter(task: TaskInput, body?: string): string +``` + +`parseFrontmatter` and `parseTaskFile` also run TypeBox validation on the parsed data before returning — invalid frontmatter throws `InvalidInputError` with field-level details. + +### parseTaskDirectory Semantics + +- **Recursive** — scans subdirectories recursively +- **File extension** — `.md` only +- **No frontmatter** — files without valid `---`-delimited frontmatter are silently skipped +- **I/O errors** — throws the underlying Node.js error (ENOENT, EACCES, etc.) + +This is a convenience wrapper for the common case. Consumers that need different discovery semantics (non-recursive, different extensions, custom filtering) should implement their own file discovery and call `parseTaskFile` per file. + +## No gray-matter — Self-contained Splitter + yaml + +The library writes its own `---` delimited frontmatter splitter and uses `yaml` (by eemeli) as the sole YAML parser. **`gray-matter` is not a dependency.** + +This is a deliberate supply-chain security decision: + +- **`gray-matter` depends on `js-yaml@3.x`** — an old version with known code injection vulnerabilities (CVE-2025-64718 — prototype pollution via YAML merge key `<<`). Even with gray-matter's custom engine API, `js-yaml` is still *installed* in `node_modules` as a transitive dependency. The attack surface is the install, not the import. +- **gray-matter's full tree is 11 packages** (js-yaml, argparse, kind-of, section-matter, extend-shallow, is-extendable, strip-bom-string, etc.) — none of which we need. +- **Recent npm supply chain attacks** (April 2026: 18-package phishing compromise targeting chalk/debug/etc., the Shai-Hulud self-replicating worm hitting 500+ packages, the axios RAT incident) demonstrate that every dependency in the tree is potential attack surface. + +### What we don't replicate from gray-matter + +TOML/Coffee engines, JavaScript eval engine, `section-matter` (nested sections), in-memory cache. We don't use any of these. + +### `yaml` package profile + +- Zero dependencies, full YAML 1.2 spec compliance, no known CVEs +- Actively maintained, excellent TypeScript types +- Single-package blast radius — if compromised, tractable to fork (pure JS) + +### WASM YAML parser — considered and rejected + +A Rust YAML crate compiled to WASM was considered as an alternative, but it reintroduces complexity the napi→graphology pivot was designed to remove (Rust toolchain in CI, WASM compile target, cold-start latency, FFI boundary). The marginal security gain over `yaml` (already zero-dep) doesn't justify the added build complexity. + +## Splitter Design + +The frontmatter splitter is a simple `---` delimiter parser (~40 lines). It: + +1. Checks for opening `---` delimiter (not `----`) +2. Finds closing `\n---` delimiter +3. Extracts the YAML data string and the markdown content body +4. Returns `{ data: string, content: string }` or `null` if no valid frontmatter + +The actual YAML parsing is delegated to `yaml.parse()`. The serializer uses `yaml.stringify()` for the data portion. + +## Constraints + +- **No gray-matter, no js-yaml** — these are hard exclusions for supply chain security. +- **YAML 1.2 only** — the `yaml` package implements YAML 1.2, which is a superset of JSON and avoids the ambiguous type coercion issues of YAML 1.1. +- **Frontmatter is a parsing concern, not a graph concern** — parsed `TaskInput` objects are fed to `TaskGraph.fromTasks()`. The parser doesn't know about graphs; the graph doesn't know about files. + +## References + +- `yaml` package: https://github.com/eemeli/yaml +- CVE-2025-64718 (js-yaml prototype pollution): tracked in npm audit database \ No newline at end of file diff --git a/docs/architecture/graph-model.md b/docs/architecture/graph-model.md new file mode 100644 index 0000000..bd0a049 --- /dev/null +++ b/docs/architecture/graph-model.md @@ -0,0 +1,89 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Graph Model + +How task graphs are represented, constructed, and queried within the library. + +## Overview + +The library uses graphology's `DirectedGraph` as the underlying data structure. Tasks are nodes (keyed by task `id`), dependencies are directed edges, and categorical metadata (scope, risk, impact, etc.) lives on node attributes. The `TaskGraph` class wraps graphology and provides construction, mutation, and basic query operations — see [api-surface.md](api-surface.md) for the full API. + +## Edge Direction + +**prerequisite → dependent** (matches Rust CLI convention). + +If task B has `dependsOn: ["A"]`, the edge is **A → B** (A must complete before B). + +In graphology terms: +- `graph.inNeighbors(B)` → prerequisites (what B depends on) +- `graph.outNeighbors(A)` → dependents (what depends on A) +- `graph.addEdge(A, B)` — prerequisite is source, dependent is target + +This convention is critical: it determines the semantics of `topologicalOrder` (prerequisites before dependents), `criticalPath` (longest path from source to sink), and `parallelGroups` (generational grouping by depth from sources). + +## Construction Paths + +The graph must be constructable from multiple sources to serve both consumers: + +| Path | Source | Consumer | Edge Attributes | +|------|--------|----------|----------------| +| `fromTasks` | `TaskInput[]` (frontmatter/JSON) | OpenCode plugin, tests | Default `qualityDegradation` (0.9) | +| `fromRecords` | `TaskInput[]` + `DependencyEdge[]` | alkhub (DB query results) | Per-edge `qualityDegradation` | +| `fromJSON` | `TaskGraphSerialized` (graphology export) | Persistence/round-trip | Preserved from source | +| Incremental | `addTask` / `addDependency` calls | Programmatic/testing | Default or explicit | + +**Preferred internal approach**: For paths 1 and 2, build a serialized graph JSON blob (nodes array + edges array) and call `graph.import()`. This is faster than N individual `addNode`/`addEdge` calls and avoids the verbose builder API. + +### qualityDegradation on Construction + +`fromTasks` constructs edges from `dependsOn` arrays in frontmatter, which cannot express per-edge `qualityDegradation`. Those edges get the default (0.9). `fromRecords` and `fromJSON` support per-edge values. Edges can be augmented after construction via `updateEdgeAttributes`. + +This distinction exists because the file-based frontmatter model has no syntax for per-edge attributes, while the DB-backed model (alkhub) stores per-edge `qualityDegradation` in the `task_dependencies` table. The library serves both without forcing either into the other's shape. + +## Categorical Field Defaults + +Categorical fields (`scope`, `risk`, `impact`, `level`) are optional (nullable) — NULL means "not yet assessed." This matches the Rust CLI's `Option`, `Option`, etc. and the alkhub DB schema's nullable columns. + +The analysis functions need numeric values, so a `resolveDefaults` helper provides fallbacks: + +| Field | When NULL | Fallback | +|-------|-----------|----------| +| risk | not assessed | successProbability 0.80 (medium), riskWeight 0.20 | +| scope | not assessed | costEstimate 2.0 (narrow) | +| impact | not assessed | weight 1.0 (isolated) | + +The raw nullable data is preserved on the graph. `resolveDefaults` is called internally by analysis functions but is also available to consumers that need the same default logic. This ensures the library never silently reinterprets "not assessed" as a specific value — the distinction is explicit. + +> See [schemas.md](schemas.md) for the full enum definitions and numeric method tables. + +## Node Metadata + +Unlike the original napi design where `DependencyGraph` only stored IDs, node attributes carry the categorical metadata directly. This eliminates the need to pass `TaskInput[]` alongside the graph — `weightedCriticalPath` and `riskPath` read attributes from the graph nodes. + +The graph acts as an in-memory index/metadata store for categorical fields. Task body content, file path, and other non-graph data stay external to the library. + +## Edge Attributes + +Edges carry `qualityDegradation` for the DAG-propagation cost model. If absent, the default (0.9) is used by `workflowCost`. Other algorithms ignore edge attributes. + +> See [cost-benefit.md](cost-benefit.md) for how qualityDegradation is used in propagation. + +## Graph Reactivity + +graphology's `Graph` class extends Node.js `EventEmitter` and emits fine-grained mutation events: `nodeAdded`, `edgeAdded`, `nodeDropped`, `edgeDropped`, `nodeAttributesUpdated`, `edgeAttributesUpdated`, `cleared`, `edgesCleared`. + +`TaskGraph` does **not** wrap or re-emit these events. Consumers that need reactivity (e.g., file-watch → coordinator notification) access the underlying graphology instance via `graph.raw` and attach listeners directly. This keeps `TaskGraph` as a pure computation library with no opinion about reactivity. + +## Constraints + +- **DAG structure** — The library models task dependencies as a directed acyclic graph. Cycles are detected and reported as errors, not silently tolerated. See [errors-validation.md](errors-validation.md). +- **No parallel edges** — Between any node pair (A, B), at most one edge A→B exists. Duplicate dependency declarations are a validation error, not a valid use case. See [ADR-006](decisions/006-deterministic-edge-keys.md). +- **Unique node keys** — Task IDs (slugs) are unique within a graph. Adding a node with a duplicate key is an error. +- **Small graph sizes** — Realistic task graphs are 10–200 nodes. This means rebuild-on-change is always sub-millisecond. See [ADR-002](decisions/002-rebuild-vs-incremental.md). + +## Open Questions + +- Should we support multi-graphs (same node pair, multiple edges with different attributes)? Not needed for current use cases but could arise if conditional dependencies are introduced. See [ADR-006](decisions/006-deterministic-edge-keys.md) for the no-parallel-edges constraint. \ No newline at end of file diff --git a/docs/architecture/schemas.md b/docs/architecture/schemas.md new file mode 100644 index 0000000..7306008 --- /dev/null +++ b/docs/architecture/schemas.md @@ -0,0 +1,194 @@ +--- +status: draft +last_updated: 2026-04-26 +--- + +# Schemas + +TypeBox schema definitions, categorical enums, and their numeric methods. + +## Design Decision: TypeBox as Single Source of Truth + +All data shapes are defined as TypeBox schemas. This gives us: + +1. **Static TypeScript types** via `Static` — compile-time safety +2. **Runtime validation** via `Value.Check()` / `Value.Assert()` — reject bad input before it hits the graph +3. **JSON Schema** for free — can be used by consumers for their own validation, API contracts, etc. + +The TypeBox schemas serve as the single source of truth for both types and validation. No separate type definitions, no Zod, no ad-hoc validation logic. Consumers with Zod in their stack can convert at their boundary. + +## Input Schemas + +### TaskInput + +The universal input shape for a task, matching the Rust `TaskFrontmatter` field set: + +```typescript +const TaskInput = Type.Object({ + id: Type.String(), + name: Type.String(), + dependsOn: Type.Array(Type.String()), + status: Type.Optional(TaskStatusEnum), + scope: Type.Optional(TaskScopeEnum), + risk: Type.Optional(TaskRiskEnum), + impact: Type.Optional(TaskImpactEnum), + level: Type.Optional(TaskLevelEnum), + priority: Type.Optional(TaskPriorityEnum), + tags: Type.Optional(Type.Array(Type.String())), + assignee: Type.Optional(Type.String()), + due: Type.Optional(Type.String()), + created: Type.Optional(Type.String()), + modified: Type.Optional(Type.String()), +}) +``` + +### DependencyEdge + +```typescript +const DependencyEdge = Type.Object({ + from: Type.String(), // prerequisite task id + to: Type.String(), // dependent task id + qualityDegradation: Type.Optional(Type.Number()), // 0.0–1.0, default 0.9 +}) +``` + +The `qualityDegradation` field models how much upstream failure bleeds through to the dependent task. Value of 0.0 means no propagation (independent model), 1.0 means full propagation. Default is 0.9 following the Python research model. Only used by `workflowCost` in DAG-propagation mode; ignored by all other algorithms. + +## Graph Attribute Schemas + +### TaskGraphNodeAttributes + +Node attributes stored on the graphology graph. The node key is the task `id` (slug). Attributes carry only the metadata needed for graph analysis — no body/content: + +```typescript +const TaskGraphNodeAttributes = Type.Object({ + name: Type.String(), + scope: Type.Optional(TaskScopeEnum), + risk: Type.Optional(TaskRiskEnum), + impact: Type.Optional(TaskImpactEnum), + level: Type.Optional(TaskLevelEnum), + priority: Type.Optional(TaskPriorityEnum), + status: Type.Optional(TaskStatusEnum), +}) +``` + +### TaskGraphEdgeAttributes + +```typescript +const TaskGraphEdgeAttributes = Type.Object({ + qualityDegradation: Type.Optional(Type.Number()), +}) +``` + +### SerializedGraph + +Following the graphology native JSON format, parameterized with our attribute types: + +```typescript +const TaskGraphSerialized = SerializedGraph( + TaskGraphNodeAttributes, + TaskGraphEdgeAttributes, + Type.Object({}) +) +``` + +This validates the graphology `export()` output and enables `import()` from validated JSON blobs. + +**No schema version field**: The serialized format follows graphology's native JSON format and does not include a version field. Serialized graphs are not a persistence format with backward-compatibility guarantees. They serve as an intermediate transport format (e.g., for caching, IPC, or test fixtures). Consumers that need persistence should wrap the serialized output in their own versioned envelope. + +## Categorical Enums + +### Enum Definitions + +Categorical enums are defined with `Type.Union(Type.Literal(...))` — string values matching the DB and frontmatter conventions. + +**Naming convention**: The TypeBox schema constants use an `Enum` suffix (e.g., `TaskScopeEnum`, `TaskRiskEnum`). The corresponding TypeScript type aliases drop the suffix (e.g., `type TaskScope = Static`). The schema constant is the runtime value; the type alias is the compile-time type. All function signatures use the compile-time type names. + +| Enum Schema Constant | TypeScript Type | Values | +|----------------------|-----------------|--------| +| `TaskScopeEnum` | `TaskScope` | single, narrow, moderate, broad, system | +| `TaskRiskEnum` | `TaskRisk` | trivial, low, medium, high, critical | +| `TaskImpactEnum` | `TaskImpact` | isolated, component, phase, project | +| `TaskLevelEnum` | `TaskLevel` | planning, decomposition, implementation, review, research | +| `TaskPriorityEnum` | `TaskPriority` | low, medium, high, critical | +| `TaskStatusEnum` | `TaskStatus` | pending, in-progress, completed, failed, blocked | + +### Numeric Methods + +#### TaskScope → cost/token estimates + +| TaskScope | costEstimate | tokenEstimate | +|-----------|-------------|---------------| +| single | 1.0 | 500 | +| narrow | 2.0 | 1500 | +| moderate | 3.0 | 3000 | +| broad | 4.0 | 6000 | +| system | 5.0 | 10000 | + +#### TaskRisk → probability/weight + +| TaskRisk | successProbability | riskWeight (1-p) | +|----------|--------------------|--------------------| +| trivial | 0.98 | 0.02 | +| low | 0.90 | 0.10 | +| medium | 0.80 | 0.20 | +| high | 0.65 | 0.35 | +| critical | 0.50 | 0.50 | + +#### TaskImpact → weight + +| TaskImpact | weight | +|-----------|--------| +| isolated | 1.0 | +| component | 1.5 | +| phase | 2.0 | +| project | 3.0 | + +#### Label-only enums + +`TaskLevel` and `TaskPriority` have no numeric methods — they are for labeling/filtering only. + +### Standalone Numeric Functions + +These are standalone functions (not methods on enum objects) for maximum composability: + +```typescript +function scopeCostEstimate(scope: TaskScope): number // 1.0–5.0 +function scopeTokenEstimate(scope: TaskScope): number // 500–10000 +function riskSuccessProbability(risk: TaskRisk): number // 0.50–0.98 +function riskWeight(risk: TaskRisk): number // 0.02–0.50 +function impactWeight(impact: TaskImpact): number // 1.0–3.0 +function resolveDefaults(attrs: Partial): ResolvedTaskAttributes +``` + +## ResolvedTaskAttributes + +The output of `resolveDefaults` — all categorical fields resolved to their numeric equivalents for use in analysis: + +```typescript +interface ResolvedTaskAttributes { + name: string + scope: TaskScope + risk: TaskRisk + impact: TaskImpact + level: TaskLevel | null + priority: TaskPriority | null + status: TaskStatus | null + // Numeric equivalents (always present after resolution): + costEstimate: number + tokenEstimate: number + successProbability: number + riskWeight: number + impactWeight: number +} +``` + +**Why `level`, `priority`, and `status` remain nullable**: These three fields are label-only enums with no numeric methods (see "Label-only enums" above). They are used for filtering and labeling, not for cost calculations. A task with `level: null` simply hasn't been categorized — the analysis functions don't need a numeric value for it. `risk`, `scope`, and `impact` are the only fields that feed into EV and risk calculations, so they're the only ones that need default resolution. + +> **Note on `level`**: While the cost-benefit framework shows that "risk: critical at planning level > risk: critical at implementation level" (upstream failures multiply), this is captured by the DAG-propagation model's topology-aware cost computation, not by a numeric value on `level` itself. The `level` field serves as metadata for filtering and display, not as a cost input. + +## Constraints + +- **Nullable categorical fields are meaningful** — NULL means "not yet assessed," not "use default." The `resolveDefaults` helper makes this explicit. See [graph-model.md](graph-model.md) for the default mappings. +- **No Zod bridge** — Consumers with Zod in their stack can convert at their boundary. The library does not provide a Zod interop layer. +- **Enum values match DB and frontmatter conventions** — The string values are identical to the Rust `TaskFrontmatter` field values and the alkhub `pgEnum` definitions. \ No newline at end of file