Decompose monolithic architecture.md into modular docs/architecture/ documents

The 751-line architecture.md violated the SDD process modular documentation
target (~500 lines). It also had duplicate TaskGraph class definitions (one
monolith, one decomposed) that directly contradicted each other, and embedded
consumer-specific tool dispatch mappings that belong in downstream projects.

Changes:
- Split into 8 focused documents + 7 ADR records + redirect page
- Removed the monolithic TaskGraph class (kept only decomposed version)
- Moved CLI→plugin dispatch mapping out (belongs in plugin architecture)
- Extracted implementation code (frontmatter splitter, findCycles, DAG
  propagation) into WHAT/WHY descriptions per architect role spec
- Added proper ADR format for all resolved design decisions
- Fixed review issues: C_fail mapping, DuplicateNodeError/DuplicateEdgeError
  types, ValidationError/GraphValidationError definitions, mutation error
  handling contract, enum naming convention, validation timing clarification
This commit is contained in:
2026-04-26 06:38:52 +00:00
parent bac335274d
commit bde1cc4e70
16 changed files with 1264 additions and 748 deletions

View File

@@ -0,0 +1,131 @@
---
status: draft
last_updated: 2026-04-26
---
# Cost-Benefit Analysis
Expected value math, risk analysis, DAG-propagation cost model, and cycle detection.
## Overview
The cost-benefit functions are the key analytical value of the library. They go beyond simple graph topology to answer structural questions about task workflows: which path has the highest cumulative risk? What's the expected cost of a workflow? Which tasks should be decomposed?
These functions implement the cost-benefit framework from `/workspace/@alkimiadev/taskgraph/docs/framework.md` and extend it with DAG-propagation (from the Python research model) that the Rust CLI's independent model ignores.
## Core Concepts
### Expected Value of a Task
```
EV_task = P_success × C_success + (1 - P_success) × C_fail
```
Where categorical fields provide the inputs:
- **P_success** = `riskSuccessProbability(risk)` — probability the task completes successfully
- **C_success** = `scopeCostEstimate(scope)` — cost when it works
- **C_fail** = modeled via `EvConfig` parameters: `scopeCost + fallbackCost + timeLost × expectedRetries`. The `calculateTaskEv` function uses `scopeCost` as `C_success` and derives `C_fail` from the same `scopeCost` plus `fallbackCost` and `timeLost` scaled by expected retry count. `fallbackCost` and `timeLost` default to 0 if not provided, yielding `C_fail = C_success` in the simplest case. The `valueRate` parameter converts the result to dollar terms if needed.
### Structural Insight: Upstream Failures Multiply
```
planning failure → wrong decomposition → wasted implementation
decomposition failure → unclear tasks → rework
review failure → bugs shipped → rework
```
This means `risk: critical` at planning level > `risk: critical` at implementation level. The cost-benefit framework demonstrates this: poor planning (p=0.65) increases total cost by 150% compared to good planning (p=0.92), even with identical implementation tasks.
The failure propagates: poor planning reduces decomposition quality, which reduces implementation effectiveness, which increases integration issues. This structural property is independent of the developer type — human, LLM, or otherwise.
### Decomposition Threshold
`shouldDecomposeTask` flags tasks where:
- risk >= high, OR
- scope >= broad
This is a structural insight: large or risky tasks have higher failure rates and should be broken down. The threshold is consistent with the Rust CLI's `decompose` command.
## DAG-Propagation Cost Model
### Why
The Rust CLI computes EV per-task independently — no upstream quality degradation. As the Python research model demonstrates, this is dangerously optimistic for non-trivial workflows. In a dependency chain where planning has p=0.65 (poor), the Python model shows a **213% cost increase** vs good planning (p=0.92). The independent model barely shows a difference because it ignores cascading failure.
### Implementation Approach
DAG propagation is the **default mode**. The independent model is a degenerate case (set `defaultQualityDegradation: 0` or `propagationMode: 'independent'`).
The algorithm processes tasks in topological order, maintaining an `upstreamSuccessProbs` map:
1. For each task in topological order:
- If propagation mode is `dag-propagate`: compute `pEffective` from intrinsic probability + upstream propagation
- If propagation mode is `independent`: use intrinsic probability directly
- Calculate EV using `calculateTaskEv`
- Store the task's actual success probability for downstream propagation
2. When computing effective probability for a task with prerequisites:
- Start with intrinsic probability
- For each prerequisite, compute inherited quality: `parentP + (1 - parentP) × (1 - qualityDegradation)`
- Multiply all inherited quality factors together with intrinsic probability
3. The `qualityDegradation` per edge determines how much a parent's failure bleeds through:
- 0.0 = no propagation (independent model)
- 1.0 = full propagation (parent failure guarantees child failure)
- default 0.9 = high but not total propagation
### Per-task output
Each task in the `WorkflowCostResult.tasks` array includes both `pIntrinsic` and `pEffective` so consumers can see the degradation effect. The per-task entries also include `taskId` and `name` (enriched from the graph's node attributes) — `calculateTaskEv` is the pure math function (takes only numeric inputs), while `workflowCost` is the aggregate that orchestrates the per-task calls and enriches results with identity metadata from the graph.
### Skip-completed semantics
When `includeCompleted: false`, completed tasks are excluded from the result's task list, but they **remain in the propagation chain** with p=1.0. Removing completed tasks from propagation would *worsen* downstream probability estimates — exactly the opposite of what "what's left" queries need.
> See [ADR-004](decisions/004-workflow-cost-dag-propagation.md) and [ADR-005](decisions/005-no-depth-escalation-v1.md).
### Comparison with Rust CLI
| Dimension | Rust CLI (Simple Sum) | This Library (DAG Propagation) |
|-----------|----------------------|-------------------------------|
| Topology awareness | None | Full — topological order + upstream propagation |
| Upstream failure modeling | Ignored | Each parent's failure degrades child's effective p |
| Edge semantics | Not used | `qualityDegradation` per edge, default 0.9 |
| Result interpretation | Sum of independent per-task costs | Total workflow cost accounting for cascading failure |
| Degenerate case | — | Set `propagationMode: 'independent'` or `defaultQualityDegradation: 0` |
## Risk Analysis Functions
### riskPath
`riskPath(graph)``RiskPathResult`
Calls `weightedCriticalPath` with weight function `riskWeight * impactWeight`. Returns the path with highest cumulative risk and its total risk score.
### riskDistribution
`riskDistribution(graph)``RiskDistributionResult`
Groups tasks by risk category. Returns counts per bucket: trivial, low, medium, high, critical, unspecified.
### shouldDecomposeTask
`shouldDecomposeTask(attrs: TaskGraphNodeAttributes)``DecomposeResult`
Pure function — takes node attributes (not a graph). Internally calls `resolveDefaults` to handle nullable `risk`/`scope` fields. A task with `risk: null` uses the default (medium, which is below the threshold); a task with `scope: null` uses the default (narrow, which is below the threshold). This means unassessed tasks are never flagged for decomposition — an explicit `risk: "high"` or `scope: "broad"` is required.
## findCycles
graphology provides `hasCycle` (boolean) and `stronglyConnectedComponents` (node groups, not paths). The library implements a custom cycle path extractor for error reporting:
- **Algorithm**: Extended 3-color DFS (WHITE/GREY/BLACK). When a back edge is found (GREY → GREY), trace back through the recursion stack to extract the cycle path as an ordered node sequence. Each inner array in the returned `string[][]` is a single cycle — an ordered sequence of node IDs where the last node has an edge back to the first. The algorithm returns **one representative cycle per back edge**, not an exhaustive enumeration of all simple cycles (which could be exponential). For error reporting, one cycle per problematic region is sufficient.
- **Optimization**: Use `stronglyConnectedComponents()` as a fast pre-check. If there are zero multi-node SCCs (and no self-loops), skip the DFS entirely.
- **Relationship to topologicalOrder**: `topologicalOrder()` throws `CircularDependencyError` (with `cycles` populated from `findCycles`) when the graph is cyclic. This gives consumers the cycle information needed for error reporting.
> See [errors-validation.md](errors-validation.md) for error handling.
## Constraints
- **DAG-propagation is default** — the independent model is opt-in, not the other way around. The independent model is the degenerate case, not the norm.
- **No depth-escalation in v1** — the multiplicative propagation model already captures depth effects implicitly (each hop compounds another `<1.0` factor). Adding an explicit depth penalty would double-count until we have empirical calibration data. See [ADR-005](decisions/005-no-depth-escalation-v1.md).
- **Categorical estimates, not numeric** — The framework uses categorical fields because LLMs reliably distinguish "high vs medium risk" but struggle with "$3.42 vs $3.50". Categoricals remain valid across environments (different models, providers, token costs).