Files
taskgraph_ts/docs/research/taskgraph_rust_source.md

45 KiB

TaskGraph Rust Source - Comprehensive Research Report

Source: /workspace/@alkimiadev/taskgraph (Rust CLI project) Report date: 2026-04-23 Version: 0.1.3


Table of Contents

  1. Project Structure
  2. Cargo.toml Details
  3. Core Data Types and Public APIs
  4. Functions/Methods to Expose via NAPI
  5. Serialization (Serde) Support
  6. Error Types and Error Handling
  7. Input/Output Patterns
  8. Existing Tests and Benchmarks

1. Project Structure

Directory Layout

taskgraph/
├── Cargo.toml              # Package manifest (single crate, not a workspace)
├── Cargo.lock               # Locked dependencies
├── LICENSE-APACHE           # Apache-2.0 license
├── LICENSE-MIT              # MIT license
├── README.md                # User-facing documentation
├── AGENTS.md                # AI agent context file
├── opencode.json            # OpenCode configuration
├── .github/
│   └── workflows/
│       └── ci.yml           # CI: fmt, clippy, test, coverage
├── docs/
│   ├── ARCHITECTURE.md      # Full architecture spec
│   ├── framework.md         # Cost-benefit framework rationale
│   ├── workflow.md          # Practical workflow guide
│   ├── implementation.md   # Tools/models/guidelines
│   ├── phase-1.md through phase-4.md  # Phase plans
│   ├── issues/              # Blocking issues tracking
│   ├── reviews/             # Code review docs
│   └── research/
│       └── cost_benefit_analysis_framework.py
├── scripts/
│   └── benchmark.sh         # Manual benchmark script
├── benches/
│   └── graph_benchmarks.rs  # Criterion benchmarks
├── src/
│   ├── main.rs              # Binary entry point (thin: parse CLI, execute)
│   ├── lib.rs               # Library root - re-exports public API
│   ├── cli.rs               # CLI argument definitions (clap derive)
│   ├── task.rs              # Task, TaskFrontmatter, enums (serde types)
│   ├── graph.rs             # DependencyGraph (petgraph wrapper)
│   ├── error.rs             # Error enum (thiserror)
│   ├── config.rs            # Config loading (.taskgraph.toml)
│   ├── discovery.rs         # TaskCollection (directory scanning)
│   └── commands/
│       ├── mod.rs            # Command module re-exports
│       ├── init.rs           # `init` command
│       ├── validate.rs       # `validate` command
│       ├── list.rs           # `list` command
│       ├── show.rs            # `show` command
│       ├── deps.rs            # `deps` command
│       ├── topo.rs            # `topo` command
│       ├── cycles.rs          # `cycles` command
│       ├── parallel.rs        # `parallel` command
│       ├── critical.rs        # `critical` command
│       ├── bottleneck.rs      # `bottleneck` command
│       ├── risk.rs            # `risk` command
│       ├── decompose.rs       # `decompose` command
│       ├── workflow_cost.rs   # `workflow-cost` command
│       ├── risk_path.rs       # `risk-path` command
│       └── graph_cmd.rs       # `graph` command (DOT output)
└── tests/
    ├── integration/
    │   └── commands.rs        # 25 integration tests (assert_cmd)
    └── fixtures/
        ├── tasks/             # 3 valid tasks (one depends on another)
        ├── cycles/             # 3 tasks forming a cycle
        ├── invalid/            # 1 task with missing dependency
        ├── risk/               # 5 tasks with various risk levels
        └── decompose/          # 4 tasks for decomposition testing

Module Dependency Graph

lib.rs
  ├── cli          → commands::*, config, discovery, graph
  ├── commands/*   → cli, discovery, graph, task
  ├── config       → error
  ├── discovery    → task, error
  ├── error        → (thiserror, std, serde_yaml, serde_json)
  ├── graph        → discovery, task, petgraph
  └── task         → (serde, chrono, gray_matter, error)

Crates

This is a single crate project (not a Cargo workspace). It produces:

  • Library: libtaskgraph (from src/lib.rs)
  • Binary: taskgraph (from src/main.rs)

2. Cargo.toml Details

Package Metadata

Field Value
name taskgraph
version 0.1.3
edition 2021
license MIT OR Apache-2.0
description CLI tool for managing task dependencies using markdown files
repository https://github.com/alkimiadev/taskgraph
keywords task, dependency, graph, cli, markdown
categories command-line-utilities, development-tools

Dependencies (Production)

Crate Version Features Purpose
petgraph 0.7 - Directed graph data structure & algorithms (toposort, cycle detection, etc.)
gray_matter 0.2 - Markdown frontmatter extraction (YAML engine)
serde 1.0 derive Serialization/deserialization framework
serde_json 1.0 - JSON serialization (for --format json output)
serde_yaml 0.9 - YAML serialization (for frontmatter parsing & roundtrip)
clap 4.5 derive CLI argument parsing
clap_complete 4.5 - Shell completion generation
chrono 0.4 serde Date/time with serde support
anyhow 1.0 - Ergonomic error handling (used in CLI/binary)
thiserror 2.0 - Derived error types (used in library)
dirs 6.0 - Platform directories (future: global config)
walkdir 2.5 - Recursive directory walking
tracing 0.1 - Structured logging
tracing-subscriber 0.3 env-filter Log output formatting
toml 0.8 - Config file parsing

Dev Dependencies

Crate Version Purpose
tempfile 3.0 Temporary directories for tests
assert_cmd 2.0 CLI integration testing
predicates 3.0 Assertion predicates for integration tests
criterion 0.5 Benchmarking framework

Features

[features]
default = []

No feature flags exist yet. This is a good candidate for adding napi feature.

Release Profile

[profile.release]
opt-level = 3
lto = true
strip = true

3. Core Data Types and Public APIs

3.1 Task (src/task.rs)

The central data type. Represents a single task file.

/// A task with its content.
#[derive(Debug, Clone)]
pub struct Task {
    pub frontmatter: TaskFrontmatter,
    pub body: String,           // Markdown body content
    pub source: Option<String>, // Source file path (if loaded from file)
}

Methods:

Method Signature Returns Description
id() &self -> &str Task ID Accessor for frontmatter.id
name() &self -> &str Task name Accessor for frontmatter.name
status() &self -> TaskStatus Status enum Accessor for frontmatter.status
depends_on() &self -> &[String] Dependency list Accessor for frontmatter.depends_on
from_file() &Path -> Result<Self> Parsed Task Parse from a .md file on disk
from_markdown() &str, Option<String> -> Result<Self> Parsed Task Parse from markdown string + optional source name
to_markdown() &self -> Result<String, serde_yaml::Error> Markdown string Serialize back to markdown with YAML frontmatter

Key observation: Task itself does NOT derive Serialize or Deserialize. Only TaskFrontmatter does. The body and source fields are not serialized through serde - they're managed separately during parse/render.

3.2 TaskFrontmatter (src/task.rs)

The structured metadata extracted from YAML frontmatter:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TaskFrontmatter {
    pub id: String,
    pub name: String,
    #[serde(default)]
    pub status: TaskStatus,
    #[serde(default, rename = "depends_on")]
    pub depends_on: Vec<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub priority: Option<String>,
    #[serde(default, skip_serializing_if = "Vec::is_empty")]
    pub tags: Vec<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub created: Option<DateTime<Utc>>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub modified: Option<DateTime<Utc>>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub assignee: Option<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub due: Option<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub scope: Option<TaskScope>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub risk: Option<TaskRisk>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub impact: Option<TaskImpact>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub level: Option<TaskLevel>,
}

Serde details:

  • All enums use #[serde(rename_all = "kebab-case")] for YAML keys
  • Optional fields use skip_serializing_if to keep output clean
  • Tags use skip_serializing_if = "Vec::is_empty"
  • depends_on renamed from Rust depends_on (same, but explicitly)
  • status has a default of TaskStatus::Pending

3.3 Enum Types (src/task.rs)

All enums derive Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default.

TaskStatus

#[serde(rename_all = "kebab-case")]
pub enum TaskStatus {
    Pending,       // default
    InProgress,    // "in-progress" in YAML/JSON
    Completed,
    Failed,
    Blocked,
}

Also implements Display (kebab-case strings).

TaskScope

#[serde(rename_all = "kebab-case")]
pub enum TaskScope {
    Single,    // ~500 tokens, cost 1.0
    Narrow,    // default, ~1500 tokens, cost 2.0
    Moderate,  // ~3000 tokens, cost 3.0
    Broad,     // ~6000 tokens, cost 4.0
    System,    // ~10000 tokens, cost 5.0
}

Methods: token_estimate() -> u32, cost_estimate() -> f64, Display

TaskRisk

#[serde(rename_all = "kebab-case")]
pub enum TaskRisk {
    Trivial,   // p=0.98
    Low,       // default, p=0.90
    Medium,    // p=0.80
    High,      // p=0.65
    Critical,  // p=0.50
}

Methods: success_probability() -> f64, Display

TaskImpact

#[serde(rename_all = "kebab-case")]
pub enum TaskImpact {
    Isolated,   // default, weight 1.0
    Component,  // weight 1.5
    Phase,      // weight 2.0
    Project,    // weight 3.0
}

Methods: weight() -> f64, Display

TaskLevel

#[serde(rename_all = "kebab-case")]
pub enum TaskLevel {
    Planning,
    Decomposition,
    Implementation,  // default
    Review,
    Research,
}

Methods: Display only

3.4 DependencyGraph (src/graph.rs)

A directed graph of task dependencies built from a TaskCollection.

pub struct DependencyGraph {
    graph: DiGraph<TaskId, ()>,       // petgraph directed graph
    index_map: HashMap<TaskId, NodeIndex>,  // task ID -> node index
}

Edge direction: from -> to means "from must complete before to" (dependency must complete first).

Public API:

Method Signature Returns Description
new() -> Self Empty graph Create empty graph
from_collection() &TaskCollection -> Self Built graph Build from discovered tasks
from_tasks() Vec<&Task> -> Self Built graph Build from explicit task list
add_task() &mut self, TaskId () Add node
add_dependency() &mut self, &str, &str () Add edge (from->to); silently ignores unknown IDs
has_cycles() &self -> bool Boolean Uses petgraph::algo::is_cyclic_directed
find_cycles() &self -> Vec<Vec<TaskId>> Cycles Custom DFS cycle finder
topological_order() &self -> Option<Vec<TaskId>> Order or None Uses petgraph::algo::toposort
dependencies() &self, &str -> Vec<TaskId> Incoming neighbors What this task depends on (direct)
dependents() &self, &str -> Vec<TaskId> Outgoing neighbors What depends on this (direct)
parallel_groups() &self -> Vec<Vec<TaskId>> Generations Tasks grouped by level (can run concurrently)
critical_path() &self -> Vec<TaskId> Path Longest path through the graph
weighted_critical_path() &self, F: Fn(&str)->f64 -> Vec<TaskId> Weighted path Path with highest cumulative weight
bottlenecks() &self -> Vec<(TaskId, usize)> Ranked list Betweenness centrality via path counting
to_dot() &self -> String DOT string GraphViz DOT format export

Also implements Default (returns new()).

Important: DependencyGraph does NOT implement Serialize/Deserialize. It's a compute-only structure built fresh each time from tasks.

3.5 TaskCollection (src/discovery.rs)

Collection of tasks discovered from a directory:

#[derive(Debug, Default)]
pub struct TaskCollection {
    tasks: HashMap<String, Task>,        // Tasks indexed by ID
    paths: HashMap<String, PathBuf>,     // File paths indexed by ID
    errors: Vec<DiscoveryError>,         // Parse errors encountered
}

Public API:

Method Signature Returns Description
new() -> Self Empty collection Constructor
from_directory() &Path -> Self Populated collection Scan directory recursively for .md files
get() &self, &str -> Option<&Task> Task or None Lookup by ID
path() &self, &str -> Option<&PathBuf> Path or None File path for task ID
tasks() &self -> impl Iterator<Item = &Task> Iterator All tasks
ids() &self -> impl Iterator<Item = &str> Iterator All task IDs
len() &self -> usize Count Number of tasks
is_empty() &self -> bool Boolean Empty check
errors() &self -> &[DiscoveryError] Errors Parse errors from discovery
missing_dependencies() &self -> HashMap<String, Vec<String>> Map Task ID -> missing dep IDs
validate() &self -> ValidationResult Result Full validation

Important: TaskCollection does NOT implement Serialize/Deserialize either. It's built procedurally.

3.6 DiscoveryError (src/discovery.rs)

#[derive(Debug, Clone)]
pub struct DiscoveryError {
    pub path: PathBuf,
    pub message: String,
}

No serde derives. Simple struct for error reporting.

3.7 ValidationResult (src/discovery.rs)

#[derive(Debug)]
pub struct ValidationResult {
    pub task_count: usize,
    pub errors: Vec<DiscoveryError>,
    pub missing_dependencies: HashMap<String, Vec<String>>,
}

Methods: is_valid() -> bool, issue_count() -> usize

No serde derives on the Rust type itself, but it's converted to ValidationOutput (which does derive Serialize) in the validate command.

3.8 Config (src/config.rs)

#[derive(Debug, Default, Serialize, Deserialize)]
pub struct Config {
    #[serde(default)]
    pub project: ProjectConfig,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct ProjectConfig {
    #[serde(default = "default_tasks_dir")]
    pub tasks_dir: String,  // default: "tasks"
}

API:

Method Signature Returns Description
from_file() &Path -> Result<Self> Config Load from .taskgraph.toml
find_and_load() -> Option<Self> Config or None Search up directory tree
tasks_path() &self -> PathBuf Path Get tasks directory

3.9 CLI Types (src/cli.rs)

#[derive(Clone, Copy, Debug, Default, ValueEnum)]
pub enum OutputFormat {
    Plain,   // default
    Json,
}

#[derive(Parser, Debug)]
pub struct Cli {
    pub path: Option<String>,
    pub format: OutputFormat,
    pub command: Commands,
}

#[derive(Subcommand, Debug)]
pub enum Commands {
    Init { id, name, scope, risk },
    Validate { strict },
    List { status, tag },
    Show { id },
    Deps { id },
    Dependents { id },
    Topo { status },
    Cycles,
    Parallel,
    Critical,
    Bottleneck,
    Risk,
    Decompose,
    WorkflowCost { include_completed, limit },
    RiskPath,
    Graph { output },
    Completions { shell },
}

The Cli::execute() method dispatches all commands. It creates TaskCollection from directory for each command.

3.10 Lib.rs Public Re-exports

pub mod cli;
pub mod commands;
pub mod config;
pub mod discovery;
pub mod error;
pub mod graph;
pub mod task;

pub use config::Config;
pub use discovery::{DiscoveryError, TaskCollection, ValidationResult};
pub use error::{Error, Result};
pub use graph::DependencyGraph;
pub use task::{Task, TaskFrontmatter, TaskImpact, TaskLevel, TaskRisk, TaskScope, TaskStatus};

4. Functions/Methods to Expose via NAPI

Priority 1: Core Data Types (Must Have)

These are the foundational types that everything else depends on:

Rust Type NAPI Class Why
Task Task Central unit of work; must be creatable, readable, serializable from JS
TaskFrontmatter Embedded in Task or separate class All metadata is here; JS needs to read/write fields
TaskStatus String enum mapping Simple 5-variant enum; map to JS string union
TaskScope String enum mapping 5 variants with numeric mappings; map to JS string union
TaskRisk String enum mapping 5 variants with probability; map to JS string union
TaskImpact String enum mapping 4 variants with weight; map to JS string union
TaskLevel String enum mapping 5 variants; map to JS string union

Priority 2: Core Functions (Must Have)

Rust Function NAPI Method Input Output Why
Task::from_markdown() Task.fromMarkdown(content, source?) string, string? Task Parse task from markdown string
Task::from_file() Task.fromFile(path) string Task Parse task from file path
Task::to_markdown() task.toMarkdown() - string Serialize task back to markdown
Task::id() task.id (getter) - string Accessor
Task::name() task.name (getter) - string Accessor
Task::status() task.status (getter) - string Accessor
Task::depends_on() task.dependsOn (getter) - string[] Accessor
TaskScope::token_estimate() scope.tokenEstimate() - number Numeric mapping
TaskScope::cost_estimate() scope.costEstimate() - number Numeric mapping
TaskRisk::success_probability() risk.successProbability() - number Numeric mapping
TaskImpact::weight() impact.weight() - number Numeric mapping

Priority 3: Collection & Discovery (Must Have)

Rust Function NAPI Method Input Output Why
TaskCollection::from_directory() TaskCollection.fromDirectory(path) string TaskCollection Primary entry point: discover all tasks
TaskCollection::new() new TaskCollection() - TaskCollection Empty constructor for building manually
TaskCollection::get() collection.get(id) string Task|null Lookup by ID
TaskCollection::len() collection.length (getter) - number Task count
TaskCollection::ids() collection.ids() - string[] All task IDs
TaskCollection::tasks() collection.tasks() - Task[] All tasks
TaskCollection::errors() collection.errors (getter) - DiscoveryError[] Parse errors
TaskCollection::missing_dependencies() collection.missingDependencies() - Record<string, string[]> Find broken deps
TaskCollection::validate() collection.validate() - ValidationResult Full validation

Priority 4: Graph Operations (Must Have)

Rust Function NAPI Method Input Output Why
DependencyGraph::from_collection() DependencyGraph.fromCollection(collection) TaskCollection DependencyGraph Build graph
DependencyGraph::new() new DependencyGraph() - DependencyGraph Empty graph constructor
DependencyGraph::from_tasks() DependencyGraph.fromTasks(tasks[]) Task[] DependencyGraph Build from JS array
add_task() graph.addTask(id) string void Add node
add_dependency() graph.addDependency(from, to) string, string void Add edge
has_cycles() graph.hasCycles() - boolean Cycle detection
find_cycles() graph.findCycles() - string[][] Get actual cycles
topological_order() graph.topologicalOrder() - string[]|null Execution order
dependencies() graph.dependencies(id) string string[] Direct deps
dependents() graph.dependents(id) string string[] What depends on this
parallel_groups() graph.parallelGroups() - string[][] Parallel work groups
critical_path() graph.criticalPath() - string[] Longest path
weighted_critical_path() graph.weightedCriticalPath(weightFn) (id: string) => number string[] Weighted longest path
bottlenecks() graph.bottlenecks() - [string, number][] Betweenness centrality
to_dot() graph.toDot() - string GraphViz DOT format

Priority 5: Config (Nice to Have)

Rust Function NAPI Method Input Output Why
Config::from_file() Config.fromFile(path) string Config Load config
Config::find_and_load() Config.findAndLoad() - Config|null Auto-discover config
Config::tasks_path() config.tasksPath (getter) - string Get tasks dir

Priority 6: Workflow Cost Calculation (Nice to Have)

The workflow_cost command uses calculate_task_ev() which is a private function. Consider exposing:

Function NAPI Method Input Output Why
calculate_task_ev() (currently private) calculateTaskEv(p, scopeCost, impactWeight) number, number, number number Expected value calculation

This would need to be made pub or reimplemented in the NAPI layer.

Notes on weighted_critical_path for NAPI

The weighted_critical_path takes a Rust closure F: Fn(&str) -> f64. For NAPI, this would need to:

  1. Accept a JavaScript function callback, OR
  2. Accept a Record<string, number> map of task ID -> weight

Option 2 is simpler and avoids cross-language callback overhead. For example:

// NAPI signature option A (callback approach - complex)
graph.weightedCriticalPath((taskId: string) => number): string[]

// NAPI signature option B (map approach - simpler)
graph.weightedCriticalPath(weights: Record<string, number>): string[]

5. Serialization (Serde) Support

Full Serde Support (Serialize + Deserialize)

Type Serialize Deserialize Notes
TaskStatus Yes Yes rename_all = "kebab-case"
TaskScope Yes Yes rename_all = "kebab-case"
TaskRisk Yes Yes rename_all = "kebab-case"
TaskImpact Yes Yes rename_all = "kebab-case"
TaskLevel Yes Yes rename_all = "kebab-case"
TaskFrontmatter Yes Yes Rich serde attributes (skip_serializing_if, rename, default)
Config Yes Yes Via TOML
ProjectConfig Yes Yes Via TOML

No Serde Support

Type Serialize Deserialize Reason
Task No No body and source are separate from frontmatter; to_markdown() handles serialization manually
DependencyGraph No No Computed structure; rebuilt from tasks each time
TaskCollection No No Procedurally built from directory scanning
DiscoveryError No No Error reporting struct
ValidationResult No No Internal result type
Error No No Error enum
OutputFormat No No CLI-only (ValueEnum, not serde)
Cli No No CLI-only (clap derive)
Commands No No CLI-only enum

JSON Serialization in Commands (Ad-hoc)

Several command modules define private structs that derive Serialize for JSON output:

File Struct Fields
validate.rs ValidationOutput valid, task_count, error_count, errors[], missing_deps
validate.rs ValidationError path, message
list.rs TaskSummary id, name, status, scope
show.rs TaskDetails id, name, status, depends_on, scope, risk, impact, level, tags, body
deps.rs DependencyInfo id, status, exists
deps.rs DependenciesOutput task_id, dependencies[]
topo.rs TopoTask position, id, name, status
topo.rs TopoOutput order[], has_cycles
cycles.rs CyclesOutput has_cycles, cycle_count, cycles[]
workflow_cost.rs TaskCost id, name, cost

These are private to each command module and not part of the public API. For NAPI, we would define equivalent TypeScript interfaces or create new public serializable structs.

Serialization Format Details

YAML (frontmatter): TaskFrontmatter uses serde_yaml with:

  • rename_all = "kebab-case" on enums → in-progress, narrow, high, etc.
  • rename = "depends_on" on the depends_on field (explicit)
  • default on required-ish fields
  • skip_serializing_if = "Option::is_none" for optional fields
  • skip_serializing_if = "Vec::is_empty" for tags

JSON (output): Uses serde_json::to_string_pretty() in commands.

TOML (config): Config uses toml::from_str().

Roundtrip: Task::from_markdown() + Task::to_markdown() should produce equivalent output (tested implicitly).


6. Error Types and Error Handling

Library Error Type (src/error.rs)

#[derive(Error, Debug)]
pub enum Error {
    #[error("Task not found: {0}")]
    TaskNotFound(String),

    #[error("Task already exists: {0}")]
    TaskAlreadyExists(String),

    #[error("Circular dependency detected: {0}")]
    CircularDependency(String),

    #[error("Invalid frontmatter in {file}: {message}")]
    InvalidFrontmatter { file: String, message: String },

    #[error("Missing required field '{field}' in {file}")]
    MissingField { file: String, field: String },

    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),

    #[error("YAML parsing error: {0}")]
    Yaml(#[from] serde_yaml::Error),

    #[error("JSON serialization error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("Graph error: {0}")]
    Graph(String),
}

pub type Result<T> = std::result::Result<T, Error>;

Error conversion: From impls via #[from] for std::io::Error, serde_yaml::Error, serde_json::Error.

Usage patterns:

  • Library code returns crate::Result<T> (= Result<T, Error>)
  • anyhow::Result is used only in main.rs for the binary entry point
  • thiserror provides Display impls automatically

CLI Error Handling

The Cli::execute() method returns anyhow::Result<()>. Each command function returns crate::Result<()>. The ? operator converts between them naturally.

Error handling at boundaries:

  • Task::from_file(): IO errors → Error::Io, parse errors → Error::InvalidFrontmatter
  • TaskCollection::from_directory(): Silently skips files without frontmatter, stores errors in DiscoveryError list (non-fatal)
  • Config::from_file(): TOML parse errors → Error::Graph(format!(...)) (note: reuses Graph variant)
  • Command functions: Error::TaskNotFound when task ID missing, Error::TaskAlreadyExists on duplicate init

NAPI Error Mapping Strategy

For the Node.js wrapper, we should map:

Rust Error Node.js Error Notes
TaskNotFound(id) Generic Error with message JS: throw new Error("Task not found: <id>")
TaskAlreadyExists(id) Generic Error with message JS: throw new Error("Task already exists: <id>")
CircularDependency(msg) Generic Error with message JS: throw new Error("Circular dependency: <msg>")
InvalidFrontmatter { file, message } Generic Error with message JS: throw new Error("Invalid frontmatter in <file>: <message>")
MissingField { file, field } Generic Error with message JS: throw new Error("Missing field <field> in <file>")
Io(err) Generic Error with message JS: throw new Error("IO error: <message>")
Yaml(err) Generic Error with message JS: throw new Error("YAML parsing error: <message>")
Json(err) Generic Error with message JS: throw new Error("JSON error: <message>")
Graph(msg) Generic Error with message JS: throw new Error("Graph error: <msg>")

Alternatively, we could create custom JS error classes for better programmatic handling:

class TaskNotFoundError extends Error { taskId: string }
class CircularDependencyError extends Error { }
class InvalidFrontmatterError extends Error { file: string; message: string }

7. Input/Output Patterns

Data Flow Overview

                    DISCOVERY
tasks/*.md files ──────────────> TaskCollection
   (disk)                       (HashMap<String, Task>)
                                     │
                                     │ from_collection() / from_tasks()
                                     ▼
                               DependencyGraph
                               (DiGraph<String, ()>)
                                     │
                    ┌────────────────┼────────────────────┐
                    │                 │                     │
                    ▼                 ▼                     ▼
              topological       parallel_groups      critical_path
              order()           ()                   ()
                    │                 │                     │
                    └────────────────┴─────────────────────┘
                                     │
                                     ▼
                              Output (plain/JSON)

Input Patterns

  1. File-based input (primary): TaskCollection::from_directory(path) scans a directory recursively for .md files, parses each, and builds the collection. This is the main entry point.

  2. String-based input: Task::from_markdown(content, source) parses a single markdown string. Useful for programmatic construction.

  3. Path-based input: Task::from_file(path) reads a single file and parses it.

  4. Programmatic construction: DependencyGraph::new() + add_task() + add_dependency() for building graphs manually.

Output Patterns

  1. Plain text (default): Human-readable terminal output with tables, arrows, and formatting.

  2. JSON output (--format json): Structured JSON using ad-hoc Serialize structs in each command. This is the primary programmatic output format.

  3. DOT format: DependencyGraph::to_dot() returns GraphViz DOT format string.

  4. Markdown roundtrip: Task::to_markdown() produces valid markdown with YAML frontmatter.

Typical Usage Flow

// 1. Discover tasks
let collection = TaskCollection::from_directory(Path::new("./tasks"));

// 2. Validate
let result = collection.validate();
if !result.is_valid() { /* handle errors */ }

// 3. Build graph
let graph = DependencyGraph::from_collection(&collection);

// 4. Analyze
let has_cycles = graph.has_cycles();
let order = graph.topological_order();
let parallel = graph.parallel_groups();
let critical = graph.critical_path();
let bottlenecks = graph.bottlenecks();

NAPI Data Flow Design

For the Node.js wrapper, the recommended data flow is:

// Option A: File-based (mirrors Rust CLI)
const collection = TaskCollection.fromDirectory('./tasks');
const graph = DependencyGraph.fromCollection(collection);

// Option B: Programmatic (unique to NAPI)
const tasks = [
  Task.fromMarkdown('---\nid: t1\nname: Task 1\n---\nBody'),
  Task.fromMarkdown('---\nid: t2\nname: Task 2\ndepends_on: [t1]\n---\nBody'),
];
const graph = DependencyGraph.fromTasks(tasks);

// Option C: Manual graph construction
const graph = new DependencyGraph();
graph.addTask('t1');
graph.addTask('t2');
graph.addDependency('t1', 't2');

Memory/Ownership Considerations for NAPI

  • Task is Clone (cheap to clone; contains String, TaskFrontmatter, Option)
  • TaskCollection owns all Task objects (HashMap<String, Task>)
  • DependencyGraph owns the graph structure (not the tasks themselves; only stores task IDs as node weights)
  • DependencyGraph::from_collection() borrows &TaskCollection (doesn't take ownership)
  • Task::from_file() and from_markdown() return owned Task values

For NAPI, we need to decide:

  1. Should TaskCollection hold JS-managed task objects or Rust-owned? Probably Rust-owned (tasks are parsed from files/strings, not constructed in JS).
  2. Should graph operations return strings or Task references? Currently returns Vec<TaskId> (strings). The JS side can look up tasks from the collection. This is efficient.
  3. Should DependencyGraph keep a reference to TaskCollection? Currently no. This means JS must pass the collection alongside the graph for enriched output. We could create a combined TaskGraph class in the NAPI layer.

8. Existing Tests and Benchmarks

Unit Tests (in-source)

File Test Count Key Tests
src/graph.rs 12 Empty graph, add task/dep, missing deps, cycle detection, topo sort, parallel groups, critical path, bottleneck, DOT output, unknown task queries
src/discovery.rs 5 Single task discovery, skip files without frontmatter, duplicate ID detection, missing dependencies, validation result
src/config.rs 2 Default config, load from file

Integration Tests (tests/integration/commands.rs)

25 tests total using assert_cmd:

Test Command What It Verifies
test_list_command list Lists all 3 fixture tasks
test_list_with_status_filter list --status completed Filters correctly
test_show_command show task-one Shows task details
test_show_missing_task show missing-task Fails on missing
test_validate_command validate Succeeds on valid fixtures
test_validate_with_missing_dependency validate (invalid) Reports missing deps
test_topo_command topo Outputs topological order
test_deps_command deps task-two Shows task-one as dependency
test_dependents_command dependents task-one Shows tasks two and three
test_cycles_command_no_cycles cycles No cycles in valid fixtures
test_cycles_command_with_cycles cycles (cycles fixtures) Detects cycle
test_parallel_command parallel Shows generation groups
test_critical_command critical Shows critical path
test_graph_command graph Outputs DOT format
test_bottleneck_command bottleneck Shows bottleneck tasks
test_init_command init new-task Creates file
test_init_duplicate_task init task-one Fails on duplicate
test_init_with_options init --scope narrow --risk low Writes scope/risk to file
test_risk_command risk Distribution with counts
test_risk_command_empty risk (empty dir) "No tasks found"
test_decompose_command decompose Flags high-risk/broad-scope tasks
test_decompose_command_none_needed decompose (low-risk tasks) "No tasks need decomposition"
test_workflow_cost_command workflow-cost Shows cost analysis
test_workflow_cost_command_empty workflow-cost (empty) "No tasks found"
test_risk_path_command risk-path Shows risk path
test_risk_path_command_empty risk-path (empty) "No tasks found"
test_help_flag --help Shows help text
test_version_flag --version Succeeds
test_completions_bash completions bash Bash completion output
test_completions_zsh completions zsh Zsh completion output
test_completions_fish completions fish Fish completion output

Benchmark Suite (benches/graph_benchmarks.rs)

Uses Criterion. Two benchmark groups:

  1. load_tasks: Measures TaskCollection::from_directory() + DependencyGraph::from_collection() for 50, 100, 500, 1000 tasks.

  2. graph_ops: On 1000-task graph, measures:

    • topological_sort_1000
    • cycle_detection_1000
    • critical_path_1000
    • bottlenecks_1000

Test data: linear chain of tasks (task-i depends on task-(i-1)).

Performance Numbers (from README)

Tasks Load Time Topo Sort Cycles Critical Path
50 3ms 3ms 2ms 8ms
500 19ms 21ms 14ms 52ms
1,000 34ms 42ms 26ms 82ms

(Benchmarked on AMD EPYC 9004 series)

CI Pipeline (.github/workflows/ci.yml)

Two jobs:

  1. Test: checkout -> install Rust (with rustfmt, clippy) -> cache -> fmt check -> clippy -> test -> build release
  2. Coverage: checkout -> install Rust -> cache -> install cargo-llvm-cov -> generate lcov -> upload to Codecov

Test Coverage

Reported at 89% (meeting the 80% target from AGENTS.md).


Appendix A: Complete Type Reference for NAPI Mapping

Enums to JS String Unions

// task.ts
type TaskStatus = "pending" | "in-progress" | "completed" | "failed" | "blocked";
type TaskScope = "single" | "narrow" | "moderate" | "broad" | "system";
type TaskRisk = "trivial" | "low" | "medium" | "high" | "critical";
type TaskImpact = "isolated" | "component" | "phase" | "project";
type TaskLevel = "planning" | "decomposition" | "implementation" | "review" | "research";

Proposed NAPI Class Structure

// task.ts
class Task {
  // Static constructors
  static fromMarkdown(content: string, source?: string): Task;
  static fromFile(path: string): Task;

  // Getters
  get id(): string;
  get name(): string;
  get status(): TaskStatus;
  get dependsOn(): string[];
  get body(): string;
  get source(): string | null;

  // Frontmatter access (via JS object)
  get frontmatter(): TaskFrontmatter;

  // Serialization
  toMarkdown(): string;
}

interface TaskFrontmatter {
  id: string;
  name: string;
  status: TaskStatus;
  dependsOn: string[];
  priority?: string;
  tags: string[];
  created?: string;     // ISO 8601
  modified?: string;    // ISO 8601
  assignee?: string;
  due?: string;
  scope?: TaskScope;
  risk?: TaskRisk;
  impact?: TaskImpact;
  level?: TaskLevel;
}

// collection.ts
class TaskCollection {
  static fromDirectory(path: string): TaskCollection;
  get(id: string): Task | null;
  get length(): number;
  ids(): string[];
  tasks(): Task[];
  get errors(): DiscoveryError[];
  missingDependencies(): Record<string, string[]>;
  validate(): ValidationResult;
}

interface DiscoveryError {
  path: string;
  message: string;
}

interface ValidationResult {
  taskCount: number;
  errors: DiscoveryError[];
  missingDependencies: Record<string, string[]>;
  isValid(): boolean;
  issueCount(): number;
}

// graph.ts
class DependencyGraph {
  static fromCollection(collection: TaskCollection): DependencyGraph;
  static fromTasks(tasks: Task[]): DependencyGraph;

  addTask(id: string): void;
  addDependency(from: string, to: string): void;
  hasCycles(): boolean;
  findCycles(): string[][];
  topologicalOrder(): string[] | null;
  dependencies(taskId: string): string[];
  dependents(taskId: string): string[];
  parallelGroups(): string[][];
  criticalPath(): string[];
  weightedCriticalPath(weights: Record<string, number>): string[];
  bottlenecks(): [string, number][];
  toDot(): string;
}

// config.ts
class Config {
  static fromFile(path: string): Config;
  static findAndLoad(): Config | null;
  get tasksPath(): string;
}

// workflow.ts
function calculateTaskEv(p: number, scopeCost: number, impactWeight: number): number;

Key Decisions for NAPI Implementation

  1. Task mutability: The Rust Task struct is Clone but has no setters. For NAPI, we should either:

    • Make the JS Task immutable (read-only after creation) - simpler, matches Rust
    • Add a TaskBuilder pattern for constructing tasks programmatically
  2. Enum representation: Use JS string literals (not numeric enums) to match the kebab-case serde serialization.

  3. Error handling: Throw JS Error objects from NAPI. Consider custom error classes for TaskNotFound and InvalidFrontmatter.

  4. DateTime handling: chrono::DateTime<Utc> maps to ISO 8601 strings in JS. No need for JS Date objects in the NAPI layer.

  5. Graph lifetime: The Rust DependencyGraph borrows nothing (stores owned String node weights). It can be freely moved/owned in NAPI.

  6. Collection lifetime: TaskCollection owns its tasks. The NAPI class should hold the Rust struct. Returning Task references from collection.get() requires careful lifetime management - consider returning clones.

  7. weighted_critical_path callback: Replace the Rust closure with a JS Record<string, number> dict lookup to avoid FFI callback overhead and complexity.


Appendix B: Notable Implementation Details

Bottleneck Algorithm

The current bottlenecks() implementation uses an O(n^2 * P) algorithm where P is the number of paths between nodes. It enumerates all paths between all pairs, then counts how many paths each task appears on. This is not true betweenness centrality (which uses Brandes' O(VE) algorithm) but a simpler path-counting approach. For large graphs, this could be slow. The benchmark only tests up to 1000 nodes with linear topology.

Critical Path Algorithm

Uses recursive memoized longest-path computation. Works well for DAGs but will return empty/incorrect results if cycles exist (the parallel_groups method also silently breaks if cycles exist).

Missing: Task Serialization

Task does not implement Serialize/Deserialize. The to_markdown() method manually concatenates YAML frontmatter + markdown body. If we need JSON serialization of the full Task (including body), we should add a new serializable struct like:

#[derive(Serialize)]
pub struct SerializableTask {
    pub frontmatter: TaskFrontmatter,
    pub body: String,
    pub source: Option<String>,
}

Or implement Serialize for Task directly.

Missing: Task Mutability

There are no methods to update a task's status, dependencies, etc. in place. The current design assumes files are the source of truth and are edited directly. For an NAPI wrapper, we may want to add:

  • task.set_status(status: TaskStatus)
  • task.set_depends_on(deps: Vec<String>)
  • etc.

Or use a builder pattern for creating new tasks.

Missing: Partial Graph Building

DependencyGraph::from_collection() adds edges only for dependencies that exist as nodes in the graph. Missing dependencies are silently ignored (no error, no warning). This matches the add_dependency() behavior which checks index_map before adding edges.

walkdir::FollowLinks(false)

TaskCollection::from_directory() does not follow symlinks. This is intentional for safety.


Appendix C: Dependency Version Compatibility Notes

Crate Version Notes for NAPI
petgraph 0.7 Stable API; DiGraph and algorithms are well-defined
gray_matter 0.2 Minor version; API may change in 0.3
serde 1.0 Very stable; derive feature needed
serde_json 1.0 Very stable
serde_yaml 0.9 Note: serde_yaml 0.9 is the last version before potential breaking changes
chrono 0.4 Stable; serde feature for serialization
clap 4.5 CLI-only; not needed in NAPI lib
thiserror 2.0 Error derive; v2 is newer than commonly seen
toml 0.8 For config loading
walkdir 2.5 For directory scanning

For NAPI, we can exclude from the build:

  • clap / clap_complete (CLI-only, not needed for library)
  • tracing / tracing-subscriber (logging, optional)
  • dirs (platform directories, only for CLI default paths)

This could be done with feature flags:

[features]
default = ["cli"]
cli = ["clap", "clap_complete", "tracing", "tracing-subscriber", "dirs"]
napi = []  # Minimal dependencies for Node.js binding