alkdev/taskgraph_ts

Fork 0

Files

glm-5.1 ba8c382d53 Add architecture doc and research reports for taskgraph_ts napi wrapper

2026-04-23 10:30:40 +00:00

45 KiB

Raw Permalink Blame History

TaskGraph Rust Source - Comprehensive Research Report

Source: /workspace/@alkimiadev/taskgraph (Rust CLI project) Report date: 2026-04-23 Version: 0.1.3

Project Structure
Cargo.toml Details
Core Data Types and Public APIs
Functions/Methods to Expose via NAPI
Serialization (Serde) Support
Error Types and Error Handling
Input/Output Patterns
Existing Tests and Benchmarks

1. Project Structure

Directory Layout

taskgraph/
├── Cargo.toml              # Package manifest (single crate, not a workspace)
├── Cargo.lock               # Locked dependencies
├── LICENSE-APACHE           # Apache-2.0 license
├── LICENSE-MIT              # MIT license
├── README.md                # User-facing documentation
├── AGENTS.md                # AI agent context file
├── opencode.json            # OpenCode configuration
├── .github/
│   └── workflows/
│       └── ci.yml           # CI: fmt, clippy, test, coverage
├── docs/
│   ├── ARCHITECTURE.md      # Full architecture spec
│   ├── framework.md         # Cost-benefit framework rationale
│   ├── workflow.md          # Practical workflow guide
│   ├── implementation.md   # Tools/models/guidelines
│   ├── phase-1.md through phase-4.md  # Phase plans
│   ├── issues/              # Blocking issues tracking
│   ├── reviews/             # Code review docs
│   └── research/
│       └── cost_benefit_analysis_framework.py
├── scripts/
│   └── benchmark.sh         # Manual benchmark script
├── benches/
│   └── graph_benchmarks.rs  # Criterion benchmarks
├── src/
│   ├── main.rs              # Binary entry point (thin: parse CLI, execute)
│   ├── lib.rs               # Library root - re-exports public API
│   ├── cli.rs               # CLI argument definitions (clap derive)
│   ├── task.rs              # Task, TaskFrontmatter, enums (serde types)
│   ├── graph.rs             # DependencyGraph (petgraph wrapper)
│   ├── error.rs             # Error enum (thiserror)
│   ├── config.rs            # Config loading (.taskgraph.toml)
│   ├── discovery.rs         # TaskCollection (directory scanning)
│   └── commands/
│       ├── mod.rs            # Command module re-exports
│       ├── init.rs           # `init` command
│       ├── validate.rs       # `validate` command
│       ├── list.rs           # `list` command
│       ├── show.rs            # `show` command
│       ├── deps.rs            # `deps` command
│       ├── topo.rs            # `topo` command
│       ├── cycles.rs          # `cycles` command
│       ├── parallel.rs        # `parallel` command
│       ├── critical.rs        # `critical` command
│       ├── bottleneck.rs      # `bottleneck` command
│       ├── risk.rs            # `risk` command
│       ├── decompose.rs       # `decompose` command
│       ├── workflow_cost.rs   # `workflow-cost` command
│       ├── risk_path.rs       # `risk-path` command
│       └── graph_cmd.rs       # `graph` command (DOT output)
└── tests/
    ├── integration/
    │   └── commands.rs        # 25 integration tests (assert_cmd)
    └── fixtures/
        ├── tasks/             # 3 valid tasks (one depends on another)
        ├── cycles/             # 3 tasks forming a cycle
        ├── invalid/            # 1 task with missing dependency
        ├── risk/               # 5 tasks with various risk levels
        └── decompose/          # 4 tasks for decomposition testing

Module Dependency Graph

lib.rs
  ├── cli          → commands::*, config, discovery, graph
  ├── commands/*   → cli, discovery, graph, task
  ├── config       → error
  ├── discovery    → task, error
  ├── error        → (thiserror, std, serde_yaml, serde_json)
  ├── graph        → discovery, task, petgraph
  └── task         → (serde, chrono, gray_matter, error)

Crates

This is a single crate project (not a Cargo workspace). It produces:

Library: libtaskgraph (from src/lib.rs)
Binary: taskgraph (from src/main.rs)

2. Cargo.toml Details

Package Metadata

Field	Value
name	`taskgraph`
version	`0.1.3`
edition	`2021`
license	`MIT OR Apache-2.0`
description	CLI tool for managing task dependencies using markdown files
repository	`https://github.com/alkimiadev/taskgraph`
keywords	`task`, `dependency`, `graph`, `cli`, `markdown`
categories	`command-line-utilities`, `development-tools`

Dependencies (Production)

Crate	Version	Features	Purpose
`petgraph`	`0.7`	-	Directed graph data structure & algorithms (toposort, cycle detection, etc.)
`gray_matter`	`0.2`	-	Markdown frontmatter extraction (YAML engine)
`serde`	`1.0`	`derive`	Serialization/deserialization framework
`serde_json`	`1.0`	-	JSON serialization (for `--format json` output)
`serde_yaml`	`0.9`	-	YAML serialization (for frontmatter parsing & roundtrip)
`clap`	`4.5`	`derive`	CLI argument parsing
`clap_complete`	`4.5`	-	Shell completion generation
`chrono`	`0.4`	`serde`	Date/time with serde support
`anyhow`	`1.0`	-	Ergonomic error handling (used in CLI/binary)
`thiserror`	`2.0`	-	Derived error types (used in library)
`dirs`	`6.0`	-	Platform directories (future: global config)
`walkdir`	`2.5`	-	Recursive directory walking
`tracing`	`0.1`	-	Structured logging
`tracing-subscriber`	`0.3`	`env-filter`	Log output formatting
`toml`	`0.8`	-	Config file parsing

Dev Dependencies

Crate	Version	Purpose
`tempfile`	`3.0`	Temporary directories for tests
`assert_cmd`	`2.0`	CLI integration testing
`predicates`	`3.0`	Assertion predicates for integration tests
`criterion`	`0.5`	Benchmarking framework

Features

[features]
default = []

No feature flags exist yet. This is a good candidate for adding napi feature.

Release Profile

[profile.release]
opt-level = 3
lto = true
strip = true

3. Core Data Types and Public APIs

3.1 Task (`src/task.rs`)

The central data type. Represents a single task file.

/// A task with its content.
#[derive(Debug, Clone)]
pub struct Task {
    pub frontmatter: TaskFrontmatter,
    pub body: String,           // Markdown body content
    pub source: Option<String>, // Source file path (if loaded from file)
}

Methods:

Method	Signature	Returns	Description
`id()`	`&self -> &str`	Task ID	Accessor for frontmatter.id
`name()`	`&self -> &str`	Task name	Accessor for frontmatter.name
`status()`	`&self -> TaskStatus`	Status enum	Accessor for frontmatter.status
`depends_on()`	`&self -> &[String]`	Dependency list	Accessor for frontmatter.depends_on
`from_file()`	`&Path -> Result<Self>`	Parsed Task	Parse from a .md file on disk
`from_markdown()`	`&str, Option<String> -> Result<Self>`	Parsed Task	Parse from markdown string + optional source name
`to_markdown()`	`&self -> Result<String, serde_yaml::Error>`	Markdown string	Serialize back to markdown with YAML frontmatter

Key observation: Task itself does NOT derive Serialize or Deserialize. Only TaskFrontmatter does. The body and source fields are not serialized through serde - they're managed separately during parse/render.

3.2 TaskFrontmatter (`src/task.rs`)

The structured metadata extracted from YAML frontmatter:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TaskFrontmatter {
    pub id: String,
    pub name: String,
    #[serde(default)]
    pub status: TaskStatus,
    #[serde(default, rename = "depends_on")]
    pub depends_on: Vec<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub priority: Option<String>,
    #[serde(default, skip_serializing_if = "Vec::is_empty")]
    pub tags: Vec<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub created: Option<DateTime<Utc>>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub modified: Option<DateTime<Utc>>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub assignee: Option<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub due: Option<String>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub scope: Option<TaskScope>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub risk: Option<TaskRisk>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub impact: Option<TaskImpact>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub level: Option<TaskLevel>,
}

Serde details:

All enums use #[serde(rename_all = "kebab-case")] for YAML keys
Optional fields use skip_serializing_if to keep output clean
Tags use skip_serializing_if = "Vec::is_empty"
depends_on renamed from Rust depends_on (same, but explicitly)
status has a default of TaskStatus::Pending

3.3 Enum Types (`src/task.rs`)

All enums derive Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default.

TaskStatus

#[serde(rename_all = "kebab-case")]
pub enum TaskStatus {
    Pending,       // default
    InProgress,    // "in-progress" in YAML/JSON
    Completed,
    Failed,
    Blocked,
}

Also implements Display (kebab-case strings).

TaskScope

#[serde(rename_all = "kebab-case")]
pub enum TaskScope {
    Single,    // ~500 tokens, cost 1.0
    Narrow,    // default, ~1500 tokens, cost 2.0
    Moderate,  // ~3000 tokens, cost 3.0
    Broad,     // ~6000 tokens, cost 4.0
    System,    // ~10000 tokens, cost 5.0
}

Methods: token_estimate() -> u32, cost_estimate() -> f64, Display

TaskRisk

#[serde(rename_all = "kebab-case")]
pub enum TaskRisk {
    Trivial,   // p=0.98
    Low,       // default, p=0.90
    Medium,    // p=0.80
    High,      // p=0.65
    Critical,  // p=0.50
}

Methods: success_probability() -> f64, Display

TaskImpact

#[serde(rename_all = "kebab-case")]
pub enum TaskImpact {
    Isolated,   // default, weight 1.0
    Component,  // weight 1.5
    Phase,      // weight 2.0
    Project,    // weight 3.0
}

Methods: weight() -> f64, Display

TaskLevel

#[serde(rename_all = "kebab-case")]
pub enum TaskLevel {
    Planning,
    Decomposition,
    Implementation,  // default
    Review,
    Research,
}

Methods: Display only

3.4 DependencyGraph (`src/graph.rs`)

A directed graph of task dependencies built from a TaskCollection.

pub struct DependencyGraph {
    graph: DiGraph<TaskId, ()>,       // petgraph directed graph
    index_map: HashMap<TaskId, NodeIndex>,  // task ID -> node index
}

Edge direction: from -> to means "from must complete before to" (dependency must complete first).

Public API:

Method	Signature	Returns	Description
`new()`	`-> Self`	Empty graph	Create empty graph
`from_collection()`	`&TaskCollection -> Self`	Built graph	Build from discovered tasks
`from_tasks()`	`Vec<&Task> -> Self`	Built graph	Build from explicit task list
`add_task()`	`&mut self, TaskId`	()	Add node
`add_dependency()`	`&mut self, &str, &str`	()	Add edge (from->to); silently ignores unknown IDs
`has_cycles()`	`&self -> bool`	Boolean	Uses `petgraph::algo::is_cyclic_directed`
`find_cycles()`	`&self -> Vec<Vec<TaskId>>`	Cycles	Custom DFS cycle finder
`topological_order()`	`&self -> Option<Vec<TaskId>>`	Order or None	Uses `petgraph::algo::toposort`
`dependencies()`	`&self, &str -> Vec<TaskId>`	Incoming neighbors	What this task depends on (direct)
`dependents()`	`&self, &str -> Vec<TaskId>`	Outgoing neighbors	What depends on this (direct)
`parallel_groups()`	`&self -> Vec<Vec<TaskId>>`	Generations	Tasks grouped by level (can run concurrently)
`critical_path()`	`&self -> Vec<TaskId>`	Path	Longest path through the graph
`weighted_critical_path()`	`&self, F: Fn(&str)->f64 -> Vec<TaskId>`	Weighted path	Path with highest cumulative weight
`bottlenecks()`	`&self -> Vec<(TaskId, usize)>`	Ranked list	Betweenness centrality via path counting
`to_dot()`	`&self -> String`	DOT string	GraphViz DOT format export

Also implements Default (returns new()).

Important: DependencyGraph does NOT implement Serialize/Deserialize. It's a compute-only structure built fresh each time from tasks.

3.5 TaskCollection (`src/discovery.rs`)

Collection of tasks discovered from a directory:

#[derive(Debug, Default)]
pub struct TaskCollection {
    tasks: HashMap<String, Task>,        // Tasks indexed by ID
    paths: HashMap<String, PathBuf>,     // File paths indexed by ID
    errors: Vec<DiscoveryError>,         // Parse errors encountered
}

Public API:

Method	Signature	Returns	Description
`new()`	`-> Self`	Empty collection	Constructor
`from_directory()`	`&Path -> Self`	Populated collection	Scan directory recursively for .md files
`get()`	`&self, &str -> Option<&Task>`	Task or None	Lookup by ID
`path()`	`&self, &str -> Option<&PathBuf>`	Path or None	File path for task ID
`tasks()`	`&self -> impl Iterator<Item = &Task>`	Iterator	All tasks
`ids()`	`&self -> impl Iterator<Item = &str>`	Iterator	All task IDs
`len()`	`&self -> usize`	Count	Number of tasks
`is_empty()`	`&self -> bool`	Boolean	Empty check
`errors()`	`&self -> &[DiscoveryError]`	Errors	Parse errors from discovery
`missing_dependencies()`	`&self -> HashMap<String, Vec<String>>`	Map	Task ID -> missing dep IDs
`validate()`	`&self -> ValidationResult`	Result	Full validation

Important: TaskCollection does NOT implement Serialize/Deserialize either. It's built procedurally.

3.6 DiscoveryError (`src/discovery.rs`)

#[derive(Debug, Clone)]
pub struct DiscoveryError {
    pub path: PathBuf,
    pub message: String,
}

No serde derives. Simple struct for error reporting.

3.7 ValidationResult (`src/discovery.rs`)

#[derive(Debug)]
pub struct ValidationResult {
    pub task_count: usize,
    pub errors: Vec<DiscoveryError>,
    pub missing_dependencies: HashMap<String, Vec<String>>,
}

Methods: is_valid() -> bool, issue_count() -> usize

No serde derives on the Rust type itself, but it's converted to ValidationOutput (which does derive Serialize) in the validate command.

3.8 Config (`src/config.rs`)

#[derive(Debug, Default, Serialize, Deserialize)]
pub struct Config {
    #[serde(default)]
    pub project: ProjectConfig,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct ProjectConfig {
    #[serde(default = "default_tasks_dir")]
    pub tasks_dir: String,  // default: "tasks"
}

API:

Method	Signature	Returns	Description
`from_file()`	`&Path -> Result<Self>`	Config	Load from .taskgraph.toml
`find_and_load()`	`-> Option<Self>`	Config or None	Search up directory tree
`tasks_path()`	`&self -> PathBuf`	Path	Get tasks directory

3.9 CLI Types (`src/cli.rs`)

#[derive(Clone, Copy, Debug, Default, ValueEnum)]
pub enum OutputFormat {
    Plain,   // default
    Json,
}

#[derive(Parser, Debug)]
pub struct Cli {
    pub path: Option<String>,
    pub format: OutputFormat,
    pub command: Commands,
}

#[derive(Subcommand, Debug)]
pub enum Commands {
    Init { id, name, scope, risk },
    Validate { strict },
    List { status, tag },
    Show { id },
    Deps { id },
    Dependents { id },
    Topo { status },
    Cycles,
    Parallel,
    Critical,
    Bottleneck,
    Risk,
    Decompose,
    WorkflowCost { include_completed, limit },
    RiskPath,
    Graph { output },
    Completions { shell },
}

The Cli::execute() method dispatches all commands. It creates TaskCollection from directory for each command.

3.10 Lib.rs Public Re-exports

pub mod cli;
pub mod commands;
pub mod config;
pub mod discovery;
pub mod error;
pub mod graph;
pub mod task;

pub use config::Config;
pub use discovery::{DiscoveryError, TaskCollection, ValidationResult};
pub use error::{Error, Result};
pub use graph::DependencyGraph;
pub use task::{Task, TaskFrontmatter, TaskImpact, TaskLevel, TaskRisk, TaskScope, TaskStatus};

4. Functions/Methods to Expose via NAPI

Priority 1: Core Data Types (Must Have)

These are the foundational types that everything else depends on:

Rust Type	NAPI Class	Why
`Task`	`Task`	Central unit of work; must be creatable, readable, serializable from JS
`TaskFrontmatter`	Embedded in `Task` or separate class	All metadata is here; JS needs to read/write fields
`TaskStatus`	String enum mapping	Simple 5-variant enum; map to JS string union
`TaskScope`	String enum mapping	5 variants with numeric mappings; map to JS string union
`TaskRisk`	String enum mapping	5 variants with probability; map to JS string union
`TaskImpact`	String enum mapping	4 variants with weight; map to JS string union
`TaskLevel`	String enum mapping	5 variants; map to JS string union

Priority 2: Core Functions (Must Have)

Rust Function	NAPI Method	Input	Output	Why
`Task::from_markdown()`	`Task.fromMarkdown(content, source?)`	`string, string?`	`Task`	Parse task from markdown string
`Task::from_file()`	`Task.fromFile(path)`	`string`	`Task`	Parse task from file path
`Task::to_markdown()`	`task.toMarkdown()`	-	`string`	Serialize task back to markdown
`Task::id()`	`task.id` (getter)	-	`string`	Accessor
`Task::name()`	`task.name` (getter)	-	`string`	Accessor
`Task::status()`	`task.status` (getter)	-	`string`	Accessor
`Task::depends_on()`	`task.dependsOn` (getter)	-	`string[]`	Accessor
`TaskScope::token_estimate()`	`scope.tokenEstimate()`	-	`number`	Numeric mapping
`TaskScope::cost_estimate()`	`scope.costEstimate()`	-	`number`	Numeric mapping
`TaskRisk::success_probability()`	`risk.successProbability()`	-	`number`	Numeric mapping
`TaskImpact::weight()`	`impact.weight()`	-	`number`	Numeric mapping

Priority 3: Collection & Discovery (Must Have)

Rust Function	NAPI Method	Input	Output	Why
`TaskCollection::from_directory()`	`TaskCollection.fromDirectory(path)`	`string`	`TaskCollection`	Primary entry point: discover all tasks
`TaskCollection::new()`	`new TaskCollection()`	-	`TaskCollection`	Empty constructor for building manually
`TaskCollection::get()`	`collection.get(id)`	`string`	`Task\|null`	Lookup by ID
`TaskCollection::len()`	`collection.length` (getter)	-	`number`	Task count
`TaskCollection::ids()`	`collection.ids()`	-	`string[]`	All task IDs
`TaskCollection::tasks()`	`collection.tasks()`	-	`Task[]`	All tasks
`TaskCollection::errors()`	`collection.errors` (getter)	-	`DiscoveryError[]`	Parse errors
`TaskCollection::missing_dependencies()`	`collection.missingDependencies()`	-	`Record<string, string[]>`	Find broken deps
`TaskCollection::validate()`	`collection.validate()`	-	`ValidationResult`	Full validation

Priority 4: Graph Operations (Must Have)

Rust Function	NAPI Method	Input	Output	Why
`DependencyGraph::from_collection()`	`DependencyGraph.fromCollection(collection)`	`TaskCollection`	`DependencyGraph`	Build graph
`DependencyGraph::new()`	`new DependencyGraph()`	-	`DependencyGraph`	Empty graph constructor
`DependencyGraph::from_tasks()`	`DependencyGraph.fromTasks(tasks[])`	`Task[]`	`DependencyGraph`	Build from JS array
`add_task()`	`graph.addTask(id)`	`string`	`void`	Add node
`add_dependency()`	`graph.addDependency(from, to)`	`string, string`	`void`	Add edge
`has_cycles()`	`graph.hasCycles()`	-	`boolean`	Cycle detection
`find_cycles()`	`graph.findCycles()`	-	`string[][]`	Get actual cycles
`topological_order()`	`graph.topologicalOrder()`	-	`string[]\|null`	Execution order
`dependencies()`	`graph.dependencies(id)`	`string`	`string[]`	Direct deps
`dependents()`	`graph.dependents(id)`	`string`	`string[]`	What depends on this
`parallel_groups()`	`graph.parallelGroups()`	-	`string[][]`	Parallel work groups
`critical_path()`	`graph.criticalPath()`	-	`string[]`	Longest path
`weighted_critical_path()`	`graph.weightedCriticalPath(weightFn)`	`(id: string) => number`	`string[]`	Weighted longest path
`bottlenecks()`	`graph.bottlenecks()`	-	`[string, number][]`	Betweenness centrality
`to_dot()`	`graph.toDot()`	-	`string`	GraphViz DOT format

Priority 5: Config (Nice to Have)

Rust Function	NAPI Method	Input	Output	Why
`Config::from_file()`	`Config.fromFile(path)`	`string`	`Config`	Load config
`Config::find_and_load()`	`Config.findAndLoad()`	-	`Config\|null`	Auto-discover config
`Config::tasks_path()`	`config.tasksPath` (getter)	-	`string`	Get tasks dir

Priority 6: Workflow Cost Calculation (Nice to Have)

The workflow_cost command uses calculate_task_ev() which is a private function. Consider exposing:

Function	NAPI Method	Input	Output	Why
`calculate_task_ev()` (currently private)	`calculateTaskEv(p, scopeCost, impactWeight)`	`number, number, number`	`number`	Expected value calculation

This would need to be made pub or reimplemented in the NAPI layer.

Notes on `weighted_critical_path` for NAPI

The weighted_critical_path takes a Rust closure F: Fn(&str) -> f64. For NAPI, this would need to:

Accept a JavaScript function callback, OR
Accept a Record<string, number> map of task ID -> weight

Option 2 is simpler and avoids cross-language callback overhead. For example:

// NAPI signature option A (callback approach - complex)
graph.weightedCriticalPath((taskId: string) => number): string[]

// NAPI signature option B (map approach - simpler)
graph.weightedCriticalPath(weights: Record<string, number>): string[]

5. Serialization (Serde) Support

Full Serde Support (Serialize + Deserialize)

Type	Serialize	Deserialize	Notes
`TaskStatus`	Yes	Yes	`rename_all = "kebab-case"`
`TaskScope`	Yes	Yes	`rename_all = "kebab-case"`
`TaskRisk`	Yes	Yes	`rename_all = "kebab-case"`
`TaskImpact`	Yes	Yes	`rename_all = "kebab-case"`
`TaskLevel`	Yes	Yes	`rename_all = "kebab-case"`
`TaskFrontmatter`	Yes	Yes	Rich serde attributes (skip_serializing_if, rename, default)
`Config`	Yes	Yes	Via TOML
`ProjectConfig`	Yes	Yes	Via TOML

No Serde Support

Type	Serialize	Deserialize	Reason
`Task`	No	No	`body` and `source` are separate from frontmatter; `to_markdown()` handles serialization manually
`DependencyGraph`	No	No	Computed structure; rebuilt from tasks each time
`TaskCollection`	No	No	Procedurally built from directory scanning
`DiscoveryError`	No	No	Error reporting struct
`ValidationResult`	No	No	Internal result type
`Error`	No	No	Error enum
`OutputFormat`	No	No	CLI-only (ValueEnum, not serde)
`Cli`	No	No	CLI-only (clap derive)
`Commands`	No	No	CLI-only enum

JSON Serialization in Commands (Ad-hoc)

Several command modules define private structs that derive Serialize for JSON output:

File	Struct	Fields
`validate.rs`	`ValidationOutput`	valid, task_count, error_count, errors[], missing_deps
`validate.rs`	`ValidationError`	path, message
`list.rs`	`TaskSummary`	id, name, status, scope
`show.rs`	`TaskDetails`	id, name, status, depends_on, scope, risk, impact, level, tags, body
`deps.rs`	`DependencyInfo`	id, status, exists
`deps.rs`	`DependenciesOutput`	task_id, dependencies[]
`topo.rs`	`TopoTask`	position, id, name, status
`topo.rs`	`TopoOutput`	order[], has_cycles
`cycles.rs`	`CyclesOutput`	has_cycles, cycle_count, cycles[]
`workflow_cost.rs`	`TaskCost`	id, name, cost

These are private to each command module and not part of the public API. For NAPI, we would define equivalent TypeScript interfaces or create new public serializable structs.

Serialization Format Details

YAML (frontmatter): TaskFrontmatter uses serde_yaml with:

rename_all = "kebab-case" on enums → in-progress, narrow, high, etc.
rename = "depends_on" on the depends_on field (explicit)
default on required-ish fields
skip_serializing_if = "Option::is_none" for optional fields
skip_serializing_if = "Vec::is_empty" for tags

JSON (output): Uses serde_json::to_string_pretty() in commands.

TOML (config): Config uses toml::from_str().

Roundtrip: Task::from_markdown() + Task::to_markdown() should produce equivalent output (tested implicitly).

6. Error Types and Error Handling

Library Error Type (`src/error.rs`)

#[derive(Error, Debug)]
pub enum Error {
    #[error("Task not found: {0}")]
    TaskNotFound(String),

    #[error("Task already exists: {0}")]
    TaskAlreadyExists(String),

    #[error("Circular dependency detected: {0}")]
    CircularDependency(String),

    #[error("Invalid frontmatter in {file}: {message}")]
    InvalidFrontmatter { file: String, message: String },

    #[error("Missing required field '{field}' in {file}")]
    MissingField { file: String, field: String },

    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),

    #[error("YAML parsing error: {0}")]
    Yaml(#[from] serde_yaml::Error),

    #[error("JSON serialization error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("Graph error: {0}")]
    Graph(String),
}

pub type Result<T> = std::result::Result<T, Error>;

Error conversion: From impls via #[from] for std::io::Error, serde_yaml::Error, serde_json::Error.

Usage patterns:

Library code returns crate::Result<T> (= Result<T, Error>)
anyhow::Result is used only in main.rs for the binary entry point
thiserror provides Display impls automatically

CLI Error Handling

The Cli::execute() method returns anyhow::Result<()>. Each command function returns crate::Result<()>. The ? operator converts between them naturally.

Error handling at boundaries:

Task::from_file(): IO errors → Error::Io, parse errors → Error::InvalidFrontmatter
TaskCollection::from_directory(): Silently skips files without frontmatter, stores errors in DiscoveryError list (non-fatal)
Config::from_file(): TOML parse errors → Error::Graph(format!(...)) (note: reuses Graph variant)
Command functions: Error::TaskNotFound when task ID missing, Error::TaskAlreadyExists on duplicate init

NAPI Error Mapping Strategy

For the Node.js wrapper, we should map:

Rust Error	Node.js Error	Notes
`TaskNotFound(id)`	Generic `Error` with message	JS: `throw new Error("Task not found: <id>")`
`TaskAlreadyExists(id)`	Generic `Error` with message	JS: `throw new Error("Task already exists: <id>")`
`CircularDependency(msg)`	Generic `Error` with message	JS: `throw new Error("Circular dependency: <msg>")`
`InvalidFrontmatter { file, message }`	Generic `Error` with message	JS: `throw new Error("Invalid frontmatter in <file>: <message>")`
`MissingField { file, field }`	Generic `Error` with message	JS: `throw new Error("Missing field <field> in <file>")`
`Io(err)`	Generic `Error` with message	JS: `throw new Error("IO error: <message>")`
`Yaml(err)`	Generic `Error` with message	JS: `throw new Error("YAML parsing error: <message>")`
`Json(err)`	Generic `Error` with message	JS: `throw new Error("JSON error: <message>")`
`Graph(msg)`	Generic `Error` with message	JS: `throw new Error("Graph error: <msg>")`

Alternatively, we could create custom JS error classes for better programmatic handling:

class TaskNotFoundError extends Error { taskId: string }
class CircularDependencyError extends Error { }
class InvalidFrontmatterError extends Error { file: string; message: string }

7. Input/Output Patterns

Data Flow Overview

                    DISCOVERY
tasks/*.md files ──────────────> TaskCollection
   (disk)                       (HashMap<String, Task>)
                                     │
                                     │ from_collection() / from_tasks()
                                     ▼
                               DependencyGraph
                               (DiGraph<String, ()>)
                                     │
                    ┌────────────────┼────────────────────┐
                    │                 │                     │
                    ▼                 ▼                     ▼
              topological       parallel_groups      critical_path
              order()           ()                   ()
                    │                 │                     │
                    └────────────────┴─────────────────────┘
                                     │
                                     ▼
                              Output (plain/JSON)

Input Patterns

File-based input (primary): TaskCollection::from_directory(path) scans a directory recursively for .md files, parses each, and builds the collection. This is the main entry point.
String-based input: Task::from_markdown(content, source) parses a single markdown string. Useful for programmatic construction.
Path-based input: Task::from_file(path) reads a single file and parses it.
Programmatic construction: DependencyGraph::new() + add_task() + add_dependency() for building graphs manually.

Output Patterns

Plain text (default): Human-readable terminal output with tables, arrows, and formatting.
JSON output (--format json): Structured JSON using ad-hoc Serialize structs in each command. This is the primary programmatic output format.
DOT format: DependencyGraph::to_dot() returns GraphViz DOT format string.
Markdown roundtrip: Task::to_markdown() produces valid markdown with YAML frontmatter.

Typical Usage Flow

// 1. Discover tasks
let collection = TaskCollection::from_directory(Path::new("./tasks"));

// 2. Validate
let result = collection.validate();
if !result.is_valid() { /* handle errors */ }

// 3. Build graph
let graph = DependencyGraph::from_collection(&collection);

// 4. Analyze
let has_cycles = graph.has_cycles();
let order = graph.topological_order();
let parallel = graph.parallel_groups();
let critical = graph.critical_path();
let bottlenecks = graph.bottlenecks();

NAPI Data Flow Design

For the Node.js wrapper, the recommended data flow is:

// Option A: File-based (mirrors Rust CLI)
const collection = TaskCollection.fromDirectory('./tasks');
const graph = DependencyGraph.fromCollection(collection);

// Option B: Programmatic (unique to NAPI)
const tasks = [
  Task.fromMarkdown('---\nid: t1\nname: Task 1\n---\nBody'),
  Task.fromMarkdown('---\nid: t2\nname: Task 2\ndepends_on: [t1]\n---\nBody'),
];
const graph = DependencyGraph.fromTasks(tasks);

// Option C: Manual graph construction
const graph = new DependencyGraph();
graph.addTask('t1');
graph.addTask('t2');
graph.addDependency('t1', 't2');

Memory/Ownership Considerations for NAPI

Task is Clone (cheap to clone; contains String, TaskFrontmatter, Option)
TaskCollection owns all Task objects (HashMap<String, Task>)
DependencyGraph owns the graph structure (not the tasks themselves; only stores task IDs as node weights)
DependencyGraph::from_collection() borrows &TaskCollection (doesn't take ownership)
Task::from_file() and from_markdown() return owned Task values

For NAPI, we need to decide:

Should TaskCollection hold JS-managed task objects or Rust-owned? Probably Rust-owned (tasks are parsed from files/strings, not constructed in JS).
Should graph operations return strings or Task references? Currently returns Vec<TaskId> (strings). The JS side can look up tasks from the collection. This is efficient.
Should DependencyGraph keep a reference to TaskCollection? Currently no. This means JS must pass the collection alongside the graph for enriched output. We could create a combined TaskGraph class in the NAPI layer.

8. Existing Tests and Benchmarks

Unit Tests (in-source)

File	Test Count	Key Tests
`src/graph.rs`	12	Empty graph, add task/dep, missing deps, cycle detection, topo sort, parallel groups, critical path, bottleneck, DOT output, unknown task queries
`src/discovery.rs`	5	Single task discovery, skip files without frontmatter, duplicate ID detection, missing dependencies, validation result
`src/config.rs`	2	Default config, load from file

Integration Tests (`tests/integration/commands.rs`)

25 tests total using assert_cmd:

Test	Command	What It Verifies
`test_list_command`	`list`	Lists all 3 fixture tasks
`test_list_with_status_filter`	`list --status completed`	Filters correctly
`test_show_command`	`show task-one`	Shows task details
`test_show_missing_task`	`show missing-task`	Fails on missing
`test_validate_command`	`validate`	Succeeds on valid fixtures
`test_validate_with_missing_dependency`	`validate` (invalid)	Reports missing deps
`test_topo_command`	`topo`	Outputs topological order
`test_deps_command`	`deps task-two`	Shows task-one as dependency
`test_dependents_command`	`dependents task-one`	Shows tasks two and three
`test_cycles_command_no_cycles`	`cycles`	No cycles in valid fixtures
`test_cycles_command_with_cycles`	`cycles` (cycles fixtures)	Detects cycle
`test_parallel_command`	`parallel`	Shows generation groups
`test_critical_command`	`critical`	Shows critical path
`test_graph_command`	`graph`	Outputs DOT format
`test_bottleneck_command`	`bottleneck`	Shows bottleneck tasks
`test_init_command`	`init new-task`	Creates file
`test_init_duplicate_task`	`init task-one`	Fails on duplicate
`test_init_with_options`	`init --scope narrow --risk low`	Writes scope/risk to file
`test_risk_command`	`risk`	Distribution with counts
`test_risk_command_empty`	`risk` (empty dir)	"No tasks found"
`test_decompose_command`	`decompose`	Flags high-risk/broad-scope tasks
`test_decompose_command_none_needed`	`decompose` (low-risk tasks)	"No tasks need decomposition"
`test_workflow_cost_command`	`workflow-cost`	Shows cost analysis
`test_workflow_cost_command_empty`	`workflow-cost` (empty)	"No tasks found"
`test_risk_path_command`	`risk-path`	Shows risk path
`test_risk_path_command_empty`	`risk-path` (empty)	"No tasks found"
`test_help_flag`	`--help`	Shows help text
`test_version_flag`	`--version`	Succeeds
`test_completions_bash`	`completions bash`	Bash completion output
`test_completions_zsh`	`completions zsh`	Zsh completion output
`test_completions_fish`	`completions fish`	Fish completion output

Benchmark Suite (`benches/graph_benchmarks.rs`)

Uses Criterion. Two benchmark groups:

load_tasks: Measures TaskCollection::from_directory() + DependencyGraph::from_collection() for 50, 100, 500, 1000 tasks.
graph_ops: On 1000-task graph, measures:
- topological_sort_1000
- cycle_detection_1000
- critical_path_1000
- bottlenecks_1000

Test data: linear chain of tasks (task-i depends on task-(i-1)).

Performance Numbers (from README)

Tasks	Load Time	Topo Sort	Cycles	Critical Path
50	3ms	3ms	2ms	8ms
500	19ms	21ms	14ms	52ms
1,000	34ms	42ms	26ms	82ms

(Benchmarked on AMD EPYC 9004 series)

CI Pipeline (`.github/workflows/ci.yml`)

Two jobs:

Test: checkout -> install Rust (with rustfmt, clippy) -> cache -> fmt check -> clippy -> test -> build release
Coverage: checkout -> install Rust -> cache -> install cargo-llvm-cov -> generate lcov -> upload to Codecov

Test Coverage

Reported at 89% (meeting the 80% target from AGENTS.md).

Appendix A: Complete Type Reference for NAPI Mapping

Enums to JS String Unions

// task.ts
type TaskStatus = "pending" | "in-progress" | "completed" | "failed" | "blocked";
type TaskScope = "single" | "narrow" | "moderate" | "broad" | "system";
type TaskRisk = "trivial" | "low" | "medium" | "high" | "critical";
type TaskImpact = "isolated" | "component" | "phase" | "project";
type TaskLevel = "planning" | "decomposition" | "implementation" | "review" | "research";

Proposed NAPI Class Structure

// task.ts
class Task {
  // Static constructors
  static fromMarkdown(content: string, source?: string): Task;
  static fromFile(path: string): Task;

  // Getters
  get id(): string;
  get name(): string;
  get status(): TaskStatus;
  get dependsOn(): string[];
  get body(): string;
  get source(): string | null;

  // Frontmatter access (via JS object)
  get frontmatter(): TaskFrontmatter;

  // Serialization
  toMarkdown(): string;
}

interface TaskFrontmatter {
  id: string;
  name: string;
  status: TaskStatus;
  dependsOn: string[];
  priority?: string;
  tags: string[];
  created?: string;     // ISO 8601
  modified?: string;    // ISO 8601
  assignee?: string;
  due?: string;
  scope?: TaskScope;
  risk?: TaskRisk;
  impact?: TaskImpact;
  level?: TaskLevel;
}

// collection.ts
class TaskCollection {
  static fromDirectory(path: string): TaskCollection;
  get(id: string): Task | null;
  get length(): number;
  ids(): string[];
  tasks(): Task[];
  get errors(): DiscoveryError[];
  missingDependencies(): Record<string, string[]>;
  validate(): ValidationResult;
}

interface DiscoveryError {
  path: string;
  message: string;
}

interface ValidationResult {
  taskCount: number;
  errors: DiscoveryError[];
  missingDependencies: Record<string, string[]>;
  isValid(): boolean;
  issueCount(): number;
}

// graph.ts
class DependencyGraph {
  static fromCollection(collection: TaskCollection): DependencyGraph;
  static fromTasks(tasks: Task[]): DependencyGraph;

  addTask(id: string): void;
  addDependency(from: string, to: string): void;
  hasCycles(): boolean;
  findCycles(): string[][];
  topologicalOrder(): string[] | null;
  dependencies(taskId: string): string[];
  dependents(taskId: string): string[];
  parallelGroups(): string[][];
  criticalPath(): string[];
  weightedCriticalPath(weights: Record<string, number>): string[];
  bottlenecks(): [string, number][];
  toDot(): string;
}

// config.ts
class Config {
  static fromFile(path: string): Config;
  static findAndLoad(): Config | null;
  get tasksPath(): string;
}

// workflow.ts
function calculateTaskEv(p: number, scopeCost: number, impactWeight: number): number;

Key Decisions for NAPI Implementation

Task mutability: The Rust Task struct is Clone but has no setters. For NAPI, we should either:
- Make the JS Task immutable (read-only after creation) - simpler, matches Rust
- Add a TaskBuilder pattern for constructing tasks programmatically
Enum representation: Use JS string literals (not numeric enums) to match the kebab-case serde serialization.
Error handling: Throw JS Error objects from NAPI. Consider custom error classes for TaskNotFound and InvalidFrontmatter.
DateTime handling: chrono::DateTime<Utc> maps to ISO 8601 strings in JS. No need for JS Date objects in the NAPI layer.
Graph lifetime: The Rust DependencyGraph borrows nothing (stores owned String node weights). It can be freely moved/owned in NAPI.
Collection lifetime: TaskCollection owns its tasks. The NAPI class should hold the Rust struct. Returning Task references from collection.get() requires careful lifetime management - consider returning clones.
weighted_critical_path callback: Replace the Rust closure with a JS Record<string, number> dict lookup to avoid FFI callback overhead and complexity.

Appendix B: Notable Implementation Details

Bottleneck Algorithm

The current bottlenecks() implementation uses an O(n^2 * P) algorithm where P is the number of paths between nodes. It enumerates all paths between all pairs, then counts how many paths each task appears on. This is not true betweenness centrality (which uses Brandes' O(VE) algorithm) but a simpler path-counting approach. For large graphs, this could be slow. The benchmark only tests up to 1000 nodes with linear topology.

Critical Path Algorithm

Uses recursive memoized longest-path computation. Works well for DAGs but will return empty/incorrect results if cycles exist (the parallel_groups method also silently breaks if cycles exist).

Missing: Task Serialization

Task does not implement Serialize/Deserialize. The to_markdown() method manually concatenates YAML frontmatter + markdown body. If we need JSON serialization of the full Task (including body), we should add a new serializable struct like:

#[derive(Serialize)]
pub struct SerializableTask {
    pub frontmatter: TaskFrontmatter,
    pub body: String,
    pub source: Option<String>,
}

Or implement Serialize for Task directly.

Missing: Task Mutability

There are no methods to update a task's status, dependencies, etc. in place. The current design assumes files are the source of truth and are edited directly. For an NAPI wrapper, we may want to add:

task.set_status(status: TaskStatus)
task.set_depends_on(deps: Vec<String>)
etc.

Or use a builder pattern for creating new tasks.

Missing: Partial Graph Building

DependencyGraph::from_collection() adds edges only for dependencies that exist as nodes in the graph. Missing dependencies are silently ignored (no error, no warning). This matches the add_dependency() behavior which checks index_map before adding edges.

walkdir::FollowLinks(false)

TaskCollection::from_directory() does not follow symlinks. This is intentional for safety.

Appendix C: Dependency Version Compatibility Notes

Crate	Version	Notes for NAPI
`petgraph`	`0.7`	Stable API; `DiGraph` and algorithms are well-defined
`gray_matter`	`0.2`	Minor version; API may change in `0.3`
`serde`	`1.0`	Very stable; `derive` feature needed
`serde_json`	`1.0`	Very stable
`serde_yaml`	`0.9`	Note: `serde_yaml` 0.9 is the last version before potential breaking changes
`chrono`	`0.4`	Stable; `serde` feature for serialization
`clap`	`4.5`	CLI-only; not needed in NAPI lib
`thiserror`	`2.0`	Error derive; v2 is newer than commonly seen
`toml`	`0.8`	For config loading
`walkdir`	`2.5`	For directory scanning

For NAPI, we can exclude from the build:

clap / clap_complete (CLI-only, not needed for library)
tracing / tracing-subscriber (logging, optional)
dirs (platform directories, only for CLI default paths)

This could be done with feature flags:

[features]
default = ["cli"]
cli = ["clap", "clap_complete", "tracing", "tracing-subscriber", "dirs"]
napi = []  # Minimal dependencies for Node.js binding

45 KiB Raw Permalink Blame History