# TaskGraph Rust Source - Comprehensive Research Report > Source: `/workspace/@alkimiadev/taskgraph` (Rust CLI project) > Report date: 2026-04-23 > Version: 0.1.3 --- ## Table of Contents 1. [Project Structure](#1-project-structure) 2. [Cargo.toml Details](#2-cargotoml-details) 3. [Core Data Types and Public APIs](#3-core-data-types-and-public-apis) 4. [Functions/Methods to Expose via NAPI](#4-functionsmethods-to-expose-via-napi) 5. [Serialization (Serde) Support](#5-serialization-serde-support) 6. [Error Types and Error Handling](#6-error-types-and-error-handling) 7. [Input/Output Patterns](#7-inputoutput-patterns) 8. [Existing Tests and Benchmarks](#8-existing-tests-and-benchmarks) --- ## 1. Project Structure ### Directory Layout ``` taskgraph/ ├── Cargo.toml # Package manifest (single crate, not a workspace) ├── Cargo.lock # Locked dependencies ├── LICENSE-APACHE # Apache-2.0 license ├── LICENSE-MIT # MIT license ├── README.md # User-facing documentation ├── AGENTS.md # AI agent context file ├── opencode.json # OpenCode configuration ├── .github/ │ └── workflows/ │ └── ci.yml # CI: fmt, clippy, test, coverage ├── docs/ │ ├── ARCHITECTURE.md # Full architecture spec │ ├── framework.md # Cost-benefit framework rationale │ ├── workflow.md # Practical workflow guide │ ├── implementation.md # Tools/models/guidelines │ ├── phase-1.md through phase-4.md # Phase plans │ ├── issues/ # Blocking issues tracking │ ├── reviews/ # Code review docs │ └── research/ │ └── cost_benefit_analysis_framework.py ├── scripts/ │ └── benchmark.sh # Manual benchmark script ├── benches/ │ └── graph_benchmarks.rs # Criterion benchmarks ├── src/ │ ├── main.rs # Binary entry point (thin: parse CLI, execute) │ ├── lib.rs # Library root - re-exports public API │ ├── cli.rs # CLI argument definitions (clap derive) │ ├── task.rs # Task, TaskFrontmatter, enums (serde types) │ ├── graph.rs # DependencyGraph (petgraph wrapper) │ ├── error.rs # Error enum (thiserror) │ ├── config.rs # Config loading (.taskgraph.toml) │ ├── discovery.rs # TaskCollection (directory scanning) │ └── commands/ │ ├── mod.rs # Command module re-exports │ ├── init.rs # `init` command │ ├── validate.rs # `validate` command │ ├── list.rs # `list` command │ ├── show.rs # `show` command │ ├── deps.rs # `deps` command │ ├── topo.rs # `topo` command │ ├── cycles.rs # `cycles` command │ ├── parallel.rs # `parallel` command │ ├── critical.rs # `critical` command │ ├── bottleneck.rs # `bottleneck` command │ ├── risk.rs # `risk` command │ ├── decompose.rs # `decompose` command │ ├── workflow_cost.rs # `workflow-cost` command │ ├── risk_path.rs # `risk-path` command │ └── graph_cmd.rs # `graph` command (DOT output) └── tests/ ├── integration/ │ └── commands.rs # 25 integration tests (assert_cmd) └── fixtures/ ├── tasks/ # 3 valid tasks (one depends on another) ├── cycles/ # 3 tasks forming a cycle ├── invalid/ # 1 task with missing dependency ├── risk/ # 5 tasks with various risk levels └── decompose/ # 4 tasks for decomposition testing ``` ### Module Dependency Graph ``` lib.rs ├── cli → commands::*, config, discovery, graph ├── commands/* → cli, discovery, graph, task ├── config → error ├── discovery → task, error ├── error → (thiserror, std, serde_yaml, serde_json) ├── graph → discovery, task, petgraph └── task → (serde, chrono, gray_matter, error) ``` ### Crates This is a **single crate** project (not a Cargo workspace). It produces: - **Library**: `libtaskgraph` (from `src/lib.rs`) - **Binary**: `taskgraph` (from `src/main.rs`) --- ## 2. Cargo.toml Details ### Package Metadata | Field | Value | |-------|-------| | name | `taskgraph` | | version | `0.1.3` | | edition | `2021` | | license | `MIT OR Apache-2.0` | | description | CLI tool for managing task dependencies using markdown files | | repository | `https://github.com/alkimiadev/taskgraph` | | keywords | `task`, `dependency`, `graph`, `cli`, `markdown` | | categories | `command-line-utilities`, `development-tools` | ### Dependencies (Production) | Crate | Version | Features | Purpose | |-------|---------|----------|---------| | `petgraph` | `0.7` | - | Directed graph data structure & algorithms (toposort, cycle detection, etc.) | | `gray_matter` | `0.2` | - | Markdown frontmatter extraction (YAML engine) | | `serde` | `1.0` | `derive` | Serialization/deserialization framework | | `serde_json` | `1.0` | - | JSON serialization (for `--format json` output) | | `serde_yaml` | `0.9` | - | YAML serialization (for frontmatter parsing & roundtrip) | | `clap` | `4.5` | `derive` | CLI argument parsing | | `clap_complete` | `4.5` | - | Shell completion generation | | `chrono` | `0.4` | `serde` | Date/time with serde support | | `anyhow` | `1.0` | - | Ergonomic error handling (used in CLI/binary) | | `thiserror` | `2.0` | - | Derived error types (used in library) | | `dirs` | `6.0` | - | Platform directories (future: global config) | | `walkdir` | `2.5` | - | Recursive directory walking | | `tracing` | `0.1` | - | Structured logging | | `tracing-subscriber` | `0.3` | `env-filter` | Log output formatting | | `toml` | `0.8` | - | Config file parsing | ### Dev Dependencies | Crate | Version | Purpose | |-------|---------|---------| | `tempfile` | `3.0` | Temporary directories for tests | | `assert_cmd` | `2.0` | CLI integration testing | | `predicates` | `3.0` | Assertion predicates for integration tests | | `criterion` | `0.5` | Benchmarking framework | ### Features ```toml [features] default = [] ``` No feature flags exist yet. This is a good candidate for adding `napi` feature. ### Release Profile ```toml [profile.release] opt-level = 3 lto = true strip = true ``` --- ## 3. Core Data Types and Public APIs ### 3.1 Task (`src/task.rs`) The central data type. Represents a single task file. ```rust /// A task with its content. #[derive(Debug, Clone)] pub struct Task { pub frontmatter: TaskFrontmatter, pub body: String, // Markdown body content pub source: Option, // Source file path (if loaded from file) } ``` **Methods:** | Method | Signature | Returns | Description | |--------|-----------|---------|-------------| | `id()` | `&self -> &str` | Task ID | Accessor for frontmatter.id | | `name()` | `&self -> &str` | Task name | Accessor for frontmatter.name | | `status()` | `&self -> TaskStatus` | Status enum | Accessor for frontmatter.status | | `depends_on()` | `&self -> &[String]` | Dependency list | Accessor for frontmatter.depends_on | | `from_file()` | `&Path -> Result` | Parsed Task | Parse from a .md file on disk | | `from_markdown()` | `&str, Option -> Result` | Parsed Task | Parse from markdown string + optional source name | | `to_markdown()` | `&self -> Result` | Markdown string | Serialize back to markdown with YAML frontmatter | **Key observation:** `Task` itself does **NOT** derive `Serialize` or `Deserialize`. Only `TaskFrontmatter` does. The `body` and `source` fields are not serialized through serde - they're managed separately during parse/render. ### 3.2 TaskFrontmatter (`src/task.rs`) The structured metadata extracted from YAML frontmatter: ```rust #[derive(Debug, Clone, Serialize, Deserialize)] pub struct TaskFrontmatter { pub id: String, pub name: String, #[serde(default)] pub status: TaskStatus, #[serde(default, rename = "depends_on")] pub depends_on: Vec, #[serde(default, skip_serializing_if = "Option::is_none")] pub priority: Option, #[serde(default, skip_serializing_if = "Vec::is_empty")] pub tags: Vec, #[serde(default, skip_serializing_if = "Option::is_none")] pub created: Option>, #[serde(default, skip_serializing_if = "Option::is_none")] pub modified: Option>, #[serde(default, skip_serializing_if = "Option::is_none")] pub assignee: Option, #[serde(default, skip_serializing_if = "Option::is_none")] pub due: Option, #[serde(default, skip_serializing_if = "Option::is_none")] pub scope: Option, #[serde(default, skip_serializing_if = "Option::is_none")] pub risk: Option, #[serde(default, skip_serializing_if = "Option::is_none")] pub impact: Option, #[serde(default, skip_serializing_if = "Option::is_none")] pub level: Option, } ``` **Serde details:** - All enums use `#[serde(rename_all = "kebab-case")]` for YAML keys - Optional fields use `skip_serializing_if` to keep output clean - Tags use `skip_serializing_if = "Vec::is_empty"` - `depends_on` renamed from Rust `depends_on` (same, but explicitly) - `status` has a default of `TaskStatus::Pending` ### 3.3 Enum Types (`src/task.rs`) All enums derive `Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default`. #### TaskStatus ```rust #[serde(rename_all = "kebab-case")] pub enum TaskStatus { Pending, // default InProgress, // "in-progress" in YAML/JSON Completed, Failed, Blocked, } ``` Also implements `Display` (kebab-case strings). #### TaskScope ```rust #[serde(rename_all = "kebab-case")] pub enum TaskScope { Single, // ~500 tokens, cost 1.0 Narrow, // default, ~1500 tokens, cost 2.0 Moderate, // ~3000 tokens, cost 3.0 Broad, // ~6000 tokens, cost 4.0 System, // ~10000 tokens, cost 5.0 } ``` Methods: `token_estimate() -> u32`, `cost_estimate() -> f64`, `Display` #### TaskRisk ```rust #[serde(rename_all = "kebab-case")] pub enum TaskRisk { Trivial, // p=0.98 Low, // default, p=0.90 Medium, // p=0.80 High, // p=0.65 Critical, // p=0.50 } ``` Methods: `success_probability() -> f64`, `Display` #### TaskImpact ```rust #[serde(rename_all = "kebab-case")] pub enum TaskImpact { Isolated, // default, weight 1.0 Component, // weight 1.5 Phase, // weight 2.0 Project, // weight 3.0 } ``` Methods: `weight() -> f64`, `Display` #### TaskLevel ```rust #[serde(rename_all = "kebab-case")] pub enum TaskLevel { Planning, Decomposition, Implementation, // default Review, Research, } ``` Methods: `Display` only ### 3.4 DependencyGraph (`src/graph.rs`) A directed graph of task dependencies built from a `TaskCollection`. ```rust pub struct DependencyGraph { graph: DiGraph, // petgraph directed graph index_map: HashMap, // task ID -> node index } ``` **Edge direction:** `from -> to` means "from must complete before to" (dependency must complete first). **Public API:** | Method | Signature | Returns | Description | |--------|-----------|---------|-------------| | `new()` | `-> Self` | Empty graph | Create empty graph | | `from_collection()` | `&TaskCollection -> Self` | Built graph | Build from discovered tasks | | `from_tasks()` | `Vec<&Task> -> Self` | Built graph | Build from explicit task list | | `add_task()` | `&mut self, TaskId` | () | Add node | | `add_dependency()` | `&mut self, &str, &str` | () | Add edge (from->to); silently ignores unknown IDs | | `has_cycles()` | `&self -> bool` | Boolean | Uses `petgraph::algo::is_cyclic_directed` | | `find_cycles()` | `&self -> Vec>` | Cycles | Custom DFS cycle finder | | `topological_order()` | `&self -> Option>` | Order or None | Uses `petgraph::algo::toposort` | | `dependencies()` | `&self, &str -> Vec` | Incoming neighbors | What this task depends on (direct) | | `dependents()` | `&self, &str -> Vec` | Outgoing neighbors | What depends on this (direct) | | `parallel_groups()` | `&self -> Vec>` | Generations | Tasks grouped by level (can run concurrently) | | `critical_path()` | `&self -> Vec` | Path | Longest path through the graph | | `weighted_critical_path()` | `&self, F: Fn(&str)->f64 -> Vec` | Weighted path | Path with highest cumulative weight | | `bottlenecks()` | `&self -> Vec<(TaskId, usize)>` | Ranked list | Betweenness centrality via path counting | | `to_dot()` | `&self -> String` | DOT string | GraphViz DOT format export | Also implements `Default` (returns `new()`). **Important:** `DependencyGraph` does **NOT** implement `Serialize`/`Deserialize`. It's a compute-only structure built fresh each time from tasks. ### 3.5 TaskCollection (`src/discovery.rs`) Collection of tasks discovered from a directory: ```rust #[derive(Debug, Default)] pub struct TaskCollection { tasks: HashMap, // Tasks indexed by ID paths: HashMap, // File paths indexed by ID errors: Vec, // Parse errors encountered } ``` **Public API:** | Method | Signature | Returns | Description | |--------|-----------|---------|-------------| | `new()` | `-> Self` | Empty collection | Constructor | | `from_directory()` | `&Path -> Self` | Populated collection | Scan directory recursively for .md files | | `get()` | `&self, &str -> Option<&Task>` | Task or None | Lookup by ID | | `path()` | `&self, &str -> Option<&PathBuf>` | Path or None | File path for task ID | | `tasks()` | `&self -> impl Iterator` | Iterator | All tasks | | `ids()` | `&self -> impl Iterator` | Iterator | All task IDs | | `len()` | `&self -> usize` | Count | Number of tasks | | `is_empty()` | `&self -> bool` | Boolean | Empty check | | `errors()` | `&self -> &[DiscoveryError]` | Errors | Parse errors from discovery | | `missing_dependencies()` | `&self -> HashMap>` | Map | Task ID -> missing dep IDs | | `validate()` | `&self -> ValidationResult` | Result | Full validation | **Important:** `TaskCollection` does **NOT** implement `Serialize`/`Deserialize` either. It's built procedurally. ### 3.6 DiscoveryError (`src/discovery.rs`) ```rust #[derive(Debug, Clone)] pub struct DiscoveryError { pub path: PathBuf, pub message: String, } ``` No serde derives. Simple struct for error reporting. ### 3.7 ValidationResult (`src/discovery.rs`) ```rust #[derive(Debug)] pub struct ValidationResult { pub task_count: usize, pub errors: Vec, pub missing_dependencies: HashMap>, } ``` Methods: `is_valid() -> bool`, `issue_count() -> usize` No serde derives on the Rust type itself, but it's converted to `ValidationOutput` (which does derive `Serialize`) in the validate command. ### 3.8 Config (`src/config.rs`) ```rust #[derive(Debug, Default, Serialize, Deserialize)] pub struct Config { #[serde(default)] pub project: ProjectConfig, } #[derive(Debug, Serialize, Deserialize)] pub struct ProjectConfig { #[serde(default = "default_tasks_dir")] pub tasks_dir: String, // default: "tasks" } ``` **API:** | Method | Signature | Returns | Description | |--------|-----------|---------|-------------| | `from_file()` | `&Path -> Result` | Config | Load from .taskgraph.toml | | `find_and_load()` | `-> Option` | Config or None | Search up directory tree | | `tasks_path()` | `&self -> PathBuf` | Path | Get tasks directory | ### 3.9 CLI Types (`src/cli.rs`) ```rust #[derive(Clone, Copy, Debug, Default, ValueEnum)] pub enum OutputFormat { Plain, // default Json, } #[derive(Parser, Debug)] pub struct Cli { pub path: Option, pub format: OutputFormat, pub command: Commands, } #[derive(Subcommand, Debug)] pub enum Commands { Init { id, name, scope, risk }, Validate { strict }, List { status, tag }, Show { id }, Deps { id }, Dependents { id }, Topo { status }, Cycles, Parallel, Critical, Bottleneck, Risk, Decompose, WorkflowCost { include_completed, limit }, RiskPath, Graph { output }, Completions { shell }, } ``` The `Cli::execute()` method dispatches all commands. It creates `TaskCollection` from directory for each command. ### 3.10 Lib.rs Public Re-exports ```rust pub mod cli; pub mod commands; pub mod config; pub mod discovery; pub mod error; pub mod graph; pub mod task; pub use config::Config; pub use discovery::{DiscoveryError, TaskCollection, ValidationResult}; pub use error::{Error, Result}; pub use graph::DependencyGraph; pub use task::{Task, TaskFrontmatter, TaskImpact, TaskLevel, TaskRisk, TaskScope, TaskStatus}; ``` --- ## 4. Functions/Methods to Expose via NAPI ### Priority 1: Core Data Types (Must Have) These are the foundational types that everything else depends on: | Rust Type | NAPI Class | Why | |-----------|------------|-----| | `Task` | `Task` | Central unit of work; must be creatable, readable, serializable from JS | | `TaskFrontmatter` | Embedded in `Task` or separate class | All metadata is here; JS needs to read/write fields | | `TaskStatus` | String enum mapping | Simple 5-variant enum; map to JS string union | | `TaskScope` | String enum mapping | 5 variants with numeric mappings; map to JS string union | | `TaskRisk` | String enum mapping | 5 variants with probability; map to JS string union | | `TaskImpact` | String enum mapping | 4 variants with weight; map to JS string union | | `TaskLevel` | String enum mapping | 5 variants; map to JS string union | ### Priority 2: Core Functions (Must Have) | Rust Function | NAPI Method | Input | Output | Why | |---------------|-------------|-------|--------|-----| | `Task::from_markdown()` | `Task.fromMarkdown(content, source?)` | `string, string?` | `Task` | Parse task from markdown string | | `Task::from_file()` | `Task.fromFile(path)` | `string` | `Task` | Parse task from file path | | `Task::to_markdown()` | `task.toMarkdown()` | - | `string` | Serialize task back to markdown | | `Task::id()` | `task.id` (getter) | - | `string` | Accessor | | `Task::name()` | `task.name` (getter) | - | `string` | Accessor | | `Task::status()` | `task.status` (getter) | - | `string` | Accessor | | `Task::depends_on()` | `task.dependsOn` (getter) | - | `string[]` | Accessor | | `TaskScope::token_estimate()` | `scope.tokenEstimate()` | - | `number` | Numeric mapping | | `TaskScope::cost_estimate()` | `scope.costEstimate()` | - | `number` | Numeric mapping | | `TaskRisk::success_probability()` | `risk.successProbability()` | - | `number` | Numeric mapping | | `TaskImpact::weight()` | `impact.weight()` | - | `number` | Numeric mapping | ### Priority 3: Collection & Discovery (Must Have) | Rust Function | NAPI Method | Input | Output | Why | |---------------|-------------|-------|--------|-----| | `TaskCollection::from_directory()` | `TaskCollection.fromDirectory(path)` | `string` | `TaskCollection` | Primary entry point: discover all tasks | | `TaskCollection::new()` | `new TaskCollection()` | - | `TaskCollection` | Empty constructor for building manually | | `TaskCollection::get()` | `collection.get(id)` | `string` | `Task\|null` | Lookup by ID | | `TaskCollection::len()` | `collection.length` (getter) | - | `number` | Task count | | `TaskCollection::ids()` | `collection.ids()` | - | `string[]` | All task IDs | | `TaskCollection::tasks()` | `collection.tasks()` | - | `Task[]` | All tasks | | `TaskCollection::errors()` | `collection.errors` (getter) | - | `DiscoveryError[]` | Parse errors | | `TaskCollection::missing_dependencies()` | `collection.missingDependencies()` | - | `Record` | Find broken deps | | `TaskCollection::validate()` | `collection.validate()` | - | `ValidationResult` | Full validation | ### Priority 4: Graph Operations (Must Have) | Rust Function | NAPI Method | Input | Output | Why | |---------------|-------------|-------|--------|-----| | `DependencyGraph::from_collection()` | `DependencyGraph.fromCollection(collection)` | `TaskCollection` | `DependencyGraph` | Build graph | | `DependencyGraph::new()` | `new DependencyGraph()` | - | `DependencyGraph` | Empty graph constructor | | `DependencyGraph::from_tasks()` | `DependencyGraph.fromTasks(tasks[])` | `Task[]` | `DependencyGraph` | Build from JS array | | `add_task()` | `graph.addTask(id)` | `string` | `void` | Add node | | `add_dependency()` | `graph.addDependency(from, to)` | `string, string` | `void` | Add edge | | `has_cycles()` | `graph.hasCycles()` | - | `boolean` | Cycle detection | | `find_cycles()` | `graph.findCycles()` | - | `string[][]` | Get actual cycles | | `topological_order()` | `graph.topologicalOrder()` | - | `string[]\|null` | Execution order | | `dependencies()` | `graph.dependencies(id)` | `string` | `string[]` | Direct deps | | `dependents()` | `graph.dependents(id)` | `string` | `string[]` | What depends on this | | `parallel_groups()` | `graph.parallelGroups()` | - | `string[][]` | Parallel work groups | | `critical_path()` | `graph.criticalPath()` | - | `string[]` | Longest path | | `weighted_critical_path()` | `graph.weightedCriticalPath(weightFn)` | `(id: string) => number` | `string[]` | Weighted longest path | | `bottlenecks()` | `graph.bottlenecks()` | - | `[string, number][]` | Betweenness centrality | | `to_dot()` | `graph.toDot()` | - | `string` | GraphViz DOT format | ### Priority 5: Config (Nice to Have) | Rust Function | NAPI Method | Input | Output | Why | |---------------|-------------|-------|--------|-----| | `Config::from_file()` | `Config.fromFile(path)` | `string` | `Config` | Load config | | `Config::find_and_load()` | `Config.findAndLoad()` | - | `Config\|null` | Auto-discover config | | `Config::tasks_path()` | `config.tasksPath` (getter) | - | `string` | Get tasks dir | ### Priority 6: Workflow Cost Calculation (Nice to Have) The `workflow_cost` command uses `calculate_task_ev()` which is a private function. Consider exposing: | Function | NAPI Method | Input | Output | Why | |----------|-------------|-------|--------|-----| | `calculate_task_ev()` (currently private) | `calculateTaskEv(p, scopeCost, impactWeight)` | `number, number, number` | `number` | Expected value calculation | This would need to be made `pub` or reimplemented in the NAPI layer. ### Notes on `weighted_critical_path` for NAPI The `weighted_critical_path` takes a Rust closure `F: Fn(&str) -> f64`. For NAPI, this would need to: 1. Accept a JavaScript function callback, OR 2. Accept a `Record` map of task ID -> weight Option 2 is simpler and avoids cross-language callback overhead. For example: ```typescript // NAPI signature option A (callback approach - complex) graph.weightedCriticalPath((taskId: string) => number): string[] // NAPI signature option B (map approach - simpler) graph.weightedCriticalPath(weights: Record): string[] ``` --- ## 5. Serialization (Serde) Support ### Full Serde Support (Serialize + Deserialize) | Type | Serialize | Deserialize | Notes | |------|-----------|-------------|-------| | `TaskStatus` | Yes | Yes | `rename_all = "kebab-case"` | | `TaskScope` | Yes | Yes | `rename_all = "kebab-case"` | | `TaskRisk` | Yes | Yes | `rename_all = "kebab-case"` | | `TaskImpact` | Yes | Yes | `rename_all = "kebab-case"` | | `TaskLevel` | Yes | Yes | `rename_all = "kebab-case"` | | `TaskFrontmatter` | Yes | Yes | Rich serde attributes (skip_serializing_if, rename, default) | | `Config` | Yes | Yes | Via TOML | | `ProjectConfig` | Yes | Yes | Via TOML | ### No Serde Support | Type | Serialize | Deserialize | Reason | |------|-----------|-------------|--------| | `Task` | No | No | `body` and `source` are separate from frontmatter; `to_markdown()` handles serialization manually | | `DependencyGraph` | No | No | Computed structure; rebuilt from tasks each time | | `TaskCollection` | No | No | Procedurally built from directory scanning | | `DiscoveryError` | No | No | Error reporting struct | | `ValidationResult` | No | No | Internal result type | | `Error` | No | No | Error enum | | `OutputFormat` | No | No | CLI-only (ValueEnum, not serde) | | `Cli` | No | No | CLI-only (clap derive) | | `Commands` | No | No | CLI-only enum | ### JSON Serialization in Commands (Ad-hoc) Several command modules define private structs that derive `Serialize` for JSON output: | File | Struct | Fields | |------|--------|--------| | `validate.rs` | `ValidationOutput` | valid, task_count, error_count, errors[], missing_deps | | `validate.rs` | `ValidationError` | path, message | | `list.rs` | `TaskSummary` | id, name, status, scope | | `show.rs` | `TaskDetails` | id, name, status, depends_on, scope, risk, impact, level, tags, body | | `deps.rs` | `DependencyInfo` | id, status, exists | | `deps.rs` | `DependenciesOutput` | task_id, dependencies[] | | `topo.rs` | `TopoTask` | position, id, name, status | | `topo.rs` | `TopoOutput` | order[], has_cycles | | `cycles.rs` | `CyclesOutput` | has_cycles, cycle_count, cycles[] | | `workflow_cost.rs` | `TaskCost` | id, name, cost | These are **private** to each command module and not part of the public API. For NAPI, we would define equivalent TypeScript interfaces or create new public serializable structs. ### Serialization Format Details **YAML (frontmatter):** `TaskFrontmatter` uses `serde_yaml` with: - `rename_all = "kebab-case"` on enums → `in-progress`, `narrow`, `high`, etc. - `rename = "depends_on"` on the `depends_on` field (explicit) - `default` on required-ish fields - `skip_serializing_if = "Option::is_none"` for optional fields - `skip_serializing_if = "Vec::is_empty"` for tags **JSON (output):** Uses `serde_json::to_string_pretty()` in commands. **TOML (config):** `Config` uses `toml::from_str()`. **Roundtrip:** `Task::from_markdown()` + `Task::to_markdown()` should produce equivalent output (tested implicitly). --- ## 6. Error Types and Error Handling ### Library Error Type (`src/error.rs`) ```rust #[derive(Error, Debug)] pub enum Error { #[error("Task not found: {0}")] TaskNotFound(String), #[error("Task already exists: {0}")] TaskAlreadyExists(String), #[error("Circular dependency detected: {0}")] CircularDependency(String), #[error("Invalid frontmatter in {file}: {message}")] InvalidFrontmatter { file: String, message: String }, #[error("Missing required field '{field}' in {file}")] MissingField { file: String, field: String }, #[error("IO error: {0}")] Io(#[from] std::io::Error), #[error("YAML parsing error: {0}")] Yaml(#[from] serde_yaml::Error), #[error("JSON serialization error: {0}")] Json(#[from] serde_json::Error), #[error("Graph error: {0}")] Graph(String), } pub type Result = std::result::Result; ``` **Error conversion:** `From` impls via `#[from]` for `std::io::Error`, `serde_yaml::Error`, `serde_json::Error`. **Usage patterns:** - Library code returns `crate::Result` (= `Result`) - `anyhow::Result` is used only in `main.rs` for the binary entry point - `thiserror` provides `Display` impls automatically ### CLI Error Handling The `Cli::execute()` method returns `anyhow::Result<()>`. Each command function returns `crate::Result<()>`. The `?` operator converts between them naturally. **Error handling at boundaries:** - `Task::from_file()`: IO errors → `Error::Io`, parse errors → `Error::InvalidFrontmatter` - `TaskCollection::from_directory()`: Silently skips files without frontmatter, stores errors in `DiscoveryError` list (non-fatal) - `Config::from_file()`: TOML parse errors → `Error::Graph(format!(...))` (note: reuses Graph variant) - Command functions: `Error::TaskNotFound` when task ID missing, `Error::TaskAlreadyExists` on duplicate init ### NAPI Error Mapping Strategy For the Node.js wrapper, we should map: | Rust Error | Node.js Error | Notes | |------------|---------------|-------| | `TaskNotFound(id)` | Generic `Error` with message | JS: `throw new Error("Task not found: ")` | | `TaskAlreadyExists(id)` | Generic `Error` with message | JS: `throw new Error("Task already exists: ")` | | `CircularDependency(msg)` | Generic `Error` with message | JS: `throw new Error("Circular dependency: ")` | | `InvalidFrontmatter { file, message }` | Generic `Error` with message | JS: `throw new Error("Invalid frontmatter in : ")` | | `MissingField { file, field }` | Generic `Error` with message | JS: `throw new Error("Missing field in ")` | | `Io(err)` | Generic `Error` with message | JS: `throw new Error("IO error: ")` | | `Yaml(err)` | Generic `Error` with message | JS: `throw new Error("YAML parsing error: ")` | | `Json(err)` | Generic `Error` with message | JS: `throw new Error("JSON error: ")` | | `Graph(msg)` | Generic `Error` with message | JS: `throw new Error("Graph error: ")` | Alternatively, we could create custom JS error classes for better programmatic handling: ```typescript class TaskNotFoundError extends Error { taskId: string } class CircularDependencyError extends Error { } class InvalidFrontmatterError extends Error { file: string; message: string } ``` --- ## 7. Input/Output Patterns ### Data Flow Overview ``` DISCOVERY tasks/*.md files ──────────────> TaskCollection (disk) (HashMap) │ │ from_collection() / from_tasks() ▼ DependencyGraph (DiGraph) │ ┌────────────────┼────────────────────┐ │ │ │ ▼ ▼ ▼ topological parallel_groups critical_path order() () () │ │ │ └────────────────┴─────────────────────┘ │ ▼ Output (plain/JSON) ``` ### Input Patterns 1. **File-based input (primary):** `TaskCollection::from_directory(path)` scans a directory recursively for `.md` files, parses each, and builds the collection. This is the main entry point. 2. **String-based input:** `Task::from_markdown(content, source)` parses a single markdown string. Useful for programmatic construction. 3. **Path-based input:** `Task::from_file(path)` reads a single file and parses it. 4. **Programmatic construction:** `DependencyGraph::new()` + `add_task()` + `add_dependency()` for building graphs manually. ### Output Patterns 1. **Plain text (default):** Human-readable terminal output with tables, arrows, and formatting. 2. **JSON output (`--format json`):** Structured JSON using ad-hoc `Serialize` structs in each command. This is the primary programmatic output format. 3. **DOT format:** `DependencyGraph::to_dot()` returns GraphViz DOT format string. 4. **Markdown roundtrip:** `Task::to_markdown()` produces valid markdown with YAML frontmatter. ### Typical Usage Flow ```rust // 1. Discover tasks let collection = TaskCollection::from_directory(Path::new("./tasks")); // 2. Validate let result = collection.validate(); if !result.is_valid() { /* handle errors */ } // 3. Build graph let graph = DependencyGraph::from_collection(&collection); // 4. Analyze let has_cycles = graph.has_cycles(); let order = graph.topological_order(); let parallel = graph.parallel_groups(); let critical = graph.critical_path(); let bottlenecks = graph.bottlenecks(); ``` ### NAPI Data Flow Design For the Node.js wrapper, the recommended data flow is: ```typescript // Option A: File-based (mirrors Rust CLI) const collection = TaskCollection.fromDirectory('./tasks'); const graph = DependencyGraph.fromCollection(collection); // Option B: Programmatic (unique to NAPI) const tasks = [ Task.fromMarkdown('---\nid: t1\nname: Task 1\n---\nBody'), Task.fromMarkdown('---\nid: t2\nname: Task 2\ndepends_on: [t1]\n---\nBody'), ]; const graph = DependencyGraph.fromTasks(tasks); // Option C: Manual graph construction const graph = new DependencyGraph(); graph.addTask('t1'); graph.addTask('t2'); graph.addDependency('t1', 't2'); ``` ### Memory/Ownership Considerations for NAPI - `Task` is `Clone` (cheap to clone; contains String, TaskFrontmatter, Option) - `TaskCollection` owns all `Task` objects (HashMap) - `DependencyGraph` owns the graph structure (not the tasks themselves; only stores task IDs as node weights) - `DependencyGraph::from_collection()` borrows `&TaskCollection` (doesn't take ownership) - `Task::from_file()` and `from_markdown()` return owned `Task` values For NAPI, we need to decide: 1. **Should `TaskCollection` hold JS-managed task objects or Rust-owned?** Probably Rust-owned (tasks are parsed from files/strings, not constructed in JS). 2. **Should graph operations return strings or Task references?** Currently returns `Vec` (strings). The JS side can look up tasks from the collection. This is efficient. 3. **Should `DependencyGraph` keep a reference to `TaskCollection`?** Currently no. This means JS must pass the collection alongside the graph for enriched output. We could create a combined `TaskGraph` class in the NAPI layer. --- ## 8. Existing Tests and Benchmarks ### Unit Tests (in-source) | File | Test Count | Key Tests | |------|-----------|-----------| | `src/graph.rs` | 12 | Empty graph, add task/dep, missing deps, cycle detection, topo sort, parallel groups, critical path, bottleneck, DOT output, unknown task queries | | `src/discovery.rs` | 5 | Single task discovery, skip files without frontmatter, duplicate ID detection, missing dependencies, validation result | | `src/config.rs` | 2 | Default config, load from file | ### Integration Tests (`tests/integration/commands.rs`) 25 tests total using `assert_cmd`: | Test | Command | What It Verifies | |------|---------|-----------------| | `test_list_command` | `list` | Lists all 3 fixture tasks | | `test_list_with_status_filter` | `list --status completed` | Filters correctly | | `test_show_command` | `show task-one` | Shows task details | | `test_show_missing_task` | `show missing-task` | Fails on missing | | `test_validate_command` | `validate` | Succeeds on valid fixtures | | `test_validate_with_missing_dependency` | `validate` (invalid) | Reports missing deps | | `test_topo_command` | `topo` | Outputs topological order | | `test_deps_command` | `deps task-two` | Shows task-one as dependency | | `test_dependents_command` | `dependents task-one` | Shows tasks two and three | | `test_cycles_command_no_cycles` | `cycles` | No cycles in valid fixtures | | `test_cycles_command_with_cycles` | `cycles` (cycles fixtures) | Detects cycle | | `test_parallel_command` | `parallel` | Shows generation groups | | `test_critical_command` | `critical` | Shows critical path | | `test_graph_command` | `graph` | Outputs DOT format | | `test_bottleneck_command` | `bottleneck` | Shows bottleneck tasks | | `test_init_command` | `init new-task` | Creates file | | `test_init_duplicate_task` | `init task-one` | Fails on duplicate | | `test_init_with_options` | `init --scope narrow --risk low` | Writes scope/risk to file | | `test_risk_command` | `risk` | Distribution with counts | | `test_risk_command_empty` | `risk` (empty dir) | "No tasks found" | | `test_decompose_command` | `decompose` | Flags high-risk/broad-scope tasks | | `test_decompose_command_none_needed` | `decompose` (low-risk tasks) | "No tasks need decomposition" | | `test_workflow_cost_command` | `workflow-cost` | Shows cost analysis | | `test_workflow_cost_command_empty` | `workflow-cost` (empty) | "No tasks found" | | `test_risk_path_command` | `risk-path` | Shows risk path | | `test_risk_path_command_empty` | `risk-path` (empty) | "No tasks found" | | `test_help_flag` | `--help` | Shows help text | | `test_version_flag` | `--version` | Succeeds | | `test_completions_bash` | `completions bash` | Bash completion output | | `test_completions_zsh` | `completions zsh` | Zsh completion output | | `test_completions_fish` | `completions fish` | Fish completion output | ### Benchmark Suite (`benches/graph_benchmarks.rs`) Uses Criterion. Two benchmark groups: 1. **`load_tasks`**: Measures `TaskCollection::from_directory()` + `DependencyGraph::from_collection()` for 50, 100, 500, 1000 tasks. 2. **`graph_ops`**: On 1000-task graph, measures: - `topological_sort_1000` - `cycle_detection_1000` - `critical_path_1000` - `bottlenecks_1000` Test data: linear chain of tasks (task-i depends on task-(i-1)). ### Performance Numbers (from README) | Tasks | Load Time | Topo Sort | Cycles | Critical Path | |-------|-----------|-----------|--------|---------------| | 50 | 3ms | 3ms | 2ms | 8ms | | 500 | 19ms | 21ms | 14ms | 52ms | | 1,000 | 34ms | 42ms | 26ms | 82ms | (Benchmarked on AMD EPYC 9004 series) ### CI Pipeline (`.github/workflows/ci.yml`) Two jobs: 1. **Test**: checkout -> install Rust (with rustfmt, clippy) -> cache -> fmt check -> clippy -> test -> build release 2. **Coverage**: checkout -> install Rust -> cache -> install cargo-llvm-cov -> generate lcov -> upload to Codecov ### Test Coverage Reported at 89% (meeting the 80% target from AGENTS.md). --- ## Appendix A: Complete Type Reference for NAPI Mapping ### Enums to JS String Unions ```typescript // task.ts type TaskStatus = "pending" | "in-progress" | "completed" | "failed" | "blocked"; type TaskScope = "single" | "narrow" | "moderate" | "broad" | "system"; type TaskRisk = "trivial" | "low" | "medium" | "high" | "critical"; type TaskImpact = "isolated" | "component" | "phase" | "project"; type TaskLevel = "planning" | "decomposition" | "implementation" | "review" | "research"; ``` ### Proposed NAPI Class Structure ```typescript // task.ts class Task { // Static constructors static fromMarkdown(content: string, source?: string): Task; static fromFile(path: string): Task; // Getters get id(): string; get name(): string; get status(): TaskStatus; get dependsOn(): string[]; get body(): string; get source(): string | null; // Frontmatter access (via JS object) get frontmatter(): TaskFrontmatter; // Serialization toMarkdown(): string; } interface TaskFrontmatter { id: string; name: string; status: TaskStatus; dependsOn: string[]; priority?: string; tags: string[]; created?: string; // ISO 8601 modified?: string; // ISO 8601 assignee?: string; due?: string; scope?: TaskScope; risk?: TaskRisk; impact?: TaskImpact; level?: TaskLevel; } // collection.ts class TaskCollection { static fromDirectory(path: string): TaskCollection; get(id: string): Task | null; get length(): number; ids(): string[]; tasks(): Task[]; get errors(): DiscoveryError[]; missingDependencies(): Record; validate(): ValidationResult; } interface DiscoveryError { path: string; message: string; } interface ValidationResult { taskCount: number; errors: DiscoveryError[]; missingDependencies: Record; isValid(): boolean; issueCount(): number; } // graph.ts class DependencyGraph { static fromCollection(collection: TaskCollection): DependencyGraph; static fromTasks(tasks: Task[]): DependencyGraph; addTask(id: string): void; addDependency(from: string, to: string): void; hasCycles(): boolean; findCycles(): string[][]; topologicalOrder(): string[] | null; dependencies(taskId: string): string[]; dependents(taskId: string): string[]; parallelGroups(): string[][]; criticalPath(): string[]; weightedCriticalPath(weights: Record): string[]; bottlenecks(): [string, number][]; toDot(): string; } // config.ts class Config { static fromFile(path: string): Config; static findAndLoad(): Config | null; get tasksPath(): string; } // workflow.ts function calculateTaskEv(p: number, scopeCost: number, impactWeight: number): number; ``` ### Key Decisions for NAPI Implementation 1. **Task mutability:** The Rust `Task` struct is `Clone` but has no setters. For NAPI, we should either: - Make the JS `Task` immutable (read-only after creation) - simpler, matches Rust - Add a `TaskBuilder` pattern for constructing tasks programmatically 2. **Enum representation:** Use JS string literals (not numeric enums) to match the `kebab-case` serde serialization. 3. **Error handling:** Throw JS `Error` objects from NAPI. Consider custom error classes for `TaskNotFound` and `InvalidFrontmatter`. 4. **DateTime handling:** `chrono::DateTime` maps to ISO 8601 strings in JS. No need for JS `Date` objects in the NAPI layer. 5. **Graph lifetime:** The Rust `DependencyGraph` borrows nothing (stores owned `String` node weights). It can be freely moved/owned in NAPI. 6. **Collection lifetime:** `TaskCollection` owns its tasks. The NAPI class should hold the Rust struct. Returning `Task` references from `collection.get()` requires careful lifetime management - consider returning clones. 7. **`weighted_critical_path` callback:** Replace the Rust closure with a JS `Record` dict lookup to avoid FFI callback overhead and complexity. --- ## Appendix B: Notable Implementation Details ### Bottleneck Algorithm The current `bottlenecks()` implementation uses an O(n^2 * P) algorithm where P is the number of paths between nodes. It enumerates all paths between all pairs, then counts how many paths each task appears on. This is **not** true betweenness centrality (which uses Brandes' O(VE) algorithm) but a simpler path-counting approach. For large graphs, this could be slow. The benchmark only tests up to 1000 nodes with linear topology. ### Critical Path Algorithm Uses recursive memoized longest-path computation. Works well for DAGs but will return empty/incorrect results if cycles exist (the `parallel_groups` method also silently breaks if cycles exist). ### Missing: Task Serialization `Task` does not implement `Serialize`/`Deserialize`. The `to_markdown()` method manually concatenates YAML frontmatter + markdown body. If we need JSON serialization of the full `Task` (including body), we should add a new serializable struct like: ```rust #[derive(Serialize)] pub struct SerializableTask { pub frontmatter: TaskFrontmatter, pub body: String, pub source: Option, } ``` Or implement `Serialize` for `Task` directly. ### Missing: Task Mutability There are no methods to update a task's status, dependencies, etc. in place. The current design assumes files are the source of truth and are edited directly. For an NAPI wrapper, we may want to add: - `task.set_status(status: TaskStatus)` - `task.set_depends_on(deps: Vec)` - etc. Or use a builder pattern for creating new tasks. ### Missing: Partial Graph Building `DependencyGraph::from_collection()` adds edges only for dependencies that exist as nodes in the graph. Missing dependencies are silently ignored (no error, no warning). This matches the `add_dependency()` behavior which checks `index_map` before adding edges. ### walkdir::FollowLinks(false) `TaskCollection::from_directory()` does not follow symlinks. This is intentional for safety. --- ## Appendix C: Dependency Version Compatibility Notes | Crate | Version | Notes for NAPI | |-------|---------|---------------| | `petgraph` | `0.7` | Stable API; `DiGraph` and algorithms are well-defined | | `gray_matter` | `0.2` | Minor version; API may change in `0.3` | | `serde` | `1.0` | Very stable; `derive` feature needed | | `serde_json` | `1.0` | Very stable | | `serde_yaml` | `0.9` | Note: `serde_yaml` 0.9 is the last version before potential breaking changes | | `chrono` | `0.4` | Stable; `serde` feature for serialization | | `clap` | `4.5` | CLI-only; not needed in NAPI lib | | `thiserror` | `2.0` | Error derive; v2 is newer than commonly seen | | `toml` | `0.8` | For config loading | | `walkdir` | `2.5` | For directory scanning | For NAPI, we can exclude from the build: - `clap` / `clap_complete` (CLI-only, not needed for library) - `tracing` / `tracing-subscriber` (logging, optional) - `dirs` (platform directories, only for CLI default paths) This could be done with feature flags: ```toml [features] default = ["cli"] cli = ["clap", "clap_complete", "tracing", "tracing-subscriber", "dirs"] napi = [] # Minimal dependencies for Node.js binding ```