Files

glm-5.1 34d1802d30 Add TaskSource abstraction, config schema, and Bun.Glob file scanning

Architecture updates to support the plugin's I/O and configuration layer:

- TaskSource interface abstracts task loading from I/O, making future
  sources (API, database, test) swappable without operation changes
- FileSource implements v1: Bun.Glob for directory scanning, Bun.file
  for reading, parseFrontmatter for parsing (single-pass I/O)
- SourceResult provides raw file content (for show) and per-file error
  detail (for validate) that parseTaskDirectory couldn't offer
- Config schema uses TypeBox (already a dep via taskgraph) for
  compile-time types, runtime validation, and JSON Schema export
- ADR-005: TaskSource abstraction rationale
- ADR-006: Bun.Glob over parseTaskDirectory rationale
- Performance benchmark added (43 tasks full pipeline: ~150ms)
- AGENTS.md updated with config section and source structure

2026-04-28 10:06:18 +00:00

28 KiB

Raw Blame History

status, last_updated

status	last_updated
draft	2026-04-28

Open Tasks: Architecture Overview

Structured task management for OpenCode agents — graph analysis, dependency insight, decomposition guidance, and workflow cost estimation. Exposes a single tasks tool using a registry pattern to keep the agent's visible tool count minimal.

Problem

The taskgraph Rust CLI provides task graph operations but requires shell invocation — agents must compose bash commands and parse plain-text output. This is error-prone, context-expensive, and gives no structural validation or rich formatting. The TypeScript core library (@alkdev/taskgraph) now provides all graph operations natively. This plugin wraps that library into an OpenCode tool interface so agents get first-class, structured access without leaving the conversation.

What This Plugin Is

A read-only analysis and query layer on top of the project's tasks/ directory. It:

Reads task markdown files with YAML frontmatter via @alkdev/taskgraph parsing
Constructs an in-memory TaskGraph per invocation
Runs analysis functions (critical path, parallel groups, bottlenecks, risk, workflow cost, decomposition)
Returns formatted markdown to the agent

What This Plugin Is Not

Not a task editor — it does not create, modify, or delete task files. Task creation and status updates are the agent's responsibility (Write/Edit tools).
Not a task runner — it does not coordinate execution. That's the role of open-coordinator.
Not a persistence layer — there is no database, no cache, no state between invocations. Each tool call reads files fresh.

Architecture

Single-Tool Registry Pattern

Following open-memory's proven approach, the plugin exposes one tool (tasks) with internal operation dispatch:

tasks({tool: "help"})                         → Show available operations
tasks({tool: "list"})                         → List tasks in project
tasks({tool: "show", args: {id: "..."}})      → Show task details
tasks({tool: "deps", args: {id: "..."}})      → Task prerequisites
tasks({tool: "dependents", args: {id: "..."}}) → Tasks depending on a task
tasks({tool: "validate"})                      → Validate all task files
tasks({tool: "topo"})                          → Topological ordering
tasks({tool: "cycles"})                        → Circular dependency detection
tasks({tool: "critical"})                      → Critical path
tasks({tool: "parallel"})                      → Parallel execution groups
tasks({tool: "bottleneck"})                  → Bottleneck analysis
tasks({tool: "risk"})                           → Risk path + distribution
tasks({tool: "cost"})                           → Workflow cost estimate
tasks({tool: "decompose", args: {id: "..."}})  → Decomposition guidance

Why: Each tool definition adds JSON schema to the system prompt (~200-300 tokens each). 14 operations as 14 separate tools = ~3500 tokens of tool definitions. The registry pattern collapses this to ~250 tokens (one tool schema) plus an on-demand help text the agent retrieves only when needed. This is the same math that drove open-memory's design.

Component Structure

src/
├── index.ts              # Plugin entry: tool registration + config loading
├── tools.ts              # Tool definition — single `tasks` tool with registry dispatch
├── registry.ts           # Operation registry (dispatch table, arg validation)
├── config.ts             # Plugin config schema + resolution (TypeBox, validated)
├── sources/
│   ├── types.ts          # TaskSource interface
│   ├── file-source.ts    # FileSource — reads tasks/ directory via Bun.Glob + parseFrontmatter
│   └── index.ts          # Source factory: resolves config → TaskSource
├── operations/            # Individual operation implementations
│   ├── help.ts            # Help reference and per-operation details
│   ├── list.ts            # List and filter tasks
│   ├── show.ts            # Show full task details
│   ├── deps.ts            # Show prerequisites
│   ├── dependents.ts      # Show dependents
│   ├── validate.ts        # Validate task files
│   ├── topo.ts            # Topological ordering
│   ├── cycles.ts          # Cycle detection
│   ├── critical.ts        # Critical path
│   ├── parallel.ts        # Parallel execution groups
│   ├── bottleneck.ts      # Bottleneck scores
│   ├── risk.ts            # Risk path + risk distribution
│   ├── cost.ts            # Workflow cost estimate
│   └── decompose.ts       # Decomposition guidance
└── formatting.ts          # Shared markdown formatting helpers

Plugin Configuration

The plugin reads optional configuration from opencode.json under the plugin entry:

// opencode.json
{
  "plugin": [
    ["@alkdev/open-tasks", {
      "tasksPath": "tasks"  // relative to workspace root (default: "tasks")
    }]
  ]
}

If no config is provided, the plugin defaults to "tasks" (a tasks/ directory relative to the workspace root).

The config schema uses TypeBox (already a dependency via @alkdev/taskgraph), giving us:

Compile-time types — Static<typeof ConfigSchema> for TypeScript inference
Runtime validation — Value.Check(ConfigSchema, configObj) to reject invalid config
JSON Schema export — for tooling/IDE support

import { Type, type Static } from "@alkdev/typebox"

export const ConfigSchema = Type.Object({
  tasksPath: Type.Optional(Type.String({ default: "tasks" })),
})

export type Config = Static<typeof ConfigSchema>

This minimal schema is forward-looking. Future sources (API endpoints, databases) will add their own config keys.

TaskSource Abstraction

Operations don't read the filesystem directly. They go through a TaskSource interface:

interface TaskSource {
  /** Human-readable description for error messages */
  readonly name: string

  /** Load all tasks, returning parsed TaskInput[] and raw file data */
  load(): Promise<SourceResult>
}

interface SourceResult {
  tasks: TaskInput[]           // parsed frontmatter from @alkdev/taskgraph
  rawFiles: Map<string, string> // taskId → full file content (for `show` operation)
  errors: SourceError[]         // files that failed to parse
}

interface SourceError {
  filePath: string
  error: string
}

Why an interface? v1 only has FileSource (reads from tasks/ directory). But the abstraction makes it trivial to add:

ApiSource — tasks fetched from a remote endpoint (future: project management tools, CI dashboards)
MixedSource — merge multiple sources with precedence rules
TestSource — in-memory tasks for unit testing operations without filesystem

Each source implements load() and returns the same shape. Operations receive a SourceResult and work with it — they never know (or care) where the data came from. This is the same pattern that makes the tool tool in open-memory work with SQLite but be testable with in-memory data.

FileSource Implementation

The v1 concrete source reads markdown files from a directory:

class FileSource implements TaskSource {
  readonly name: string

  constructor(private dirPath: string) {
    this.name = `FileSource(${dirPath})`
  }

  async load(): Promise<SourceResult> {
    const glob = new Bun.Glob("**/*.md")
    const files = Array.from(glob.scanSync({ cwd: this.dirPath }))
    // ... read each file, parse with parseFrontmatter, collect results
  }
}

Why Bun.Glob instead of parseTaskDirectory? The library's parseTaskDirectory uses node:fs/promises.readdir recursively and silently skips files with invalid frontmatter. We use Bun.Glob instead because:

We need raw file content — the show operation returns the full markdown body, not just frontmatter. parseTaskDirectory only returns parsed TaskInput objects; we'd need a separate pass to read file contents.
We need error detail — parseTaskDirectory silently skips invalid files. We need to surface parse errors with filenames so the validate operation can report them.
Single-pass I/O — Bun.Glob gives us file paths, then we read each file once with Bun.file() and parse with parseFrontmatter. One I/O pass, not two.
Consistent runtime — the plugin targets Bun. Bun.Glob and Bun.file() are the native APIs; no reason to use Node compat shims.

The library's parseFrontmatter (singular) is still the right tool for parsing individual file content. We just replace the directory-scanning and file-reading parts.

Data Flow

Each operation follows the same pipeline:

Agent calls tasks({tool: "list", args: {status: "pending"}})
  │
  ├─ registry.ts validates tool name and args
  │
  ├─ Operation handler:
  │   │
  │   ├─ source.load() → SourceResult (tasks, rawFiles, errors)
  │   │
  │   ├─ TaskGraph.fromTasks(sourceResult.tasks) → in-memory graph
  │   │
  │   ├─ Analysis function (e.g., parallelGroups(graph))
  │   │
  │   └─ format result as markdown
  │
  └─ Return formatted markdown to agent

The source is resolved once at plugin initialization (in index.ts) and passed to all operation handlers via the registry. Operations call source.load() to get fresh data — no caching between calls.

Path Resolution

The plugin resolves its tasks directory from config with safe defaults:

Config — tasksPath from plugin config (if provided). Treated as relative to workspace root. Path traversal is rejected.
Default — tasks/ relative to workspace root (from ctx.directory in PluginInput).
No config, no directory — operations return a clear message explaining how to create a tasks/ directory.

There is no CWD fallback. The workspace root from the OpenCode plugin context is the authoritative base path.

Operations Reference

Query Operations

Operation	Maps to	Key Args	Output
`list`	`TaskGraph` iteration	`status`, `scope`, `risk` (filter)	Filtered task table
`show`	`graph.getTask()`	`id` (required)	Full task details + markdown body
`deps`	`graph.dependencies()`	`id` (required)	Prerequisite task list
`dependents`	`graph.dependents()`	`id` (required)	Dependent task list
`topo`	`graph.topologicalOrder()`	—	Ordered task list
`cycles`	`graph.findCycles()`	—	Cycle report or "no cycles"
`validate`	`graph.validate()`	—	Validation errors or "all valid"

Analysis Operations

Operation	Maps to	Key Args	Output
`critical`	`criticalPath()`, `weightedCriticalPath()`	—	Critical path with task names
`parallel`	`parallelGroups()`	—	Grouped task lists by generation
`bottleneck`	`bottlenecks()`	—	Ranked task list with scores
`risk`	`riskPath()`, `riskDistribution()`	—	Highest-risk path + distribution table
`cost`	`workflowCost()`	`propagationMode`, `defaultQualityRetention`, `includeCompleted`	Per-task EV + totals
`decompose`	`shouldDecomposeTask()`	`id` (required)	Decomposition verdict + reasons

Help Operation

tasks({tool: "help"}) returns the full operation reference table. tasks({tool: "help", args: {tool: "list"}}) returns detailed usage for one operation including argument shapes and example calls.

Design Decisions

D1: Registry Pattern (single tool, not 14)

Context: 14 operations could each be a separate tool or collapsed into one router.
Choice: Single tasks tool with {tool, args} dispatch.
Consequences: Agent always has access to the help reference. Adding operations never increases context bloat. Trade-off: the tool and args fields are not individually validated by the outer schema — validation happens inside the dispatch.
Reference: See ADR-001

D2: No Caching, Fresh Graph Per Call

Context: Task files change as agents work (status updates, new tasks, removed tasks). A cached graph would become stale.
Choice: Each tool invocation reads the tasks directory fresh and builds a new graph.
Consequences: Slightly redundant I/O for consecutive calls, but guarantees correctness. The tasks directory is typically small (<50 files). The parseTaskDirectory + TaskGraph.fromTasks pipeline is fast (sub-second for typical task sets).
Reference: See ADR-002

D3: `risk` Operation Merges `risk-path` and Risk Distribution

Context: The CLI has separate risk (distribution) and risk-path (path) subcommands. Both are risk-related and an agent asking "what's the risk situation?" wants both.
Choice: Single risk operation returns both risk distribution (grouped by category) and risk path (the highest-cumulative-risk path through the DAG).
Consequences: One call gives the full risk picture. Saves the agent from needing two calls and correlating results.
Reference: See ADR-003

D4: `decompose` Takes Task ID, Not Raw Attributes

Context: shouldDecomposeTask() in the core library accepts TaskGraphNodeAttributes directly (an object with id, name, risk, scope, impact, etc. — all categorical fields nullable). The plugin could expose this raw or resolve by task ID.
Choice: The decompose operation takes a task id, looks up the task from the graph (graph.getTask(id)), and passes its attributes to shouldDecomposeTask().
Consequences: Agent-friendly — just pass the task ID rather than reconstructing attributes. If the task doesn't exist, a clear error is returned. The library function is still available for programmatic use; this is an interface convenience.

D5: `cost` Defaults Match SDD Process

Context: workflowCost() supports propagationMode (independent vs dag-propagate), defaultQualityRetention, and includeCompleted. Different defaults make sense for different workflows.
Choice: Default to propagationMode: "dag-propagate", includeCompleted: false, defaultQualityRetention: 0.9 — matching the Spec-Driven Development (SDD) process's assumption that completed tasks are factored out of remaining cost, and that quality degrades probabilistically across dependencies. See SDD Process for the overall workflow.
Consequences: The most common use case (active project planning) gets sensible defaults. Agents can override per-call.

D6: Separate `registry.ts` From `tools.ts`

Context: Open-memory puts all handler logic in tools.ts (~500 lines). That works for a single cohesive domain (SQL queries) but open-tasks has 14 operations that each wrap a distinct library function.
Choice: tools.ts defines the tool schema and dispatch. registry.ts maps operation names to handler functions. Each operation is a separate file under operations/.
Consequences: Each operation is independently understandable and testable. Adding a new operation means adding one file and one registry entry, not editing a growing monolith.

D7: TaskSource Abstraction

Context: v1 reads tasks from a local tasks/ directory. Future sources could include API endpoints, databases, or remote project management tools. Hardcoding file I/O in each operation would make this evolution painful.
Choice: Define a TaskSource interface with a single load() method returning SourceResult { tasks, rawFiles, errors }. v1 implements FileSource (reads from filesystem). The source is resolved once at plugin initialization and passed to all operations.
Consequences: Operations are decoupled from I/O. FileSource uses Bun.Glob for discovery and parseFrontmatter for parsing. Future ApiSource would swap in a fetch call. Test sources can provide in-memory data. The show operation gets raw file content via rawFiles — no second I/O pass needed.

D8: Bun.Glob Over `parseTaskDirectory`

Context: @alkdev/taskgraph provides parseTaskFile and parseTaskDirectory for file I/O. However, parseTaskDirectory silently skips invalid files and returns only TaskInput[] — no raw content, no error detail.
Choice: Use Bun.Glob("**/*.md") for directory scanning, Bun.file() for reading, and parseFrontmatter() (singular) for parsing. The show operation needs full markdown content (not just frontmatter), and validate needs to report filenames with errors.
Consequences: Single I/O pass per call. We get raw file content for show, error detail for validate, and the same parseFrontmatter parsing we'd get from the library. The library is still the dependency for parseFrontmatter, TaskGraph, and all analysis — we just don't use its directory-scanning convenience function.

Interfaces

Plugin Entry (`src/index.ts`)

import type { Plugin, PluginOptions } from "@opencode-ai/plugin"
import { Value } from "@alkdev/typebox/value"
import { ConfigSchema, type Config } from "./config.js"
import { createSource } from "./sources/index.js"
import { createTools } from "./tools.js"

const OpenTasksPlugin: Plugin = async (ctx, options) => {
  const config = resolveConfig(options)
  const source = createSource(config, ctx.directory)

  return {
    tool: createTools(ctx, source),
  }
}

function resolveConfig(options?: PluginOptions): Config {
  if (options && Object.keys(options).length > 0) {
    if (!Value.Check(ConfigSchema, options)) {
      // Log warning, fall back to defaults
    }
    return Value.Cast(ConfigSchema, options) as Config
  }
  return { tasksPath: "tasks" }
}

export default OpenTasksPlugin

No hooks in v1. Future: task status injection into system prompt (similar to open-memory's context awareness hook).

Tool Definition (`src/tools.ts`)

Single tool with {tool: string, args?: Record<string, unknown>} schema. The tool field dispatches to an operation handler via the registry. Unknown tool names produce a friendly error directing to tasks({tool: "help"}).

The source is passed from the plugin entry to createTools() and stored in the registry for all operations to use.

Operation Handler Signature

import type { PluginInput } from "@opencode-ai/plugin"
import type { TaskSource } from "./sources/types.js"

type OperationHandler = (
  args: Record<string, unknown>,
  source: TaskSource,
  ctx: PluginInput,
) => string | Promise<string>

Each handler receives raw args (validated by the handler itself), the TaskSource for loading task data, and the plugin context. PluginInput provides directory (workspace root) and worktree path. Returns formatted markdown string.

Compatibility Surface

This plugin depends on @alkdev/taskgraph for all graph and parsing operations. Any contract divergence between the library and existing task files surfaces as a runtime issue in the plugin — and these are easy to miss until they break.

Resolved: The Rust CLI uses depends_on (snake_case) in YAML frontmatter while the TypeScript library uses dependsOn (camelCase). This was a bug in the library's parser — parseFrontmatter() would silently strip depends_on and then fail on the missing required field. Fixed in @alkdev/taskgraph v0.0.2: a normalization step now maps depends_on → dependsOn before schema validation, so both forms are accepted transparently. See ADR-004.

The broader lesson remains: issues upstream increase the surface area of issues downstream. A naming convention in the Rust tooling created a fault line that propagated to every consumer. These are the corners that are hard to see around in linear text — exactly what DAG-structured task analysis is designed to surface.

Constraints

Read-only — the plugin never writes to the filesystem. Task mutations happen through Write/Edit tools.
No network in v1 — FileSource reads local files only. The TaskSource abstraction makes future network sources possible but v1 has no ApiSource.
No state between calls — each invocation is independent. No caching, no session storage.
Task files are the source of truth — markdown files in tasks/ directory (or configured path). No database, no alternative storage in v1.
Depends on @alkdev/taskgraph — all graph construction and frontmatter parsing comes from the core library. This plugin provides the I/O layer, config, and formatting. Contract changes in the library (field naming, schema changes) propagate here — see Compatibility Surface.
Task directory required — operations fail gracefully if no tasks/ directory is found, returning a clear message about where to create one.
Circular dependency handling — if TaskGraph.fromTasks() detects cycles via the topologicalOrder() path, the cycles operation surfaces the cycle details. Other operations that rely on topological ordering (topo, critical, parallel, cost) report the error and suggest running cycles first.
Frontmatter key normalization resolved — @alkdev/taskgraph v0.0.2+ accepts both depends_on and dependsOn in YAML frontmatter. The plugin pins ^0.0.2. See ADR-004 and Compatibility Surface.
Operations never touch the filesystem directly — they go through TaskSource.load(). This enforces the read-only constraint and makes operations testable with in-memory sources.

Error Handling

Operations encounter two categories of errors:

Infrastructure Errors (tasks directory / file I/O)

No tasks directory: Return a clear message identifying the searched paths and how to create a tasks/ directory
Empty tasks directory: Return "No task files found in <path>"
Malformed task file: Include the filename and parse error in the output. Other valid files are still processed — a single bad file does not block the entire operation
File permission errors: Return the OS error with the file path. Operation continues processing remaining files

Graph Errors (validation / cycles)

Cycle detection: The cycles operation surfaces all cycles. Operations that require topological ordering (topo, critical, parallel, cost) catch CircularDependencyError and return a message suggesting tasks({tool: "cycles"}) first
Validation errors: The validate operation returns both schema errors (field-level: invalid enums, missing required fields) and graph errors (dangling references, duplicate edges). Other operations call graph.validate() only when structural correctness matters
Task not found: Operations that take a task id return a clear "not found" message listing the available task IDs (up to 20)

Error Format

All errors are returned as markdown-formatted strings (not thrown). The agent sees a helpful message, not a stack trace. This matches open-memory's pattern where every handler returns a string.

Performance Budget

Each operation should complete within these targets (assumes ≤50 task files):

Operation	Target	Reasoning
`help`, `list`, `show`, `deps`, `dependents`	<200ms	Single-pass read + format
`validate`, `topo`, `cycles`	<300ms	Graph construction + traversal
`critical`, `parallel`, `bottleneck`	<400ms	Graph construction + analysis
`risk`, `cost`	<500ms	Graph construction + cost-benefit analysis
`decompose`	<200ms	Single task lookup + check

At 100+ files, expect 2-3x slowdown. The dominant cost is file I/O (reading and parsing YAML), not graph algorithms.

Benchmark data (43 tasks, all analysis functions, Bun runtime):

Glob scan (Bun.Glob): ~1ms
File read + parse (parseFrontmatter per file): ~140ms
Graph construction (TaskGraph.fromTasks): ~5ms
All six analysis functions combined: ~17ms
Total pipeline: ~150ms

The Rust CLI is faster on raw file I/O and YAML parsing (native binary, no JS overhead), but the plugin wins on overall call latency — no subprocess spawn, no plain-text parsing by the LLM, no context-wasting bash composition. The ~150ms is well within agent tool call budgets.

Versioning

The plugin pins @alkdev/taskgraph at ^0.0.2 in package.json dependencies. As the library stabilizes, the pin should be tightened to a minor version range to prevent unexpected contract changes. Major version bumps in the library require explicit review of this plugin's compatibility surface.

Operation Lifecycle

New operations can be added freely — the registry pattern means no schema bloat. When an operation needs removal:

Mark as deprecated in the help text for one minor version
Return a deprecation notice from the handler for one minor version
Remove in the next major version
Any removal requires an ADR documenting the reason

Test Strategy

Unit tests: Each operation handler tested with mock TaskGraph inputs (no file I/O). @alkdev/taskgraph functions are mocked — we test formatting and dispatch, not the library's analysis.
Integration tests: End-to-end tool dispatch with a fixture tasks/ directory containing sample task files. Tests write temporary files, invoke operations, and assert on markdown output.
Error tests: Missing tasks/ directory, malformed YAML, cyclic graphs, missing task IDs — each error path has at least one test.
Run with bun test. Test fixtures live in test/fixtures/tasks/.

Formatting Conventions

Tables for list, cost, bottleneck — pipe-delimited columns, sorted by relevance
Hierarchical lists for deps, dependents — indented dependency chains
Sectioned output for risk — distribution table followed by risk path
Header + detail for show — frontmatter fields as labeled list, then markdown body
Status badges for validate — ✓ valid / ✗ with error details
Grouped output for parallel — numbered generations with task lists

Relationship to Other Plugins

Plugin	Relationship
open-memory	Complementary — memory handles session introspection; tasks handles task graph analysis. Both use the registry pattern.
open-coordinator	Downstream consumer — coordinator uses `tasks` to identify parallelizable work, then spawns worktrees. The `parallel` and `critical` operations inform coordination decisions.
taskgraph CLI	Functional equivalent — the Rust CLI and this plugin expose the same operations, but this plugin is native TypeScript + in-process, while the CLI is a separate binary.
@alkdev/taskgraph	Core dependency — all graph operations. This plugin is a thin wrapper.

Open Questions

Should show include the task's markdown body? Task files can be long (especially with acceptance criteria and notes). Option A: always include full body. Option B: show returns frontmatter summary, show --full includes body. Recommendation: always include body — agents need the full context for implementation tasks, and show is on-demand (not in every call).
Should cost accept --format json? The CLI supports JSON output for programmatic consumption. Since the plugin returns to an agent (not a script), markdown is always appropriate. JSON output is out of scope.
Future hook: task status injection? Open-memory injects context percentage into the system prompt. Could open-tasks inject a brief task summary ("3 pending, 1 in-progress, 2 blocked")? This would require reading tasks on every message, which is cheap for small task sets but could be noisy. Defer to v2.

References

@alkdev/taskgraph API surface: see @alkdev/taskgraph docs/architecture/api-surface.md or the local clone at /workspace/@alkdev/taskgraph_ts/docs/architecture/api-surface.md
@alkdev/taskgraph README: local clone at /workspace/@alkdev/taskgraph_ts/README.md
open-memory architecture: /workspace/@alkdev/open-memory/docs/architecture.md (reference implementation for the registry pattern)
open-memory tools.ts: /workspace/@alkdev/open-memory/src/tools.ts (reference for handler pattern)
SDD process: ../sdd_process.md
OpenCode plugin SDK: @opencode-ai/plugin npm package

28 KiB Raw Blame History