Files
open-memory/docs/research/02-agent-definitions-pattern.md

22 KiB

Agent Definitions Pattern: Research & HUD/AUI Implications

1. alkhub_ts Agent Definitions

1.1 Directory Structure

Agent definitions in alkhub_ts live in .opencode/agents/ as individual Markdown files:

.opencode/agents/
├── architect.md
├── architecture-reviewer.md
├── code-reviewer.md
├── coordinator.md
├── decomposer.md
├── implementation-specialist.md
├── poc-specialist.md
└── research-specialist.md

1.2 File Format: YAML Frontmatter + Markdown Body

Each file uses gray-matter frontmatter for structured metadata and a Markdown body for the system prompt:

---
description: Short one-liner describing the agent's purpose
mode: primary | subagent
temperature: 0.2
---

You are the **Role Name**, [long-form system prompt...]

Frontmatter fields observed across all 8 agents:

Field Type Required Purpose
description string yes One-line summary shown in agent picker / @ autocomplete
mode "primary" | "subagent" yes Whether the agent appears as a top-level mode or only as a subagent
temperature number sometimes Model sampling temperature override

Additional fields supported by OpenCode but not used in alkhub_ts:

Field Type Purpose
model string Override the model (e.g., "anthropic/claude-sonnet-4")
variant string Model variant to use when using this agent's configured model
top_p number Top-p sampling override
hidden boolean Hide from the UI (for internal agents like compaction, title)
color string Hex color or theme color for UI display
steps number Maximum agentic iterations before forcing text-only response
permission object Per-tool permission rules (allow/deny/ask)
options object Arbitrary provider options merged into model calls
disable boolean Disable a built-in agent

1.3 Agent Roles in alkhub_ts

The 8 agents form a coordinated workflow:

Agent Mode Role
coordinator primary Orchestrates parallel task execution across worktrees
architect primary Creates/maintains architecture specifications (WHAT & WHY)
decomposer primary Breaks architecture into atomic, dependency-ordered tasks
implementation-specialist primary Executes atomic tasks in isolated worktrees
poc-specialist primary Creates proof-of-concepts in research worktrees
research-specialist subagent Researches technical topics, documents findings
code-reviewer subagent Reviews code quality at checkpoints
architecture-reviewer subagent Reviews architecture specs for gaps/risks

Key patterns:

  • Primary agents are selectable top-level modes in the TUI
  • Subagents are invoked only via the @agent-name syntax or programmatically via the task tool
  • Each agent has a detailed system prompt defining its workflow, constraints, and output format
  • The coordinator describes both current (open-coordinator plugin) and future (hub operations) execution models

1.4 Agent Prompt Design Patterns

The alkhub_ts agents demonstrate several reusable patterns:

  1. Environment scoping: Implementation specialist and POC specialist both specify exact worktree paths and use workdir parameter patterns
  2. Workflow phases: Structured numbered steps (1. Load Task → 2. Verify → 3. Implement → 4. Verify → 5. Update → 6. Commit)
  3. Safe Exit protocol: Standardized failure handling with status updates and escalation
  4. Role constraints: "You coordinate, you do not implement" — explicit boundaries
  5. Template outputs: Structured output templates (review reports, research documents)
  6. Tool gating: References to specific tools available to the agent

2. OpenCode Agent System (Source Code Analysis)

2.1 Agent Schema (Agent.Info)

Defined in /workspace/opencode/packages/opencode/src/agent/agent.ts (lines 27-52):

export const Info = z.object({
  name: z.string(),
  description: z.string().optional(),
  mode: z.enum(["subagent", "primary", "all"]),
  native: z.boolean().optional(),
  hidden: z.boolean().optional(),
  topP: z.number().optional(),
  temperature: z.number().optional(),
  color: z.string().optional(),
  permission: Permission.Ruleset,
  model: z.object({
    modelID: ModelID.zod,
    providerID: ProviderID.zod,
  }).optional(),
  variant: z.string().optional(),
  prompt: z.string().optional(),
  options: z.record(z.string(), z.any()),
  steps: z.number().int().positive().optional(),
})

2.2 Config Schema (Config.Agent)

Defined in /workspace/opencode/packages/opencode/src/config/config.ts (lines 466-553):

export const Agent = z.object({
  model: ModelId.optional(),
  variant: z.string().optional(),
  temperature: z.number().optional(),
  top_p: z.number().optional(),
  prompt: z.string().optional(),
  tools: z.record(z.string(), z.boolean()).optional(),  // deprecated
  disable: z.boolean().optional(),
  description: z.string().optional(),
  mode: z.enum(["subagent", "primary", "all"]).optional(),
  hidden: z.boolean().optional(),
  options: z.record(z.string(), z.any()).optional(),
  color: z.union([z.string().regex(...), z.enum([...])]).optional(),
  steps: z.number().int().positive().optional(),
  maxSteps: z.number().int().positive().optional(),  // deprecated
  permission: Permission.optional(),
}).catchall(z.any()).transform(...)

Notable: The catchall(z.any()) means any unknown fields in the YAML frontmatter or JSON config are swept into options. This is by design — it allows arbitrary per-agent configuration that gets merged into model call parameters.

2.3 Loading Pipeline

Agent definitions are loaded from four directory patterns (in /workspace/opencode/packages/opencode/src/config/config.ts, line 209):

/.opencode/agent/    (singular)
/.opencode/agents/   (plural)
/agent/              (singular, no dot)
/agents/             (plural, no dot)

The loading function loadAgent() (lines 189-226):

  1. Globs for *.md files in all matching directories
  2. Parses each file with ConfigMarkdown.parse() which uses gray-matter to extract YAML frontmatter
  3. Extracts the agent name from the file path (stripping directory prefixes and .md extension)
  4. Combines frontmatter data + markdown body as prompt
  5. Validates against the Agent schema
  6. Returns a Record<string, Agent> mapping name → config

Name resolution (line 211):

const patterns = ["/.opencode/agent/", "/.opencode/agents/", "/agent/", "/agents/"]
const file = rel(item, patterns) ?? path.basename(item)
const agentName = trim(file)  // removes .md extension

This means:

  • .opencode/agents/coordinator.md → agent name "coordinator"
  • .opencode/agents/nested/child.md → agent name "nested/child"

2.4 Merge Strategy

Built-in agents (build, plan, general, explore, compaction, title, summary) are defined in code. User-defined agents from .opencode/agents/*.md are merged on top:

for (const [key, value] of Object.entries(cfg.agent ?? {})) {
  if (value.disable) {
    delete agents[key]
    continue
  }
  let item = agents[key]
  if (!item) {
    item = agents[key] = {
      name: key,
      mode: "all",
      permission: Permission.merge(defaults, user),
      options: {},
      native: false,
    }
  }
  // Merge each field: prompt, model, temperature, mode, etc.
  item.prompt = value.prompt ?? item.prompt
  item.model = value.model ? Provider.parseModel(value.model) : item.model
  item.variant = value.variant ?? item.variant
  // ... etc
}

Key behaviors:

  • disable: true removes a built-in agent entirely
  • If a new name doesn't match a built-in, a fresh agent with mode: "all" is created
  • Frontmatter fields override built-in values (not deep-merge for most fields)
  • Permission configs are merged (not replaced)
  • options are deep-merged with mergeDeep()

2.5 System Prompt Assembly

When an LLM call is made, the system prompt is assembled in this order (from /workspace/opencode/packages/opencode/src/session/llm.ts, lines 101-126):

const system: string[] = []
system.push(
  [
    // 1. Agent-specific prompt OR provider default prompt
    ...(input.agent.prompt ? [input.agent.prompt] : SystemPrompt.provider(input.model)),
    // 2. Custom system prompt from the call
    ...input.system,
    // 3. Custom system prompt from the user message
    ...(input.user.system ? [input.user.system] : []),
  ]
    .filter((x) => x)
    .join("\n"),
)

Then the plugin hook experimental.chat.system.transform is triggered, allowing plugins to modify the system prompt array.

After this, additional segments are added (from /workspace/opencode/packages/opencode/src/session/prompt.ts, lines 1500-1509):

const [skills, env, instructions, modelMsgs] = yield* Effect.all([
  Effect.promise(() => SystemPrompt.skills(agent)),
  Effect.promise(() => SystemPrompt.environment(model)),
  instruction.system(),
  Effect.promise(() => MessageV2.toModelMessages(msgs, model)),
])
const system = [...env, ...(skills ? [skills] : []),
  ...instructions]

The full system prompt hierarchy (first message wins position, content accumulates):

  1. Agent prompt (from .opencode/agents/*.md body) — or a model-specific default (anthropic.txt, gpt.txt, etc.)
  2. Custom system (from plugin hooks, compaction, plan mode injection)
  3. User-provided system prompt (from the user message)
  4. Plugin modifications via experimental.chat.system.transform
  5. Environment info (model name, working directory, platform, date)
  6. Skills list (markdown-formatted available skills)
  7. Instruction files (AGENTS.md, CLAUDE.md found walking up directory tree)

2.6 Agent Name Usage in Messages

The AgentPart type (SDK types, line 833-844):

export type AgentPart = {
  id: string
  sessionID: string
  messageID: string
  type: "agent"
  name: string           // agent name, e.g. "explore"
  source?: { value: string, start: number, end: number }
}

When a user types @explore in their message, OpenCode parses this into an AgentPart. During prompt processing, if the text contains @agent-name, it resolves to the corresponding agent definition, and the subagent is launched via the task tool.

2.7 Agent Generation

OpenCode includes an LLM-powered agent generator (Agent.generate()). When invoked, it:

  1. Collects the list of existing agent names to avoid collisions
  2. Uses a structured output call with schema { identifier, whenToUse, systemPrompt }
  3. The prompt (generate.txt) instructs the model to create an agent configuration

This is used by the /agent command in the CLI to dynamically create agents from descriptions.


3. Relationship Between Agents and Sessions

3.1 Agent per Message, Not per Session

Each user message carries an agent field indicating which agent handled it. This is NOT a session-level property — a single session can switch between agents:

// Message info structure (simplified)
interface MessageInfo {
  id: MessageID
  role: "user" | "assistant"
  agent: string       // e.g. "build", "explore", "coordinator"
  model: { providerID, modelID }
  // ...
}

From prompt.ts line 1593:

const agentName = cmd.agent ?? input.agent ?? (yield* agents.defaultAgent())

This means:

  • A user can type @explore mid-conversation to switch to the explore agent for that turn
  • The next turn may return to the default agent
  • Each message remembers which agent produced it

3.2 Agent Switching and Plan Mode

Plan mode has special handling. From prompt.ts lines 261-302:

  • When switching FROM plan TO build, a system reminder is injected explaining the transition
  • When NOT in plan mode but the previous assistant message was from plan, a different reminder is injected
  • Plan mode restricts edit permissions

3.3 No Agent-Scoped State or Memory

OpenCode does not have a concept of "agent state" or "agent-scoped memory". Each agent is stateless — it's defined by its:

  • System prompt
  • Permission ruleset
  • Model configuration
  • Tool access

State lives in the session (messages, tool results, compaction summaries). The agent definition is purely declarative configuration for how to run LLM calls within a session.

The options field on agents supports arbitrary key-value pairs that get merged into LLM call parameters, but these are static configuration, not runtime state.


4. Relevance to HUD/AUI Concept

4.1 Could HUD Sections Be Defined as Declarative Configs?

Yes — and the agent definition pattern provides a strong analogy.

An agent definition is essentially:

frontmatter (structured metadata) → controls behavior
markdown body (unstructured prompt) → controls content

A HUD section definition could follow the same pattern:

---
section: context-status
position: top
refresh: on-event          # on-event | on-demand | periodic
priority: 10
collapse-threshold: 70     # percentage above which to always expand
always-show: false
---

Template for rendering this section (can reference data sources)...

Just as agent definitions declare their mode, temperature, and permission, HUD definitions would declare their position, refresh strategy, and data requirements.

4.2 Declarative vs. Imperative: What Agent Definitions Teach Us

Agent definitions are declarative configs with a procedural core:

Aspect Agent Definition HUD Definition (Proposed)
Metadata YAML frontmatter YAML frontmatter
Content Markdown system prompt Markdown template or rendering spec
Behavior Controls LLM call parameters Controls HUD rendering and data fetching
Overrides Built-in agents can be extended/overridden Built-in HUD sections could be extended/overridden
Merge mergeDeep with priority Similar merge with project-level overrides

The critical design insight from OpenCode's agent system: the same merge strategy that allows .opencode/agents/*.md files to override built-in agents could allow .opencode/hud/*.md files to override built-in HUD sections.

4.3 Project-Specific HUD Layouts

Different project types could have different HUD layouts, just as different projects have different agent rosters:

# A web app project might define:
.opencode/hud/context-bar.md    → Shows token usage, model, cost
.opencode/hud/task-tracker.md   → Shows task progress from tasks/*.md
.opencode/hud/test-runner.md    → Shows test results

# A data pipeline project might define:
.opencode/hud/pipeline-status.md → Shows last pipeline run status
.opencode/hud/data-quality.md    → Shows data quality metrics  
.opencode/hud/context-bar.md     → Override: add data volume info

This mirrors how coordinator.md uses worktree-specific context that implementation-specialist.md doesn't need.

4.4 How Could This Be Done Without Modifying OpenCode Core?

OpenCode's plugin system provides the necessary hooks. The relevant hooks are:

  1. experimental.chat.system.transform — already used by open-memory to inject context status. This hook receives { sessionID, model } and { system } (a mutable array of system prompt strings).

  2. experimental.session.compacting — receives compaction events.

  3. event — receives all SSE events, which include message updates with token counts.

A HUD definition system could work as a plugin:

@alkdev/open-memory/ (or a separate @alkdev/open-hud plugin)
├── src/
│   ├── index.ts           # Plugin entry
│   ├── hud/
│   │   ├── loader.ts      # Load .opencode/hud/*.md files (like loadAgent)
│   │   ├── renderer.ts    # Render HUD sections into system prompt
│   │   └── sections/      # Built-in section definitions
│   │       ├── context.md
│   │       ├── tasks.md
│   │       └── git.md
│   └── hooks/
│       ├── system-prompt.ts  # experimental.chat.system.transform
│       └── event.ts         # SSE event processing for data

The key architectural insight: we don't need OpenCode to render a visual HUD. Instead, we inject structured status information into the system prompt, and the agent's response becomes the "rendered" HUD. This is exactly what open-memory already does with context percentage injection.

4.5 Proposed HUD Definition Schema

Drawing from the agent definition pattern:

---
# Section identity
name: context-status           # unique identifier (from filename)
description: Context window usage and status

# Rendering behavior
position: header               # header | sidebar | footer | inline
priority: 10                   # lower = shown first
refresh: on-event              # on-event | on-demand | periodic | once
collapse-threshold: 70         # auto-collapse below this threshold

# Data requirements
data-sources:
  - context-tracker            # from this plugin
  - session-info                # from OpenCode

# Rendering constraints
max-length: 500                # max chars in system prompt injection
always-show: false             # always inject, even when collapsed

# Agent targeting
agents:                        # which agents should see this section
  - build
  - plan
  # (null/undefined = all agents)
---

## Context Status

Your context window is at {{context.percentage}}% usage ({{context.tokens}} / {{context.limit}} tokens).

{{#if context.status.critical}}
⚠️ CRITICAL: Context usage above 92%. Consider using memory_compact() immediately.
{{else if context.status.red}}
🔴 Context usage above 85%. Consider compacting soon.
{{else if context.status.yellow}}
🟡 Context usage above 70%. Monitor but proceed normally.
{{else}}
🟢 Context usage is healthy (below 70%).
{{/if}}

4.6 Comparison: Agent Definitions vs. HUD Definitions

Dimension Agent Definition HUD Definition (Proposed)
Format YAML frontmatter + Markdown body YAML frontmatter + template body
Loading .opencode/agents/*.md .opencode/hud/*.md (or plugin-scoped)
Merge Built-in + config + user overrides Built-in + project overrides
Scope Per-agent (LLM call config) Per-section (status display config)
State None (stateless config) Reactive data sources
Output System prompt content System prompt injection (agent-visible)
Trigger User selects @agent-name System prompt assembly (every turn)
Data Static config only Dynamic (from SSE events, DB queries)

4.7 Key Differences and Challenges

  1. Statefulness: Agent definitions are purely static config. HUD sections need reactive data (context percentage, session counts, git status). This requires runtime state management that doesn't exist in the agent system.

  2. Rendering: Agent definitions are consumed by the LLM as freeform text. HUD sections could be either:

    • Prompt-injection style (like current open-memory context injection) — the agent "sees" the HUD
    • Tool-response style — the agent queries HUD data via a memory tool
    • The agent definition pattern suggests prompt-injection, but tool-response may be better for on-demand data
  3. Conditional visibility: Agent definitions have hidden and mode fields. HUD sections need richer conditions — "show only when context > 70%" or "show only when git has uncommitted changes". This is more complex than the simple boolean/enum agent system.

  4. Layout ordering: Agent definitions don't have a concept of ordering (they're selected by name). HUD sections need positional semantics (which section appears first, which is collapsible, etc.).

  5. Refresh cadence: Agent configs are loaded once. HUD data may need to refresh on events, periodically, or on-demand. The agent system has no equivalent concept.

Phase 1: Mimic the agent definition loading pattern exactly.

Store HUD section templates as .opencode/hud/*.md with YAML frontmatter. Load them using the same gray-matter + glob pattern that OpenCode uses for agents. Inject them via the experimental.chat.system.transform hook.

This requires no OpenCode core changes and establishes the file format convention.

Phase 2: Add data binding and conditional rendering.

Extend the template body with simple ${variable} interpolation. The plugin maintains a reactive data store (context tracker, session stats) that fills in these variables at system prompt assembly time.

Phase 3: Consider proposing first-class HUD support to OpenCode.

If the pattern proves valuable, propose that OpenCode adds a .opencode/hud/ directory as a first-class concept, similar to .opencode/agents/ and .opencode/skills/. The loading infrastructure already exists (glob + gray-matter + merge). The new concept is just the "HUD section" schema with its position, refresh, and data-source metadata.


5. Summary of Findings

Agent Definition System (OpenCode)

  • Format: YAML frontmatter + Markdown body in .opencode/agents/*.md
  • Schema: AgentConfig with fields for model, prompt, mode, permissions, options, etc.
  • Loading: Glob + gray-matter parsing, merged over built-in agents
  • Resolution: Agent name derived from filename (with directory prefix for nested files)
  • Usage: Selected per-message via @agent-name syntax or as default agent
  • System prompt: Agent's prompt field becomes the primary system prompt (replacing provider default)
  • No state: Agents are stateless config; state lives in sessions

alkhub_ts Agent Definitions

  • 8 agents forming a coordinated workflow (architect → decomposer → implementation-specialist)
  • Rich prompts: Detailed workflows, constraints, output templates, tool references
  • Pattern: Primary agents for top-level use, subagents for specialized delegation
  • Innovation: Worktree-scoped environment constraints, safe exit protocols, AAR processes

HUD/AUI Implications

  • The agent definition pattern (YAML frontmatter + template body, glob loading, merge strategy) translates directly to HUD section definitions
  • Agent definitions prove the pattern works for declarative, project-specific configuration
  • The key difference is state: agents are static config, HUD needs reactive data
  • Can be implemented as a plugin without OpenCode core changes using experimental.chat.system.transform
  • The same .opencode/ directory convention would make HUD definitions discoverable and project-specific