ujsx/docs/research/agent-hud-architecture.md

# Agent HUD: A Context Management Architecture for LLM Agents

## Research → Design Synthesis

This document synthesizes findings from six research areas into a concrete architecture for an agent HUD system that replaces ad-hoc compaction with structured, cache-aware context management.

**See also**: [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md) — The UJSX v2 rewrite plan, replacing the existing POC at `/workspace/aui` with a TypeBox-schema-driven universal JSX IR that supports actual JSX syntax, bi-directional transforms, and component schemas as tool schemas.

---

## The Problem

Current LLM agent interfaces have a fundamental flaw: **context is managed as a monolithic conversation log that grows until it must be violently compacted**. This creates several pathologies:

1. **Information cliff**: Compaction replaces rich conversation with a lossy summary. Everything before the compaction boundary is gone — the agent can't even know what it lost.

2. **Reactive, not proactive**: OpenCode's compaction fires at ~92% context usage. There's no budgeting — system prompt, tool definitions, and conversation history compete for the same fixed space with no allocation strategy.

3. **Cognitive waste**: The agent sees the full conversation log every turn, including messages that are no longer relevant. Early messages about setup, resolved errors, and abandoned approaches consume tokens without providing value.

4. **Cognitive strain**: The agent has no ambient awareness of its own state. It must explicitly call tools to check context usage, search history, or understand where it is in a task. Each tool call costs a turn and consumes context.

5. **No provider-cache alignment**: OpenCode's 2-part system prompt split (`llm.ts:115-126`) is the only cache-aware structure. The rest of the context — messages, tool results, the full conversation — has no cache segmentation.

The key insight: **the "agent frame" — what the LLM sees in a single turn — should be treated as a composition problem, not an append-only log problem.**

---

## The Leverage Point

OpenCode's `llm.ts:115-126` reveals the critical pattern:

```typescript
// rejoin to maintain 2-part structure for caching if header unchanged
if (system.length > 2 && system[0] === header) {
  const rest = system.slice(1)
  system.length = 0
  system.push(header, rest.join("\n"))
}
```

This splits the system prompt into:
- **Part 0** (cached, stable): Agent prompt, provider prompt — changes rarely, benefits from prompt caching
- **Part 1** (dynamic, re-cached each call): Environment, skills, instructions — changes often

The plugin system pushes to `output.system[]` which gets merged into Part 1. The open-memory plugin injects context status (~50 tokens) into this dynamic part.

**This 2-part pattern generalizes.** We can extend it to a multi-part context composition with different cache characteristics:

| Part | Content | Changes | Cache Value |
|------|---------|---------|-------------|
| 0 | Agent identity, core instructions | Rarely | Highest (static across turns) |
| 1 | HUD static layer (task, key decisions) | On agent action | High (stable within a task phase) |
| 2 | HUD dynamic layer (notes, recent events) | Every turn | Medium (re-cached each call) |
| 3 | Conversation messages | Every turn | Low (grows monotonically) |

---

## Architecture: The Agent HUD

### Core Concept

The HUD is a **structured markdown document** that the agent sees at the top of every turn, injected via the `experimental.chat.system.transform` plugin hook. Unlike compaction (which discards information), the HUD is:
- **Composed**: Built from components, not a flat log
- **Bidirectional**: The agent can update it via tools
- **Cache-aware**: Structured so static parts benefit from provider caching
- **Adaptive**: Automatically adjusts density based on context pressure

### The "Agent Frame" Model

Instead of treating the conversation as an append-only log, think of each agent turn as a **frame** composed of:

```
┌─────────────────────────────────────┐
│  System Prompt Part 0 (cached)       │  ← Agent identity, core instructions
│  [never changes within a session]    │
├─────────────────────────────────────┤
│  System Prompt Part 1 (re-cached)    │  ← HUD rendered to markdown
│  ┌─────────────────────────────────┐ │
│  │ # Task                          │ │  ← Static HUD (changes on action)
│  │ Implement auth module            │ │
│  │                                 │ │
│  │ ## Context                      │ │  ← Adaptive density
│  │ 67% used (134k/200k tokens)     │ │  ← Context status
│  │ Trend: growing rapidly           │ │
│  │                                 │ │
│  │ ## Key Decisions                │ │  ← Agent-maintained state
│  │ • Using JWT over sessions       │ │
│  │ • bcrypt for password hashing   │ │
│  │                                 │ │
│  │ ## Active Files                 │ │  ← Derived from tool calls
│  │ • src/auth/mod.ts (editing)     │ │
│  │ • src/auth/jwt.ts (referenced)  │ │
│  │                                 │ │
│  │ ## Next Steps                    │ │  ← Agent's own plan
│  │ 1. Add refresh token rotation    │ │
│  │ 2. Write auth middleware         │ │
│  │ 3. Add tests                    │ │
│  │                                 │ │
│  │ ## Notes                        │ │  ← Agent's scratchpad
│  │ • DB schema: users, sessions   │ │
│  │ • Rate limit: 100/min default   │ │
│  └─────────────────────────────────┘ │
├─────────────────────────────────────┤
│  Conversation Messages              │  ← Standard message history
│  [filtered by compaction boundary]  │
└─────────────────────────────────────┘
```

### The Append-Only Event Log

Underlying the HUD is an **append-only event log** — similar to Yjs/CRDT event streams but simpler because we're single-author (one agent per session). This is the "source of truth" that the HUD renders from.

```typescript
interface HudEvent {
  id: string              // UUID
  type: HudEventType      // discriminated union
  timestamp: number       // Unix ms
  sessionId: string       // opencode session ID
  turn: number            // which agent turn
  payload: unknown        // type-specific data
}

type HudEventType =
  | "task.set"            // Agent sets the current task description
  | "task.update"         // Agent refines the task
  | "decision.record"    // Agent records a key decision
  | "note.add"           // Agent adds a note
  | "note.update"        // Agent updates a note
  | "note.remove"        // Agent removes a note
  | "file.open"          // Agent reads a file (derived from tool calls)
  | "file.edit"          // Agent edits a file (derived from tool calls)
  | "file.close"         // File falls out of recent context
  | "step.complete"      // Agent marks a step complete
  | "step.add"          // Agent adds a next step
  | "step.reorder"       // Agent reorders steps
  | "error.encountered"  // Agent encounters an error
  | "error.resolved"     // Agent resolves an error
  | "blocker.add"       // Agent identifies a blocker
  | "blocker.remove"     // Agent removes a blocker
  | "context.snapshot"   // Context window usage snapshot (auto-generated)
  | "compact.before"     // Pre-compaction state snapshot
  | "compact.after"      // Post-compaction state + summary
```

The event log is persisted and append-only. It replaces the "conversation log as only state" model with "conversation log as one input to the HUD, alongside structured state."

### Rendering Pipeline: UJSX → MDAST → Markdown

The HUD uses the **UJSX v2** universal JSX IR (see [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md)) built on TypeBox schemas. The existing POC at `/workspace/aui` already implements UJSX → mdast → markdown with a `TransformRegistry` and `mdast-util-to-markdown`. The v2 rewrite adds:

- Actual JSX syntax (via `jsxImportSource: "@ade/ujsx"`)
- TypeBox schema-driven node types (schemas ARE types, schemas ARE tool parameter schemas)
- Bi-directional transforms (UJSX ↔ mdast, UJSX ↔ hast, etc.)
- HTML-agnostic core (no `onClick`, `className` in universal props)

```
HUD Components (.tsx with JSX syntax)
    │
    ▼
h() factory / JSX transform → UElement tree (TypeBox-schema-validated)
    │
    ▼  TransformRegistry (direction: 'ujsx->mdast')
    │  Rules match on TypeBox schemas, not string tags
    │
mdast tree
    │
    ▼  mdast-util-to-markdown + mdast-util-gfm
    │
Markdown String
    │
    ▼  Injected via experimental.chat.system.transform
    │
System Prompt Part 1
```

**Why UJSX (not Hono JSXNode, not hast→mdast)?**

1. **Schema-driven**: TypeBox schemas serve triple duty — TypeScript types, runtime validation, and tool parameter schemas. Component props = tool input schemas. Zero duplication.
2. **Bi-directional**: Rules convert both UJSX→mdast AND mdast→UJSX. Parse existing markdown (notes, AGENTS.md) back into the IR.
3. **HTML-agnostic**: No `onClick`, `className`, `aria-*` in the core. The IR isn't pretending to be HTML.
4. **HostConfig preserved**: The same component tree renders to graphology, markdown, or future targets.
5. **Actual JSX syntax**: With `jsxImportSource`, write `<TaskSection task={state.task} density="compact" />` not `h('TaskSection', { task: state.task, density: 'compact' })`.

### HUD Component Architecture

```tsx
// The root HUD component
function HUD({ state, contextInfo }: { state: HudState, contextInfo: ContextInfo }) {
  const density = getDensity(contextInfo.percentage)

  return (
    <Container>
      <TaskSection task={state.task} density={density} />
      <ContextBar info={contextInfo} density={density} />
      {density !== 'minimal' && <DecisionsList decisions={state.decisions} density={density} />}
      {density !== 'minimal' && <ActiveFiles files={state.activeFiles} density={density} />}
      <NextSteps steps={state.nextSteps} density={density} />
      {density === 'full' && <NotesSection notes={state.notes} />}
      {contextInfo.percentage > 85 && <WarningBanner info={contextInfo} />}
    </Container>
  )
}

// Adaptive density: renders differently based on context pressure
type Density = 'full' | 'compact' | 'minimal'

function getDensity(percentage: number): Density {
  if (percentage < 70) return 'full'
  if (percentage < 85) return 'compact'
  return 'minimal'
}

// Example adaptive component
function DecisionsList({ decisions, density }: { decisions: Decision[], density: Density }) {
  if (density === 'compact') {
    return <text>{`## Decisions (${decisions.length})\n` + decisions.map(d => `- ${d.summary}`).join('\n')}</text>
  }
  return (
    <section title="Decisions">
      {decisions.map(d => <DecisionItem decision={d} />)}
    </section>
  )
}
```

### Cache-Aware System Prompt Structure

The key extension of OpenCode's 2-part pattern:

```
System Message 0 (cached, stable):
  - Agent identity prompt
  - Provider-specific prompt
  — Never changes within a session

System Message 1 (re-cached each call):
  - HUD static layer (task, decisions, next steps)
  - HUD dynamic layer (context bar, notes, active files)
  - Environment info, skills, instructions
  — Changes based on agent actions and context pressure
```

The **static layer** of the HUD should change only when the agent explicitly updates it (task change, decision recorded, step completed). The **dynamic layer** changes every turn (context percentage, recent files, notes).

For Anthropic's `cache_control`, we mark:
- System message 0 → `cacheControl: { type: "ephemeral" }` (already done by opencode)
- System message 1 → `cacheControl: { type: "ephemeral" }` (already done by opencode)

The savings come from keeping message 0 stable (0.1x cost on cache hits) and making message 1 as small as possible while still providing full situational awareness.

### HUD State and Event Log Storage

The event log uses the same SQLite database opencode already has, with a new table:

```sql
CREATE TABLE hud_event (
  id TEXT PRIMARY KEY,
  session_id TEXT NOT NULL REFERENCES session(id),
  type TEXT NOT NULL,           -- HudEventType
  turn INTEGER NOT NULL,       -- agent turn number
  timestamp INTEGER NOT NULL,   -- Unix ms
  payload TEXT NOT NULL,       -- JSON
  created_at INTEGER NOT NULL DEFAULT (unixepoch())
);

CREATE INDEX idx_hud_event_session ON hud_event(session_id, timestamp);
CREATE INDEX idx_hud_event_type ON hud_event(session_id, type);
```

The **HUD state** is derived from the event log by projectors (similar to opencode's existing event sourcing pattern):

```typescript
interface HudState {
  task: { description: string; updatedAt: number } | null
  decisions: Array<{ id: string; summary: string; details: string; recordedAt: number }>
  notes: Array<{ id: string; content: string; updatedAt: number }>
  activeFiles: Array<{ path: string; lastAccessed: number; status: 'reading' | 'editing' | 'referenced' }>
  nextSteps: Array<{ id: string; description: string; completed: boolean; order: number }>
  blockers: Array<{ id: string; description: string; resolved: boolean }>
  errors: Array<{ id: string; message: string; resolved: boolean; resolvedAt?: number }>
}
```

Projectors are pure functions: `(state, event) => newState`. They're deterministic and idempotent — the state can always be rebuilt from the event log.

Some events are **derived** rather than agent-authored:
- `file.open`, `file.edit`, `file.close`: Intercepted from opencode's `tool.execute.before/after` hook by watching for Read, Write, Edit tools
- `context.snapshot`: Auto-generated from `SessionProcessor` events tracking token usage

### HUD Tools (Agent-Facing)

The agent interacts with the HUD through two tools (following the router pattern from open-memory):

**`hud`** (router tool — reduces context bloat from tool definitions):

```typescript
hud(input: {
  tool: string,    // operation name
  args?: object    // operation arguments
})
```

Operations:
- `task.get` / `task.set` / `task.update` — Manage current task
- `decisions.list` / `decisions.record` / `decisions.remove` — Key decisions log
- `notes.list` / `notes.add` / `notes.update` / `notes.remove` — Scratchpad
- `steps.list` / `steps.add` / `steps.complete` / `steps.reorder` — Next steps / plan
- `blockers.list` / `blockers.add` / `blockers.remove` — Blockers
- `snapshot` — Full HUD state
- `history` — Recent event log (for understanding what changed)

**`hud_compact`** (mutation tool — separate to prevent accidental use):

```typescript
hud_compact()  // Triggers both HUD compaction AND opencode compaction
```

This is distinct from `memory_compact` because:
1. It snapshots HUD state to the event log before compaction
2. After compaction, the stable HUD layer carries forward — the agent doesn't lose its task, decisions, or notes
3. It can trigger compaction at a natural breakpoint (when the agent says "next steps updated, good time to compact")

### Plugin Integration (opencode)

The HUD is implemented as an opencode plugin, using these hooks:

```typescript
const HUDPlugin: Plugin = async (ctx) => {
  const stateManager = new HudStateManager(ctx)  // Manages event log + projectors
  const renderer = new HudRenderer()             // JSX component → markdown

  return {
    tool: { hud: createHudTool(ctx, stateManager), hud_compact: createHudCompactTool(ctx, stateManager) },

    "experimental.chat.system.transform": async (input, output) => {
      // Render HUD and inject as system prompt
      const state = stateManager.getState(input.sessionID)
      const contextInfo = getContextInfo(input.sessionID)
      const markdown = renderer.render(HUD, { state, contextInfo })
      output.system.push(markdown)
    },

    "experimental.session.compacting": async (input, output) => {
      // Before compaction: snapshot HUD state to event log
      stateManager.recordEvent(input.sessionID, { type: "compact.before", payload: stateManager.getState(input.sessionID) })
      // Replace compaction prompt with self-continuity + HUD-aware prompt
      output.prompt = getCompactionPrompt(stateManager.getState(input.sessionID))
    },

    "event": async ({ event }) => {
      // Derive file events from tool calls
      if (event.type === "tool.execute.after") {
        stateManager.maybeRecordFileEvent(event)
      }
      // Track context snapshots
      if (event.type === "message.updated") {
        stateManager.maybeRecordContextSnapshot(event)
      }
    },

    "tool.execute.before": async (input, output) => {
      // Track file access from tool calls
      stateManager.trackToolCall(input)
    },
  }
}
```

### Rendering Pipeline Implementation

```typescript
// core/renderer.ts

import type { FC } from "hono/jsx"

interface RenderContext {
  density: Density
  state: HudState
  contextInfo: ContextInfo
}

class HudRenderer {
  // Custom walker over Hono JSXNode tree
  render(component: FC<any>, props: Record<string, any>): string {
    const node = component(props)
    return this.walkNode(node)
  }

  private walkNode(node: any): string {
    if (typeof node === "string" || typeof node === "number") {
      return String(node)
    }
    if (node === null || node === undefined || node === false) {
      return ""
    }
    if (node instanceof Promise) {
      throw new Error("Async components not supported in HUD rendering")
    }

    // JSXFragmentNode — just concatenate children
    if (node.tag === null || node.tag === Symbol.for("hono.fragment")) {
      return (node.children as any[]).map(c => this.walkNode(c)).join("\n")
    }

    // Function component — call it and walk the result
    if (typeof node.tag === "function") {
      const result = node.tag({ ...node.props, children: node.children })
      return this.walkNode(result)
    }

    // Intrinsic element — map HTML tag to markdown
    return this.renderElement(node.tag, node.props, node.children)
  }

  private renderElement(tag: string, props: any, children: any[]): string {
    const childContent = (children || []).map(c => this.walkNode(c)).join("\n")

    switch (tag) {
      case "h1": return `# ${childContent}`
      case "h2": return `## ${childContent}`
      case "h3": return `### ${childContent}`
      case "strong": case "b": return `**${childContent}**`
      case "em": case "i": return `*${childContent}*`
      case "code": return props.lang
        ? `\`\`\`${props.lang}\n${childContent}\n\`\`\``
        : `\`${childContent}\``
      case "ul": return childContent
      case "li": return `- ${childContent}`
      case "ol": return childContent  // handled by parent
      case "p": return childContent
      case "div": case "section": case "article": case "main": case "header":
      case "footer": case "nav": case "aside": case "span":
        return childContent  // container — just pass through content
      case "a": return `[${childContent}](${props.href})`
      case "blockquote": return childContent.split("\n").map(l => `> ${l}`).join("\n")
      case "hr": return "---"
      case "br": return "\n"
      case "pre": return childContent  // content already formatted by <code>
      case "table": return this.renderTable(props, children)
      default:
        // Unknown/custom tags: use data-md attribute or just render content
        if (props?.["data-md"]) {
          // Custom markdown rendering hint
          return this.renderCustomMd(props["data-md"], props, childContent)
        }
        return childContent
    }
  }
}
```

### Adaptive Density in Practice

The key insight from open-memory's context status thresholding: **information density should be proportional to context pressure.**

```
Context < 70% (GREEN / "full"):
  ┌────────────────────────────────────┐
  │ # Task                             │
  │ Implement user authentication      │
  │                                    │
  │ ## Context: 45% (90k/200k) stable  │
  │                                    │
  │ ## Decisions (3)                   │
  │ • Using JWT over sessions         │
  │ • bcrypt for password hashing     │
  │ • Rate limiting: 100/min default   │
  │                                    │
  │ ## Active Files                    │
  │ • src/auth/mod.ts (editing)        │
  │ • src/auth/jwt.ts (referenced)    │
  │ • src/db/schema.ts (referenced)   │
  │                                    │
  │ ## Next Steps                      │
  │ 1. ~~Add refresh token rotation~~  │
  │ 2. Write auth middleware           │
  │ 3. Add tests                      │
  │                                    │
  │ ## Notes                           │
  │ • DB schema: users, sessions      │
  │ • Env vars: JWT_SECRET, DB_URL    │
  └────────────────────────────────────┘
  ~300-500 tokens

Context 70-85% (YELLOW / "compact"):
  ┌────────────────────────────────────┐
  │ Task: Implement auth │ 72% (144k) ↑ │
  │ Decisions: JWT, bcrypt, 100/min    │
  │ Steps: ~~rotation~~, middleware,   │
  │        tests                       │
  │ Files: auth/mod.ts, auth/jwt.ts    │
  │ Notes: DB schema users/sessions   │
  └────────────────────────────────────┘
  ~100-150 tokens

Context 85-92% (RED / "minimal"):
  ┌────────────────────────────────────┐
  │ Auth impl │ 89% │ ⚠ compact soon  │
  │ JWT+bcrypt, step: middleware       │
  └────────────────────────────────────┘
  ~30-50 tokens
```

This adaptive compression means the agent always has situational awareness, but the cost scales inversely with available context.

### Compaction That Preserves State

The critical difference from current compaction: **the HUD state survives compaction.**

Current flow:
```
[conversation] → COMPACT → [summary replaces everything] → agent loses context
```

HUD-aware flow:
```
[event log] → project to HUD state → render HUD → [inject into system prompt]
[conversation] → COMPACT → [summary replaces conversation, but HUD persisted state carries forward]
```

Before compaction:
1. Agent records `compact.before` event with full HUD state
2. HUD state is persisted to the event log (already append-only)
3. Compaction prompt includes: "Your HUD state: {rendered HUD}. This will survive compaction."

After compaction:
1. `filterCompacted` drops pre-compaction messages
2. But the system prompt still contains the fully rendered HUD
3. The agent sees its task, decisions, notes, and next steps — not just a narrative summary

The compaction summary becomes a **complement** to the HUD, not a **replacement** for lost context. The summary handles conversational continuity ("we were discussing..."), while the HUD provides structured persistent state.

### Comparison: Current vs HUD Architecture

| Aspect | Current (Compaction) | HUD Architecture |
|--------|---------------------|------------------|
| State management | Monolithic conversation log | Structured event log + HUD projection |
| Context loss | All-or-nothing compaction cliff | Adaptive density that preserves key state |
| Agent awareness | Must call memory tool | Ambient via system prompt injection |
| Cache optimization | 2-part system prompt only | Multi-part with stable HUD layer |
| Recovery from compaction | LLM-generated summary (lossy) | Structured HUD state (deterministic) |
| Cognitive load | Full conversation always visible | Relevant state always visible, details on demand |
| Token cost at 50% | Full conversation (~100k tokens of history) | Conversation + HUD (~100k + ~300 tokens) |
| Token cost at 90% | Conversation (truncated by compaction, then grows again) | Conversation + compact HUD (truncated + ~50 tokens) |

---

## Implementation Plan

### Phase 1: Minimal Viable HUD (Plugin)

1. **Event log table**: Add `hud_event` table to opencode's SQLite database (via plugin)
2. **Basic event types**: `task.set`, `task.update`, `decision.record`, `note.add`, `step.add`, `step.complete`
3. **State projector**: Pure functions that fold events into `HudState`
4. **Simple renderer**: Template-based markdown rendering (no JSX yet)
5. **Plugin hooks**: `system.transform` for injection, `event` for derivation, `session.compacting` for pre-compaction snapshot
6. **Two tools**: `hud` (router) and `hud_compact`

### Phase 2: JSX Rendering Pipeline

1. **Hono JSXNode walking**: Custom ` HudRenderer` that walks JSXNode trees directly to markdown
2. **HUD components**: `<HUD>`, `<TaskSection>`, `<ContextBar>`, `<DecisionsList>`, `<NotesSection>`, `<NextSteps>`
3. **Adaptive density**: Components that render differently based on `density` prop
4. **Test rendering**: Snapshot tests comparing component output to expected markdown

### Phase 3: Cache Optimization

1. **HUD layer splitting**: Separate HUD into static layer (task, decisions, next steps) and dynamic layer (context bar, notes, active files)
2. **Static layer diffing**: Only push to `output.system[]` when static content changes, reducing cache misses
3. **Dynamic layer minimal updates**: Track what changed since last render, include only deltas
4. **Measure cache hit rates**: Instrument provider token usage to verify caching improvements

### Phase 4: Advanced Features

1. **Derived events**: File tracking from tool calls, context snapshots from message events
2. **Search integration**: `hud.search` operation using FTS5 instead of LIKE
3. **Cross-session HUD**: Persist HUD state across sessions, enable resuming tasks
4. **QuickJS scripting**: Use toolEnv's envProxy pattern to let agents customize their HUD components
5. **Multi-agent HUD**: When `coord.spawn` creates sub-agents, share HUD state via parent session events

---

## Key Design Decisions

### Why event log over direct state mutation?

1. **Auditability**: Every state change has a trace. You can reconstruct the HUD at any point in time.
2. **Compaction resilience**: Events survive compaction because they're in a separate table, not the conversation message stream.
3. **No conflict resolution needed**: Unlike CRDTs, we're single-author (one agent per session). The event log is append-only with no merge conflicts.
4. **Projection flexibility**: Different projectors can derive different views from the same events. The HUD is one projection; a task progress tracker could be another.

### Why JSX over Handlebars/templates?

1. **Component composition**: JSX supports arbitrary nesting and composition. `<CompactView>` can wrap `<DecisionsList>` with different rendering logic.
2. **Conditional rendering**: `{density === 'full' && <NotesSection />}` is cleaner than `{{#if density.full}}...{{/if}}`.
3. **Type safety**: Components are typed functions. Props, state, density — all compile-time checked.
4. **Developer familiarity**: React-like patterns are widely understood.
5. **Direct JSXNode walking avoids HTML roundtrip**: No `renderToStaticMarkup → parse → hast → mdast → markdown` needed.

### Why system prompt injection instead of a dedicated message role?

1. **Cache alignment**: OpenCode already manages the system prompt as a cache-aware structure. Injecting into `output.system[]` gives us immediate cache optimization.
2. **No protocol change**: We don't need to change opencode's core messaging protocol. The plugin hook is sufficient.
3. **Survives compaction**: System prompt is always included (it's never compacted). The HUD is always visible.
4. **Provider compatibility**: System prompts work with every LLM provider. A custom message role might not.

### Why router pattern (1 tool) over separate tools?

From open-memory's research: each tool definition adds ~1-2k tokens to the system prompt. With 10+ HUD operations, that's 10-20k tokens of overhead. A single `hud` router tool costs ~2k tokens regardless of how many operations it supports. The `help` operation provides inline documentation.

---

## File Structure (Proposed)

```
packages/hud-plugin/
  src/
    index.ts                    # Plugin entry point
    state/
      types.ts                  # HudState, HudEvent types
      projector.ts              # Event → State projectors
      store.ts                  # SQLite read/write for hud_event
    renderer/
      components/
        HUD.tsx                 # Root HUD component
        TaskSection.tsx         # Task display
        ContextBar.tsx          # Context percentage bar
        DecisionsList.tsx       # Key decisions
        NotesSection.tsx        # Scratchpad
        NextSteps.tsx           # Plan/next steps
        ActiveFiles.tsx         # Currently active files
        WarningBanner.tsx       # Context pressure warning
      renderer.ts               # JSXNode → markdown walker
      density.ts                # Adaptive density logic
    tools/
      hud.ts                    # hud router tool definition
      hud_compact.ts            # hud_compact tool definition
      operations/               # Individual operations
        task.ts
        decisions.ts
        notes.ts
        steps.ts
        files.ts
        snapshot.ts
        history.ts
    hooks/
      system-transform.ts       # experimental.chat.system.transform
      compacting.ts             # experimental.session.compacting
      event.ts                  # event derivation (file tracking, context snapshots)
    context/
      tracker.ts                # Context window tracking (from open-memory)
      thresholds.ts             # Density thresholds
```

---

## Risks and Open Questions

1. **Token overhead at low context usage**: Even "minimal" HUD adds ~30-50 tokens. Is 0% context worth the overhead? Probably yes — the ROI is in the 70%+ range where the HUD prevents catastrophic compaction. At 0%, the full HUD costs ~300-500 tokens, which is ~0.25% of a 200k context. The payoff is avoiding a compaction event that loses the entire conversation.

2. **Agent compliance**: Will agents consistently use `hud` tools to update their state? The current approach relies on the agent choosing to call `hud` tools. Alternatives:
   - **Auto-derivation**: More events derived automatically from tool calls (file tracking is automatic, but decisions/notes require agent action)
   - **AGENTS.md prompt**: Include HUD tool usage in instructions
   - **Conventional prompting**: "Always update your HUD after completing a step or making a decision" in the compaction/system prompt

3. **Stale state**: If the agent doesn't update the HUD before compaction, the HUD might be stale. Mitigation: auto-snapshot HUD state before compaction in the `session.compacting` hook.

4. **JSX dependency weight**: Hono's JSX runtime adds a dependency. Alternative: use a lightweight custom JSX transform that doesn't need Hono. The renderer only needs `JSXNode` types and the walker — not the full Hono framework.

5. **Multi-agent sessions**: When sub-agents are spawned, should they share HUD state? The event log is per-session, but parent-child relationships in opencode could enable HUD state inheritance.

6. **Search vs. structured state**: Should agents search conversation history (like `memory({tool: "search"})`) or maintain structured state (like `hud({tool: "decisions.record"})`)? Both. The HUD is for state the agent actively maintains. Search is for recovering information the agent didn't think to record. They complement each other.

7. **Event log growth**: The `hud_event` table will grow. Needs a cleanup strategy — perhaps tied to compaction events (archive events older than the last compaction point).