import research docs from prior conversation and scattered sources

2026-04-29 15:11:46 +00:00
parent 9915be2ca6
commit b256fc7eb5
9 changed files with 4274 additions and 0 deletions
--- a/docs/research/agent-hud-architecture.md
+++ b/docs/research/agent-hud-architecture.md
@@ -0,0 +1,675 @@
+# Agent HUD: A Context Management Architecture for LLM Agents
+
+## Research → Design Synthesis
+
+This document synthesizes findings from six research areas into a concrete architecture for an agent HUD system that replaces ad-hoc compaction with structured, cache-aware context management.
+
+**See also**: [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md) — The UJSX v2 rewrite plan, replacing the existing POC at `/workspace/aui` with a TypeBox-schema-driven universal JSX IR that supports actual JSX syntax, bi-directional transforms, and component schemas as tool schemas.
+
+---
+
+## The Problem
+
+Current LLM agent interfaces have a fundamental flaw: **context is managed as a monolithic conversation log that grows until it must be violently compacted**. This creates several pathologies:
+
+1. **Information cliff**: Compaction replaces rich conversation with a lossy summary. Everything before the compaction boundary is gone — the agent can't even know what it lost.
+
+2. **Reactive, not proactive**: OpenCode's compaction fires at ~92% context usage. There's no budgeting — system prompt, tool definitions, and conversation history compete for the same fixed space with no allocation strategy.
+
+3. **Cognitive waste**: The agent sees the full conversation log every turn, including messages that are no longer relevant. Early messages about setup, resolved errors, and abandoned approaches consume tokens without providing value.
+
+4. **Cognitive strain**: The agent has no ambient awareness of its own state. It must explicitly call tools to check context usage, search history, or understand where it is in a task. Each tool call costs a turn and consumes context.
+
+5. **No provider-cache alignment**: OpenCode's 2-part system prompt split (`llm.ts:115-126`) is the only cache-aware structure. The rest of the context — messages, tool results, the full conversation — has no cache segmentation.
+
+The key insight: **the "agent frame" — what the LLM sees in a single turn — should be treated as a composition problem, not an append-only log problem.**
+
+---
+
+## The Leverage Point
+
+OpenCode's `llm.ts:115-126` reveals the critical pattern:
+
+```typescript
+// rejoin to maintain 2-part structure for caching if header unchanged
+if (system.length > 2 && system[0] === header) {
+  const rest = system.slice(1)
+  system.length = 0
+  system.push(header, rest.join("\n"))
+}
+```
+
+This splits the system prompt into:
+- **Part 0** (cached, stable): Agent prompt, provider prompt — changes rarely, benefits from prompt caching
+- **Part 1** (dynamic, re-cached each call): Environment, skills, instructions — changes often
+
+The plugin system pushes to `output.system[]` which gets merged into Part 1. The open-memory plugin injects context status (~50 tokens) into this dynamic part.
+
+**This 2-part pattern generalizes.** We can extend it to a multi-part context composition with different cache characteristics:
+
+| Part | Content | Changes | Cache Value |
+|------|---------|---------|-------------|
+| 0 | Agent identity, core instructions | Rarely | Highest (static across turns) |
+| 1 | HUD static layer (task, key decisions) | On agent action | High (stable within a task phase) |
+| 2 | HUD dynamic layer (notes, recent events) | Every turn | Medium (re-cached each call) |
+| 3 | Conversation messages | Every turn | Low (grows monotonically) |
+
+---
+
+## Architecture: The Agent HUD
+
+### Core Concept
+
+The HUD is a **structured markdown document** that the agent sees at the top of every turn, injected via the `experimental.chat.system.transform` plugin hook. Unlike compaction (which discards information), the HUD is:
+- **Composed**: Built from components, not a flat log
+- **Bidirectional**: The agent can update it via tools
+- **Cache-aware**: Structured so static parts benefit from provider caching
+- **Adaptive**: Automatically adjusts density based on context pressure
+
+### The "Agent Frame" Model
+
+Instead of treating the conversation as an append-only log, think of each agent turn as a **frame** composed of:
+
+```
+┌─────────────────────────────────────┐
+│  System Prompt Part 0 (cached)       │  ← Agent identity, core instructions
+│  [never changes within a session]    │
+├─────────────────────────────────────┤
+│  System Prompt Part 1 (re-cached)    │  ← HUD rendered to markdown
+│  ┌─────────────────────────────────┐ │
+│  │ # Task                          │ │  ← Static HUD (changes on action)
+│  │ Implement auth module            │ │
+│  │                                 │ │
+│  │ ## Context                      │ │  ← Adaptive density
+│  │ 67% used (134k/200k tokens)     │ │  ← Context status
+│  │ Trend: growing rapidly           │ │
+│  │                                 │ │
+│  │ ## Key Decisions                │ │  ← Agent-maintained state
+│  │ • Using JWT over sessions       │ │
+│  │ • bcrypt for password hashing   │ │
+│  │                                 │ │
+│  │ ## Active Files                 │ │  ← Derived from tool calls
+│  │ • src/auth/mod.ts (editing)     │ │
+│  │ • src/auth/jwt.ts (referenced)  │ │
+│  │                                 │ │
+│  │ ## Next Steps                    │ │  ← Agent's own plan
+│  │ 1. Add refresh token rotation    │ │
+│  │ 2. Write auth middleware         │ │
+│  │ 3. Add tests                    │ │
+│  │                                 │ │
+│  │ ## Notes                        │ │  ← Agent's scratchpad
+│  │ • DB schema: users, sessions   │ │
+│  │ • Rate limit: 100/min default   │ │
+│  └─────────────────────────────────┘ │
+├─────────────────────────────────────┤
+│  Conversation Messages              │  ← Standard message history
+│  [filtered by compaction boundary]  │
+└─────────────────────────────────────┘
+```
+
+### The Append-Only Event Log
+
+Underlying the HUD is an **append-only event log** — similar to Yjs/CRDT event streams but simpler because we're single-author (one agent per session). This is the "source of truth" that the HUD renders from.
+
+```typescript
+interface HudEvent {
+  id: string              // UUID
+  type: HudEventType      // discriminated union
+  timestamp: number       // Unix ms
+  sessionId: string       // opencode session ID
+  turn: number            // which agent turn
+  payload: unknown        // type-specific data
+}
+
+type HudEventType = 
+  | "task.set"            // Agent sets the current task description
+  | "task.update"         // Agent refines the task
+  | "decision.record"    // Agent records a key decision
+  | "note.add"           // Agent adds a note
+  | "note.update"        // Agent updates a note
+  | "note.remove"        // Agent removes a note
+  | "file.open"          // Agent reads a file (derived from tool calls)
+  | "file.edit"          // Agent edits a file (derived from tool calls)
+  | "file.close"         // File falls out of recent context
+  | "step.complete"      // Agent marks a step complete
+  | "step.add"          // Agent adds a next step
+  | "step.reorder"       // Agent reorders steps
+  | "error.encountered"  // Agent encounters an error
+  | "error.resolved"     // Agent resolves an error
+  | "blocker.add"       // Agent identifies a blocker
+  | "blocker.remove"     // Agent removes a blocker
+  | "context.snapshot"   // Context window usage snapshot (auto-generated)
+  | "compact.before"     // Pre-compaction state snapshot
+  | "compact.after"      // Post-compaction state + summary
+```
+
+The event log is persisted and append-only. It replaces the "conversation log as only state" model with "conversation log as one input to the HUD, alongside structured state."
+
+### Rendering Pipeline: UJSX → MDAST → Markdown
+
+The HUD uses the **UJSX v2** universal JSX IR (see [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md)) built on TypeBox schemas. The existing POC at `/workspace/aui` already implements UJSX → mdast → markdown with a `TransformRegistry` and `mdast-util-to-markdown`. The v2 rewrite adds:
+
+- Actual JSX syntax (via `jsxImportSource: "@ade/ujsx"`)
+- TypeBox schema-driven node types (schemas ARE types, schemas ARE tool parameter schemas)
+- Bi-directional transforms (UJSX ↔ mdast, UJSX ↔ hast, etc.)
+- HTML-agnostic core (no `onClick`, `className` in universal props)
+
+```
+HUD Components (.tsx with JSX syntax)
+    │
+    ▼
+h() factory / JSX transform → UElement tree (TypeBox-schema-validated)
+    │
+    ▼  TransformRegistry (direction: 'ujsx->mdast')
+    │  Rules match on TypeBox schemas, not string tags
+    │
+mdast tree
+    │
+    ▼  mdast-util-to-markdown + mdast-util-gfm
+    │
+Markdown String
+    │
+    ▼  Injected via experimental.chat.system.transform
+    │
+System Prompt Part 1
+```
+
+**Why UJSX (not Hono JSXNode, not hast→mdast)?**
+
+1. **Schema-driven**: TypeBox schemas serve triple duty — TypeScript types, runtime validation, and tool parameter schemas. Component props = tool input schemas. Zero duplication.
+2. **Bi-directional**: Rules convert both UJSX→mdast AND mdast→UJSX. Parse existing markdown (notes, AGENTS.md) back into the IR.
+3. **HTML-agnostic**: No `onClick`, `className`, `aria-*` in the core. The IR isn't pretending to be HTML.
+4. **HostConfig preserved**: The same component tree renders to graphology, markdown, or future targets.
+5. **Actual JSX syntax**: With `jsxImportSource`, write `<TaskSection task={state.task} density="compact" />` not `h('TaskSection', { task: state.task, density: 'compact' })`.
+
+### HUD Component Architecture
+
+```tsx
+// The root HUD component
+function HUD({ state, contextInfo }: { state: HudState, contextInfo: ContextInfo }) {
+  const density = getDensity(contextInfo.percentage)
+  
+  return (
+    <Container>
+      <TaskSection task={state.task} density={density} />
+      <ContextBar info={contextInfo} density={density} />
+      {density !== 'minimal' && <DecisionsList decisions={state.decisions} density={density} />}
+      {density !== 'minimal' && <ActiveFiles files={state.activeFiles} density={density} />}
+      <NextSteps steps={state.nextSteps} density={density} />
+      {density === 'full' && <NotesSection notes={state.notes} />}
+      {contextInfo.percentage > 85 && <WarningBanner info={contextInfo} />}
+    </Container>
+  )
+}
+
+// Adaptive density: renders differently based on context pressure
+type Density = 'full' | 'compact' | 'minimal'
+
+function getDensity(percentage: number): Density {
+  if (percentage < 70) return 'full'
+  if (percentage < 85) return 'compact'
+  return 'minimal'
+}
+
+// Example adaptive component
+function DecisionsList({ decisions, density }: { decisions: Decision[], density: Density }) {
+  if (density === 'compact') {
+    return <text>{`## Decisions (${decisions.length})\n` + decisions.map(d => `- ${d.summary}`).join('\n')}</text>
+  }
+  return (
+    <section title="Decisions">
+      {decisions.map(d => <DecisionItem decision={d} />)}
+    </section>
+  )
+}
+```
+
+### Cache-Aware System Prompt Structure
+
+The key extension of OpenCode's 2-part pattern:
+
+```
+System Message 0 (cached, stable):
+  - Agent identity prompt
+  - Provider-specific prompt
+  — Never changes within a session
+
+System Message 1 (re-cached each call):
+  - HUD static layer (task, decisions, next steps)
+  - HUD dynamic layer (context bar, notes, active files)
+  - Environment info, skills, instructions
+  — Changes based on agent actions and context pressure
+```
+
+The **static layer** of the HUD should change only when the agent explicitly updates it (task change, decision recorded, step completed). The **dynamic layer** changes every turn (context percentage, recent files, notes).
+
+For Anthropic's `cache_control`, we mark:
+- System message 0 → `cacheControl: { type: "ephemeral" }` (already done by opencode)
+- System message 1 → `cacheControl: { type: "ephemeral" }` (already done by opencode)
+
+The savings come from keeping message 0 stable (0.1x cost on cache hits) and making message 1 as small as possible while still providing full situational awareness.
+
+### HUD State and Event Log Storage
+
+The event log uses the same SQLite database opencode already has, with a new table:
+
+```sql
+CREATE TABLE hud_event (
+  id TEXT PRIMARY KEY,
+  session_id TEXT NOT NULL REFERENCES session(id),
+  type TEXT NOT NULL,           -- HudEventType
+  turn INTEGER NOT NULL,       -- agent turn number
+  timestamp INTEGER NOT NULL,   -- Unix ms
+  payload TEXT NOT NULL,       -- JSON
+  created_at INTEGER NOT NULL DEFAULT (unixepoch())
+);
+
+CREATE INDEX idx_hud_event_session ON hud_event(session_id, timestamp);
+CREATE INDEX idx_hud_event_type ON hud_event(session_id, type);
+```
+
+The **HUD state** is derived from the event log by projectors (similar to opencode's existing event sourcing pattern):
+
+```typescript
+interface HudState {
+  task: { description: string; updatedAt: number } | null
+  decisions: Array<{ id: string; summary: string; details: string; recordedAt: number }>
+  notes: Array<{ id: string; content: string; updatedAt: number }>
+  activeFiles: Array<{ path: string; lastAccessed: number; status: 'reading' | 'editing' | 'referenced' }>
+  nextSteps: Array<{ id: string; description: string; completed: boolean; order: number }>
+  blockers: Array<{ id: string; description: string; resolved: boolean }>
+  errors: Array<{ id: string; message: string; resolved: boolean; resolvedAt?: number }>
+}
+```
+
+Projectors are pure functions: `(state, event) => newState`. They're deterministic and idempotent — the state can always be rebuilt from the event log.
+
+Some events are **derived** rather than agent-authored:
+- `file.open`, `file.edit`, `file.close`: Intercepted from opencode's `tool.execute.before/after` hook by watching for Read, Write, Edit tools
+- `context.snapshot`: Auto-generated from `SessionProcessor` events tracking token usage
+
+### HUD Tools (Agent-Facing)
+
+The agent interacts with the HUD through two tools (following the router pattern from open-memory):
+
+**`hud`** (router tool — reduces context bloat from tool definitions):
+
+```typescript
+hud(input: {
+  tool: string,    // operation name
+  args?: object    // operation arguments
+})
+```
+
+Operations:
+- `task.get` / `task.set` / `task.update` — Manage current task
+- `decisions.list` / `decisions.record` / `decisions.remove` — Key decisions log
+- `notes.list` / `notes.add` / `notes.update` / `notes.remove` — Scratchpad
+- `steps.list` / `steps.add` / `steps.complete` / `steps.reorder` — Next steps / plan
+- `blockers.list` / `blockers.add` / `blockers.remove` — Blockers
+- `snapshot` — Full HUD state
+- `history` — Recent event log (for understanding what changed)
+
+**`hud_compact`** (mutation tool — separate to prevent accidental use):
+
+```typescript
+hud_compact()  // Triggers both HUD compaction AND opencode compaction
+```
+
+This is distinct from `memory_compact` because:
+1. It snapshots HUD state to the event log before compaction
+2. After compaction, the stable HUD layer carries forward — the agent doesn't lose its task, decisions, or notes
+3. It can trigger compaction at a natural breakpoint (when the agent says "next steps updated, good time to compact")
+
+### Plugin Integration (opencode)
+
+The HUD is implemented as an opencode plugin, using these hooks:
+
+```typescript
+const HUDPlugin: Plugin = async (ctx) => {
+  const stateManager = new HudStateManager(ctx)  // Manages event log + projectors
+  const renderer = new HudRenderer()             // JSX component → markdown
+
+  return {
+    tool: { hud: createHudTool(ctx, stateManager), hud_compact: createHudCompactTool(ctx, stateManager) },
+    
+    "experimental.chat.system.transform": async (input, output) => {
+      // Render HUD and inject as system prompt
+      const state = stateManager.getState(input.sessionID)
+      const contextInfo = getContextInfo(input.sessionID)
+      const markdown = renderer.render(HUD, { state, contextInfo })
+      output.system.push(markdown)
+    },
+    
+    "experimental.session.compacting": async (input, output) => {
+      // Before compaction: snapshot HUD state to event log
+      stateManager.recordEvent(input.sessionID, { type: "compact.before", payload: stateManager.getState(input.sessionID) })
+      // Replace compaction prompt with self-continuity + HUD-aware prompt
+      output.prompt = getCompactionPrompt(stateManager.getState(input.sessionID))
+    },
+    
+    "event": async ({ event }) => {
+      // Derive file events from tool calls
+      if (event.type === "tool.execute.after") {
+        stateManager.maybeRecordFileEvent(event)
+      }
+      // Track context snapshots
+      if (event.type === "message.updated") {
+        stateManager.maybeRecordContextSnapshot(event)
+      }
+    },
+    
+    "tool.execute.before": async (input, output) => {
+      // Track file access from tool calls
+      stateManager.trackToolCall(input)
+    },
+  }
+}
+```
+
+### Rendering Pipeline Implementation
+
+```typescript
+// core/renderer.ts
+
+import type { FC } from "hono/jsx"
+
+interface RenderContext {
+  density: Density
+  state: HudState
+  contextInfo: ContextInfo
+}
+
+class HudRenderer {
+  // Custom walker over Hono JSXNode tree
+  render(component: FC<any>, props: Record<string, any>): string {
+    const node = component(props)
+    return this.walkNode(node)
+  }
+
+  private walkNode(node: any): string {
+    if (typeof node === "string" || typeof node === "number") {
+      return String(node)
+    }
+    if (node === null || node === undefined || node === false) {
+      return ""
+    }
+    if (node instanceof Promise) {
+      throw new Error("Async components not supported in HUD rendering")
+    }
+
+    // JSXFragmentNode — just concatenate children
+    if (node.tag === null || node.tag === Symbol.for("hono.fragment")) {
+      return (node.children as any[]).map(c => this.walkNode(c)).join("\n")
+    }
+
+    // Function component — call it and walk the result
+    if (typeof node.tag === "function") {
+      const result = node.tag({ ...node.props, children: node.children })
+      return this.walkNode(result)
+    }
+
+    // Intrinsic element — map HTML tag to markdown
+    return this.renderElement(node.tag, node.props, node.children)
+  }
+
+  private renderElement(tag: string, props: any, children: any[]): string {
+    const childContent = (children || []).map(c => this.walkNode(c)).join("\n")
+    
+    switch (tag) {
+      case "h1": return `# ${childContent}`
+      case "h2": return `## ${childContent}`
+      case "h3": return `### ${childContent}`
+      case "strong": case "b": return `**${childContent}**`
+      case "em": case "i": return `*${childContent}*`
+      case "code": return props.lang 
+        ? `\`\`\`${props.lang}\n${childContent}\n\`\`\`` 
+        : `\`${childContent}\``
+      case "ul": return childContent
+      case "li": return `- ${childContent}`
+      case "ol": return childContent  // handled by parent
+      case "p": return childContent
+      case "div": case "section": case "article": case "main": case "header":
+      case "footer": case "nav": case "aside": case "span": 
+        return childContent  // container — just pass through content
+      case "a": return `[${childContent}](${props.href})`
+      case "blockquote": return childContent.split("\n").map(l => `> ${l}`).join("\n")
+      case "hr": return "---"
+      case "br": return "\n"
+      case "pre": return childContent  // content already formatted by <code>
+      case "table": return this.renderTable(props, children)
+      default:
+        // Unknown/custom tags: use data-md attribute or just render content
+        if (props?.["data-md"]) {
+          // Custom markdown rendering hint
+          return this.renderCustomMd(props["data-md"], props, childContent)
+        }
+        return childContent
+    }
+  }
+}
+```
+
+### Adaptive Density in Practice
+
+The key insight from open-memory's context status thresholding: **information density should be proportional to context pressure.**
+
+```
+Context < 70% (GREEN / "full"):
+  ┌────────────────────────────────────┐
+  │ # Task                             │
+  │ Implement user authentication      │
+  │                                    │
+  │ ## Context: 45% (90k/200k) stable  │
+  │                                    │
+  │ ## Decisions (3)                   │
+  │ • Using JWT over sessions         │
+  │ • bcrypt for password hashing     │
+  │ • Rate limiting: 100/min default   │
+  │                                    │
+  │ ## Active Files                    │
+  │ • src/auth/mod.ts (editing)        │
+  │ • src/auth/jwt.ts (referenced)    │
+  │ • src/db/schema.ts (referenced)   │
+  │                                    │
+  │ ## Next Steps                      │
+  │ 1. ~~Add refresh token rotation~~  │
+  │ 2. Write auth middleware           │
+  │ 3. Add tests                      │
+  │                                    │
+  │ ## Notes                           │
+  │ • DB schema: users, sessions      │
+  │ • Env vars: JWT_SECRET, DB_URL    │
+  └────────────────────────────────────┘
+  ~300-500 tokens
+
+Context 70-85% (YELLOW / "compact"):
+  ┌────────────────────────────────────┐
+  │ Task: Implement auth │ 72% (144k) ↑ │
+  │ Decisions: JWT, bcrypt, 100/min    │
+  │ Steps: ~~rotation~~, middleware,   │
+  │        tests                       │
+  │ Files: auth/mod.ts, auth/jwt.ts    │
+  │ Notes: DB schema users/sessions   │
+  └────────────────────────────────────┘
+  ~100-150 tokens
+
+Context 85-92% (RED / "minimal"):
+  ┌────────────────────────────────────┐
+  │ Auth impl │ 89% │ ⚠ compact soon  │
+  │ JWT+bcrypt, step: middleware       │
+  └────────────────────────────────────┘
+  ~30-50 tokens
+```
+
+This adaptive compression means the agent always has situational awareness, but the cost scales inversely with available context.
+
+### Compaction That Preserves State
+
+The critical difference from current compaction: **the HUD state survives compaction.**
+
+Current flow:
+```
+[conversation] → COMPACT → [summary replaces everything] → agent loses context
+```
+
+HUD-aware flow:
+```
+[event log] → project to HUD state → render HUD → [inject into system prompt]
+[conversation] → COMPACT → [summary replaces conversation, but HUD persisted state carries forward]
+```
+
+Before compaction:
+1. Agent records `compact.before` event with full HUD state
+2. HUD state is persisted to the event log (already append-only)
+3. Compaction prompt includes: "Your HUD state: {rendered HUD}. This will survive compaction."
+
+After compaction:
+1. `filterCompacted` drops pre-compaction messages
+2. But the system prompt still contains the fully rendered HUD
+3. The agent sees its task, decisions, notes, and next steps — not just a narrative summary
+
+The compaction summary becomes a **complement** to the HUD, not a **replacement** for lost context. The summary handles conversational continuity ("we were discussing..."), while the HUD provides structured persistent state.
+
+### Comparison: Current vs HUD Architecture
+
+| Aspect | Current (Compaction) | HUD Architecture |
+|--------|---------------------|------------------|
+| State management | Monolithic conversation log | Structured event log + HUD projection |
+| Context loss | All-or-nothing compaction cliff | Adaptive density that preserves key state |
+| Agent awareness | Must call memory tool | Ambient via system prompt injection |
+| Cache optimization | 2-part system prompt only | Multi-part with stable HUD layer |
+| Recovery from compaction | LLM-generated summary (lossy) | Structured HUD state (deterministic) |
+| Cognitive load | Full conversation always visible | Relevant state always visible, details on demand |
+| Token cost at 50% | Full conversation (~100k tokens of history) | Conversation + HUD (~100k + ~300 tokens) |
+| Token cost at 90% | Conversation (truncated by compaction, then grows again) | Conversation + compact HUD (truncated + ~50 tokens) |
+
+---
+
+## Implementation Plan
+
+### Phase 1: Minimal Viable HUD (Plugin)
+
+1. **Event log table**: Add `hud_event` table to opencode's SQLite database (via plugin)
+2. **Basic event types**: `task.set`, `task.update`, `decision.record`, `note.add`, `step.add`, `step.complete`
+3. **State projector**: Pure functions that fold events into `HudState`
+4. **Simple renderer**: Template-based markdown rendering (no JSX yet)
+5. **Plugin hooks**: `system.transform` for injection, `event` for derivation, `session.compacting` for pre-compaction snapshot
+6. **Two tools**: `hud` (router) and `hud_compact`
+
+### Phase 2: JSX Rendering Pipeline
+
+1. **Hono JSXNode walking**: Custom ` HudRenderer` that walks JSXNode trees directly to markdown
+2. **HUD components**: `<HUD>`, `<TaskSection>`, `<ContextBar>`, `<DecisionsList>`, `<NotesSection>`, `<NextSteps>`
+3. **Adaptive density**: Components that render differently based on `density` prop
+4. **Test rendering**: Snapshot tests comparing component output to expected markdown
+
+### Phase 3: Cache Optimization
+
+1. **HUD layer splitting**: Separate HUD into static layer (task, decisions, next steps) and dynamic layer (context bar, notes, active files)
+2. **Static layer diffing**: Only push to `output.system[]` when static content changes, reducing cache misses
+3. **Dynamic layer minimal updates**: Track what changed since last render, include only deltas
+4. **Measure cache hit rates**: Instrument provider token usage to verify caching improvements
+
+### Phase 4: Advanced Features
+
+1. **Derived events**: File tracking from tool calls, context snapshots from message events
+2. **Search integration**: `hud.search` operation using FTS5 instead of LIKE
+3. **Cross-session HUD**: Persist HUD state across sessions, enable resuming tasks
+4. **QuickJS scripting**: Use toolEnv's envProxy pattern to let agents customize their HUD components
+5. **Multi-agent HUD**: When `coord.spawn` creates sub-agents, share HUD state via parent session events
+
+---
+
+## Key Design Decisions
+
+### Why event log over direct state mutation?
+
+1. **Auditability**: Every state change has a trace. You can reconstruct the HUD at any point in time.
+2. **Compaction resilience**: Events survive compaction because they're in a separate table, not the conversation message stream.
+3. **No conflict resolution needed**: Unlike CRDTs, we're single-author (one agent per session). The event log is append-only with no merge conflicts.
+4. **Projection flexibility**: Different projectors can derive different views from the same events. The HUD is one projection; a task progress tracker could be another.
+
+### Why JSX over Handlebars/templates?
+
+1. **Component composition**: JSX supports arbitrary nesting and composition. `<CompactView>` can wrap `<DecisionsList>` with different rendering logic.
+2. **Conditional rendering**: `{density === 'full' && <NotesSection />}` is cleaner than `{{#if density.full}}...{{/if}}`.
+3. **Type safety**: Components are typed functions. Props, state, density — all compile-time checked.
+4. **Developer familiarity**: React-like patterns are widely understood.
+5. **Direct JSXNode walking avoids HTML roundtrip**: No `renderToStaticMarkup → parse → hast → mdast → markdown` needed.
+
+### Why system prompt injection instead of a dedicated message role?
+
+1. **Cache alignment**: OpenCode already manages the system prompt as a cache-aware structure. Injecting into `output.system[]` gives us immediate cache optimization.
+2. **No protocol change**: We don't need to change opencode's core messaging protocol. The plugin hook is sufficient.
+3. **Survives compaction**: System prompt is always included (it's never compacted). The HUD is always visible.
+4. **Provider compatibility**: System prompts work with every LLM provider. A custom message role might not.
+
+### Why router pattern (1 tool) over separate tools?
+
+From open-memory's research: each tool definition adds ~1-2k tokens to the system prompt. With 10+ HUD operations, that's 10-20k tokens of overhead. A single `hud` router tool costs ~2k tokens regardless of how many operations it supports. The `help` operation provides inline documentation.
+
+---
+
+## File Structure (Proposed)
+
+```
+packages/hud-plugin/
+  src/
+    index.ts                    # Plugin entry point
+    state/
+      types.ts                  # HudState, HudEvent types
+      projector.ts              # Event → State projectors
+      store.ts                  # SQLite read/write for hud_event
+    renderer/
+      components/
+        HUD.tsx                 # Root HUD component
+        TaskSection.tsx         # Task display
+        ContextBar.tsx          # Context percentage bar
+        DecisionsList.tsx       # Key decisions
+        NotesSection.tsx        # Scratchpad
+        NextSteps.tsx           # Plan/next steps
+        ActiveFiles.tsx         # Currently active files
+        WarningBanner.tsx       # Context pressure warning
+      renderer.ts               # JSXNode → markdown walker
+      density.ts                # Adaptive density logic
+    tools/
+      hud.ts                    # hud router tool definition
+      hud_compact.ts            # hud_compact tool definition
+      operations/               # Individual operations
+        task.ts
+        decisions.ts
+        notes.ts
+        steps.ts
+        files.ts
+        snapshot.ts
+        history.ts
+    hooks/
+      system-transform.ts       # experimental.chat.system.transform
+      compacting.ts             # experimental.session.compacting
+      event.ts                  # event derivation (file tracking, context snapshots)
+    context/
+      tracker.ts                # Context window tracking (from open-memory)
+      thresholds.ts             # Density thresholds
+```
+
+---
+
+## Risks and Open Questions
+
+1. **Token overhead at low context usage**: Even "minimal" HUD adds ~30-50 tokens. Is 0% context worth the overhead? Probably yes — the ROI is in the 70%+ range where the HUD prevents catastrophic compaction. At 0%, the full HUD costs ~300-500 tokens, which is ~0.25% of a 200k context. The payoff is avoiding a compaction event that loses the entire conversation.
+
+2. **Agent compliance**: Will agents consistently use `hud` tools to update their state? The current approach relies on the agent choosing to call `hud` tools. Alternatives:
+   - **Auto-derivation**: More events derived automatically from tool calls (file tracking is automatic, but decisions/notes require agent action)
+   - **AGENTS.md prompt**: Include HUD tool usage in instructions
+   - **Conventional prompting**: "Always update your HUD after completing a step or making a decision" in the compaction/system prompt
+
+3. **Stale state**: If the agent doesn't update the HUD before compaction, the HUD might be stale. Mitigation: auto-snapshot HUD state before compaction in the `session.compacting` hook.
+
+4. **JSX dependency weight**: Hono's JSX runtime adds a dependency. Alternative: use a lightweight custom JSX transform that doesn't need Hono. The renderer only needs `JSXNode` types and the walker — not the full Hono framework.
+
+5. **Multi-agent sessions**: When sub-agents are spawned, should they share HUD state? The event log is per-session, but parent-child relationships in opencode could enable HUD state inheritance.
+
+6. **Search vs. structured state**: Should agents search conversation history (like `memory({tool: "search"})`) or maintain structured state (like `hud({tool: "decisions.record"})`)? Both. The HUD is for state the agent actively maintains. Search is for recovering information the agent didn't think to record. They complement each other.
+
+7. **Event log growth**: The `hud_event` table will grow. Needs a cleanup strategy — perhaps tied to compaction events (archive events older than the last compaction point).