675 lines
32 KiB
Markdown
675 lines
32 KiB
Markdown
# Agent HUD: A Context Management Architecture for LLM Agents
|
|
|
|
## Research → Design Synthesis
|
|
|
|
This document synthesizes findings from six research areas into a concrete architecture for an agent HUD system that replaces ad-hoc compaction with structured, cache-aware context management.
|
|
|
|
**See also**: [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md) — The UJSX v2 rewrite plan, replacing the existing POC at `/workspace/aui` with a TypeBox-schema-driven universal JSX IR that supports actual JSX syntax, bi-directional transforms, and component schemas as tool schemas.
|
|
|
|
---
|
|
|
|
## The Problem
|
|
|
|
Current LLM agent interfaces have a fundamental flaw: **context is managed as a monolithic conversation log that grows until it must be violently compacted**. This creates several pathologies:
|
|
|
|
1. **Information cliff**: Compaction replaces rich conversation with a lossy summary. Everything before the compaction boundary is gone — the agent can't even know what it lost.
|
|
|
|
2. **Reactive, not proactive**: OpenCode's compaction fires at ~92% context usage. There's no budgeting — system prompt, tool definitions, and conversation history compete for the same fixed space with no allocation strategy.
|
|
|
|
3. **Cognitive waste**: The agent sees the full conversation log every turn, including messages that are no longer relevant. Early messages about setup, resolved errors, and abandoned approaches consume tokens without providing value.
|
|
|
|
4. **Cognitive strain**: The agent has no ambient awareness of its own state. It must explicitly call tools to check context usage, search history, or understand where it is in a task. Each tool call costs a turn and consumes context.
|
|
|
|
5. **No provider-cache alignment**: OpenCode's 2-part system prompt split (`llm.ts:115-126`) is the only cache-aware structure. The rest of the context — messages, tool results, the full conversation — has no cache segmentation.
|
|
|
|
The key insight: **the "agent frame" — what the LLM sees in a single turn — should be treated as a composition problem, not an append-only log problem.**
|
|
|
|
---
|
|
|
|
## The Leverage Point
|
|
|
|
OpenCode's `llm.ts:115-126` reveals the critical pattern:
|
|
|
|
```typescript
|
|
// rejoin to maintain 2-part structure for caching if header unchanged
|
|
if (system.length > 2 && system[0] === header) {
|
|
const rest = system.slice(1)
|
|
system.length = 0
|
|
system.push(header, rest.join("\n"))
|
|
}
|
|
```
|
|
|
|
This splits the system prompt into:
|
|
- **Part 0** (cached, stable): Agent prompt, provider prompt — changes rarely, benefits from prompt caching
|
|
- **Part 1** (dynamic, re-cached each call): Environment, skills, instructions — changes often
|
|
|
|
The plugin system pushes to `output.system[]` which gets merged into Part 1. The open-memory plugin injects context status (~50 tokens) into this dynamic part.
|
|
|
|
**This 2-part pattern generalizes.** We can extend it to a multi-part context composition with different cache characteristics:
|
|
|
|
| Part | Content | Changes | Cache Value |
|
|
|------|---------|---------|-------------|
|
|
| 0 | Agent identity, core instructions | Rarely | Highest (static across turns) |
|
|
| 1 | HUD static layer (task, key decisions) | On agent action | High (stable within a task phase) |
|
|
| 2 | HUD dynamic layer (notes, recent events) | Every turn | Medium (re-cached each call) |
|
|
| 3 | Conversation messages | Every turn | Low (grows monotonically) |
|
|
|
|
---
|
|
|
|
## Architecture: The Agent HUD
|
|
|
|
### Core Concept
|
|
|
|
The HUD is a **structured markdown document** that the agent sees at the top of every turn, injected via the `experimental.chat.system.transform` plugin hook. Unlike compaction (which discards information), the HUD is:
|
|
- **Composed**: Built from components, not a flat log
|
|
- **Bidirectional**: The agent can update it via tools
|
|
- **Cache-aware**: Structured so static parts benefit from provider caching
|
|
- **Adaptive**: Automatically adjusts density based on context pressure
|
|
|
|
### The "Agent Frame" Model
|
|
|
|
Instead of treating the conversation as an append-only log, think of each agent turn as a **frame** composed of:
|
|
|
|
```
|
|
┌─────────────────────────────────────┐
|
|
│ System Prompt Part 0 (cached) │ ← Agent identity, core instructions
|
|
│ [never changes within a session] │
|
|
├─────────────────────────────────────┤
|
|
│ System Prompt Part 1 (re-cached) │ ← HUD rendered to markdown
|
|
│ ┌─────────────────────────────────┐ │
|
|
│ │ # Task │ │ ← Static HUD (changes on action)
|
|
│ │ Implement auth module │ │
|
|
│ │ │ │
|
|
│ │ ## Context │ │ ← Adaptive density
|
|
│ │ 67% used (134k/200k tokens) │ │ ← Context status
|
|
│ │ Trend: growing rapidly │ │
|
|
│ │ │ │
|
|
│ │ ## Key Decisions │ │ ← Agent-maintained state
|
|
│ │ • Using JWT over sessions │ │
|
|
│ │ • bcrypt for password hashing │ │
|
|
│ │ │ │
|
|
│ │ ## Active Files │ │ ← Derived from tool calls
|
|
│ │ • src/auth/mod.ts (editing) │ │
|
|
│ │ • src/auth/jwt.ts (referenced) │ │
|
|
│ │ │ │
|
|
│ │ ## Next Steps │ │ ← Agent's own plan
|
|
│ │ 1. Add refresh token rotation │ │
|
|
│ │ 2. Write auth middleware │ │
|
|
│ │ 3. Add tests │ │
|
|
│ │ │ │
|
|
│ │ ## Notes │ │ ← Agent's scratchpad
|
|
│ │ • DB schema: users, sessions │ │
|
|
│ │ • Rate limit: 100/min default │ │
|
|
│ └─────────────────────────────────┘ │
|
|
├─────────────────────────────────────┤
|
|
│ Conversation Messages │ ← Standard message history
|
|
│ [filtered by compaction boundary] │
|
|
└─────────────────────────────────────┘
|
|
```
|
|
|
|
### The Append-Only Event Log
|
|
|
|
Underlying the HUD is an **append-only event log** — similar to Yjs/CRDT event streams but simpler because we're single-author (one agent per session). This is the "source of truth" that the HUD renders from.
|
|
|
|
```typescript
|
|
interface HudEvent {
|
|
id: string // UUID
|
|
type: HudEventType // discriminated union
|
|
timestamp: number // Unix ms
|
|
sessionId: string // opencode session ID
|
|
turn: number // which agent turn
|
|
payload: unknown // type-specific data
|
|
}
|
|
|
|
type HudEventType =
|
|
| "task.set" // Agent sets the current task description
|
|
| "task.update" // Agent refines the task
|
|
| "decision.record" // Agent records a key decision
|
|
| "note.add" // Agent adds a note
|
|
| "note.update" // Agent updates a note
|
|
| "note.remove" // Agent removes a note
|
|
| "file.open" // Agent reads a file (derived from tool calls)
|
|
| "file.edit" // Agent edits a file (derived from tool calls)
|
|
| "file.close" // File falls out of recent context
|
|
| "step.complete" // Agent marks a step complete
|
|
| "step.add" // Agent adds a next step
|
|
| "step.reorder" // Agent reorders steps
|
|
| "error.encountered" // Agent encounters an error
|
|
| "error.resolved" // Agent resolves an error
|
|
| "blocker.add" // Agent identifies a blocker
|
|
| "blocker.remove" // Agent removes a blocker
|
|
| "context.snapshot" // Context window usage snapshot (auto-generated)
|
|
| "compact.before" // Pre-compaction state snapshot
|
|
| "compact.after" // Post-compaction state + summary
|
|
```
|
|
|
|
The event log is persisted and append-only. It replaces the "conversation log as only state" model with "conversation log as one input to the HUD, alongside structured state."
|
|
|
|
### Rendering Pipeline: UJSX → MDAST → Markdown
|
|
|
|
The HUD uses the **UJSX v2** universal JSX IR (see [`ujsx-v2-typebox-rewrite.md`](./ujsx-v2-typebox-rewrite.md)) built on TypeBox schemas. The existing POC at `/workspace/aui` already implements UJSX → mdast → markdown with a `TransformRegistry` and `mdast-util-to-markdown`. The v2 rewrite adds:
|
|
|
|
- Actual JSX syntax (via `jsxImportSource: "@ade/ujsx"`)
|
|
- TypeBox schema-driven node types (schemas ARE types, schemas ARE tool parameter schemas)
|
|
- Bi-directional transforms (UJSX ↔ mdast, UJSX ↔ hast, etc.)
|
|
- HTML-agnostic core (no `onClick`, `className` in universal props)
|
|
|
|
```
|
|
HUD Components (.tsx with JSX syntax)
|
|
│
|
|
▼
|
|
h() factory / JSX transform → UElement tree (TypeBox-schema-validated)
|
|
│
|
|
▼ TransformRegistry (direction: 'ujsx->mdast')
|
|
│ Rules match on TypeBox schemas, not string tags
|
|
│
|
|
mdast tree
|
|
│
|
|
▼ mdast-util-to-markdown + mdast-util-gfm
|
|
│
|
|
Markdown String
|
|
│
|
|
▼ Injected via experimental.chat.system.transform
|
|
│
|
|
System Prompt Part 1
|
|
```
|
|
|
|
**Why UJSX (not Hono JSXNode, not hast→mdast)?**
|
|
|
|
1. **Schema-driven**: TypeBox schemas serve triple duty — TypeScript types, runtime validation, and tool parameter schemas. Component props = tool input schemas. Zero duplication.
|
|
2. **Bi-directional**: Rules convert both UJSX→mdast AND mdast→UJSX. Parse existing markdown (notes, AGENTS.md) back into the IR.
|
|
3. **HTML-agnostic**: No `onClick`, `className`, `aria-*` in the core. The IR isn't pretending to be HTML.
|
|
4. **HostConfig preserved**: The same component tree renders to graphology, markdown, or future targets.
|
|
5. **Actual JSX syntax**: With `jsxImportSource`, write `<TaskSection task={state.task} density="compact" />` not `h('TaskSection', { task: state.task, density: 'compact' })`.
|
|
|
|
### HUD Component Architecture
|
|
|
|
```tsx
|
|
// The root HUD component
|
|
function HUD({ state, contextInfo }: { state: HudState, contextInfo: ContextInfo }) {
|
|
const density = getDensity(contextInfo.percentage)
|
|
|
|
return (
|
|
<Container>
|
|
<TaskSection task={state.task} density={density} />
|
|
<ContextBar info={contextInfo} density={density} />
|
|
{density !== 'minimal' && <DecisionsList decisions={state.decisions} density={density} />}
|
|
{density !== 'minimal' && <ActiveFiles files={state.activeFiles} density={density} />}
|
|
<NextSteps steps={state.nextSteps} density={density} />
|
|
{density === 'full' && <NotesSection notes={state.notes} />}
|
|
{contextInfo.percentage > 85 && <WarningBanner info={contextInfo} />}
|
|
</Container>
|
|
)
|
|
}
|
|
|
|
// Adaptive density: renders differently based on context pressure
|
|
type Density = 'full' | 'compact' | 'minimal'
|
|
|
|
function getDensity(percentage: number): Density {
|
|
if (percentage < 70) return 'full'
|
|
if (percentage < 85) return 'compact'
|
|
return 'minimal'
|
|
}
|
|
|
|
// Example adaptive component
|
|
function DecisionsList({ decisions, density }: { decisions: Decision[], density: Density }) {
|
|
if (density === 'compact') {
|
|
return <text>{`## Decisions (${decisions.length})\n` + decisions.map(d => `- ${d.summary}`).join('\n')}</text>
|
|
}
|
|
return (
|
|
<section title="Decisions">
|
|
{decisions.map(d => <DecisionItem decision={d} />)}
|
|
</section>
|
|
)
|
|
}
|
|
```
|
|
|
|
### Cache-Aware System Prompt Structure
|
|
|
|
The key extension of OpenCode's 2-part pattern:
|
|
|
|
```
|
|
System Message 0 (cached, stable):
|
|
- Agent identity prompt
|
|
- Provider-specific prompt
|
|
— Never changes within a session
|
|
|
|
System Message 1 (re-cached each call):
|
|
- HUD static layer (task, decisions, next steps)
|
|
- HUD dynamic layer (context bar, notes, active files)
|
|
- Environment info, skills, instructions
|
|
— Changes based on agent actions and context pressure
|
|
```
|
|
|
|
The **static layer** of the HUD should change only when the agent explicitly updates it (task change, decision recorded, step completed). The **dynamic layer** changes every turn (context percentage, recent files, notes).
|
|
|
|
For Anthropic's `cache_control`, we mark:
|
|
- System message 0 → `cacheControl: { type: "ephemeral" }` (already done by opencode)
|
|
- System message 1 → `cacheControl: { type: "ephemeral" }` (already done by opencode)
|
|
|
|
The savings come from keeping message 0 stable (0.1x cost on cache hits) and making message 1 as small as possible while still providing full situational awareness.
|
|
|
|
### HUD State and Event Log Storage
|
|
|
|
The event log uses the same SQLite database opencode already has, with a new table:
|
|
|
|
```sql
|
|
CREATE TABLE hud_event (
|
|
id TEXT PRIMARY KEY,
|
|
session_id TEXT NOT NULL REFERENCES session(id),
|
|
type TEXT NOT NULL, -- HudEventType
|
|
turn INTEGER NOT NULL, -- agent turn number
|
|
timestamp INTEGER NOT NULL, -- Unix ms
|
|
payload TEXT NOT NULL, -- JSON
|
|
created_at INTEGER NOT NULL DEFAULT (unixepoch())
|
|
);
|
|
|
|
CREATE INDEX idx_hud_event_session ON hud_event(session_id, timestamp);
|
|
CREATE INDEX idx_hud_event_type ON hud_event(session_id, type);
|
|
```
|
|
|
|
The **HUD state** is derived from the event log by projectors (similar to opencode's existing event sourcing pattern):
|
|
|
|
```typescript
|
|
interface HudState {
|
|
task: { description: string; updatedAt: number } | null
|
|
decisions: Array<{ id: string; summary: string; details: string; recordedAt: number }>
|
|
notes: Array<{ id: string; content: string; updatedAt: number }>
|
|
activeFiles: Array<{ path: string; lastAccessed: number; status: 'reading' | 'editing' | 'referenced' }>
|
|
nextSteps: Array<{ id: string; description: string; completed: boolean; order: number }>
|
|
blockers: Array<{ id: string; description: string; resolved: boolean }>
|
|
errors: Array<{ id: string; message: string; resolved: boolean; resolvedAt?: number }>
|
|
}
|
|
```
|
|
|
|
Projectors are pure functions: `(state, event) => newState`. They're deterministic and idempotent — the state can always be rebuilt from the event log.
|
|
|
|
Some events are **derived** rather than agent-authored:
|
|
- `file.open`, `file.edit`, `file.close`: Intercepted from opencode's `tool.execute.before/after` hook by watching for Read, Write, Edit tools
|
|
- `context.snapshot`: Auto-generated from `SessionProcessor` events tracking token usage
|
|
|
|
### HUD Tools (Agent-Facing)
|
|
|
|
The agent interacts with the HUD through two tools (following the router pattern from open-memory):
|
|
|
|
**`hud`** (router tool — reduces context bloat from tool definitions):
|
|
|
|
```typescript
|
|
hud(input: {
|
|
tool: string, // operation name
|
|
args?: object // operation arguments
|
|
})
|
|
```
|
|
|
|
Operations:
|
|
- `task.get` / `task.set` / `task.update` — Manage current task
|
|
- `decisions.list` / `decisions.record` / `decisions.remove` — Key decisions log
|
|
- `notes.list` / `notes.add` / `notes.update` / `notes.remove` — Scratchpad
|
|
- `steps.list` / `steps.add` / `steps.complete` / `steps.reorder` — Next steps / plan
|
|
- `blockers.list` / `blockers.add` / `blockers.remove` — Blockers
|
|
- `snapshot` — Full HUD state
|
|
- `history` — Recent event log (for understanding what changed)
|
|
|
|
**`hud_compact`** (mutation tool — separate to prevent accidental use):
|
|
|
|
```typescript
|
|
hud_compact() // Triggers both HUD compaction AND opencode compaction
|
|
```
|
|
|
|
This is distinct from `memory_compact` because:
|
|
1. It snapshots HUD state to the event log before compaction
|
|
2. After compaction, the stable HUD layer carries forward — the agent doesn't lose its task, decisions, or notes
|
|
3. It can trigger compaction at a natural breakpoint (when the agent says "next steps updated, good time to compact")
|
|
|
|
### Plugin Integration (opencode)
|
|
|
|
The HUD is implemented as an opencode plugin, using these hooks:
|
|
|
|
```typescript
|
|
const HUDPlugin: Plugin = async (ctx) => {
|
|
const stateManager = new HudStateManager(ctx) // Manages event log + projectors
|
|
const renderer = new HudRenderer() // JSX component → markdown
|
|
|
|
return {
|
|
tool: { hud: createHudTool(ctx, stateManager), hud_compact: createHudCompactTool(ctx, stateManager) },
|
|
|
|
"experimental.chat.system.transform": async (input, output) => {
|
|
// Render HUD and inject as system prompt
|
|
const state = stateManager.getState(input.sessionID)
|
|
const contextInfo = getContextInfo(input.sessionID)
|
|
const markdown = renderer.render(HUD, { state, contextInfo })
|
|
output.system.push(markdown)
|
|
},
|
|
|
|
"experimental.session.compacting": async (input, output) => {
|
|
// Before compaction: snapshot HUD state to event log
|
|
stateManager.recordEvent(input.sessionID, { type: "compact.before", payload: stateManager.getState(input.sessionID) })
|
|
// Replace compaction prompt with self-continuity + HUD-aware prompt
|
|
output.prompt = getCompactionPrompt(stateManager.getState(input.sessionID))
|
|
},
|
|
|
|
"event": async ({ event }) => {
|
|
// Derive file events from tool calls
|
|
if (event.type === "tool.execute.after") {
|
|
stateManager.maybeRecordFileEvent(event)
|
|
}
|
|
// Track context snapshots
|
|
if (event.type === "message.updated") {
|
|
stateManager.maybeRecordContextSnapshot(event)
|
|
}
|
|
},
|
|
|
|
"tool.execute.before": async (input, output) => {
|
|
// Track file access from tool calls
|
|
stateManager.trackToolCall(input)
|
|
},
|
|
}
|
|
}
|
|
```
|
|
|
|
### Rendering Pipeline Implementation
|
|
|
|
```typescript
|
|
// core/renderer.ts
|
|
|
|
import type { FC } from "hono/jsx"
|
|
|
|
interface RenderContext {
|
|
density: Density
|
|
state: HudState
|
|
contextInfo: ContextInfo
|
|
}
|
|
|
|
class HudRenderer {
|
|
// Custom walker over Hono JSXNode tree
|
|
render(component: FC<any>, props: Record<string, any>): string {
|
|
const node = component(props)
|
|
return this.walkNode(node)
|
|
}
|
|
|
|
private walkNode(node: any): string {
|
|
if (typeof node === "string" || typeof node === "number") {
|
|
return String(node)
|
|
}
|
|
if (node === null || node === undefined || node === false) {
|
|
return ""
|
|
}
|
|
if (node instanceof Promise) {
|
|
throw new Error("Async components not supported in HUD rendering")
|
|
}
|
|
|
|
// JSXFragmentNode — just concatenate children
|
|
if (node.tag === null || node.tag === Symbol.for("hono.fragment")) {
|
|
return (node.children as any[]).map(c => this.walkNode(c)).join("\n")
|
|
}
|
|
|
|
// Function component — call it and walk the result
|
|
if (typeof node.tag === "function") {
|
|
const result = node.tag({ ...node.props, children: node.children })
|
|
return this.walkNode(result)
|
|
}
|
|
|
|
// Intrinsic element — map HTML tag to markdown
|
|
return this.renderElement(node.tag, node.props, node.children)
|
|
}
|
|
|
|
private renderElement(tag: string, props: any, children: any[]): string {
|
|
const childContent = (children || []).map(c => this.walkNode(c)).join("\n")
|
|
|
|
switch (tag) {
|
|
case "h1": return `# ${childContent}`
|
|
case "h2": return `## ${childContent}`
|
|
case "h3": return `### ${childContent}`
|
|
case "strong": case "b": return `**${childContent}**`
|
|
case "em": case "i": return `*${childContent}*`
|
|
case "code": return props.lang
|
|
? `\`\`\`${props.lang}\n${childContent}\n\`\`\``
|
|
: `\`${childContent}\``
|
|
case "ul": return childContent
|
|
case "li": return `- ${childContent}`
|
|
case "ol": return childContent // handled by parent
|
|
case "p": return childContent
|
|
case "div": case "section": case "article": case "main": case "header":
|
|
case "footer": case "nav": case "aside": case "span":
|
|
return childContent // container — just pass through content
|
|
case "a": return `[${childContent}](${props.href})`
|
|
case "blockquote": return childContent.split("\n").map(l => `> ${l}`).join("\n")
|
|
case "hr": return "---"
|
|
case "br": return "\n"
|
|
case "pre": return childContent // content already formatted by <code>
|
|
case "table": return this.renderTable(props, children)
|
|
default:
|
|
// Unknown/custom tags: use data-md attribute or just render content
|
|
if (props?.["data-md"]) {
|
|
// Custom markdown rendering hint
|
|
return this.renderCustomMd(props["data-md"], props, childContent)
|
|
}
|
|
return childContent
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Adaptive Density in Practice
|
|
|
|
The key insight from open-memory's context status thresholding: **information density should be proportional to context pressure.**
|
|
|
|
```
|
|
Context < 70% (GREEN / "full"):
|
|
┌────────────────────────────────────┐
|
|
│ # Task │
|
|
│ Implement user authentication │
|
|
│ │
|
|
│ ## Context: 45% (90k/200k) stable │
|
|
│ │
|
|
│ ## Decisions (3) │
|
|
│ • Using JWT over sessions │
|
|
│ • bcrypt for password hashing │
|
|
│ • Rate limiting: 100/min default │
|
|
│ │
|
|
│ ## Active Files │
|
|
│ • src/auth/mod.ts (editing) │
|
|
│ • src/auth/jwt.ts (referenced) │
|
|
│ • src/db/schema.ts (referenced) │
|
|
│ │
|
|
│ ## Next Steps │
|
|
│ 1. ~~Add refresh token rotation~~ │
|
|
│ 2. Write auth middleware │
|
|
│ 3. Add tests │
|
|
│ │
|
|
│ ## Notes │
|
|
│ • DB schema: users, sessions │
|
|
│ • Env vars: JWT_SECRET, DB_URL │
|
|
└────────────────────────────────────┘
|
|
~300-500 tokens
|
|
|
|
Context 70-85% (YELLOW / "compact"):
|
|
┌────────────────────────────────────┐
|
|
│ Task: Implement auth │ 72% (144k) ↑ │
|
|
│ Decisions: JWT, bcrypt, 100/min │
|
|
│ Steps: ~~rotation~~, middleware, │
|
|
│ tests │
|
|
│ Files: auth/mod.ts, auth/jwt.ts │
|
|
│ Notes: DB schema users/sessions │
|
|
└────────────────────────────────────┘
|
|
~100-150 tokens
|
|
|
|
Context 85-92% (RED / "minimal"):
|
|
┌────────────────────────────────────┐
|
|
│ Auth impl │ 89% │ ⚠ compact soon │
|
|
│ JWT+bcrypt, step: middleware │
|
|
└────────────────────────────────────┘
|
|
~30-50 tokens
|
|
```
|
|
|
|
This adaptive compression means the agent always has situational awareness, but the cost scales inversely with available context.
|
|
|
|
### Compaction That Preserves State
|
|
|
|
The critical difference from current compaction: **the HUD state survives compaction.**
|
|
|
|
Current flow:
|
|
```
|
|
[conversation] → COMPACT → [summary replaces everything] → agent loses context
|
|
```
|
|
|
|
HUD-aware flow:
|
|
```
|
|
[event log] → project to HUD state → render HUD → [inject into system prompt]
|
|
[conversation] → COMPACT → [summary replaces conversation, but HUD persisted state carries forward]
|
|
```
|
|
|
|
Before compaction:
|
|
1. Agent records `compact.before` event with full HUD state
|
|
2. HUD state is persisted to the event log (already append-only)
|
|
3. Compaction prompt includes: "Your HUD state: {rendered HUD}. This will survive compaction."
|
|
|
|
After compaction:
|
|
1. `filterCompacted` drops pre-compaction messages
|
|
2. But the system prompt still contains the fully rendered HUD
|
|
3. The agent sees its task, decisions, notes, and next steps — not just a narrative summary
|
|
|
|
The compaction summary becomes a **complement** to the HUD, not a **replacement** for lost context. The summary handles conversational continuity ("we were discussing..."), while the HUD provides structured persistent state.
|
|
|
|
### Comparison: Current vs HUD Architecture
|
|
|
|
| Aspect | Current (Compaction) | HUD Architecture |
|
|
|--------|---------------------|------------------|
|
|
| State management | Monolithic conversation log | Structured event log + HUD projection |
|
|
| Context loss | All-or-nothing compaction cliff | Adaptive density that preserves key state |
|
|
| Agent awareness | Must call memory tool | Ambient via system prompt injection |
|
|
| Cache optimization | 2-part system prompt only | Multi-part with stable HUD layer |
|
|
| Recovery from compaction | LLM-generated summary (lossy) | Structured HUD state (deterministic) |
|
|
| Cognitive load | Full conversation always visible | Relevant state always visible, details on demand |
|
|
| Token cost at 50% | Full conversation (~100k tokens of history) | Conversation + HUD (~100k + ~300 tokens) |
|
|
| Token cost at 90% | Conversation (truncated by compaction, then grows again) | Conversation + compact HUD (truncated + ~50 tokens) |
|
|
|
|
---
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Minimal Viable HUD (Plugin)
|
|
|
|
1. **Event log table**: Add `hud_event` table to opencode's SQLite database (via plugin)
|
|
2. **Basic event types**: `task.set`, `task.update`, `decision.record`, `note.add`, `step.add`, `step.complete`
|
|
3. **State projector**: Pure functions that fold events into `HudState`
|
|
4. **Simple renderer**: Template-based markdown rendering (no JSX yet)
|
|
5. **Plugin hooks**: `system.transform` for injection, `event` for derivation, `session.compacting` for pre-compaction snapshot
|
|
6. **Two tools**: `hud` (router) and `hud_compact`
|
|
|
|
### Phase 2: JSX Rendering Pipeline
|
|
|
|
1. **Hono JSXNode walking**: Custom ` HudRenderer` that walks JSXNode trees directly to markdown
|
|
2. **HUD components**: `<HUD>`, `<TaskSection>`, `<ContextBar>`, `<DecisionsList>`, `<NotesSection>`, `<NextSteps>`
|
|
3. **Adaptive density**: Components that render differently based on `density` prop
|
|
4. **Test rendering**: Snapshot tests comparing component output to expected markdown
|
|
|
|
### Phase 3: Cache Optimization
|
|
|
|
1. **HUD layer splitting**: Separate HUD into static layer (task, decisions, next steps) and dynamic layer (context bar, notes, active files)
|
|
2. **Static layer diffing**: Only push to `output.system[]` when static content changes, reducing cache misses
|
|
3. **Dynamic layer minimal updates**: Track what changed since last render, include only deltas
|
|
4. **Measure cache hit rates**: Instrument provider token usage to verify caching improvements
|
|
|
|
### Phase 4: Advanced Features
|
|
|
|
1. **Derived events**: File tracking from tool calls, context snapshots from message events
|
|
2. **Search integration**: `hud.search` operation using FTS5 instead of LIKE
|
|
3. **Cross-session HUD**: Persist HUD state across sessions, enable resuming tasks
|
|
4. **QuickJS scripting**: Use toolEnv's envProxy pattern to let agents customize their HUD components
|
|
5. **Multi-agent HUD**: When `coord.spawn` creates sub-agents, share HUD state via parent session events
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
### Why event log over direct state mutation?
|
|
|
|
1. **Auditability**: Every state change has a trace. You can reconstruct the HUD at any point in time.
|
|
2. **Compaction resilience**: Events survive compaction because they're in a separate table, not the conversation message stream.
|
|
3. **No conflict resolution needed**: Unlike CRDTs, we're single-author (one agent per session). The event log is append-only with no merge conflicts.
|
|
4. **Projection flexibility**: Different projectors can derive different views from the same events. The HUD is one projection; a task progress tracker could be another.
|
|
|
|
### Why JSX over Handlebars/templates?
|
|
|
|
1. **Component composition**: JSX supports arbitrary nesting and composition. `<CompactView>` can wrap `<DecisionsList>` with different rendering logic.
|
|
2. **Conditional rendering**: `{density === 'full' && <NotesSection />}` is cleaner than `{{#if density.full}}...{{/if}}`.
|
|
3. **Type safety**: Components are typed functions. Props, state, density — all compile-time checked.
|
|
4. **Developer familiarity**: React-like patterns are widely understood.
|
|
5. **Direct JSXNode walking avoids HTML roundtrip**: No `renderToStaticMarkup → parse → hast → mdast → markdown` needed.
|
|
|
|
### Why system prompt injection instead of a dedicated message role?
|
|
|
|
1. **Cache alignment**: OpenCode already manages the system prompt as a cache-aware structure. Injecting into `output.system[]` gives us immediate cache optimization.
|
|
2. **No protocol change**: We don't need to change opencode's core messaging protocol. The plugin hook is sufficient.
|
|
3. **Survives compaction**: System prompt is always included (it's never compacted). The HUD is always visible.
|
|
4. **Provider compatibility**: System prompts work with every LLM provider. A custom message role might not.
|
|
|
|
### Why router pattern (1 tool) over separate tools?
|
|
|
|
From open-memory's research: each tool definition adds ~1-2k tokens to the system prompt. With 10+ HUD operations, that's 10-20k tokens of overhead. A single `hud` router tool costs ~2k tokens regardless of how many operations it supports. The `help` operation provides inline documentation.
|
|
|
|
---
|
|
|
|
## File Structure (Proposed)
|
|
|
|
```
|
|
packages/hud-plugin/
|
|
src/
|
|
index.ts # Plugin entry point
|
|
state/
|
|
types.ts # HudState, HudEvent types
|
|
projector.ts # Event → State projectors
|
|
store.ts # SQLite read/write for hud_event
|
|
renderer/
|
|
components/
|
|
HUD.tsx # Root HUD component
|
|
TaskSection.tsx # Task display
|
|
ContextBar.tsx # Context percentage bar
|
|
DecisionsList.tsx # Key decisions
|
|
NotesSection.tsx # Scratchpad
|
|
NextSteps.tsx # Plan/next steps
|
|
ActiveFiles.tsx # Currently active files
|
|
WarningBanner.tsx # Context pressure warning
|
|
renderer.ts # JSXNode → markdown walker
|
|
density.ts # Adaptive density logic
|
|
tools/
|
|
hud.ts # hud router tool definition
|
|
hud_compact.ts # hud_compact tool definition
|
|
operations/ # Individual operations
|
|
task.ts
|
|
decisions.ts
|
|
notes.ts
|
|
steps.ts
|
|
files.ts
|
|
snapshot.ts
|
|
history.ts
|
|
hooks/
|
|
system-transform.ts # experimental.chat.system.transform
|
|
compacting.ts # experimental.session.compacting
|
|
event.ts # event derivation (file tracking, context snapshots)
|
|
context/
|
|
tracker.ts # Context window tracking (from open-memory)
|
|
thresholds.ts # Density thresholds
|
|
```
|
|
|
|
---
|
|
|
|
## Risks and Open Questions
|
|
|
|
1. **Token overhead at low context usage**: Even "minimal" HUD adds ~30-50 tokens. Is 0% context worth the overhead? Probably yes — the ROI is in the 70%+ range where the HUD prevents catastrophic compaction. At 0%, the full HUD costs ~300-500 tokens, which is ~0.25% of a 200k context. The payoff is avoiding a compaction event that loses the entire conversation.
|
|
|
|
2. **Agent compliance**: Will agents consistently use `hud` tools to update their state? The current approach relies on the agent choosing to call `hud` tools. Alternatives:
|
|
- **Auto-derivation**: More events derived automatically from tool calls (file tracking is automatic, but decisions/notes require agent action)
|
|
- **AGENTS.md prompt**: Include HUD tool usage in instructions
|
|
- **Conventional prompting**: "Always update your HUD after completing a step or making a decision" in the compaction/system prompt
|
|
|
|
3. **Stale state**: If the agent doesn't update the HUD before compaction, the HUD might be stale. Mitigation: auto-snapshot HUD state before compaction in the `session.compacting` hook.
|
|
|
|
4. **JSX dependency weight**: Hono's JSX runtime adds a dependency. Alternative: use a lightweight custom JSX transform that doesn't need Hono. The renderer only needs `JSXNode` types and the walker — not the full Hono framework.
|
|
|
|
5. **Multi-agent sessions**: When sub-agents are spawned, should they share HUD state? The event log is per-session, but parent-child relationships in opencode could enable HUD state inheritance.
|
|
|
|
6. **Search vs. structured state**: Should agents search conversation history (like `memory({tool: "search"})`) or maintain structured state (like `hud({tool: "decisions.record"})`)? Both. The HUD is for state the agent actively maintains. Search is for recovering information the agent didn't think to record. They complement each other.
|
|
|
|
7. **Event log growth**: The `hud_event` table will grow. Needs a cleanup strategy — perhaps tied to compaction events (archive events older than the last compaction point). |