Files

glm-5.1 3dceb30ce9 Review cleanup: fix stale tool references, update docs, add README

- Remove unused src/context/notify.ts (never wired up)
- Fix format.ts/search.ts: update memory_messages references to router pattern
- Update AGENTS.md: reflect current state, add recommended consumer additions
- Update docs/architecture.md: match router pattern, remove stale phases
- Add README.md: problem/solution, install, tools, agent guidance

2026-04-21 12:41:14 +00:00

9.4 KiB

Raw Permalink Blame History

Open Memory: Architecture & Research

Note

: AGENTS.md is the canonical operational reference for this project. This document provides deeper context on the research and design decisions.

Overview

@alkdev/open-memory is a standalone OpenCode plugin providing three capabilities:

Context Awareness — real-time tracking of context window usage with proactive warnings
Session History Browser — structured access to past sessions, messages, plans, and search
Compaction Management — better compaction prompts and on-demand compaction triggering

The core problem: OpenCode's automatic compaction fires at ~92% context usage with no warning. The default prompt frames it as "summarize for another agent" when it's the same agent continuing. This is disorienting and derailing. Open-memory gives agents awareness, control, and better summaries.

Problem Statement

Automatic Compaction is Disorienting

Fires at ~92% with no advance warning
Default prompt says "summarize for another agent" — misleading
Agent loses context at an unpredictable point
No way to compact at a natural breakpoint

No History Access Within Sessions

Agents can't reference prior sessions, decisions, or work
The opencode-memory.md skill shows queries are possible via sqlite3 but require manual bash commands
No structured tool interface for browsing history

Context Window Opacity

The agent has no idea how close it is to compaction
No visibility into token usage trends within a session

Architecture

Tool Design: Router Pattern

The plugin exposes exactly 2 tools to the agent:

Tool	Type	Purpose
`memory`	Read-only router	Dispatches to 8 internal operations by `{tool: "name", args: {...}}`
`memory_compact`	Mutation	Triggers compaction via `ctx.client.session.summarize()`

Why a router? OpenCode has ~13.5k token baseline context bloat with just "hello world". Each tool definition adds its JSON schema to the system prompt. 8 separate tools = 8 schemas consuming context. By collapsing into a router, the agent sees only 2 tool definitions instead of 8, dramatically reducing context overhead.

This pattern is inspired by toolEnv's /call registry approach and is applicable to other plugins that expose many operations.

Three Pillars

1. Context Awareness

SSE-based token tracking:

Subscribe to message.updated events via the event plugin hook
Track tokens.input from assistant messages per session
The tokens.input on the latest assistant message = current context size
Compare against model's limit.context to compute percentage used
Model limits available from ctx.client.config.get()

Thresholds (defined in src/context/thresholds.ts as the single source of truth):

Green (<70%): Healthy, no action needed
Yellow (70-85%): Consider compacting at next break point
Red (85-92%): Strongly recommend compacting now
Critical (>92%): Imminent automatic compaction

Proactive notification:

experimental.chat.system.transform hook injects context percentage into system prompt
Agent always knows its context status without calling a tool
At yellow/red thresholds, injects an explicit advisory note

2. Compaction Management

memory_compact tool:

Calls ctx.client.session.summarize() to trigger compaction on the current session
Requires providerID and modelID — obtained from the session's last user message or context tracker
Must NOT await summarize() — returns immediately, schedules via setTimeout(0) because compaction can't start until the tool returns control to the event loop
Refuses to compact if context is below 50% (wastes a compaction cycle)
This gives the agent explicit control over when compaction happens

experimental.session.compacting hook:

Replaces the default "summarize for another agent" prompt
Better prompt emphasizes self-continuity, preserving task context, decisions, and next steps
Uses structured template: Goal, Instructions, Discoveries, Accomplished, Relevant files, Notes

3. Session History Browser

All backed by read-only bun:sqlite queries to ${XDG_DATA_HOME:-$HOME/.local/share}/opencode/opencode.db.

Operations (all accessed via the memory router):

Operation	Purpose	Key args
help	Show available operations	tool (optional, for details on one)
summary	Quick counts: projects, sessions, messages, todos	—
sessions	List recent sessions with metadata	limit, projectPath
messages	Read messages from a session as markdown	sessionId, limit
search	Text search across all conversations (LIKE-based)	query, limit
compactions	List/read compaction checkpoints for a session	sessionId, read (1-based index)
context	Current context window usage	—
plans	List and read saved plans	read (filename)

Rendering:

Markdown tables for session lists
Formatted conversation transcripts for messages
Snippet + session reference for search results
Compaction checkpoints as navigable indices with summary previews
All queries use LIMIT and parameterized db.prepare().all(params)

Compaction Data in DB

When compaction occurs, OpenCode creates:

A synthetic user message with a compaction-type part (part.data = {type: "compaction", auto: true/false, overflow: true/false})
message.data.summary = {diffs: [...]} on the compaction message
The assistant message immediately after contains the actual summary text in a text-type part

The compactions operation queries for compaction-type parts and retrieves the adjacent summary text, presenting them as navigable checkpoints. This is a stepping stone toward agents having their own UI with HUD + last N messages + tools for long-term memories.

Component Design

src/
├── index.ts              # Plugin entry: hooks + tool registration
├── tools.ts              # 2 tools: memory router + memory_compact (with setTimeout fix)
├── context/
│   ├── tracker.ts        # SSE token tracking (per-session context usage)
│   └── thresholds.ts     # Threshold constants + ContextStatus type (single source of truth)
├── history/
│   ├── queries.ts        # bun:sqlite read-only query helper (lazy singleton)
│   ├── format.ts         # Markdown rendering for session/message output
│   └── search.ts         # LIKE-based full-text search across conversations
└── compaction/
    └── prompt.ts         # Compaction prompt template (self-continuity, not "for another agent")

Key Technical Details

Context Percentage Calculation

From overflow.ts in OpenCode source:

count = tokens.total || (input + output + cache.read + cache.write)
reserved = config.compaction?.reserved ?? min(20000, maxOutputTokens)
usable = model.limit.input ? model.limit.input - reserved
                        : model.limit.context - maxOutputTokens

The tokens.input field on the last assistant message represents the context size at the time that message was sent. We track this and compare it against the model's context limit (from config/providers), falling back to 200k.

Session Summarize API

The SDK exposes ctx.client.session.summarize():

ctx.client.session.summarize({
  path: { id: sessionID },
  body: { providerID, modelID },
})

This triggers the compaction flow in OpenCode's server. Must not be awaited — see the memory_compact deadlock note above.

Plugin Hooks

experimental.session.compacting:

async (input, output) => {
  output.prompt = getCompactionPrompt(); // replaces default entirely
}

experimental.chat.system.transform:

async (input, output) => {
  const info = contextTracker.getContextInfo(input.sessionID);
  if (info) {
    output.system.push(`🟢 Context: ${info.percentage}% used (...)`);
  }
}

event:

async ({ event }) => {
  contextTracker.handleEvent(event);
}

Relationship to `open-coordinator`

Open-coordinator handles worktree orchestration, session spawning, bidirectional communication
Open-memory handles session introspection, context awareness, history browsing
Both use SSE event streams but for different purposes
Both can be used together — coordinator for multi-session workflows, memory for context management
Both implement experimental.session.compacting — open-memory's version is more detailed
The router pattern (2 tools instead of many) was first applied here and can be applied to open-coordinator

Future Work

FTS5 virtual table support for better search (stemming, ranking)
Configurable thresholds via plugin config
Session comparison tools
Export/import helpers
Integration tests

References

OpenCode source: /workspace/opencode — especially packages/opencode/src/session/compaction.ts, overflow.ts
OpenCode plugin SDK: /workspace/opencode/packages/plugin/src/index.ts
OpenCode plugin types: see Hooks interface for all available hooks
Open-coordinator plugin: /workspace/@alkimiadev/open-coordinator — architecture pattern reference
OpenCode DB schema: message, part, session, project, todo tables
OpenCode config schema: compaction.auto, compaction.prune, compaction.reserved fields
Bun SQLite docs: https://bun.com/docs/runtime/sqlite

9.4 KiB Raw Permalink Blame History