Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
260 lines
17 KiB
Markdown
260 lines
17 KiB
Markdown
---
|
|
status: resolved
|
|
created: 2026-04-17
|
|
last_updated: 2026-04-17
|
|
---
|
|
|
|
# Documentation Consistency Review
|
|
|
|
Review of AGENTS.md and all 12 architecture docs for conflicting, confusing, and inconsistent content. Findings are organized by severity: Conflicts (actively misleading), Inconsistencies (confusing), and Gaps (missing info).
|
|
|
|
Each finding has a resolution status: **open** (needs decision), **resolved** (fixed), or **wontfix** (explicitly justified with rationale).
|
|
|
|
---
|
|
|
|
## 🔴 Conflicts — Actively Misleading
|
|
|
|
### C1. Runner/Spoke writes directly to Postgres vs. "No Postgres Connection" — ✅ resolved
|
|
|
|
**Files**: `agent-sessions.md`, `spoke-runner.md`, `packages.md`
|
|
|
|
**Problem**: `agent-sessions.md` diagram showed direct Postgres access from runner, contradicting spoke-runner.md ("No Postgres connection") and packages.md.
|
|
|
|
**Resolution**: Fixed diagram — session writes now go through hub operations (call protocol), not direct Postgres. Runner is stateless.
|
|
|
|
---
|
|
|
|
### C2. Hub "inherits from spoke" — ✅ resolved
|
|
|
|
**Files**: `hub-architecture.md`, `packages.md`, `AGENTS.md`
|
|
|
|
**Problem**: "Hub = Spoke + Orchestration — *inherits* the spoke's operation registry..." implied hub depends on spoke. Actual model: both → core independently.
|
|
|
|
**Resolution**: Rewrote to "Hub shares core with spoke, adds orchestration." Updated table section from "Kept from ade_spoke (wholesale)" to "From core (shared with spoke)."
|
|
|
|
---
|
|
|
|
### C3. Call protocol: conflicting signals on whether to build it now — ✅ resolved
|
|
|
|
**Files**: `call-graph.md`, `operations.md`, `overview.md`
|
|
|
|
**Problem**: Three docs gave different signals — call-graph.md said initial implementation, operations.md said stopgap without it, overview.md said needs implementation.
|
|
|
|
**Resolution**: Call protocol is in initial implementation. Removed stopgap language from operations.md. Updated overview.md to clarify it's the implementation that's needed, not the design decision. The stopgap reference was from a session that conflated the open-coordinator dev plugin with the project's native call protocol.
|
|
|
|
---
|
|
|
|
### C4. Coordination operations use `registry.execute()` — ✅ resolved
|
|
|
|
**Files**: `coordination.md`, `call-graph.md`
|
|
|
|
**Problem**: All `coord.*` operations showed `registry.execute()` calls, bypassing the call protocol designed to solve exactly the abort cascading problem that coordination needs.
|
|
|
|
**Resolution**: Updated coordination.md to use `env.*` (call protocol via buildEnv) instead of `registry.execute()`. The previous form was from the initial POC; the real implementation should use the call protocol.
|
|
|
|
---
|
|
|
|
### C5. PendingRequestMap package location: core vs. hub — ✅ resolved
|
|
|
|
**Files**: `call-graph.md`, `operations.md`, `packages.md`
|
|
|
|
**Problem**: `buildEnv()` in `core/operations/env.ts` takes `callMap: PendingRequestMap`. `packages.md` listed PendingRequestMap in hub. Circular dependency risk.
|
|
|
|
**Resolution**: PendingRequestMap belongs in core because both hub and spoke need it. Updated `packages.md` to list `call/` module in core with PendingRequestMap, CallHandler, and call event types. Hub module changed from "Call protocol" to "Call graph" (runtime tracking/observability using core's PendingRequestMap).
|
|
|
|
> **Resolution (2026-05-18)**: PendingRequestMap is now in `@alkdev/operations` package with full implementation (not just an interface). The complete class includes `call()`, `subscribe()`, `respond()`, `emitError()`, `complete()`, and `abort()` methods. Resolved by core library extraction to `@alkdev/operations`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`.
|
|
|
|
---
|
|
|
|
## 🟡 Inconsistencies — Confusing
|
|
|
|
### I1. Redis EventTarget status duplicated in AGENTS.md provenance — ✅ resolved
|
|
|
|
**Problem**: Same work described in both "PubSub" row and "Redis EventTarget" row.
|
|
|
|
**Resolution**: Merged. Provenance table now has separate rows for PubSub (createPubSub + operators), TypedEventTarget, Redis EventTarget — each with single source of truth.
|
|
|
|
---
|
|
|
|
### I2. "Do not reference paths outside this repo" vs. provenance external refs — ✅ resolved
|
|
|
|
**Problem**: Rule prohibited external paths but provenance table was full of them with no exemption.
|
|
|
|
**Resolution**: Rewrote provenance section with explanation: "ade_spoke was a predecessor project — references are for historical traceability only." Sources now say "Copied from predecessor project" instead of `ade_spoke/operations/`. Made the rule clearer: `/workspace/` checkouts of public packages are fine; private project paths are not.
|
|
|
|
---
|
|
|
|
### I3. "Not for copying code from" vs. "Copied to core/" — ✅ resolved
|
|
|
|
**Problem**: Reference deps say read-only; provenance shows code copied from those same sources.
|
|
|
|
**Resolution**: Rules now clarify: provenance code was copied during initial setup; going forward reference deps are read-only for source-level understanding only. The distinction is: (1) use local clones as references when you have questions — source and tests beat docs, (2) don't pull in references to in-house private projects that outsiders won't have access to.
|
|
|
|
---
|
|
|
|
### I4. graphql-yoga "should fork in" (future) vs. already forked (past) — ✅ resolved
|
|
|
|
**Problem**: Line 97 said "we should fork in" while line 76 said "Done ✅."
|
|
|
|
**Resolution**: Updated AGENTS.md graphql-yoga row to past tense: "Source of createPubSub + event-target code (already forked into core/pubsub/). Kept for reference only."
|
|
|
|
---
|
|
|
|
### I5. AI SDK version column had three different versions — ✅ resolved
|
|
|
|
**Problem**: npm Version `6.0.138`, parenthetical "latest 6.x stable", git checkout `6.0.165`.
|
|
|
|
**Resolution**: Updated to: npm "Will use latest 6.x stable (currently 6.0.168)", git checkout `6.0.165` (slightly behind). Removed the stale `6.0.138` reference.
|
|
|
|
---
|
|
|
|
### I6. Four operations vs. Three MCP tools — ✅ resolved
|
|
|
|
**Problem**: Spoke protocol has `list`; MCP server didn't expose it.
|
|
|
|
**Resolution**: Added `list` as a fourth MCP tool. Updated mcp-server.md throughout (3→4 tools). Updated overview.md and AGENTS.md to match.
|
|
|
|
---
|
|
|
|
### I7. `mappings` table schema conflicts — ✅ resolved
|
|
|
|
**Resolution**: Renamed `storage-pattern.md` → `storage.md`. All table schemas now canonical in storage.md. Removed inline schemas from coordination.md and call-graph.md — they now link to storage.md. Added `detections` table, `status` column on `mappings`, and full column lists for `call_graph_nodes`/`call_graph_edges`.
|
|
|
|
---
|
|
|
|
### I8. Status enum mismatch: call graph vs. mappings — ✅ resolved
|
|
|
|
**Resolution**: Added a "Status Enum Reference" section to storage.md documenting all status enums and explaining that `mappings.active` and `call_graph_nodes.pending`/`running` are different concepts — "active" = workflow in progress, "pending"/"running" = call execution state.
|
|
|
|
---
|
|
|
|
### I9. `call_graph_nodes` columns missing from storage-pattern.md summary — ✅ resolved
|
|
|
|
**Resolution**: Full column lists for all tables now in storage.md. Removed the abbreviated summary table format in favor of per-table detailed specs.
|
|
|
|
---
|
|
|
|
### I10. Identity model — ✅ resolved
|
|
|
|
**Problem**: Call protocol `Identity` had `roles: string[]` and `AccessControl` had `requiredRoles`. These came from a prior project's dual auth system (token/keys + iroh identities). With keypal as the single auth mechanism, "roles" are just scope bundles — a configuration convention, not a separate type.
|
|
|
|
**Resolution**:
|
|
- Removed `roles` from `Identity` interface and TypeBox schema. Now `{ id, scopes, resources }` — matches keypal's `ApiKeyMetadata` exactly.
|
|
- Renamed `AccessControl.requiredRoles` → `requiredScopesAny` (OR semantics for "any of these scopes").
|
|
- Added Access Control Model section to operations.md explaining how keypal scopes/resources map to AccessControl checks.
|
|
- Updated call-graph.md `CallEventMap` and error model to match.
|
|
- All 16 core tests pass.
|
|
|
|
---
|
|
|
|
### I11. "Kept from ade_spoke" section includes new designs — ✅ resolved (with C2)
|
|
|
|
**Resolution**: Section renamed to "From core (shared with spoke)" and new designs moved or reclassified.
|
|
|
|
---
|
|
|
|
### I12. SSE vs WebSocket clarification — ✅ resolved
|
|
|
|
**Resolution**: Added clarification to call-graph.md: WebSocket is primary bidirectional transport for hub↔spoke and hub↔client-spoke. SSE exists for compatibility (OpenAI proxy, legacy clients) but is not preferred. A client connecting as a spoke gets full bidirectional communication over a single WebSocket. Updated AGENTS.md constraint to match. Updated hub-architecture.md hub responsibilities.
|
|
|
|
---
|
|
|
|
### I13. WebSocketEventTarget: hub-side spec — ✅ resolved (architectural task noted)
|
|
|
|
**Resolution**: Added "Hub-Side WebSocket Handling (Architectural Task)" section to spoke-runner.md outlining the needed components: Hono WebSocket upgrade, per-connection WebSocketEventTarget + PendingRequestMap, spoke lifecycle management, identity/authentication at upgrade. Flagged as architectural task needing deeper design before implementation.
|
|
|
|
---
|
|
|
|
### I14. Container Manager → Container Spoke (deferred) — ✅ resolved
|
|
|
|
**Resolution**: Renamed "Container Manager" → "Container Spoke (deferred)" in hub-architecture.md. Added "Container Spoke (deferred)" spoke type to spoke-runner.md explaining it extends base spoke with Docker + opencode lifecycle. Prerequisite: working hub + minimal base spoke first. Also added a vast.ai variant note.
|
|
|
|
---
|
|
|
|
### I15. OpenAI Proxy needs a doc home — ✅ resolved
|
|
|
|
**Resolution**: Added "OpenAI proxy — LLM provider proxy, key management, rate limiting (blocks all LLM usage)" to hub modules in packages.md. Added "Proxy LLM calls" to hub responsibilities in hub-architecture.md.
|
|
|
|
---
|
|
|
|
### I16. `ade_spoke` / `ade-v0` / `open-coordinator` unexplained external references — ✅ resolved (with I2)
|
|
|
|
**Resolution**: AGENTS.md provenance now explains predecessor project context. Sources say "Copied from predecessor project" instead of cryptic paths. open-coordinator references removed from architecture docs (it's a dev tool, not project code).
|
|
|
|
---
|
|
|
|
### I17. Open questions not cross-referenced between docs — ✅ resolved
|
|
|
|
**Resolution**: Added cross-references between hub-architecture.md (API auth question) and spoke-runner.md (WebSocket auth question). Updated container lifecycle question in spoke-runner.md to reference the deferred container spoke. These cross-references should help reduce future drift since it's obvious when a related doc needs updating.
|
|
|
|
---
|
|
|
|
### I18. AGENTS.md: "call ≡ subscribe at protocol level" ambiguous — ✅ resolved
|
|
|
|
**Resolution**: Expanded in AGENTS.md to: "see call-graph.md: a call resolves after one event, a subscription stays open and yields events until stopped. Same message format, different consumption pattern."
|
|
|
|
---
|
|
|
|
## 🔵 Gaps — Missing Info (Not Contradictory)
|
|
|
|
| # | Gap | Where | Status | Suggested Fix |
|
|
|---|-----|-------|--------|---------------|
|
|
| G1 | `detections` table not in storage docs | coordination.md, storage.md | ✅ resolved | Added to storage.md table list |
|
|
| G2 | MCP client vs MCP server not distinguished | packages.md | ✅ resolved | Added clarification: MCP client in core (spokes need it), MCP server hub-only |
|
|
| G3 | No Deno version specified | AGENTS.md | ✅ resolved | Added: "latest stable, currently 2.6.9" |
|
|
| G4 | Do `hub/` and `spoke/` dirs exist? | AGENTS.md workspace structure | ✅ resolved | All three package dirs exist |
|
|
| G5 | Keypal version "close enough" | AGENTS.md | ✅ resolved | Updated to note "behind npm — needs tag update" |
|
|
| G6 | `DbType.Table` not explained | AGENTS.md | ✅ resolved | Added explanation: "from our prior project's storage layer — use drizzle-typebox pattern instead" |
|
|
| G7 | Graphology "not installed yet" may be stale | AGENTS.md | ✅ resolved | Verified: not in deno.json yet, updated phrasing |
|
|
| G8 | Provenance statuses undated | AGENTS.md | ✅ resolved | Rewrote provenance for clarity; historical context noted |
|
|
| G9 | `scripts/analyze_lint.ts` not explained | AGENTS.md | ✅ resolved | Verified exists; added description: in-house dev tool (filtering, stats for large lint output) |
|
|
|
|
---
|
|
|
|
## Resolution Log
|
|
|
|
| ID | Decision | Date | Rationale |
|
|
|----|----------|------|-----------|
|
|
| C1 | Fixed diagram: session writes go through hub, not direct Postgres | 2026-04-17 | Spokes have no Postgres connection; writes must go through hub operations |
|
|
| C2 | Rewrote "inherits spoke" to "shares core, adds orchestration" | 2026-04-17 | Actual dependency model is hub→core, spoke→core, not hub→spoke |
|
|
| C3 | Call protocol is initial implementation; removed stopgap language | 2026-04-17 | Stopgap/open-coordinator references were from a session that conflated dev plugin with project code. Call protocol is project code |
|
|
| C4 | Coordination ops use call protocol (env.*) not registry.execute() | 2026-04-17 | registry.execute() was POC pattern; call protocol provides abort cascading and observability that coordination needs |
|
|
| C5 | PendingRequestMap is in core, not hub | 2026-04-17 | Both hub and spoke need it; core's buildEnv() references it |
|
|
| I1-I6 | AGENTS.md provenance and reference deps rewritten for clarity | 2026-04-17 | Eliminated duplicated rows, clarified rules about external refs vs reference deps, fixed version info, added list to MCP tools |
|
|
| I7/I8/I9 | Storage doc centralized all table schemas; removed inline duplications | 2026-04-17 | Renamed storage-pattern.md → storage.md; coordination.md and call-graph.md now link to it; added detections table, status column on mappings, full column lists |
|
|
| I10 | Removed roles from Identity; renamed requiredRoles → requiredScopesAny | 2026-04-17 | With keypal as single auth, "roles" are scope bundles (convention), not a type. Identity now { id, scopes, resources } matching keypal's ApiKeyMetadata. AccessControl.requiredRoles → requiredScopesAny |
|
|
| I12 | SSE/WebSocket transport distinction clarified | 2026-04-17 | WebSocket primary for all bidirectional communication; SSE for compatibility only. Updated call-graph.md, AGENTS.md, hub-architecture.md |
|
|
| I13 | Hub-side WebSocket handling flagged as architectural task | 2026-04-17 | Added spec outline to spoke-runner.md; needs deeper design |
|
|
| I14 | Renamed Container Manager → Container Spoke (deferred) | 2026-04-17 | Extends base spoke with Docker/opencode lifecycle. Prerequisite: working hub + minimal spoke first |
|
|
| I15 | OpenAI proxy added to hub module list and responsibilities | 2026-04-17 | Added to packages.md and hub-architecture.md |
|
|
| I16 | open-coordinator references removed from architecture docs | 2026-04-17 | It's a dev tool for local agent coordination, not a project dependency |
|
|
| I17 | Cross-references added between hub and spoke open questions | 2026-04-17 | Auth and container questions now link between docs |
|
|
| I18 | "call ≡ subscribe" expanded with explanation and link | 2026-04-17 | AGENTS.md now explains: call resolves after one event, subscribe streams until stopped |
|
|
|
|
---
|
|
|
|
## Superseding Resolutions (2026-05-18 Core Library Extraction)
|
|
|
|
The following findings from this review have been further resolved by the extraction of `@alkdev/operations` v0.1.0 and `@alkdev/pubsub` v0.1.0 to npm. The original resolution in each case was correct at the time; these notes record the additional progress.
|
|
|
|
| Finding | Original Issue | Additional Resolution |
|
|
|---------|---------------|----------------------|
|
|
| C5 | PendingRequestMap is in core, not hub | **Further resolved**: PendingRequestMap is now in `@alkdev/operations` package with full implementation (not just an interface). Resolved by core library extraction to `@alkdev/operations`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`. |
|
|
| I2 | `env.ts` has PendingRequestMap interface only | **Further resolved**: Full PendingRequestMap class is now in `@alkdev/operations` with `call()`, `subscribe()`, `respond()`, `emitError()`, `complete()`, and `abort()`. Resolved by core library extraction to `@alkdev/operations`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`. |
|
|
| I5 | `OperationContext.pubsub` typed as unknown | **Further resolved**: `pubsub` field has been removed from OperationContext in `@alkdev/operations`. Subscriptions use `PendingRequestMap.subscribe()` instead. Resolved by core library extraction to `@alkdev/operations`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`. |
|
|
| I6 | `OperationContext.stream` never populated | **Further resolved**: `stream` field has been removed from OperationContext in `@alkdev/operations`. Resolved by core library extraction to `@alkdev/operations`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`. |
|
|
| I7 | `@repeaterjs/repeater` version mismatch risk | **Further resolved**: Repeater is now inlined in `@alkdev/pubsub`, eliminating the external dependency and version mismatch risk. Resolved by core library extraction to `@alkdev/pubsub`. See `docs/reviews/core-library-extraction-sync-2026-05-18.md`. |
|
|
|
|
---
|
|
|
|
## Remaining Open Items
|
|
|
|
All items from this review have been resolved. Future architecture work that was identified:
|
|
|
|
1. **Hub-side WebSocket handling** (I13) — spec outline added, needs deeper design before implementation
|
|
2. **Container spoke** (I14) — deferred until hub + minimal spoke are working
|
|
3. **Instruction firewall** — future project for safe bash/filesystem access from untrusted agent roles
|
|
4. **Message/part schema iteration** — storage.md has structure, detailed data shapes need more work
|
|
7. **I17** — Cross-reference open questions between docs
|
|
8. **I18** — "call ≡ subscribe" needs clarification
|
|
9. **G1/G2/G3/G9** — Small gaps (detections table, MCP client/server, Deno version, lint script) |