Replace the Vercel AI SDK with direct OpenAI SDK calls and a custom AgentLoop. The AI SDK has zero runtime integration today, so removing it costs nothing. Supply chain risk (2-5 releases/day, April 2026 Vercel breach, bus factor of 1) makes it a liability we don't need. Key changes: - ADR-018 accepted: openai package (zero runtime deps) replaces ai SDK - AgentLoop handles multi-step tool execution explicitly (~300 LOC vs AI SDK's ~2700 LOC streamText) - Hub owns UIMessage/UIPart/ToolCallState types (extends ADR-016) - Hub owns streaming protocol (subset of AI SDK's UIMessageChunk wire format with step boundaries, error handling, usage tracking) - operationToOpenAITool() maps TypeBox schemas directly, no adapter - Trade-off: ~1100 LOC total new code for the savings of 6+ transitive deps, supply chain risk, and release cadence coupling Updates AGENTS.md constraints and dependencies, adds OQ-63/OQ-64/OQ-65 and Theme 11 (Inference & LLM Integration) to open questions.
44 KiB
status, last_updated
| status | last_updated |
|---|---|
| reviewed | 2026-05-26 |
Open Questions Tracker
Cross-cutting compilation of all unresolved questions across the hub architecture documents, organized by theme. Questions that appear in multiple documents are unified here with cross-references.
How to Use This Document
- Each question has an ID (e.g., OQ-01), status, origin (which doc(s)), and priority assessment
- Cross-references link related questions that may conflict or answer each other
- When a question is resolved, update its status to
resolvedand add a resolution note - Once all questions in a theme are resolved, the theme section can be removed
Table of Contents
| Theme | Questions | Focus |
|---|---|---|
| 1. Authentication & Authorization | OQ-01–OQ-05 | Auth models, permissions, SSO |
| 2. Spoke Connectivity & Lifecycle | OQ-06–OQ-11, OQ-61–OQ-62 | Spoke provisioning, WebSocket, concurrent ops, dev spoke |
| 3. Data Integrity & Lifecycle | OQ-12–OQ-15 | Deletion, retention, truncation, FK enforcement |
| 4. Session & Schema Design | OQ-16–OQ-19 | Message schema, compaction, versioning, nesting |
| 5. Configuration & Infrastructure | OQ-20–OQ-25 | Config reload, CI/CD, SSL, tokenEnv, secret refs |
| 6. Role & Identity Management | OQ-26–OQ-29 | Role sync, inheritance, dynamic creation, switching |
| 7. Task Management | OQ-30–OQ-33 | Task storage, bulk updates, embeddings |
| 8. Deployment & Operations | OQ-34–OQ-37 | Migrations, hot spare, observability, Redis topology |
| 9. Cross-Cutting Implementation Gaps | OQ-38–OQ-50 | Startup, config, logger, Gitea, keypal, auth, schemas |
| 10. Future / Low Priority | OQ-51–OQ-60 | Phase 3+, memory, versioning, visualization |
| 11. Inference & LLM Integration | OQ-63–OQ-65 | Streaming protocol, SDK choice, part persistence |
Resolved by ADRs
| ADR | Questions Resolved |
|---|---|
| ADR-014 | OQ-21, OQ-22, OQ-34, OQ-35, OQ-36, OQ-37 |
| ADR-015 | OQ-16, OQ-17, OQ-26, OQ-28, OQ-51, OQ-55 |
| ADR-016 | OQ-18, OQ-19 (confirmed) |
| ADR-017 | OQ-26, OQ-28, OQ-51 (overlaps with ADR-015) |
| ADR-018 | OQ-16 (extended to TypeScript types) |
Theme 1: Authentication & Authorization
OQ-01: What is the API authentication model?
- Origin: hub-architecture.md OQ-2
- Status: open
- Priority: high — blocks all authenticated endpoints
- Question: Should the hub use API keys with the keypal pattern, a simpler token auth stopgap, or something else? This affects every authenticated endpoint in the system.
- Cross-references: OQ-02 (WebSocket auth), OQ-43 (MCP auth)
OQ-02: How does WebSocket authentication work for spoke connections?
- Origin: spoke-runner.md OQ-4
- Status: open
- Priority: high — blocks all spoke connections
- Question: Should the spoke authenticate via token in the first message after connect, token in the query string, or token in the subprotocol header? This also affects the
SpokeConfig.authformat — the config system currently supportstokenFilebut the actual auth protocol is undefined. - Cross-references: OQ-01 (API auth model), OQ-46 (spoke config auth field), infrastructure.md Security section
OQ-03: How are permissions enforced at the call protocol layer?
- Origin: agent-roles.md OQ-2
- Status: resolved
- Priority: high
- Resolution:
OperationContext.identitycarries the resolved permissions fromsessions.data.scope. TheCallHandlerevaluatesAccessControl.requiredScopesagainst the session's resolved scope. The principal-agent framework ensures delegated permissions are properly intersected.
OQ-04: How are service accounts provisioned?
- Origin: agent-roles.md OQ-6
- Status: narrowed
- Priority: medium
- Question: Does the hub need a
hub.createAccountoperation for programmatic service account creation, or is manual creation (with keypal CLI) sufficient for v1? LLM-specific email conventions (e.g.,glm-5.1@alk.dev) are deployment-specific, not core architecture. Git attribution for LLM accounts uses the account'sgiteaUsername— this is a config concern, not an auth architecture concern. - Narrowed by: ADR-017 — LLM accounts are service accounts with specific scopes, same pattern as any other automated identity. V1: manual creation. Future:
hub.createAccountoperation.
OQ-05: Should the hub integrate with git providers via SSO?
- Origin: hub-architecture.md OQ-3
- Status: narrowed
- Priority: low
- Question: Originally: should
api.alk.devshare sessions with Gitea? Narrowed: git provider integration (Gitea, GitHub, etc.) is a spoke concern via operations, not through SSO. The dev spoke exposes git operations. SSO with any specific git provider is out of scope for a generalized hub. For v1, git access is through the dev spoke's git operations, not shared auth. - Narrowed by: ADR-015 and ADR-017
Theme 2: Spoke Connectivity & Lifecycle
OQ-06: How does a spoke receive its project context?
- Origin: spoke-runner.md OQ-1
- Status: open
- Priority: high — blocks spoke provisioning
- Question: Does the hub tell the spoke which git repo to clone, or does the spoke come pre-configured with a project? For the dev spoke specifically: the spoke binary connects to the hub, receives project/workspace context via
hub.register, and clones/checks out the relevant repo. The exact protocol needs specification. - Cross-references: OQ-07 (source sync), OQ-61 (dev spoke operations)
OQ-07: How does source sync work for external compute?
- Origin: spoke-runner.md OQ-3
- Status: open
- Priority: medium
- Question: For GPU compute spokes on vast.ai — does the spoke clone from Gitea automatically, or does the hub push source to it?
- Cross-references: OQ-06 (project context)
OQ-08: Can a spoke handle concurrent operations?
- Origin: spoke-runner.md OQ-5
- Status: narrowed
- Priority: medium → low
- Question: Originally: can a spoke handle multiple
call.requestedevents concurrently? Narrowed: the hub doesn't decide this — the spoke does. The hub dispatchescall.requestedand the spoke processes it. A spoke can process concurrently (multiple handlers) or serially (queue). The only hub-side question is whether the hub should respect a spoke-advertised concurrency limit in itshub.registerpayload (default: 1). This is a minor spoke registration enhancement, not an architectural question. - Cross-references: OQ-09 (operation list freshness)
OQ-09: When does a spoke re-register its operations?
- Origin: spoke-runner.md OQ-6
- Status: narrowed
- Priority: low
- Resolution: For v1, re-register on reconnect only. Spokes disconnect and reconnect (the call protocol handles abort cascading for in-flight calls). Push-based registry updates are a v2 enhancement. The
hub.registercall on reconnect is sufficient for v1.
OQ-10: What is the design for the hub-side WebSocket handler?
- Origin: spoke-runner.md Hub-Side WebSocket Handling section
- Status: open
- Priority: high — blocks all spoke functionality
- Question: What is the full design for the hub-side WebSocket handler? This includes: Hono WebSocket upgrade handler, per-connection
WebSocketEventTarget, per-connectionPendingRequestMap, spoke lifecycle management (connect/register/heartbeat/disconnect), identity/authentication integration, and reconnection state recovery. Currently described as "an architectural task that needs deeper design" with no spec. - Cross-references: OQ-02 (WebSocket auth), OQ-06 (spoke project context — constrains handler message types), OQ-08 (concurrent operations)
OQ-11: Dev spoke and compute spoke lifecycle
- Origin: spoke-runner.md OQ-2, hub-architecture.md Components table
- Status: narrowed
- Priority: low
- Question: Originally: "Container spoke extends base spoke with Docker container lifecycle management and opencode integration." Narrowed by ADR-015: the dev spoke is a compiled Deno binary (not an opencode container) that exposes dev operations over the standard call protocol. Compute spokes (GPU, vast.ai) are separate spoke types. The container spoke concept is deferred — the dev spoke replaces it for v1.
- Cross-references: OQ-06 (project context), OQ-61 (dev spoke operations)
OQ-61: What operations does the dev spoke expose?
- Origin: ADR-015
- Status: open
- Priority: medium
- Question: The dev spoke replaces opencode's tool suite. What operations does it expose? Minimum:
dev.bash.exec,dev.fs.read,dev.fs.write,dev.fs.list,dev.git.status,dev.git.diff,dev.git.commit,dev.git.clone,dev.git.checkout. Web search may be a hub-native operation or a separate spoke. The exact operation set and their input/output schemas need specification. - Cross-references: OQ-06 (project context), OQ-11 (dev spoke lifecycle)
OQ-62: How is the dev spoke distributed and configured?
- Origin: ADR-015
- Status: open
- Priority: medium
- Question: The dev spoke is a compiled Deno binary. How is it distributed (Docker image, binary download, package manager)? How is it configured (hub URL, auth token, project context)? Does it use
SpokeConfigfrom hub-config.md or a separate config format? The spoke doesn't have Postgres or Redis — just a WebSocket connection to the hub and local tools. - Cross-references: OQ-06 (project context), OQ-46 (spoke config auth field)
Theme 3: Data Integrity & Lifecycle
OQ-12: Operation deletion and call graph referential integrity
- Origin: call-graph.md OQ-1, storage/spokes.md OQ-1
- Status: open
- Priority: high — blocks operation lifecycle management
- Question: The
call_graph_nodes.operationIdcolumn has a RESTRICT FK tooperations.id. An operation cannot be deleted while any call records reference it. Two strategies proposed: (a) deny removal while call records exist, or (b) reassign referencing call records to a sentinel__removed__operation. MakingoperationIdnullable in flowgraph'sCallNodeAttrsis another option. This needs coordination with the@alkdev/flowgraphpackage. - Cross-references: OQ-13 (call graph retention interacts with deletion constraints)
OQ-13: Call graph retention policy
- Origin: storage/call-graph.md Retention Policy section, storage/README.md OQ-3
- Status: open
- Priority: medium
- Question: Need TTL-based cleanup of completed/failed call graph records older than N days, with aggregation for observability. Default 90 days is specified but no config field exists yet. Aggregation for observability (dashboards, metrics) is deferred to Phase 2.
- Cross-references: OQ-12 (operation deletion)
OQ-14: Call graph payload truncation strategy
- Origin: storage/call-graph.md
- Status: open
- Priority: medium
- Question: Strategy defined (10KB threshold,
{ _truncated }format) but no config field exists for the threshold. Payload redaction strategy also needs config fields. Object storage for payloads exceeding the truncation threshold is Phase 2 and not yet implemented. - Cross-references: OQ-13 (retention policy)
OQ-15: Polymorphic FK enforcement for providerId
- Origin: storage/spokes.md OQ-2
- Status: open
- Priority: medium
- Question:
providerIdinspokestable references different parent tables depending onspokeType(eitherdev_env_spokesorcompute_spokes). Current approach is application-layer enforcement. Alternatives (two nullable FK columns, DB triggers) are deferred.
Theme 4: Session & Schema Design
OQ-16: Session/message schema finalization
- Origin: agent-sessions.md Schema Research Needed section, storage/sessions.md
- Status: resolved by ADR-016
- Priority: high → medium (unblocked, narrower scope)
- Resolution: The hub defines its own canonical message/part format based on AI SDK
UIMessage+ parts. Opencode's format is an import concern, not an architectural constraint. The hub's format stays close to opencode's for import compatibility but is self-determined. The remaining design work is specifying the hub's exact part types and session data shapes — this is an implementation task, not an open architectural question. - Cross-references: OQ-17 (compaction), OQ-19 (part nesting)
OQ-17: Session message compaction
- Origin: agent-sessions.md, storage/README.md OQ-2
- Status: resolved by ADR-016
- Priority: medium → low
- Resolution: Compaction is an opencode concept (LLM-driven summarization). The hub may need message pruning (server-side truncation of old messages for API response size), but this is different from compaction. For v1, full message history is served. Pruning is a potential future optimization, not a current design concern.
OQ-18: Message data versioning
- Origin: storage/README.md OQ-1
- Status: resolved by ADR-016
- Priority: medium
- Resolution: The hub's
dataJSONB columns are implicitly versioned by their TypeBox schema evolution. No separateversioncolumn is needed. Each schema change is documented in migration history, and the TypeBox schemas in the codebase are the source of truth. Opencode'sversioncolumn was an opencode concern, not a hub pattern.
OQ-19: Part nesting
- Origin: storage/sessions.md
- Status: resolved by ADR-016
- Priority: low
- Resolution: Flat parts with
messageIdFK for v1. If nesting becomes necessary for the hub's own use cases (e.g., tool results containing sub-parts), aparentIdcolumn can be added. No need to carry opencode's tree structure.
Theme 5: Configuration & Infrastructure
OQ-20: Config reload without restart
- Origin: hub-config.md OQ-1, hub-startup.md OQ-2
- Status: open
- Priority: medium
- Question: For non-encrypted fields (logLevel, cache TTLs), should SIGHUP or an API call trigger re-reading the config file? Encrypted fields would need the master key to remain in memory, which the current design explicitly avoids after startup. Currently config is read-once; changes require restart.
- Cross-references: OQ-21 (CI/CD config generation), OQ-22 (config layers)
OQ-21: Config file generation for CI/CD
- Origin: hub-config.md OQ-2
- Status: resolved by ADR-014
- Priority: high → resolved
- Resolution: In the Docker deployment model, config files are pre-encrypted by the operator (using
alkhub-config) and mounted at runtime. The Docker secret provides the master key. CI/CD doesn't need the master key — it doesn't encrypt. Config files are built into the Docker image or mounted as volumes. Thealkhub-configCLI runs on the operator's machine, not in CI/CD.
OQ-22: Multiple config file layers
- Origin: hub-config.md OQ-4
- Status: resolved by ADR-014
- Priority: low → resolved
- Resolution: Docker Compose handles environment variation via different config files mounted at different paths. Dev: local decrypted config. Prod: pre-encrypted config mounted as read-only volume. No overlay system needed — the Docker model makes this straightforward with volume mounts.
OQ-23: What are the production SSL/TLS requirements for PostgresConfig?
- Origin: hub-config.md PostgresConfig section
- Status: resolved by ADR-014
- Priority: medium → low
- Resolution: In the Docker deployment model, Postgres and the hub run in the same Docker network. TLS between containers in the same network is not required — Docker network policies handle isolation. TLS termination happens at the reverse proxy (nginx/caddy) for external traffic.
PostgresConfig.ssl: booleanis sufficient for v1. If a future deployment topology puts Postgres on a different network, aPostgresSslConfigobject can be added, but same-network Docker deployment doesn't need it.
OQ-24: HTTPServiceConfig.auth.tokenEnv deprecation
- Origin: hub-config.md, operations.md
- Status: open
- Priority: high — security violation
- Question:
HTTPServiceConfig.auth.tokenEnvis deprecated and should be removed. Thefrom_openapi.tslineDeno.env.get(config.auth.tokenEnv)is a bug that violates the "no secrets in env vars" rule. All outbound auth tokens should be resolved fromclient_secretsviasecretKeywiring. This needs to be removed and replaced before any production use.
OQ-25: Secret reference resolution ordering
- Origin: hub-config.md OQ-7
- Status: open
- Priority: medium
- Question: Should
resolveSecretRefsfail at startup if a referencedsecretKeydoesn't exist inclient_secretsyet? Current preference: fail at startup for clients that areenabled: true. If a client is disabled, the missing secret is logged as a warning and left unresolved.
Theme 6: Role & Identity Management
OQ-26: Role import/sync operation
- Origin: agent-roles.md OQ-1, storage/README.md OQ-9 (partial), storage/roles.md
- Status: resolved by ADR-017
- Priority: medium → resolved
- Resolution: The hub is database-first for roles from day one. No
roles.syncfrom.opencode/agents/*.mdis needed. Role definitions are seeded by migrations for built-in roles. Custom roles are created viahub.createRole. Opencode's.opencode/agents/file format is an opencode concern, not a hub concern.
OQ-27: Role inheritance with permission resolution
- Origin: agent-roles.md OQ-8
- Status: open
- Priority: medium
- Question: When a role has a
parentId, its permissions are unioned with the parent's, with the child's rules taking priority in case of conflict. Max depth: 3 levels. Circular inheritance is prevented at role creation time. The description exists but the implementation is not yet specified. - Cross-references: OQ-26 (role sync — resolved, but inheritance still needs implementation)
OQ-28: Dynamic role creation
- Origin: agent-roles.md OQ-3
- Status: resolved by ADR-017
- Priority: low → resolved
- Resolution: The hub supports
hub.createRolefor programmatic role creation. Opencode'sAgent.generate()pattern (on-the-fly LLM-driven role creation) is not a hub concern. Roles are DB records, created via hub operations.
OQ-29: Per-session role switching
- Origin: agent-roles.md OQ-4
- Status: narrowed
- Priority: medium
- Question: Originally: should a session be able to change roles mid-conversation, like opencode supports? Narrowed by ADR-015: this is a hub-only concern. The hub binds role at session creation.
session.updateRoleis a potential operation if needed, but v1 roles are bound at creation. No opencode agent model to reconcile.
Theme 7: Task Management
OQ-30: Task storage and sync implementation
- Origin: storage/README.md OQ-9
- Status: open
- Priority: high
- Question: The database is the source of truth for tasks; markdown files are the authoring surface. The sync operation (files → database) exists conceptually but is not yet implemented. This blocks the SDD workflow from using database-backed task tracking.
- Cross-references: OQ-26 (role sync — resolved, similar pattern)
OQ-31: Bulk task status updates
- Origin: storage/tasks.md OQ-2
- Status: open
- Priority: medium
- Question: Should completing a meta task auto-mark all sub-tasks as completed? Likely yes, but this is application-level logic that needs implementation.
OQ-32: Cross-project task dependencies
- Origin: storage/tasks.md OQ-3
- Status: open
- Priority: low
- Question: Not supported for v1. Application-layer validation prevents cross-project references. DB trigger guard deferred to Phase 2.
OQ-33: Task embeddings
- Origin: storage/tasks.md OQ-1
- Status: open
- Priority: low
- Question: Vector embeddings for similarity search.
metadataJSONB can hold an embedding reference later, or a separatetask_embeddingstable can be added. Deferred.
Theme 8: Deployment & Operations
OQ-34: Background migration vs. startup migration
- Origin: hub-startup.md OQ-1
- Status: resolved by ADR-014
- Priority: medium → resolved
- Resolution: Single-container deployment means migrations must complete before the hub serves. Background migration requires schema version negotiation that adds complexity without benefit. Migrations block startup. Docker's restart policy handles failures.
OQ-35: Hot spare / zero-downtime restart
- Origin: hub-startup.md OQ-3
- Status: resolved by ADR-014
- Priority: low → resolved
- Resolution: v1 is single-container deployment with Docker restart policy. Zero-downtime restart requires connection draining and session transfer, which is Phase 2. For v1, Docker restart with health checks is sufficient.
OQ-36: Startup observability
- Origin: hub-startup.md OQ-4
- Status: resolved by ADR-014
- Priority: low → resolved
- Resolution: The
/healthendpoint with step-level progress and structured JSON logging to stdout is sufficient. Docker'sHEALTHCHECKdirective and log aggregation handle the rest. No pub/sub startup events needed.
OQ-37: Redis deployment topology
- Origin: hub-architecture.md OQ-1
- Status: resolved by ADR-014
- Priority: medium → resolved
- Resolution: Redis runs in the same Docker network as the hub. Spokes connect via WebSocket, not Redis. Redis is hub-internal only. Latency between hub and Redis is negligible within the same Docker network. If a future topology needs Redis closer to remote spokes, that would be a spoke-level concern (a spoke-side Redis), not the hub's Redis.
Theme 9: Cross-Cutting Implementation Gaps
OQ-38: Hub startup implementation
- Origin: hub-startup.md — full startup sequence spec, no implementation yet
- Status: open
- Priority: high — blocks all functionality
- Question:
src/main.tsandstartHub()are not yet implemented. The full 11-step startup sequence is specified in hub-startup.md. This is the single most blocking implementation gap. - Cross-references: OQ-20 (config reload), OQ-24 (tokenEnv deprecation)
OQ-39: Hub-specific config in operations package
- Origin: operations.md Known Gaps
- Status: open
- Priority: high — blocks hub startup
- Question:
core/config/types.tsin the operations package has spoke-only config. Hub-specific config (postgres, redis, auth) needs to be added. This overlaps with the hub-config.md spec but the actual code doesn't exist yet. - Cross-references: OQ-38 (startup implementation)
OQ-40: Logger configuration
- Origin: operations.md Known Gaps
- Status: open
- Priority: medium
- Question:
core/logger/mod.tsis a stub that only logs the["logtape", "meta"]category. Needs proper config for app-level loggers. Hub startup Step 3 configures the logger, but the implementation is stub-level.
OQ-41: Gitea operations at startup
- Origin: storage/README.md OQ-7
- Status: narrowed
- Priority: medium
- Question: Originally: load Gitea swagger spec at startup and register ~300 operations via FromOpenAPI. Narrowed by ADR-015: Gitea integration is no longer a core hub dependency. Git operations come from the dev spoke (or a separate Gitea spoke via FromOpenAPI). Loading Gitea's OpenAPI spec at startup is optional — a future spoke can provide it. For v1, git operations are exposed through the dev spoke, not hub-native Gitea operations.
OQ-42: Keypal adapter testing
- Origin: storage/README.md OQ-4
- Status: open
- Priority: medium
- Question:
HubKeyStorage(the Drizzle adapter for keypal) needs comprehensive tests before production use.
OQ-43: MCP endpoint authentication detail
- Origin: mcp-server.md Auth section
- Status: open
- Priority: medium
- Question: "The MCP endpoint uses bearer token auth. Each runner gets a token at registration." No detail on token format, rotation, issuance, or how tokens are validated. This connects to OQ-01 (API auth model) and OQ-02 (WebSocket auth).
- Cross-references: OQ-01 (API auth model), OQ-02 (WebSocket auth)
OQ-44: Reactive vs. call graph requested semantics
- Origin: call-graph.md OQ-2
- Status: open
- Priority: medium
- Question: In
FlowGraph,call.requestedcreates a node inpendingstate. InWorkflowReactiveRoot,call.requestedmaps toNodeStatus.running. This is a deliberate semantic difference — the reactive model tracks execution progress while the call graph model tracks protocol state. But implementers must be aware that feeding the same event to both models produces different initial statuses.
OQ-45: Client config schema evolution
- Origin: storage/README.md OQ-8
- Status: open
- Priority: medium
- Question: Existing DB rows in
clients.configmay fail validation if the TypeBox schema changes. UsingType.Optional()for new fields helps, but breaking changes need a strategy. Full contract migration protocol is a pending task.
OQ-46: Spoke auth field format in config
- Origin: hub-config.md OQ-3
- Status: open
- Priority: high
- Question: The
SpokeConfig.authfield format is blocked on the spoke-runner WebSocket auth design (OQ-02). Config system supportstokenFilebut actual protocol is TBD. The dev spoke (ADR-015) will usetokenFileto read its auth token from a Docker secret or mounted file. - Cross-references: OQ-02 (WebSocket auth), OQ-62 (dev spoke distribution)
OQ-47: Config schema version
- Origin: hub-config.md OQ-5
- Status: open
- Priority: low
- Question:
BaseConfig.$schemais optional.alkhub-config initshould generate it. Implementation detail — doesn't block anything but supports forward compatibility and editor validation.
OQ-48: Cross-doc terminology migration
- Origin: storage/README.md OQ-5
- Status: open
- Priority: low
- Question: The ADR-005 rename from "runner" to "spoke" is done in primary specs but "runner/runnerId" references still exist in other architecture docs. Need updating for consistency.
OQ-49: ADR-012 migration
- Origin: ADR-012 — Proposed, not Accepted
- Status: open
- Priority: medium
- Question: ADR-012 proposes terminology changes (
sessions.agentName→roleName, etc.). The ADR is in "Proposed" status, not "Accepted". The storage docs already use "role" terminology, but the rename needs a migration plan and the ADR needs to be accepted or rejected.
OQ-50: Key rotation background sweep implementation
- Origin: storage-spec-phase1-resolutions.md D4
- Status: open
- Priority: high
- Question: Task
specify-key-rotation-protocoladdresses key rotation, and the protocol is described in storage/services.md, but the background sweep implementation (cron job that re-encryptsclient_secretsrows with the current key version) is not yet implemented. - Cross-references: OQ-24 (tokenEnv deprecation — more secrets flow through
client_secretsafter this), OQ-25 (secret reference resolution — sweep depends on correct secret ref wiring)
Theme 10: Future / Low Priority
OQ-51: Role database-authoritative (Phase 3)
- Origin: agent-roles.md Phase 3, storage/roles.md
- Status: resolved by ADR-017
- Priority: low → resolved
- Resolution: The hub is database-first from day one. There is no Phase 1 (file-based) or Phase 2 (file sync). Roles are defined in the
rolestable from the start, seeded by migrations. Markdown files are not part of the hub's role system.
OQ-52: Memory across sessions
- Origin: agent-roles.md OQ-7
- Status: deferred
- Priority: low
- Question: Should LLM accounts have persistent memory across sessions? This is separate from session message history. Could be a
memoriestable or vector store. Deferred — separate feature with no current requirement.
OQ-53: Task versioning
- Origin: storage/tasks.md OQ-4
- Status: open
- Priority: low
- Question: Should previous versions of task body be kept? Decision for v1: no versioning, just update in place.
OQ-54: High-contention task notes
- Origin: storage/tasks.md
- Status: open
- Priority: low
- Question: DB-level concatenation for append is specified, but consider separating into
task_notestable for high-contention scenarios.
OQ-55: Anthropic conversation import
- Origin: storage/README.md OQ-6
- Status: deferred by ADR-015
- Priority: low
- Question: Import script for Anthropic conversations. This is a nice-to-have research tool for re-importing past conversations, not a core feature. The format has likely changed since it was last relevant. Not shipped with the codebase.
OQ-56: ADR-013 out-of-scope items
- Origin: ADR-013
- Status: open
- Priority: low
- Question: Several items explicitly out of scope for ADR-013: bidirectional Zod ↔ TypeBox sync, runtime schema migration, auto-generation of TypeScript types from wire schemas, converting Zod
.transform()/.pipe()output types. May revisit if needed.
OQ-57: Call graph visualization
- Origin: call-graph.md What We Defer #2
- Status: open
- Priority: low
- Question: API only, no Sigma.js UI for v1.
OQ-58: Stream deduplication
- Origin: call-graph.md What We Defer #3
- Status: open
- Priority: medium
- Question:
Value.Hash({operationId, input})deduplication for multiple subscribers to the same stream. May be needed for subscription scalability.
OQ-59: requested_by edge in flowgraph
- Origin: call-graph.md What We Defer #4
- Status: open
- Priority: low
- Question: The
requested_byedge type is a storage-layer concept for identity tracing. It's persisted incall_graph_edgesbut not modeled in@alkdev/flowgraph'sCallEdgeAttrs. May be added to flowgraph in the future.
OQ-60: Full ujsx call templates
- Origin: call-graph.md What We Defer #1
- Status: open
- Priority: low
- Question: Currently using hardcoded workflow sequences.
@alkdev/flowgraph/componentprovidesOperation,Sequential,Parallel,Conditional,Mapcomponents for declarative template definition. Will adopt when workflow complexity justifies it.
Theme 11: Inference & LLM Integration
OQ-63: What is the exact subset of UIMessageChunk types the hub proxy emits?
- Origin: ADR-018
- Status: open
- Priority: medium
- Question: ADR-018 defines an initial subset of AI SDK's UIMessageChunk protocol for the hub's SSE streaming format. The initial set covers text, reasoning, tool call lifecycle, step boundaries, and error events. As features are added (e.g., source URLs, file attachments, dynamic tools), new chunk types need to be specified. Should the hub define a formal schema for its streaming protocol, or document it informally? How do we version the protocol if chunk types change?
- Cross-references: OQ-64 (raw HTTP vs SDK streaming)
OQ-64: Should the direct agent use the openai SDK's streaming API or raw HTTP?
- Origin: ADR-018
- Status: open
- Priority: low
- Question: The direct agent path can use the
openaiSDK's typed streaming API (client.chat.completions.create({ stream: true })) or raw HTTP for more control over SSE parsing. The SDK provides convenience (typed responses, automatic tool call accumulation) but adds abstraction. The proxy path must use raw HTTP (Hono SSE handler). Should both paths use the same approach for consistency, or is it acceptable to use the SDK for the direct agent and raw HTTP for the proxy? - Cross-references: OQ-63 (streaming protocol)
OQ-65: What is the buffered write strategy for part persistence?
- Origin: ADR-018
- Status: open
- Priority: medium
- Question: Streaming LLM responses produce many part updates (text deltas, state transitions, tool call results). Writing each delta as a separate database write would be extremely expensive. Options: (a) flush on
*-endevents (per-part commits — text parts committed when done, tool parts committed when complete), (b) flush onstep-finish(per-step commits — all parts in a step committed together), (c) flush onfinish(per-message commits — all parts committed when the agent turn is complete). Per-part (a) balances latency and write volume best for real-time SSE updates. - Cross-references: OQ-63 (streaming protocol defines when
*-endevents fire)
Cross-Cutting Dependencies
These questions block each other or share resolution paths:
-
API Auth Chain: OQ-01 → OQ-02 → OQ-43 → OQ-46 — The API auth model determines WebSocket auth, which determines MCP auth and spoke config format. Resolve top-down.
-
Spoke Connectivity Chain: OQ-06 → OQ-10 — Spoke provisioning can't work without the hub-side WebSocket handler. Resolve OQ-10 first.
-
Implementation Bootstrap: OQ-38 → OQ-39 → OQ-40 — Hub startup implementation needs hub config types and proper logger config. These are the minimum viable path to a running hub.
-
Config Security Chain: OQ-24 → OQ-25 → OQ-50 — Token env deprecation and secret reference resolution are intertwined. OQ-24 must be resolved (remove tokenEnv) before OQ-25 can be validated. After OQ-25, the key rotation background sweep (OQ-50) becomes more important because more secrets flow through
client_secrets. -
Data Lifecycle Chain: OQ-12 → OQ-13 → OQ-14 — Operation deletion strategy, call graph retention, and payload truncation interact. OQ-12 determines whether operations can be removed at all.
-
Inference Chain: OQ-63 → OQ-64, OQ-65 — The streaming protocol subset (OQ-63) determines what the direct agent and proxy need to produce. The SDK vs. raw HTTP choice (OQ-64) and the persistence strategy (OQ-65) depend on the protocol definition.
Summary Table
| ID | Question | Origin | Priority | Status |
|---|---|---|---|---|
| OQ-01 | API authentication model | hub-architecture | high | open |
| OQ-02 | WebSocket auth for spokes | spoke-runner | high | open |
| OQ-03 | Permission enforcement at call protocol | agent-roles | high | resolved |
| OQ-04 | Service account provisioning | agent-roles | medium | narrowed |
| OQ-05 | Git provider SSO integration | hub-architecture | low | narrowed |
| OQ-06 | Spoke project context | spoke-runner | high | open |
| OQ-07 | Source sync for external compute | spoke-runner | medium | open |
| OQ-08 | Concurrent spoke operations | spoke-runner | low | narrowed |
| OQ-09 | Spoke operation list freshness | spoke-runner | low | narrowed |
| OQ-10 | Hub-side WebSocket handler design | spoke-runner | high | open |
| OQ-11 | Dev spoke and compute spoke lifecycle | spoke-runner, hub-architecture | low | narrowed |
| OQ-12 | Operation deletion vs. call graph FK | call-graph, storage/spokes | high | open |
| OQ-13 | Call graph retention policy | storage/call-graph, storage/README | medium | open |
| OQ-14 | Call graph payload truncation config | storage/call-graph | medium | open |
| OQ-15 | Polymorphic FK for providerId |
storage/spokes | medium | open |
| OQ-16 | Session/message schema finalization | agent-sessions, storage/sessions | medium | resolved (ADR-016) |
| OQ-17 | Session message compaction | agent-sessions, storage/README | low | resolved (ADR-016) |
| OQ-18 | Message data versioning | storage/README | medium | resolved (ADR-016) |
| OQ-19 | Part nesting | storage/sessions | low | resolved (ADR-016) |
| OQ-20 | Config reload without restart | hub-config, hub-startup | medium | open |
| OQ-21 | CI/CD config generation | hub-config | high | resolved (ADR-014) |
| OQ-22 | Multiple config file layers | hub-config | low | resolved (ADR-014) |
| OQ-23 | PostgresConfig SSL details | hub-config | low | resolved (ADR-014) |
| OQ-24 | HTTPServiceConfig.auth.tokenEnv deprecation | hub-config, operations | high | open |
| OQ-25 | Secret reference resolution ordering | hub-config | medium | open |
| OQ-26 | Role import/sync operation | agent-roles, storage/README, storage/roles | medium | resolved (ADR-017) |
| OQ-27 | Role inheritance with permission resolution | agent-roles | medium | open |
| OQ-28 | Dynamic role creation | agent-roles | low | resolved (ADR-017) |
| OQ-29 | Per-session role switching | agent-roles | medium | narrowed |
| OQ-30 | Task storage and sync implementation | storage/README | high | open |
| OQ-31 | Bulk task status updates | storage/tasks | medium | open |
| OQ-32 | Cross-project task dependencies | storage/tasks | low | open |
| OQ-33 | Task embeddings | storage/tasks | low | open |
| OQ-34 | Background vs. startup migration | hub-startup | medium | resolved (ADR-014) |
| OQ-35 | Hot spare / zero-downtime restart | hub-startup | low | resolved (ADR-014) |
| OQ-36 | Startup observability | hub-startup | low | resolved (ADR-014) |
| OQ-37 | Redis deployment topology | hub-architecture | medium | resolved (ADR-014) |
| OQ-38 | Hub startup implementation | hub-startup | high | open |
| OQ-39 | Hub-specific config in operations package | operations | high | open |
| OQ-40 | Logger configuration | operations | medium | open |
| OQ-41 | Gitea operations at startup | storage/README | medium | narrowed |
| OQ-42 | Keypal adapter testing | storage/README | medium | open |
| OQ-43 | MCP endpoint authentication detail | mcp-server | medium | open |
| OQ-44 | Reactive vs. call graph requested semantics | call-graph | medium | open |
| OQ-45 | Client config schema evolution | storage/README | medium | open |
| OQ-46 | Spoke auth field format in config | hub-config | high | open |
| OQ-47 | Config schema version | hub-config | low | open |
| OQ-48 | Cross-doc terminology migration | storage/README | low | open |
| OQ-49 | ADR-012 migration | decisions/ADR-012 | medium | open |
| OQ-50 | Key rotation background sweep | decisions/storage-spec-phase1 | high | open |
| OQ-51 | Role database-authoritative (Phase 3) | agent-roles, storage/roles | low | resolved (ADR-017) |
| OQ-52 | Memory across sessions | agent-roles | low | deferred |
| OQ-53 | Task versioning | storage/tasks | low | open |
| OQ-54 | High-contention task notes | storage/tasks | low | open |
| OQ-55 | Anthropic conversation import | storage/README | low | deferred |
| OQ-56 | ADR-013 out-of-scope items | decisions/ADR-013 | low | open |
| OQ-57 | Call graph visualization | call-graph | low | open |
| OQ-58 | Stream deduplication | call-graph | medium | open |
| OQ-59 | requested_by edge in flowgraph |
call-graph | low | open |
| OQ-60 | Full ujsx call templates | call-graph | low | open |
| OQ-61 | Dev spoke operations | ADR-015 | medium | open |
| OQ-62 | Dev spoke distribution and config | ADR-015 | medium | open |
| OQ-63 | Hub proxy SSE chunk type subset | ADR-018 | medium | open |
| OQ-64 | Direct agent: openai SDK vs raw HTTP | ADR-018 | low | open |
| OQ-65 | Part persistence buffered write strategy | ADR-018 | medium | open |
High Priority Open Questions (Blocking)
These questions block core functionality and should be resolved first:
| ID | Question | Blocks |
|---|---|---|
| OQ-01 | API authentication model | All authenticated endpoints |
| OQ-02 | WebSocket auth for spokes | All spoke connections |
| OQ-06 | Spoke project context | Spoke provisioning |
| OQ-10 | Hub-side WebSocket handler design | All spoke functionality |
| OQ-12 | Operation deletion vs. call graph FK | Operation lifecycle |
| OQ-24 | HTTPServiceConfig.auth.tokenEnv deprecation | Security (env var leak) |
| OQ-38 | Hub startup implementation | All functionality |
| OQ-39 | Hub-specific config in operations package | Hub startup |
| OQ-46 | Spoke auth field format in config | Spoke config |
| OQ-50 | Key rotation background sweep | Production secret management |
Resolution Priority Order
Suggested order for resolving the high-priority questions, based on dependency chains:
- OQ-38 + OQ-39 — Hub startup implementation + config types (enables everything)
- OQ-01 — API auth model (unblocks OQ-02, OQ-43, OQ-46)
- OQ-02 — WebSocket auth (unblocks OQ-10, OQ-46)
- OQ-10 — Hub-side WebSocket handler (enables spokes)
- OQ-24 — tokenEnv deprecation (security fix)
- OQ-12 — Operation deletion strategy (data integrity)
- OQ-06 — Spoke project context (spoke provisioning)
- OQ-50 — Key rotation sweep (production secret management)
Questions Resolved by ADRs
| ADR | Questions Resolved | Key Decisions |
|---|---|---|
| ADR-014 | OQ-21, OQ-22, OQ-23, OQ-34, OQ-35, OQ-36, OQ-37 | Docker as primary deployment; Redis/Postgres same network; config via mounted volumes; single-container restart; migrations block startup |
| ADR-015 | OQ-16, OQ-17, OQ-26, OQ-28, OQ-51, OQ-55 | Dev spoke replaces opencode integration; hub owns session format; opencode compat is import tool; opencode-spoke is optional future |
| ADR-016 | OQ-18, OQ-19 | Hub defines own canonical schema; JSONB implicitly versioned; flat parts for v1; import is compat tool |
| ADR-017 | OQ-26, OQ-28, OQ-51 | Database-first roles; seeded by migrations; hub.createRole for custom roles; no .opencode/agents/ file sync |