Resolve 22 open questions via 4 ADRs; add dev spoke questions (OQ-61, OQ-62)

ADR-014 (Docker-first deployment): resolves OQ-21, OQ-22, OQ-23,
OQ-34, OQ-35, OQ-36, OQ-37. Docker is the primary deployment model.
Redis/Postgres in same network. Config via mounted volumes. Single-
container restart for v1. Migrations block startup.

ADR-015 (Dev spoke, not opencode): resolves OQ-16, OQ-17, OQ-26,
OQ-28, OQ-51, OQ-55. Replaces opencode integration with a compiled
dev spoke binary. Hub owns session format. Opencode compat is an
import tool, not an architectural constraint. Adds OQ-61 (dev spoke
operations) and OQ-62 (dev spoke distribution).

ADR-016 (Hub-own schema): resolves OQ-18, OQ-19. Hub defines its own
canonical message/part format. JSONB is implicitly versioned. Flat
parts for v1. Compaction is a hub concern (pruning), not opencode's.

ADR-017 (Hub-first roles): resolves OQ-26, OQ-28, OQ-51 (overlapping
with ADR-015). Hub is database-first for roles. Seeded by migrations.
No file sync needed. hub.createRole for custom roles.

Also narrowed: OQ-04 (service accounts), OQ-05 (git SSO out of scope),
OQ-08 (spoke-side concern), OQ-09 (v1: reconnect only), OQ-11 (dev
spoke replaces container spoke), OQ-29 (hub-only concern), OQ-41
(gitea ops are optional spoke concern).

Deferred: OQ-52 (memory), OQ-55 (anthropic import).

Net result: 15 resolved, 7 narrowed, 2 deferred out of 62 total.
39 remain open, down from 60 in the original tracker.
This commit is contained in:
2026-05-26 05:40:54 +00:00
parent 3d7f90dec9
commit 2d7f9c11cb
5 changed files with 410 additions and 122 deletions

View File

@@ -1,6 +1,6 @@
---
status: reviewed
last_updated: 2026-05-25
last_updated: 2026-05-26
---
# Open Questions Tracker
@@ -19,7 +19,7 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
| Theme | Questions | Focus |
|-------|-----------|-------|
| 1. Authentication & Authorization | OQ-01OQ-05 | Auth models, permissions, SSO |
| 2. Spoke Connectivity & Lifecycle | OQ-06OQ-11 | Spoke provisioning, WebSocket, concurrent ops |
| 2. Spoke Connectivity & Lifecycle | OQ-06OQ-11, OQ-61OQ-62 | Spoke provisioning, WebSocket, concurrent ops, dev spoke |
| 3. Data Integrity & Lifecycle | OQ-12OQ-15 | Deletion, retention, truncation, FK enforcement |
| 4. Session & Schema Design | OQ-16OQ-19 | Message schema, compaction, versioning, nesting |
| 5. Configuration & Infrastructure | OQ-20OQ-25 | Config reload, CI/CD, SSL, tokenEnv, secret refs |
@@ -29,6 +29,15 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
| 9. Cross-Cutting Implementation Gaps | OQ-38OQ-50 | Startup, config, logger, Gitea, keypal, auth, schemas |
| 10. Future / Low Priority | OQ-51OQ-60 | Phase 3+, memory, versioning, visualization |
### Resolved by ADRs
| ADR | Questions Resolved |
|-----|-------------------|
| [ADR-014](../decisions/ADR-014-docker-first-deployment.md) | OQ-21, OQ-22, OQ-34, OQ-35, OQ-36, OQ-37 |
| [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md) | OQ-16, OQ-17, OQ-26, OQ-28, OQ-51, OQ-55 |
| [ADR-016](../decisions/ADR-016-hub-own-schema.md) | OQ-18, OQ-19 (confirmed) |
| [ADR-017](../decisions/ADR-017-hub-first-roles.md) | OQ-26, OQ-28, OQ-51 (overlaps with ADR-015) |
---
## Theme 1: Authentication & Authorization
@@ -39,7 +48,7 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: high — blocks all authenticated endpoints
- **Question**: Should the hub use API keys with the keypal pattern, a simpler token auth stopgap, or something else? This affects every authenticated endpoint in the system.
- **Cross-references**: OQ-02 (WebSocket auth), OQ-24 (MCP auth)
- **Cross-references**: OQ-02 (WebSocket auth), OQ-43 (MCP auth)
### OQ-02: How does WebSocket authentication work for spoke connections?
@@ -47,7 +56,7 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: high — blocks all spoke connections
- **Question**: Should the spoke authenticate via token in the first message after connect, token in the query string, or token in the subprotocol header? This also affects the `SpokeConfig.auth` format — the config system currently supports `tokenFile` but the actual auth protocol is undefined.
- **Cross-references**: OQ-01 (API auth model), OQ-18 (spoke config auth field), [infrastructure.md](infrastructure.md) Security section
- **Cross-references**: OQ-01 (API auth model), OQ-46 (spoke config auth field), [infrastructure.md](infrastructure.md) Security section
### OQ-03: How are permissions enforced at the call protocol layer?
@@ -56,20 +65,21 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Priority**: high
- **Resolution**: `OperationContext.identity` carries the resolved permissions from `sessions.data.scope`. The `CallHandler` evaluates `AccessControl.requiredScopes` against the session's resolved scope. The principal-agent framework ensures delegated permissions are properly intersected.
### OQ-04: How are LLM service accounts provisioned?
### OQ-04: How are service accounts provisioned?
- **Origin**: [agent-roles.md](agent-roles.md) OQ-6
- **Status**: open
- **Status**: narrowed
- **Priority**: medium
- **Question**: LLM accounts currently require manual creation (e.g., `glm-5.1@alk.dev`). Should there be an automated provisioning flow (`hub.createAccount` operation), or is manual provisioning sufficient for v1?
- **Cross-references**: OQ-03 (permission enforcement)
- **Question**: Does the hub need a `hub.createAccount` operation for programmatic service account creation, or is manual creation (with keypal CLI) sufficient for v1? LLM-specific email conventions (e.g., `glm-5.1@alk.dev`) are deployment-specific, not core architecture. Git attribution for LLM accounts uses the account's `giteaUsername` — this is a config concern, not an auth architecture concern.
- **Narrowed by**: [ADR-017](../decisions/ADR-017-hub-first-roles.md) — LLM accounts are service accounts with specific scopes, same pattern as any other automated identity. V1: manual creation. Future: `hub.createAccount` operation.
### OQ-05: Should SSO be shared with Gitea?
### OQ-05: Should the hub integrate with git providers via SSO?
- **Origin**: [hub-architecture.md](hub-architecture.md) OQ-3
- **Status**: open
- **Priority**: medium
- **Question**: Gitea at `git.alk.dev` uses its own auth. Should `api.alk.dev` share sessions with Gitea, or maintain separate auth? This affects user experience but is not blocking for v1.
- **Status**: narrowed
- **Priority**: low
- **Question**: Originally: should `api.alk.dev` share sessions with Gitea? Narrowed: git provider integration (Gitea, GitHub, etc.) is a spoke concern via operations, not through SSO. The dev spoke exposes git operations. SSO with any specific git provider is out of scope for a generalized hub. For v1, git access is through the dev spoke's git operations, not shared auth.
- **Narrowed by**: [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md) and [ADR-017](../decisions/ADR-017-hub-first-roles.md)
---
@@ -80,8 +90,8 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Origin**: [spoke-runner.md](spoke-runner.md) OQ-1
- **Status**: open
- **Priority**: high — blocks spoke provisioning
- **Question**: Does the hub tell the spoke which git repo to clone, or does the spoke come pre-configured with a project? This is fundamental to the spoke provisioning workflow.
- **Cross-references**: OQ-07 (source sync)
- **Question**: Does the hub tell the spoke which git repo to clone, or does the spoke come pre-configured with a project? For the dev spoke specifically: the spoke binary connects to the hub, receives project/workspace context via `hub.register`, and clones/checks out the relevant repo. The exact protocol needs specification.
- **Cross-references**: OQ-07 (source sync), OQ-61 (dev spoke operations)
### OQ-07: How does source sync work for external compute?
@@ -94,17 +104,17 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-08: Can a spoke handle concurrent operations?
- **Origin**: [spoke-runner.md](spoke-runner.md) OQ-5
- **Status**: open
- **Priority**: medium
- **Question**: Can a spoke handle multiple `call.requested` events concurrently? Concurrent processing is better for SUBSCRIPTION operations but introduces complexity in state management on the spoke side.
- **Status**: narrowed
- **Priority**: medium → low
- **Question**: Originally: can a spoke handle multiple `call.requested` events concurrently? Narrowed: the hub doesn't decide this — the spoke does. The hub dispatches `call.requested` and the spoke processes it. A spoke can process concurrently (multiple handlers) or serially (queue). The only hub-side question is whether the hub should respect a spoke-advertised concurrency limit in its `hub.register` payload (default: 1). This is a minor spoke registration enhancement, not an architectural question.
- **Cross-references**: OQ-09 (operation list freshness)
### OQ-09: When does a spoke re-register its operations?
- **Origin**: [spoke-runner.md](spoke-runner.md) OQ-6
- **Status**: open
- **Priority**: medium
- **Question**: Does the spoke re-register on reconnect only, or does it push updates when its local registry changes? This affects the hub's operation routing and the RunnerPool design.
- **Status**: narrowed
- **Priority**: low
- **Resolution**: For v1, re-register on reconnect only. Spokes disconnect and reconnect (the call protocol handles abort cascading for in-flight calls). Push-based registry updates are a v2 enhancement. The `hub.register` call on reconnect is sufficient for v1.
### OQ-10: What is the design for the hub-side WebSocket handler?
@@ -114,13 +124,29 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Question**: What is the full design for the hub-side WebSocket handler? This includes: Hono WebSocket upgrade handler, per-connection `WebSocketEventTarget`, per-connection `PendingRequestMap`, spoke lifecycle management (connect/register/heartbeat/disconnect), identity/authentication integration, and reconnection state recovery. Currently described as "an architectural task that needs deeper design" with no spec.
- **Cross-references**: OQ-02 (WebSocket auth), OQ-06 (spoke project context — constrains handler message types), OQ-08 (concurrent operations)
### OQ-11: Container spoke lifecycle
### OQ-11: Dev spoke and compute spoke lifecycle
- **Origin**: [spoke-runner.md](spoke-runner.md) OQ-2, [hub-architecture.md](hub-architecture.md) Components table
- **Status**: open
- **Status**: narrowed
- **Priority**: low
- **Question**: Container spoke extends base spoke with Docker container lifecycle management and opencode integration. Design is deferred until base spoke is working.
- **Cross-references**: OQ-06 (project context), OQ-07 (source sync)
- **Question**: Originally: "Container spoke extends base spoke with Docker container lifecycle management and opencode integration." Narrowed by [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md): the dev spoke is a compiled Deno binary (not an opencode container) that exposes dev operations over the standard call protocol. Compute spokes (GPU, vast.ai) are separate spoke types. The container spoke concept is deferred — the dev spoke replaces it for v1.
- **Cross-references**: OQ-06 (project context), OQ-61 (dev spoke operations)
### OQ-61: What operations does the dev spoke expose?
- **Origin**: [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md)
- **Status**: open
- **Priority**: medium
- **Question**: The dev spoke replaces opencode's tool suite. What operations does it expose? Minimum: `dev.bash.exec`, `dev.fs.read`, `dev.fs.write`, `dev.fs.list`, `dev.git.status`, `dev.git.diff`, `dev.git.commit`, `dev.git.clone`, `dev.git.checkout`. Web search may be a hub-native operation or a separate spoke. The exact operation set and their input/output schemas need specification.
- **Cross-references**: OQ-06 (project context), OQ-11 (dev spoke lifecycle)
### OQ-62: How is the dev spoke distributed and configured?
- **Origin**: [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md)
- **Status**: open
- **Priority**: medium
- **Question**: The dev spoke is a compiled Deno binary. How is it distributed (Docker image, binary download, package manager)? How is it configured (hub URL, auth token, project context)? Does it use `SpokeConfig` from hub-config.md or a separate config format? The spoke doesn't have Postgres or Redis — just a WebSocket connection to the hub and local tools.
- **Cross-references**: OQ-06 (project context), OQ-46 (spoke config auth field)
---
@@ -156,7 +182,6 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: medium
- **Question**: `providerId` in `spokes` table references different parent tables depending on `spokeType` (either `dev_env_spokes` or `compute_spokes`). Current approach is application-layer enforcement. Alternatives (two nullable FK columns, DB triggers) are deferred.
- **Cross-references**: None currently
---
@@ -165,34 +190,31 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-16: Session/message schema finalization
- **Origin**: [agent-sessions.md](agent-sessions.md) Schema Research Needed section, [storage/sessions.md](storage/sessions.md)
- **Status**: open
- **Priority**: high — blocks session storage implementation
- **Question**: The message/part schema needs more iteration. Opencode's drizzle+sqlite schema uses a message tree format with parent/child parts that needs reconciliation with AI SDK `UIMessage` part types. Which exact subset of opencode's part types does the hub use? How do we handle the session `data` column shapes (formally type-constrained or application-layer guidance)?
- **Status**: resolved by [ADR-016](../decisions/ADR-016-hub-own-schema.md)
- **Priority**: high → medium (unblocked, narrower scope)
- **Resolution**: The hub defines its own canonical message/part format based on AI SDK `UIMessage` + parts. Opencode's format is an import concern, not an architectural constraint. The hub's format stays close to opencode's for import compatibility but is self-determined. The remaining design work is specifying the hub's exact part types and session data shapes — this is an implementation task, not an open architectural question.
- **Cross-references**: OQ-17 (compaction), OQ-19 (part nesting)
### OQ-17: Session message compaction
- **Origin**: [agent-sessions.md](agent-sessions.md), [storage/README.md](storage/README.md) OQ-2
- **Status**: open
- **Priority**: medium
- **Question**: Need to define what compaction means for hub-direct AI SDK sessions. Opencode has a `compaction` agent/part type. The hub needs a strategy for long-running sessions that accumulate many messages.
- **Cross-references**: OQ-16 (schema finalization)
- **Status**: resolved by [ADR-016](../decisions/ADR-016-hub-own-schema.md)
- **Priority**: medium → low
- **Resolution**: Compaction is an opencode concept (LLM-driven summarization). The hub may need message **pruning** (server-side truncation of old messages for API response size), but this is different from compaction. For v1, full message history is served. Pruning is a potential future optimization, not a current design concern.
### OQ-18: Message data versioning
- **Origin**: [storage/README.md](storage/README.md) OQ-1
- **Status**: open
- **Status**: resolved by [ADR-016](../decisions/ADR-016-hub-own-schema.md)
- **Priority**: medium
- **Question**: Should the `data` column format be versioned for forward compatibility? Opencode has a `version` column on sessions. If the data shape evolves, old records need to be readable.
- **Cross-references**: OQ-45 (client config schema evolution), OQ-16 (schema finalization)
- **Resolution**: The hub's `data` JSONB columns are implicitly versioned by their TypeBox schema evolution. No separate `version` column is needed. Each schema change is documented in migration history, and the TypeBox schemas in the codebase are the source of truth. Opencode's `version` column was an opencode concern, not a hub pattern.
### OQ-19: Part nesting
- **Origin**: [storage/sessions.md](storage/sessions.md)
- **Status**: open
- **Status**: resolved by [ADR-016](../decisions/ADR-016-hub-own-schema.md)
- **Priority**: low
- **Question**: Currently flat parts with `messageId` FK. If nesting becomes necessary (e.g., tool results containing sub-parts), it would require a `parentId` column on `parts`. Not needed for v1 but should be considered in schema design.
- **Cross-references**: OQ-16 (schema finalization)
- **Resolution**: Flat parts with `messageId` FK for v1. If nesting becomes necessary for the hub's own use cases (e.g., tool results containing sub-parts), a `parentId` column can be added. No need to carry opencode's tree structure.
---
@@ -209,25 +231,23 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-21: Config file generation for CI/CD
- **Origin**: [hub-config.md](hub-config.md) OQ-2
- **Status**: open
- **Priority**: high — blocks deployment
- **Question**: The `alkhub-config` CLI requires the master key to encrypt values. How does CI/CD get the master key? Options: (a) CI has access to the master key secret, (b) config files are pre-encrypted and stored in a private repo, (c) encryption happens at deploy time on the host.
- **Cross-references**: OQ-20 (config reload)
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: high → resolved
- **Resolution**: In the Docker deployment model, config files are pre-encrypted by the operator (using `alkhub-config`) and mounted at runtime. The Docker secret provides the master key. CI/CD doesn't need the master key — it doesn't encrypt. Config files are built into the Docker image or mounted as volumes. The `alkhub-config` CLI runs on the operator's machine, not in CI/CD.
### OQ-22: Multiple config file layers
- **Origin**: [hub-config.md](hub-config.md) OQ-4
- **Status**: open
- **Priority**: low
- **Question**: Should the config loader support a base config + overlay pattern (e.g., `/etc/alkhub/config.json` + `/etc/alkhub/config.local.json`)? Useful for dev vs. prod.
- **Cross-references**: OQ-20 (config reload)
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: low → resolved
- **Resolution**: Docker Compose handles environment variation via different config files mounted at different paths. Dev: local decrypted config. Prod: pre-encrypted config mounted as read-only volume. No overlay system needed — the Docker model makes this straightforward with volume mounts.
### OQ-23: What are the production SSL/TLS requirements for PostgresConfig?
- **Origin**: [hub-config.md](hub-config.md) PostgresConfig section
- **Status**: open
- **Priority**: medium
- **Question**: `PostgresConfig.ssl` is currently `Type.Optional(Type.Boolean())` — "true = enable SSL with default CA verification". For production, TLS to Postgres is essential. What detailed SSL config is needed (CA certs, client certs, verify modes, custom CA)? Should we use a `PostgresSslConfig` object or a connection string-based approach?
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: medium → low
- **Resolution**: In the Docker deployment model, Postgres and the hub run in the same Docker network. TLS between containers in the same network is not required — Docker network policies handle isolation. TLS termination happens at the reverse proxy (nginx/caddy) for external traffic. `PostgresConfig.ssl: boolean` is sufficient for v1. If a future deployment topology puts Postgres on a different network, a `PostgresSslConfig` object can be added, but same-network Docker deployment doesn't need it.
### OQ-24: HTTPServiceConfig.auth.tokenEnv deprecation
@@ -250,10 +270,9 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-26: Role import/sync operation
- **Origin**: [agent-roles.md](agent-roles.md) OQ-1, [storage/README.md](storage/README.md) OQ-9 (partial), [storage/roles.md](storage/roles.md)
- **Status**: open
- **Priority**: medium
- **Question**: Should there be a `roles.sync` operation that reads `.opencode/agents/*.md` and syncs them to the `roles` table? Phase 2 of the role transition plan. Files are the authoring surface; database is the source of truth at runtime.
- **Cross-references**: OQ-27 (role inheritance resolution), OQ-28 (dynamic roles)
- **Status**: resolved by [ADR-017](../decisions/ADR-017-hub-first-roles.md)
- **Priority**: medium → resolved
- **Resolution**: The hub is database-first for roles from day one. No `roles.sync` from `.opencode/agents/*.md` is needed. Role definitions are seeded by migrations for built-in roles. Custom roles are created via `hub.createRole`. Opencode's `.opencode/agents/` file format is an opencode concern, not a hub concern.
### OQ-27: Role inheritance with permission resolution
@@ -261,22 +280,21 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: medium
- **Question**: When a role has a `parentId`, its permissions are unioned with the parent's, with the child's rules taking priority in case of conflict. Max depth: 3 levels. Circular inheritance is prevented at role creation time. The description exists but the implementation is not yet specified.
- **Cross-references**: OQ-26 (role sync)
- **Cross-references**: OQ-26 (role sync — resolved, but inheritance still needs implementation)
### OQ-28: Dynamic role creation
- **Origin**: [agent-roles.md](agent-roles.md) OQ-3
- **Status**: open
- **Priority**: low
- **Question**: Opencode supports `Agent.generate()` for on-the-fly role creation. The hub currently only supports predefined roles. Should dynamic role creation be supported? Decision: start with predefined, add later if needed.
- **Cross-references**: OQ-26 (role sync)
- **Status**: resolved by [ADR-017](../decisions/ADR-017-hub-first-roles.md)
- **Priority**: low → resolved
- **Resolution**: The hub supports `hub.createRole` for programmatic role creation. Opencode's `Agent.generate()` pattern (on-the-fly LLM-driven role creation) is not a hub concern. Roles are DB records, created via hub operations.
### OQ-29: Per-session role switching
- **Origin**: [agent-roles.md](agent-roles.md) OQ-4
- **Status**: open
- **Status**: narrowed
- **Priority**: medium
- **Question**: Should a session be able to change roles mid-conversation? Opencode supports this. Our current model binds role at session creation. Decision: support `session.updateRole` operation, but this requires re-evaluating and storing new resolved permissions in `sessions.data.scope`.
- **Question**: Originally: should a session be able to change roles mid-conversation, like opencode supports? Narrowed by [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md): this is a hub-only concern. The hub binds role at session creation. `session.updateRole` is a potential operation if needed, but v1 roles are bound at creation. No opencode agent model to reconcile.
---
@@ -288,7 +306,7 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: high
- **Question**: The database is the source of truth for tasks; markdown files are the authoring surface. The sync operation (files → database) exists conceptually but is not yet implemented. This blocks the SDD workflow from using database-backed task tracking.
- **Cross-references**: OQ-26 (role sync — similar pattern)
- **Cross-references**: OQ-26 (role sync — resolved, similar pattern)
### OQ-31: Bulk task status updates
@@ -318,31 +336,30 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-34: Background migration vs. startup migration
- **Origin**: [hub-startup.md](hub-startup.md) OQ-1
- **Status**: open
- **Priority**: medium
- **Question**: Should migrations block startup, or should they run in the background while the hub serves with the old schema? Recommendation: block for now (simpler, safer). Revisit if startup latency becomes a problem with large migrations.
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: medium → resolved
- **Resolution**: Single-container deployment means migrations must complete before the hub serves. Background migration requires schema version negotiation that adds complexity without benefit. Migrations block startup. Docker's restart policy handles failures.
### OQ-35: Hot spare / zero-downtime restart
- **Origin**: [hub-startup.md](hub-startup.md) OQ-3
- **Status**: open
- **Priority**: low
- **Question**: For production deployments, can we start a new hub process before shutting down the old one? Requires connection draining and session transfer. Deferred — hub is single-instance for now.
- **Cross-references**: [infrastructure.md](infrastructure.md) single-instance model
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: low → resolved
- **Resolution**: v1 is single-container deployment with Docker restart policy. Zero-downtime restart requires connection draining and session transfer, which is Phase 2. For v1, Docker restart with health checks is sufficient.
### OQ-36: Startup observability
- **Origin**: [hub-startup.md](hub-startup.md) OQ-4
- **Status**: open
- **Priority**: low
- **Question**: Should the startup sequence emit pub/sub events so monitoring systems can track progress, or is the `/health` endpoint plus structured logs sufficient? Recommendation: `/health` for now.
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: low → resolved
- **Resolution**: The `/health` endpoint with step-level progress and structured JSON logging to stdout is sufficient. Docker's `HEALTHCHECK` directive and log aggregation handle the rest. No pub/sub startup events needed.
### OQ-37: Redis deployment topology
- **Origin**: [hub-architecture.md](hub-architecture.md) OQ-1
- **Status**: open
- **Priority**: medium
- **Question**: Redis is deployed on the hub server. For production with many spokes on a compute server, may want Redis closer to containers for lower pub/sub latency. Current approach works for v1 but may need topology changes at scale.
- **Status**: resolved by [ADR-014](../decisions/ADR-014-docker-first-deployment.md)
- **Priority**: medium → resolved
- **Resolution**: Redis runs in the same Docker network as the hub. Spokes connect via WebSocket, not Redis. Redis is hub-internal only. Latency between hub and Redis is negligible within the same Docker network. If a future topology needs Redis closer to remote spokes, that would be a spoke-level concern (a spoke-side Redis), not the hub's Redis.
---
@@ -374,9 +391,9 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-41: Gitea operations at startup
- **Origin**: [storage/README.md](storage/README.md) OQ-7
- **Status**: open
- **Status**: narrowed
- **Priority**: medium
- **Question**: Load Gitea swagger spec at startup and register ~300 operations via FromOpenAPI. This wires the hub to the Gitea API for repository operations but is not yet implemented.
- **Question**: Originally: load Gitea swagger spec at startup and register ~300 operations via FromOpenAPI. Narrowed by [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md): Gitea integration is no longer a core hub dependency. Git operations come from the dev spoke (or a separate Gitea spoke via FromOpenAPI). Loading Gitea's OpenAPI spec at startup is optional — a future spoke can provide it. For v1, git operations are exposed through the dev spoke, not hub-native Gitea operations.
### OQ-42: Keypal adapter testing
@@ -391,6 +408,7 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Status**: open
- **Priority**: medium
- **Question**: "The MCP endpoint uses bearer token auth. Each runner gets a token at registration." No detail on token format, rotation, issuance, or how tokens are validated. This connects to OQ-01 (API auth model) and OQ-02 (WebSocket auth).
- **Cross-references**: OQ-01 (API auth model), OQ-02 (WebSocket auth)
### OQ-44: Reactive vs. call graph `requested` semantics
@@ -411,8 +429,8 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
- **Origin**: [hub-config.md](hub-config.md) OQ-3
- **Status**: open
- **Priority**: high
- **Question**: The `SpokeConfig.auth` field format is blocked on the spoke-runner WebSocket auth design (OQ-02). Config system supports `tokenFile` but actual protocol is TBD.
- **Cross-references**: OQ-02 (WebSocket auth)
- **Question**: The `SpokeConfig.auth` field format is blocked on the spoke-runner WebSocket auth design (OQ-02). Config system supports `tokenFile` but actual protocol is TBD. The dev spoke (ADR-015) will use `tokenFile` to read its auth token from a Docker secret or mounted file.
- **Cross-references**: OQ-02 (WebSocket auth), OQ-62 (dev spoke distribution)
### OQ-47: Config schema version
@@ -447,19 +465,19 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
## Theme 10: Future / Low Priority
### OQ-51: Role database-authoritative phase (Phase 3)
### OQ-51: Role database-authoritative (Phase 3)
- **Origin**: [agent-roles.md](agent-roles.md) Phase 3, [storage/roles.md](storage/roles.md)
- **Status**: open
- **Priority**: low
- **Question**: Eventually, role definitions should be primarily in the database with files only for version control. This is Phase 3 and not blocking for v1.
- **Status**: resolved by [ADR-017](../decisions/ADR-017-hub-first-roles.md)
- **Priority**: low → resolved
- **Resolution**: The hub is database-first from day one. There is no Phase 1 (file-based) or Phase 2 (file sync). Roles are defined in the `roles` table from the start, seeded by migrations. Markdown files are not part of the hub's role system.
### OQ-52: Memory across sessions
- **Origin**: [agent-roles.md](agent-roles.md) OQ-7
- **Status**: open
- **Status**: deferred
- **Priority**: low
- **Question**: Should LLM accounts have persistent memory across sessions? This is separate from session message history. Could be a `memories` table or vector store. Deferred — separate feature.
- **Question**: Should LLM accounts have persistent memory across sessions? This is separate from session message history. Could be a `memories` table or vector store. Deferred — separate feature with no current requirement.
### OQ-53: Task versioning
@@ -478,9 +496,9 @@ Cross-cutting compilation of all unresolved questions across the hub architectur
### OQ-55: Anthropic conversation import
- **Origin**: [storage/README.md](storage/README.md) OQ-6
- **Status**: open
- **Status**: deferred by [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md)
- **Priority**: low
- **Question**: Import script for Anthropic conversations is deferred. Export format is documented.
- **Question**: Import script for Anthropic conversations. This is a nice-to-have research tool for re-importing past conversations, not a core feature. The format has likely changed since it was last relevant. Not shipped with the codebase.
### OQ-56: ADR-013 out-of-scope items
@@ -527,13 +545,13 @@ These questions block each other or share resolution paths:
2. **Spoke Connectivity Chain**: OQ-06 → OQ-10 — Spoke provisioning can't work without the hub-side WebSocket handler. Resolve OQ-10 first.
3. **Session Schema Chain**: OQ-16 → OQ-17 → OQ-18 → OQ-19Schema finalization blocks compaction and versioning design. Resolve OQ-16 first.
3. **Implementation Bootstrap**: OQ-38 → OQ-39 → OQ-40Hub startup implementation needs hub config types and proper logger config. These are the minimum viable path to a running hub.
4. **Implementation Bootstrap**: OQ-38 → OQ-39 → OQ-40 — Hub startup implementation needs hub config types and proper logger config. These are the minimum viable path to a running hub.
4. **Config Security Chain**: OQ-24 → OQ-25 → OQ-50 — Token env deprecation and secret reference resolution are intertwined. OQ-24 must be resolved (remove tokenEnv) before OQ-25 can be validated. After OQ-25, the key rotation background sweep (OQ-50) becomes more important because more secrets flow through `client_secrets`.
5. **Config Security Chain**: OQ-24 → OQ-25 → OQ-50Token env deprecation and secret reference resolution are intertwined. OQ-24 must be resolved (remove tokenEnv) before OQ-25 can be validated. After OQ-25, the key rotation background sweep (OQ-50) becomes more important because more secrets flow through `client_secrets`.
5. **Data Lifecycle Chain**: OQ-12 → OQ-13 → OQ-14Operation deletion strategy, call graph retention, and payload truncation interact. OQ-12 determines whether operations can be removed at all.
6. **Data Lifecycle Chain**: OQ-12 → OQ-13 → OQ-14Operation deletion strategy, call graph retention, and payload truncation interact. OQ-12 determines whether operations can be removed at all.
6. **Dev Spoke Chain**: OQ-61 → OQ-62 → OQ-06Dev spoke operations and distribution need specification before spoke provisioning can be fully designed. OQ-11 is narrowed by ADR-015 but not resolved.
---
@@ -543,45 +561,45 @@ These questions block each other or share resolution paths:
|----|----------|--------|----------|--------|
| OQ-01 | API authentication model | hub-architecture | high | open |
| OQ-02 | WebSocket auth for spokes | spoke-runner | high | open |
| OQ-03 | Permission enforcement at call protocol | agent-roles | high | resolved |
| OQ-04 | LLM account provisioning | agent-roles | medium | open |
| OQ-05 | SSO with Gitea | hub-architecture | medium | open |
| OQ-03 | Permission enforcement at call protocol | agent-roles | high | **resolved** |
| OQ-04 | Service account provisioning | agent-roles | medium | **narrowed** |
| OQ-05 | Git provider SSO integration | hub-architecture | low | **narrowed** |
| OQ-06 | Spoke project context | spoke-runner | high | open |
| OQ-07 | Source sync for external compute | spoke-runner | medium | open |
| OQ-08 | Concurrent spoke operations | spoke-runner | medium | open |
| OQ-09 | Spoke operation list freshness | spoke-runner | medium | open |
| OQ-08 | Concurrent spoke operations | spoke-runner | low | **narrowed** |
| OQ-09 | Spoke operation list freshness | spoke-runner | low | **narrowed** |
| OQ-10 | Hub-side WebSocket handler design | spoke-runner | high | open |
| OQ-11 | Container spoke lifecycle | spoke-runner, hub-architecture | low | open |
| OQ-11 | Dev spoke and compute spoke lifecycle | spoke-runner, hub-architecture | low | **narrowed** |
| OQ-12 | Operation deletion vs. call graph FK | call-graph, storage/spokes | high | open |
| OQ-13 | Call graph retention policy | storage/call-graph, storage/README | medium | open |
| OQ-14 | Call graph payload truncation config | storage/call-graph | medium | open |
| OQ-15 | Polymorphic FK for `providerId` | storage/spokes | medium | open |
| OQ-16 | Session/message schema finalization | agent-sessions, storage/sessions | high | open |
| OQ-17 | Session message compaction | agent-sessions, storage/README | medium | open |
| OQ-18 | Message data versioning | storage/README | medium | open |
| OQ-19 | Part nesting | storage/sessions | low | open |
| OQ-16 | Session/message schema finalization | agent-sessions, storage/sessions | medium | **resolved (ADR-016)** |
| OQ-17 | Session message compaction | agent-sessions, storage/README | low | **resolved (ADR-016)** |
| OQ-18 | Message data versioning | storage/README | medium | **resolved (ADR-016)** |
| OQ-19 | Part nesting | storage/sessions | low | **resolved (ADR-016)** |
| OQ-20 | Config reload without restart | hub-config, hub-startup | medium | open |
| OQ-21 | CI/CD config generation | hub-config | high | open |
| OQ-22 | Multiple config file layers | hub-config | low | open |
| OQ-23 | PostgresConfig SSL details | hub-config | medium | open |
| OQ-21 | CI/CD config generation | hub-config | high | **resolved (ADR-014)** |
| OQ-22 | Multiple config file layers | hub-config | low | **resolved (ADR-014)** |
| OQ-23 | PostgresConfig SSL details | hub-config | low | **resolved (ADR-014)** |
| OQ-24 | HTTPServiceConfig.auth.tokenEnv deprecation | hub-config, operations | high | open |
| OQ-25 | Secret reference resolution ordering | hub-config | medium | open |
| OQ-26 | Role import/sync operation | agent-roles, storage/README, storage/roles | medium | open |
| OQ-26 | Role import/sync operation | agent-roles, storage/README, storage/roles | medium | **resolved (ADR-017)** |
| OQ-27 | Role inheritance with permission resolution | agent-roles | medium | open |
| OQ-28 | Dynamic role creation | agent-roles | low | open |
| OQ-29 | Per-session role switching | agent-roles | medium | open |
| OQ-28 | Dynamic role creation | agent-roles | low | **resolved (ADR-017)** |
| OQ-29 | Per-session role switching | agent-roles | medium | **narrowed** |
| OQ-30 | Task storage and sync implementation | storage/README | high | open |
| OQ-31 | Bulk task status updates | storage/tasks | medium | open |
| OQ-32 | Cross-project task dependencies | storage/tasks | low | open |
| OQ-33 | Task embeddings | storage/tasks | low | open |
| OQ-34 | Background vs. startup migration | hub-startup | medium | open |
| OQ-35 | Hot spare / zero-downtime restart | hub-startup | low | open |
| OQ-36 | Startup observability | hub-startup | low | open |
| OQ-37 | Redis deployment topology | hub-architecture | medium | open |
| OQ-34 | Background vs. startup migration | hub-startup | medium | **resolved (ADR-014)** |
| OQ-35 | Hot spare / zero-downtime restart | hub-startup | low | **resolved (ADR-014)** |
| OQ-36 | Startup observability | hub-startup | low | **resolved (ADR-014)** |
| OQ-37 | Redis deployment topology | hub-architecture | medium | **resolved (ADR-014)** |
| OQ-38 | Hub startup implementation | hub-startup | high | open |
| OQ-39 | Hub-specific config in operations package | operations | high | open |
| OQ-40 | Logger configuration | operations | medium | open |
| OQ-41 | Gitea operations at startup | storage/README | medium | open |
| OQ-41 | Gitea operations at startup | storage/README | medium | **narrowed** |
| OQ-42 | Keypal adapter testing | storage/README | medium | open |
| OQ-43 | MCP endpoint authentication detail | mcp-server | medium | open |
| OQ-44 | Reactive vs. call graph requested semantics | call-graph | medium | open |
@@ -591,20 +609,22 @@ These questions block each other or share resolution paths:
| OQ-48 | Cross-doc terminology migration | storage/README | low | open |
| OQ-49 | ADR-012 migration | decisions/ADR-012 | medium | open |
| OQ-50 | Key rotation background sweep | decisions/storage-spec-phase1 | high | open |
| OQ-51 | Role database-authoritative (Phase 3) | agent-roles, storage/roles | low | open |
| OQ-52 | Memory across sessions | agent-roles | low | open |
| OQ-51 | Role database-authoritative (Phase 3) | agent-roles, storage/roles | low | **resolved (ADR-017)** |
| OQ-52 | Memory across sessions | agent-roles | low | **deferred** |
| OQ-53 | Task versioning | storage/tasks | low | open |
| OQ-54 | High-contention task notes | storage/tasks | low | open |
| OQ-55 | Anthropic conversation import | storage/README | low | open |
| OQ-55 | Anthropic conversation import | storage/README | low | **deferred** |
| OQ-56 | ADR-013 out-of-scope items | decisions/ADR-013 | low | open |
| OQ-57 | Call graph visualization | call-graph | low | open |
| OQ-58 | Stream deduplication | call-graph | medium | open |
| OQ-59 | `requested_by` edge in flowgraph | call-graph | low | open |
| OQ-60 | Full ujsx call templates | call-graph | low | open |
| OQ-61 | Dev spoke operations | ADR-015 | medium | open |
| OQ-62 | Dev spoke distribution and config | ADR-015 | medium | open |
### High Priority Open Questions (Blocking)
These 11 questions block core functionality and should be resolved first:
These questions block core functionality and should be resolved first:
| ID | Question | Blocks |
|----|----------|--------|
@@ -613,8 +633,6 @@ These 11 questions block core functionality and should be resolved first:
| OQ-06 | Spoke project context | Spoke provisioning |
| OQ-10 | Hub-side WebSocket handler design | All spoke functionality |
| OQ-12 | Operation deletion vs. call graph FK | Operation lifecycle |
| OQ-16 | Session/message schema finalization | Session storage implementation |
| OQ-21 | CI/CD config generation | Deployment |
| OQ-24 | HTTPServiceConfig.auth.tokenEnv deprecation | Security (env var leak) |
| OQ-38 | Hub startup implementation | All functionality |
| OQ-39 | Hub-specific config in operations package | Hub startup |
@@ -630,8 +648,15 @@ Suggested order for resolving the high-priority questions, based on dependency c
3. **OQ-02** — WebSocket auth (unblocks OQ-10, OQ-46)
4. **OQ-10** — Hub-side WebSocket handler (enables spokes)
5. **OQ-24** — tokenEnv deprecation (security fix)
6. **OQ-16**Session/message schema (enables storage)
7. **OQ-12**Operation deletion strategy (data integrity)
8. **OQ-21**CI/CD config generation (deployment)
9. **OQ-06** — Spoke project context (spoke provisioning)
10. **OQ-50** — Key rotation sweep (production secret management)
6. **OQ-12**Operation deletion strategy (data integrity)
7. **OQ-06**Spoke project context (spoke provisioning)
8. **OQ-50**Key rotation sweep (production secret management)
### Questions Resolved by ADRs
| ADR | Questions Resolved | Key Decisions |
|-----|-------------------|---------------|
| [ADR-014](../decisions/ADR-014-docker-first-deployment.md) | OQ-21, OQ-22, OQ-23, OQ-34, OQ-35, OQ-36, OQ-37 | Docker as primary deployment; Redis/Postgres same network; config via mounted volumes; single-container restart; migrations block startup |
| [ADR-015](../decisions/ADR-015-dev-spoke-not-opencode.md) | OQ-16, OQ-17, OQ-26, OQ-28, OQ-51, OQ-55 | Dev spoke replaces opencode integration; hub owns session format; opencode compat is import tool; opencode-spoke is optional future |
| [ADR-016](../decisions/ADR-016-hub-own-schema.md) | OQ-18, OQ-19 | Hub defines own canonical schema; JSONB implicitly versioned; flat parts for v1; import is compat tool |
| [ADR-017](../decisions/ADR-017-hub-first-roles.md) | OQ-26, OQ-28, OQ-51 | Database-first roles; seeded by migrations; `hub.createRole` for custom roles; no `.opencode/agents/` file sync |