Files
hub/docs/architecture/agent-roles.md
glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts
Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.
2026-05-25 10:56:32 +00:00

408 lines
24 KiB
Markdown

---
status: draft
last_updated: 2026-04-20
---
# Agent Roles & Identity
How the hub models agents, roles, accounts, and the permissions that flow between them.
## Overview
Three distinct concepts that are often conflated:
1. **Account** — An identity in the system (human, service, or LLM). Accounts own resources, authenticate, and bear liability. Stored in `accounts` table.
2. **Role** — A behavioral specification that any account can fill. Roles define what operations are available, what permissions are granted, and what scope constraints apply. Roles are defined declaratively (currently as `.opencode/agents/*.md` files; eventually as database records). An account fills a role for the duration of a session.
3. **Session** — A unit of work where an account fills a role. Sessions bind an account to a role for their lifetime. The `sessions.roleName` column tracks which role is active.
**Key insight**: An LLM doesn't need its own account to be an "agent" — it needs an account because it needs an identity that owns its sessions, API keys, and audit trail. A human can fill the same "implementer" role that an LLM fills. The role defines behavior; the account provides identity and accountability.
## Terminology Decision
We use **"role"** for the behavioral specification and **"account"** for the identity, intentionally avoiding "agent" as a primary term. See [ADR-012](../decisions/ADR-012-agent-vs-role-vs-account.md) for the full rationale.
| We say | We don't say | Why |
|--------|--------------|-----|
| **role** | agent (behavioral sense) | A role is something you fill, not something you are |
| **account** | agent (identity sense) | An account is an identity that can be human, service, or LLM |
| **session** | agent run | A session is where account + role intersect |
| **spoke** | runner | Legacy rename, see spoke-runner.md |
When referencing OpenCode's data model (for import compatibility), we map their `agent` field to our `roleName` field. The OpenCode concept of "agent" maps to our "role" — it's a behavioral spec, not an identity.
## Account-Role Relationship
```
┌──────────┐ fills ┌──────────┐ in ┌──────────┐
│ Account │ ──────────────────────→ │ Role │ ──────────────────→ │ Session │
│ (identity)│ │(behavior)│ │ (work) │
└──────────┘ └──────────┘ └──────────┘
│ │ │
│ owns sessions, API keys, │ defines perms, │ binds account
│ audit trail, resources │ scoping, tools │ to role for duration
│ │ │
│ can be: human, service, LLM │ can be filled by │ has: project, workspace,
│ │ any capable account │ parent (if spawned)
```
An account can fill different roles at different times — a human might coordinate and an LLM might implement, or vice versa. The role constrains what operations are available; the account provides identity and ownership.
### Why LLMs Need Accounts
LLMs (like agents working in this codebase) need their own accounts because:
- **Audit trail**: Every session, every operation call, every API key usage needs to be attributable to an identity
- **Resource ownership**: Sessions and their messages belong to an account. API keys are owned by accounts.
- **Principal-agent liability**: If a coordinator spawns an implementation specialist and it makes a mistake, the coordinator's account is responsible for the delegation. The implementer's account is responsible for the execution. This is the same principal-agent framework that applies to human delegation.
- **Access control**: API key scopes and operation permissions are evaluated against the account's identity and the session's role.
- **Gitea integration**: Commit attribution goes to the account's `giteaUsername`. The `glm-5.1@alk.dev` git user is an account, just like any human developer.
### Service Accounts for LLMs
LLM accounts use `accessLevel: "service"` in the `accounts` table. This is the same `service` level used for spoke credentials and CI tokens — it indicates an automated identity that doesn't have a Gitea account. The distinction between a "spoke credential" service account and an "LLM worker" service account is in the API key scopes and the roles they fill in sessions, not in the account type itself.
```
Account (service, giteaUsername: null)
├── API Key 1 (scope: ["session:create", "coord:*"])
│ → Used to fill coordinator role
├── API Key 2 (scope: ["session:create", "dev:*"])
│ → Used to fill implementation-specialist role
└── Audit trail: all actions attributable to this identity
```
## Role Definitions
### Current State: File-Based
Roles are currently defined in `.opencode/agents/*.md` as markdown files with YAML frontmatter. This is the OpenCode convention and works for the current stopgap workflow:
```
.opencode/agents/
├── architect.md # Creates architecture specs
├── architecture-reviewer.md # Reviews architecture for ambiguities
├── code-reviewer.md # Reviews code quality
├── coordinator.md # Orchestrates parallel execution
├── decomposer.md # Breaks architecture into task graph
├── implementation-specialist.md # Executes atomic tasks
├── poc-specialist.md # Creates proof-of-concepts
└── research-specialist.md # Researches and documents findings
```
Each file contains:
- `description`: What the role does
- `mode`: `"primary"` (user-facing) or `"subagent"` (spawned by coordinator)
- `temperature`: Model temperature
- Body: Behavioral specification, tools, constraints
### Transition: File-Based → Database
Following the same pattern as `taskgraph` (which moved from file-based to database), roles should eventually become database records. The transition plan:
1. **Phase 1 (current)**: Role definitions are markdown files. The hub reads them when creating sessions or when the OpenCode convention requires them.
2. **Phase 2 (near future)**: A `roles` table in Postgres stores role definitions. Markdown files remain the authoring surface (like tasks). An ingestion operation syncs `.opencode/agents/*.md``roles` table.
3. **Phase 3 (eventual)**: Role definitions are primarily in the database. The files exist only for version control and offline editing. The hub's role management UI/API replaces file editing for common cases.
### Role Schema
A role definition includes:
| Field | Type | Description |
|-------|------|-------------|
| name | text NOT NULL UNIQUE | Role identifier (e.g., "architect", "implementation-specialist") |
| description | text | Human-readable description |
| mode | text NOT NULL | `"primary"` or `"subagent"` |
| temperature | real | Model sampling temperature |
| permissions | jsonb | Permission ruleset (what operations this role can access) |
| tools | jsonb | Tool availability map (which tools are enabled/disabled) |
| prompt | text | System prompt template |
| parentId | text | FK → `roles.id` — Parent role (for role specialization) |
| scopes | text[] | API key scopes this role requires |
| data | jsonb | Additional role-specific configuration |
The `permissions` field uses the same format as OpenCode's `Permission.Ruleset` — an array of `{ action, permission, pattern }` rules evaluated first-match:
```json
[
{ "action": "allow", "permission": "read", "pattern": "src/**" },
{ "action": "allow", "permission": "bash", "pattern": "deno *" },
{ "action": "deny", "permission": "bash", "pattern": "*" },
{ "action": "allow", "permission": "webSearch", "pattern": "*" }
]
```
The `tools` field maps tool names to boolean (enabled/disabled):
```json
{
"read": true,
"write": true,
"edit": true,
"glob": true,
"grep": true,
"bash": true,
"webSearch": true,
"webfetch": true
}
```
**Important**: The `permissions` and `tools` fields here define what the role *requests*. The actual capabilities available to a session also depend on the account's API key scopes and the spoke type's trust level (see Permission Resolution below).
### Predefined Roles
These roles correspond to the SDD process roles defined in `docs/sdd_process.md`:
| Role | Mode | Key Permissions | Key Constraints |
|------|------|----------------|----------------|
| `architect` | primary | read, write, webSearch | No bash, no implementation |
| `architecture-reviewer` | subagent | read, grep | Read-only access |
| `code-reviewer` | subagent | read, grep, bash (read-only) | Read-only access, can run tests |
| `coordinator` | primary | worktree_*, read, bash (limited) | No implementation, orchestrates only |
| `decomposer` | primary | read, taskgraph | No bash, no implementation |
| `implementation-specialist` | primary | read, write, edit, bash, webSearch | Scoped to worktree |
| `poc-specialist` | primary | read, write, edit, bash, webSearch | Scoped to research worktree |
| `research-specialist` | subagent | webSearch, read, write | No bash, no edit |
## Permission Resolution
Permissions are resolved at session creation time by intersecting three sources:
```
Effective permissions = Role.requested ∩ Account.allowed ∩ SpokeType.capable
```
Each source provides a different constraint:
1. **Role.requests** — Which operations and tools the role *wants* to use (defined in `roles.permissions` and `roles.tools`)
2. **Account.allowed** — What the account's API key *permits* (from `api_keys.metadata.scopes` and `api_keys.metadata.resources`)
3. **SpokeType.capable** — What the execution environment *physically supports* (from spoke type trust level)
The intersection is computed per-tool and per-permission:
```ts
// Pseudocode for permission resolution at session creation
function resolvePermissions(role, account, spokeType): ResolvedPermissions {
const requested = new Set(role.tools) // e.g., ["read", "write", "bash", "webSearch"]
const allowed = new Set(account.scopes) // e.g., ["session:create", "dev:*"]
const capable = TRUST_LEVELS[spokeType] // e.g., { bash: "worktree", write: "worktree", read: true }
// Tool availability: role wants it AND account allows it AND spoke can do it
const effectiveTools: Record<string, boolean> = {}
for (const tool of ALL_TOOLS) {
if (!requested.has(tool)) continue // Role doesn't request it
if (!isToolAllowed(tool, allowed)) continue // Account key doesn't permit it
if (!isToolCapable(tool, capable)) continue // Spoke type can't do it
effectiveTools[tool] = true
}
// Permission ruleset: role defines it, account scopes filter it
const effectivePermissions = role.permissions.filter(rule => {
return isActionAllowed(rule, allowed)
})
return { tools: effectiveTools, permissions: effectivePermissions }
}
```
**Resolved scope storage**: The result of permission resolution is stored in `sessions.data.scope`:
```ts
// sessions.data.scope shape (computed at session creation)
{
tools: Record<string, boolean>, // e.g., { read: true, write: true, bash: false }
permissions: PermissionRuleSet, // filtered role permissions
resolvedAt: string, // ISO timestamp of resolution
resolutionInputs: { // For audit/debugging
roleId: string,
accountScopes: string[],
spokeType: string
}
}
```
**Mutability**: The resolved scope is computed once at session creation. If account scopes change mid-session, the session retains its original scope. Role changes require creating a new session. This is a deliberate design choice — changing permissions mid-session creates audit confusion and risks inconsistent behavior.
**Re-evaluation**: Operations that spawn new sessions (e.g., `coord.spawn`) create a new session with fresh permission resolution for the target role and the spawning account's scopes.
### Trust Levels by Spoke Type
| Spoke Type | Trust Level | Bash | Network | Filesystem |
|------------|-------------|------|---------|------------|
| Hub-direct | Highest | Within hub process (no host access) | Hub's network | Read-only code access |
| Dev env | High | Scoped to worktree | Outbound allowed | Scoped to worktree |
| Client | Medium | None | Client-initiated only | None |
| Research | Low | None | WebSearch only | Read-only specific dirs |
| GPU compute | Minimal | None | None | None (data pushed to it) |
This matches the instruction firewall research finding: agents that process external data (research, web content) should have minimal capabilities. A compromised research agent has limited blast radius because it can't execute commands, modify the filesystem, or access internal APIs.
**Enforcement mechanism**: Trust levels are assigned at spoke registration time. When a spoke calls `hub.register`, it declares its `spokeType`. The hub validates that the registered operations match the declared trust level — a "research" spoke cannot register `bash.exec` or `fs.write` operations. The trust level is stored in `spokes.data.trustLevel` and used at permission resolution time. Trust levels cannot be escalated by the spoke itself; they are set by the hub based on the spoke type and confirmed at registration. See [spoke-runner.md](./spoke-runner.md) for the registration flow.
See [../../research/instruction-firewall.md](../../research/instruction-firewall.md) for the full security analysis.
## OpenCode Compatibility
### Session Import
When importing OpenCode sessions, their `agent` field maps to our `roleName`:
| OpenCode `agent` | Our `roleName` | Notes |
|------------------|----------------|-------|
| `"build"` | `"implementation-specialist"` | Primary dev role |
| `"plan"` | `"decomposer"` | Planning role |
| `"general"` | `"coordinator"` | General-purpose subagent |
| `"explore"` | `"research-specialist"` | Codebase exploration |
| `"compaction"` | (system) | Context compaction — not a user-visible role |
| `"title"` | (system) | Title generation — not a user-visible role |
| `"summary"` | (system) | Summary generation — not a user-visible role |
Custom roles from `.opencode/agents/*.md` map by name.
### Database Schema Mapping
OpenCode stores the agent name in message data (`$.role` for user messages, `$.agent` for assistant messages). We store it on the session (`sessions.roleName`) and optionally in message data (`messages.data.agent`). The session-level `roleName` is authoritative; the message-level `agent` is for compatibility.
OpenCode's `Agent.Info` zod schema includes:
- `name`: maps to our `roleName`
- `mode`: maps directly (primary ↔ primary, subagent ↔ subagent)
- `permission`: maps to our role's `permissions` field
- `model`: model selection per-role
- `temperature`, `topP`: per-role model parameters
- `steps`: max agentic steps per turn
These all have natural mappings to our role definition fields.
### Notable Differences
1. **OpenCode has no roles table** — Agent definitions are entirely file-based and hardcoded. We're adding a `roles` table for database-managed role definitions.
2. **OpenCode's `Agent.generate()`** — OpenCode can dynamically create agent configs via LLM. We don't support dynamic role creation (yet); roles must be predefined.
3. **OpenCode's `SubtaskPart`** — OpenCode has a `subtask` part type for delegation to subagents. Our `agent` part type serves a similar purpose but with different semantics (see sessions.md).
4. **OpenCode's `permission` field on messages** — OpenCode stores per-message permission overrides (`$.permission` on user message data). We handle this via role-level permissions, not per-message. This is a deliberate simplification — per-message permission overrides create complexity and attack surface.
## Relationship to Existing Tables
### accounts (identity.md)
The `accounts` table needs minor refinements for the LLM-as-account model:
| Current | Change | Rationale |
|---------|--------|-----------|
| `accessLevel: "service"` for automated accounts | Keep as `accessLevel: "service"` | The `service` access level covers non-human automation |
| `giteaUsername` nullable | Keep nullable — LLM accounts may or may not have Gitea users | The `glm-5.1@alk.dev` pattern: LLM accounts get a Gitea user for commit attribution |
| `email` required | Keep, but allow fallback emails | LLM accounts use `@alk.dev` fallback email addresses |
No new columns needed. The existing `accounts` table already supports the LLM-as-account pattern through the `service` access level and nullable `giteaUsername`.
### sessions (sessions.md)
The `agentName` column should be renamed to `roleName` for clarity. It's already nullable and text, so the migration is:
```sql
ALTER TABLE sessions RENAME COLUMN agent_name TO role_name;
```
Or if we want to avoid migration churn during active development, we can add a `roleName` field to the `data` JSONB column and deprecate `agentName` in the documentation, changing it in the next schema migration.
The `sessions.data` field adds:
- `model`: Which model the role is configured to use (from role definition or override)
- `scope`: Effective resolved scope for this session (from permission resolution)
### messages (sessions.md)
The `messages.data` field's `agent` key (in both user and assistant message data shapes) should be documented as a role reference, not an account reference. No schema change needed — it's already a text field.
## The Principal-Agent Framework
### What It Means
In legal theory, a principal delegates authority to an agent. The principal is responsible for the agent's actions within the scope of delegation. This maps directly:
| Legal Concept | Hub Concept | Example |
|---------------|------------|---------|
| Principal | Coordinator account/role | Coordinator orchestrates, is accountable |
| Agent | Implementer account/role | Implementer executes, coordinator is responsible for delegation |
| Scope of authority | Role permissions + account scopes | Coordinator can only delegate within its own authority |
| Respondeat superior | Audit trail | "The coordinator (principal) told the implementer (agent) to do X" |
### How It Applies
When a coordinator account spawns an implementation session:
1. The coordinator's account creates the session (audit: "account X created session Y")
2. The session is bound to the implementation-specialist role (permissions: worktree-scoped bash, write, read)
3. The spawned session's `parentId` points to the coordinator's session
4. If the implementer fails, it's the coordinator's responsibility to handle (Safe Exit protocol)
5. The coordinator delegated, so the coordinator bears responsibility for the outcome
The same pattern applies when a human fills the coordinator role — the human is still the principal. The accountability flows through the account, regardless of whether the principal is human or LLM.
### Memory Across Sessions
The principal-agent framework still holds when you consider memory across sessions:
- An LLM with a memory layer is still acting as an agent of the account that authorized it
- The memory doesn't change the authority relationship — it changes the capability
- If an LLM with memory makes a mistake, the account that authorized that session is still responsible
This is why accounts matter even with memory: accountability doesn't disappear just because the agent remembers past sessions.
## Role Definitions as Living Specifications
Role definitions (both file-based and database-stored) include:
1. **Behavioral specification** — What the role does, how it should behave, constraints
2. **Permission specification** — What operations the role can access
3. **Model parameters** — Temperature, model selection, max steps
4. **Tool selection** — Which tools are available/not available
5. **Scope constraints** — Worktree-scoped, project-scoped, or global
Currently these are all in the markdown files. As we move to database storage, the behavioral spec stays in markdown (for human readability and git version control) while the permission/param/tool specifications move to structured columns.
### Role Inheritance
Roles can specialize from a parent:
```
base-implementer
├── implementation-specialist (adds: webSearch, worktree scoping)
└── poc-specialist (adds: bash, research worktree scoping)
```
The `parentId` column on `roles` enables this. When evaluating permissions, the role's permissions are unioned with the parent's. This avoids duplicating common permission sets.
## Open Questions
1. **Role import/export**: Should we have a `roles.sync` operation that reads `.opencode/agents/*.md` and syncs them to the `roles` table? This would work like `taskgraph ingest` for tasks. **Leaning yes** — Phase 2 of the transition plan involves exactly this. Files are the authoring surface; database is the source of truth at runtime. The sync operation is one-way (files → database), idempotent, and run at hub startup and on demand.
2. **Permission enforcement point**: Where exactly in the call protocol do we enforce resolved permissions? The `CallHandler` checks `AccessControl` against `Identity` — should `Identity` include the role's resolved permissions? **Resolution**: Yes — `OperationContext.identity` should carry the resolved permissions from `sessions.data.scope`. The `CallHandler` evaluates `AccessControl.requiredScopes` against the session's resolved scope.
3. **Dynamic role creation**: OpenCode supports `Agent.generate()` for on-the-fly role creation. Should the hub support this, or should roles always be predefined? Decision: start with predefined, add dynamic creation later if needed.
4. **Per-session role override**: Should a session be able to change roles mid-conversation? OpenCode supports this (user selects a different agent). Our current model binds role at session creation. Decision: support role switching via `session.updateRole` operation, but this requires re-evaluating permissions and storing the new resolution in `sessions.data.scope`.
5. **Spoke trust level enforcement**: Resolved — see the "Enforcement mechanism" paragraph in the Trust Levels section above. Trust levels are set at registration and validated by the hub.
6. **LLM account provisioning**: How are LLM accounts created and managed? Currently manual (`glm-5.1@alk.dev` was created by hand). Should there be an automated provisioning flow? Decision: start manual, add `hub.createAccount` operation later.
7. **Memory across sessions**: Should LLM accounts have persistent memory that carries across sessions? This is separate from the session message history (which is already stored). Memory could be a `memories` table or a vector store attached to accounts. Decision: deferred — see the opencode-memory research for import compatibility, but persistent memory is a separate feature.
8. **Role inheritance**: How does role inheritance work with the permission resolution model? When a role has a `parentId`, its permissions are unioned with the parent's, with the child's rules taking priority in case of conflict (first-match wins across the merged list). The `tools` field is also unioned. The `temperature`, `model`, and `prompt` fields are inherited but can be overridden. Max depth: 3 levels. Circular inheritance is prevented at role creation time.
## References
- Identity table schemas: [storage/identity.md](storage/identity.md)
- Session/message/part schemas: [storage/sessions.md](storage/sessions.md)
- Spoke design and trust levels: [spoke-runner.md](spoke-runner.md)
- SDD process and role definitions: [../sdd_process.md](../sdd_process.md)
- Agent sessions architecture: [agent-sessions.md](agent-sessions.md)
- OpenCode memory skill reference: [../research/opencode-session-access.md](../research/opencode-session-access.md)
- Instruction firewall research: [../research/instruction-firewall.md](../research/instruction-firewall.md)
- Cost-benefit framework: TaskGraph categorical estimates (`framework.md` in taskgraph docs)
- OpenCode agent types: opencode `agent.ts` (Agent.Info, Agent.Service, built-in agents)
- OpenCode permission system: opencode `permission/index.ts` (Permission.Ruleset, evaluate, merge)