docs(architecture): add ADR-023, resolve OQ-24 — operation error schemas

ADR-023 adds error_schemas to OperationSpec so operations can declare
their domain-level failure modes (FILE_NOT_FOUND, RATE_LIMITED, etc.)
distinct from protocol-level codes (NOT_FOUND, FORBIDDEN, etc.). The
call.error payload gains an optional 'details' field carrying the typed
error payload conforming to the declared schema. from_openapi/to_openapi
map OpenAPI response status codes to/from ErrorDefinitions, making the
adapter contract from ADR-017 faithful on the error axis.

Also fixes W2 (KeyVersionMismatch stale comment in encryption.md —
ADR-021 implements rotation without this variant) and W4
(derive_encryption_key_for_version missing from service.md method list).

Spec updates: operation-registry.md (OperationSpec, ErrorDefinition,
Handler error mapping, services/schema), call-protocol.md (call.error
payload, CallError, ResponseEnvelope), README.md, overview.md,
open-questions.md (OQ-24), call/README.md, encryption.md, service.md.
This commit is contained in:
2026-06-21 10:26:18 +00:00
parent 1cedc4eeba
commit 3e238a471b
9 changed files with 478 additions and 26 deletions

View File

@@ -33,6 +33,7 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions,
| [016](../../decisions/016-abort-cascade-for-nested-calls.md) | Abort Cascade for Nested Calls | `call.aborted` cascades to descendants; default `abort-dependents`, `continue-running` opt-in |
| [017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | Call Protocol Client and Adapter Contract | `CallClient` opens connections; `from_call` imports remote ops; connection direction independent of call direction |
| [022](../../decisions/022-handler-registration-provenance-and-composition-authority.md) | Handler Registration, Provenance, and Composition Authority | Registration bundle carries provenance, composition authority, scoped env, capabilities |
| [023](../../decisions/023-operation-error-schemas.md) | Operation Error Schemas | Operations declare domain errors; `call.error` carries typed `details`; adapter fidelity |
## Relevant Open Questions

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-06-21
last_updated: 2026-06-22
---
# Call Protocol
@@ -127,19 +127,28 @@ The `payload` of a `call.requested` event has this shape:
```json
{
"code": "NOT_FOUND",
"message": "operation not found: /fs/readFile",
"retryable": false
"code": "FILE_NOT_FOUND",
"message": "file not found: /etc/nonexistent",
"retryable": false,
"details": { "path": "/etc/nonexistent", "errno": 2 }
}
```
Error codes use an extensible string enum. The protocol defines the following codes:
- `NOT_FOUND` — operation not in registry
Error codes use an extensible string enum. The protocol defines the following **protocol-level codes** (emitted by the dispatch machinery, not by handlers):
- `NOT_FOUND` — operation not in registry (or Internal op called from wire)
- `FORBIDDEN` — access denied (insufficient scopes or unauthenticated)
- `INVALID_INPUT` — input doesn't match the operation's JSON Schema
- `INTERNAL` — handler error
- `INTERNAL` — handler error, panic, connection failure
- `TIMEOUT` — request timed out (retryable: true)
Operations may also declare **operation-level domain codes** in their `error_schemas` (ADR-023) — e.g., `FILE_NOT_FOUND`, `RATE_LIMITED`, `INSUFFICIENT_CREDITS`. These are emitted by handlers and carry a `details` payload conforming to the declared `ErrorDefinition.schema`. Protocol-level errors omit `details` or carry protocol-specific context (e.g., the operation name for `NOT_FOUND`).
Fields:
- `code` — the error code (protocol-level or operation-level)
- `message` — human-readable error message. For logging and debugging, not for programmatic handling. Clients should switch on `code`, not parse `message`.
- `retryable` — whether the caller should retry. `true` for transient failures, `false` for permanent ones.
- `details` — optional. When the code matches a declared `ErrorDefinition`, `details` conforms to that definition's schema. This is the typed error payload — it makes errors structured instead of string-matched. See ADR-023.
New error codes may be added in future versions. Clients should treat unknown error codes as `INTERNAL` with `retryable: false`.
### Protocol Operations
@@ -304,13 +313,14 @@ pub struct ResponseEnvelope {
}
pub struct CallError {
pub code: String,
pub message: String,
pub code: String, // protocol-level (NOT_FOUND, FORBIDDEN, ...) or operation-level (ADR-023)
pub message: String, // human-readable, for logging — not for programmatic handling
pub retryable: bool,
pub details: Option<Value>, // typed error payload, conforms to ErrorDefinition.schema (ADR-023)
}
```
Local dispatch produces `ResponseEnvelope` with no serialization overhead. The `CallAdapter` converts `ResponseEnvelope` to `EventEnvelope` for the wire.
Local dispatch produces `ResponseEnvelope` with no serialization overhead. The `CallAdapter` converts `ResponseEnvelope` to `EventEnvelope` for the wire. When a handler returns a `CallError` whose `code` matches a declared `ErrorDefinition`, the `details` field carries the typed error payload. See ADR-023.
### Connection and Stream Lifecycle
@@ -356,6 +366,7 @@ Handlers clean up resources when their call is cancelled (in Rust, the future is
| Abort cascade for nested calls | [ADR-016](../../decisions/016-abort-cascade-for-nested-calls.md) | `call.aborted` cascades to descendants; default `abort-dependents`, `continue-running` opt-in |
| Call protocol client and adapter contract | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | `CallClient` opens connections; `from_call` imports remote ops; connection direction independent of call direction |
| Handler registration, provenance, and composition authority | [ADR-022](../../decisions/022-handler-registration-provenance-and-composition-authority.md) | Registration bundle carries provenance, composition authority, scoped env, capabilities; dispatch path reads from bundle |
| Operation error schemas | [ADR-023](../../decisions/023-operation-error-schemas.md) | Operations declare domain errors; `call.error` carries typed `details` |
## Open Questions

View File

@@ -37,6 +37,7 @@ pub struct OperationSpec {
pub visibility: Visibility, // External (wire-callable) or Internal (composition-only)
pub input_schema: Value, // JSON Schema for input
pub output_schema: Value, // JSON Schema for output
pub error_schemas: Vec<ErrorDefinition>, // Declared domain errors (ADR-023)
pub access_control: AccessControl,
}
@@ -50,6 +51,14 @@ pub enum Visibility {
External, // Callable from the wire (call.requested from a client)
Internal, // Composition-only (env.invoke from a handler)
}
/// A declared operation-level error. See ADR-023.
pub struct ErrorDefinition {
pub code: String, // e.g., "FILE_NOT_FOUND", "RATE_LIMITED"
pub description: String, // Human-readable description
pub schema: Value, // JSON Schema for the error detail payload
pub http_status: Option<u16>, // HTTP status for adapter projection (from_openapi/to_openapi)
}
```
Operation names use slash-based paths without a leading slash, aligned with URL path conventions: `fs/readFile`, `agent/chat`, `services/list`. The leading slash is added when needed for display (`spec.path()` returns `/fs/readFile`) and for wire format (the `call.requested` payload uses `/fs/readFile`). See OQ-13 for the path format decision (single-node `service/op` vs head/worker `node/service/op`).
@@ -94,6 +103,8 @@ A handler receives:
And returns a `ResponseEnvelope` containing the result or an error. `ResponseEnvelope` is defined in [call-protocol.md](call-protocol.md#responseenvelope) — it carries the request ID and a `Result<Value, CallError>`. Local dispatch produces it with no serialization overhead; the `CallAdapter` converts it to `EventEnvelope` for the wire.
When a handler returns an error, the `CallError.code` is matched against the operation's declared `error_schemas` (ADR-023). If the code matches a declared `ErrorDefinition`, the `call.error` event carries that code and the error's detail payload. If it doesn't match, the `call.error` carries `INTERNAL`. This is how handler failures become typed errors on the wire instead of string-matched messages.
### OperationContext
```rust
@@ -272,7 +283,7 @@ These are read-only — no admin operations are exposed through the call protoco
}
```
`services/schema` accepts `{ "name": "fs/readFile" }` and returns the full `OperationSpec` including input/output JSON Schemas.
`services/schema` accepts `{ "name": "fs/readFile" }` and returns the full `OperationSpec` including input/output JSON Schemas and declared `error_schemas` (ADR-023). This enables client code generation: a client reading the schema can produce typed error enums instead of generic error handling.
### irpc Integration
@@ -392,6 +403,7 @@ The `Capabilities` type holds non-serializable, zeroized secret material. It doe
| Secret material flow and capability injection | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Capabilities carry outbound credentials; call protocol carries no secret material |
| Privilege model and authority context | [ADR-015](../../decisions/015-privilege-model-and-authority-context.md) | `internal` = authority switch not ACL skip; External/Internal visibility; composition authority + scoped env |
| Handler registration, provenance, and composition authority | [ADR-022](../../decisions/022-handler-registration-provenance-and-composition-authority.md) | Registration bundle carries provenance, composition authority, scoped env, capabilities; dispatch path reads from bundle |
| Operation error schemas | [ADR-023](../../decisions/023-operation-error-schemas.md) | Operations declare domain errors; `call.error` carries typed `details`; adapter fidelity for `from_openapi`/`to_openapi` |
## Open Questions