Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
161 lines
13 KiB
Markdown
161 lines
13 KiB
Markdown
# ADR-013: Schema system integration — TypeBox as canonical, typemap as scanner adapter
|
|
|
|
- **Status**: Accepted (implemented in `@alkdev/operations`)
|
|
- **Date**: 2026-04-25 (updated 2026-05-18)
|
|
- **Deciders**: alkdev
|
|
|
|
## Context
|
|
|
|
The operations system requires typed `inputSchema` and `outputSchema` on every `IOperationDefinition`. Internally, the system uses `@alkdev/typebox` (our fork of `@sinclair/typebox` 0.x LTS) exclusively — `KindGuard.IsSchema()` gates registration, `Value.Check()`/`Value.Errors()` performs validation, and `Static<>` derives TypeScript types from schemas. This is a hard dependency; the runtime requires genuine TypeBox `TSchema` objects with `[Kind]` symbols.
|
|
|
|
External systems send schemas over the wire as JSON Schema. The hub-spoke protocol is JSON over WebSocket. MCP tools and OpenAPI specs are JSON Schema. Non-TypeScript spokes (Python, Rust, etc.) send JSON Schema. This means:
|
|
|
|
1. **TypeBox is the internal runtime format** — the hub and TypeScript spokes use it for validation, type derivation, and schema checking.
|
|
2. **JSON Schema is the wire format** — TypeBox schemas serialize to JSON Schema (they're a superset with `[Kind]` symbols that strip on serialization). The hub deserializes via `FromSchema()`. Any language with a JSON Schema library and a WebSocket client can implement a spoke.
|
|
3. **Spoke authors may prefer different schema DSLs** — Zod, Valibot, or TypeScript syntax strings are more ergonomic for some developers than TypeBox's builder API. `@alkdev/typemap` (a fork of the archived `@sinclair/typemap`) provides bidirectional conversion between TypeBox, Zod, Valibot, and Syntax, with TypeBox as the canonical intermediate representation.
|
|
|
|
The question is how to integrate typemap without forcing Zod/Valibot into every install, and without changing the internal TypeBox contract.
|
|
|
|
## Decision
|
|
|
|
### TypeBox is canonical — no multi-schema internals
|
|
|
|
`IOperationDefinition.inputSchema` and `outputSchema` remain `TSchema`. The registry, validation, call protocol, and storage all use TypeBox natively. No `TSchema | ZodTypeAny | ValibotSchema` union types anywhere in core.
|
|
|
|
### JSON Schema is the wire format
|
|
|
|
The spoke registration protocol (`hub.register`) carries operation specs with their schemas serialized as JSON Schema. On deserialization, the hub converts back to TypeBox `TSchema` via `FromSchema()`. This is the same pattern already used for MCP tools and OpenAPI specs.
|
|
|
|
The call protocol events (`call.requested`, `call.responded`, etc.) carry `input` as `Type.Unknown()` — the payload is validated against the operation's `inputSchema` by the receiver, not by the transport. The schema itself isn't in every event; only the `operationId` is, and the receiver looks up the schema from its registry.
|
|
|
|
Any language with a JSON Schema library and a WebSocket client can implement a spoke. No TypeBox dependency required on the spoke side.
|
|
|
|
### FromSchema() coverage is a subset of JSON Schema
|
|
|
|
`FromSchema()` (in `@alkdev/operations/from-schema`) handles the JSON Schema features most commonly encountered in operation schemas. The current implementation covers:
|
|
|
|
| Feature | Support |
|
|
|---------|---------|
|
|
| `type: "string"`, `"number"`, `"integer"`, `"boolean"`, `"null"` | ✅ Full |
|
|
| `type: "object"` with `properties` / `required` | ✅ Full |
|
|
| `type: "array"` with `items` (single schema or tuple) | ✅ Full |
|
|
| `allOf`, `anyOf`, `oneOf` | ✅ Full |
|
|
| `enum` (value arrays) | ✅ Full |
|
|
| `const` (literal values) | ✅ Full |
|
|
| `$ref` (schema references) | ⚠️ Partial — produces `Type.Ref()` but requires definitions registered in TypeBox's schema registry for resolution at validation time |
|
|
| Schema annotations (`description`, `default`, `format`, etc.) | ✅ Passed through to TypeBox as options |
|
|
| `$defs` / `definitions` | ❌ Not handled — schemas using shared definitions must inline them before sending over the wire |
|
|
| `patternProperties`, `additionalProperties` | ❌ Not handled — falls through to `Type.Unknown()` |
|
|
| `if/then/else` | ❌ Not handled |
|
|
| `not` | ❌ Not handled |
|
|
| `contentEncoding`, `contentMediaType` | ❌ Not handled |
|
|
|
|
**Wire format constraint**: Spoke schemas sent over the wire must be **self-contained** (no `$ref`s, no `$defs`/`definitions`) and use only the supported JSON Schema subset. Unsupported features currently produce `Type.Unknown()`, which accepts any value — safe (no false rejections) but no validation. The hardened `FromSchema()` (see security constraints below) must warn on unsupported features rather than silently degrading.
|
|
|
|
### Inbound schema processing has security constraints
|
|
|
|
When a spoke sends JSON Schema over the wire, the hub runs `FromSchema()` on it. This is processing untrusted input and must be hardened:
|
|
|
|
- **Schema depth limit**: `FromSchema()` is recursive. Schemas with deeply nested `allOf`/`anyOf` can cause stack overflows. The hub must reject schemas exceeding 10 levels of nesting.
|
|
- **Schema size limit**: The `hub.register` handler must reject operation specs whose serialized schema exceeds 64KB per schema.
|
|
- **`$ref` policy**: Wire schemas must be self-contained. Circular `$ref`s are a DoS vector. The hub must reject any schema containing `$ref` or `$defs`/`definitions` at registration time.
|
|
- **No silent degradation**: `FromSchema()` must warn on unsupported JSON Schema features rather than silently producing `Type.Unknown()`. The hub logs which features fell through so spoke authors can fix their schemas.
|
|
|
|
### Scanner is the conversion point — typemap converts at scan time
|
|
|
|
The scanner (`@alkdev/operations/scanner`, using `ScannerFS` Deno adapter for filesystem access) walks the filesystem, imports `.ts` operation files, and registers their default exports. This is where typemap integrates: the scanner detects the schema type and converts non-TypeBox schemas before registration, using the `SchemaAdapter` pattern from `@alkdev/operations/from-typemap`.
|
|
|
|
```ts
|
|
// Scanner conversion logic (schematic)
|
|
if (KindGuard.IsSchema(schema)) {
|
|
// TypeBox — register directly (current path)
|
|
} else if (IsZod(schema)) {
|
|
// Zod → TypeBoxFromZod → TSchema → register
|
|
} else if (IsValibot(schema)) {
|
|
// Valibot → TypeBoxFromValibot → TSchema → register
|
|
} else {
|
|
throw new Error("Not a valid schema type...");
|
|
}
|
|
```
|
|
|
|
The spoke author writes their operation definition using whatever schema DSL they prefer. The scanner converts it to TypeBox transparently at registration time. No manual `fromZod()` call needed — the author just writes Zod schemas in their operation file and the scanner handles the rest.
|
|
|
|
The conversion is one-way and happens once at scan time. After registration, only the TypeBox `TSchema` exists in the registry. The original Zod/Valibot schema is not kept — the TypeBox conversion is the authoritative schema for validation, serialization, and type derivation.
|
|
|
|
### typemap is an optional dependency with dynamic import
|
|
|
|
`@alkdev/typemap` is a peer dependency of the spoke package, not a dependency of core. The scanner uses the `SchemaAdapter` from `@alkdev/operations/from-typemap` which handles dynamic imports to load typemap's conversion functions only when needed:
|
|
|
|
```ts
|
|
// If a Zod schema is detected and typemap isn't installed,
|
|
// the error message directs the user to install it.
|
|
async function convertFromZod(schema: unknown): Promise<TSchema> {
|
|
try {
|
|
const { TypeBoxFromZod } = await import("@alkdev/typemap");
|
|
return TypeBoxFromZod(schema);
|
|
} catch {
|
|
throw new Error(
|
|
"Zod schema detected but @alkdev/typemap is not installed. " +
|
|
"Add it as a peer dependency to use Zod schemas in operation definitions."
|
|
);
|
|
}
|
|
}
|
|
```
|
|
|
|
This keeps typemap, Zod, and Valibot out of the dependency tree entirely for spoke authors who use TypeBox directly. The `import()` is conditional — if no Zod schemas are encountered, the dynamic import is never executed and the modules are never loaded.
|
|
|
|
The type detection guards (`IsZod`, `IsValibot`) use the [Standard Schema](https://github.com/standard-schema/standard-schema) `~standard` property with the `vendor` field (`"zod"` or `"valibot"`). This is a community spec implemented by Zod 3.23+ and Valibot 1.0+. The checks are small inline predicates that don't require importing Zod or Valibot themselves.
|
|
|
|
### Hub-side registration stays unchanged
|
|
|
|
When a spoke sends its operation list over the wire in `hub.register`, the schemas arrive as plain JSON (no `[Kind]` symbols). The hub's registration handler converts them via `FromSchema()` (from `@alkdev/operations/from-schema`):
|
|
|
|
```ts
|
|
// In hub.register handler
|
|
for (const spec of wireSpecs) {
|
|
const inputSchema = FromSchema(spec.inputSchema); // JSON Schema → TSchema
|
|
const outputSchema = FromSchema(spec.outputSchema); // JSON Schema → TSchema
|
|
registry.register({ ...spec, inputSchema, outputSchema });
|
|
}
|
|
```
|
|
|
|
This is already the pattern used for MCP tools and OpenAPI specs. Spoke registration is the same, whether the original author wrote in TypeBox, Zod, or Valibot — by the time it crosses the wire, it's JSON Schema.
|
|
|
|
## Consequences
|
|
|
|
**Positive:**
|
|
- Zero bloat for core or for spoke authors using TypeBox directly
|
|
- Spoke authors get ergonomic schema definition in Zod, Valibot, or Syntax transparently — the scanner converts at registration time
|
|
- Non-TypeScript spokes use JSON Schema natively — no adapter needed at the protocol level
|
|
- Wire format is language-agnostic (JSON Schema)
|
|
- TypeBox remains the single canonical runtime format — no multi-schema validation paths
|
|
- Dynamic imports mean Zod and Valibot are only loaded when schemas in those formats are actually encountered
|
|
|
|
**Negative:**
|
|
- Zod refinements that have no JSON Schema equivalent (e.g., `.refine()`, `.superRefine()`, `.transform()`) will be lost in conversion. The `TypeBoxFromZod` conversion handles declarative constraints (`.min()`, `.max()`, `.email()`, etc.) but not arbitrary validation functions. Spoke authors using Zod refinements need to understand that only the JSON Schema-representable subset survives the TypeBox conversion.
|
|
- **Type precision loss at the wire boundary**: `FromSchema()` returns `Type.TSchema` generically, so `Static<typeof schema>` resolves to `unknown` for wire-registered schemas (unlike in-process TypeBox schemas where `Static<>` gives precise types). Runtime validation is preserved, but compile-time type narrowing is lost for hub-side TypeScript code consuming spoke-registered operations. This is an inherent trade-off with wire-mediated schema exchange — the hub can't reconstruct the precise TypeScript type from JSON Schema alone.
|
|
- **Error message fidelity**: When a Zod-derived schema fails validation after TypeBox conversion, error messages reference TypeBox paths and type names, not the original Zod field names. Adding `description` fields to Zod schemas helps, since those survive conversion.
|
|
- The scanner needs a fallback error path for when typemap isn't installed but a Zod/Valibot schema is encountered.
|
|
- typemap is a community-maintained fork of an archived project — carries some maintenance risk, mitigated by it being a thin conversion layer with no runtime presence in the hub.
|
|
|
|
**Implementation status:** The scanner enhancement is now implemented in `@alkdev/operations`. The `SchemaAdapter` pattern in `@alkdev/operations/from-typemap` handles schema type detection (using Standard Schema `~standard` vendor checks) and dynamic import conversion paths. `@alkdev/typemap` is an optional peer dependency of the spoke package. `FromSchema()` in `@alkdev/operations/from-schema` is hardened with depth limits, size limits, and cycle detection.
|
|
|
|
## Out of Scope
|
|
|
|
- Bidirectional Zod ↔ TypeBox sync (conversion is one-way and one-time at scan/registration)
|
|
- Runtime schema migration or schema versioning across re-registrations
|
|
- Auto-generation of TypeScript types from wire schemas (code generation approach, deferred)
|
|
- Converting Zod `.transform()` / `.pipe()` output types (these are runtime-only, not representable in JSON Schema)
|
|
|
|
## References
|
|
|
|
- `@alkdev/typemap` npm: `@alkdev/typemap@0.10.1` — fork of `@sinclair/typemap` 0.x
|
|
- [Standard Schema spec](https://github.com/standard-schema/standard-schema) — community interface for type checking libraries
|
|
- Scanner: `@alkdev/operations/scanner` (with `ScannerFS` Deno adapter)
|
|
- `FromSchema()`: `@alkdev/operations/from-schema` — JSON Schema → TypeBox converter
|
|
- `FromOpenAPI()`: `@alkdev/operations/from-openapi` — OpenAPI → operation definitions
|
|
- `SchemaAdapter`: `@alkdev/operations/from-typemap` — Zod/Valibot → TypeBox conversion at registration time
|
|
- Spoke architecture: `docs/architecture/spoke-runner.md`
|
|
- Call protocol: `docs/architecture/call-graph.md`
|
|
- Operations system: `docs/architecture/operations.md`
|
|
- ADR-006: Operation specs as capabilities (definitions vs. registrations) |