Files
hub/docs/decisions/ADR-013-schema-system-integration.md
glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts
Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.
2026-05-25 10:56:32 +00:00

13 KiB

ADR-013: Schema system integration — TypeBox as canonical, typemap as scanner adapter

  • Status: Accepted (implemented in @alkdev/operations)
  • Date: 2026-04-25 (updated 2026-05-18)
  • Deciders: alkdev

Context

The operations system requires typed inputSchema and outputSchema on every IOperationDefinition. Internally, the system uses @alkdev/typebox (our fork of @sinclair/typebox 0.x LTS) exclusively — KindGuard.IsSchema() gates registration, Value.Check()/Value.Errors() performs validation, and Static<> derives TypeScript types from schemas. This is a hard dependency; the runtime requires genuine TypeBox TSchema objects with [Kind] symbols.

External systems send schemas over the wire as JSON Schema. The hub-spoke protocol is JSON over WebSocket. MCP tools and OpenAPI specs are JSON Schema. Non-TypeScript spokes (Python, Rust, etc.) send JSON Schema. This means:

  1. TypeBox is the internal runtime format — the hub and TypeScript spokes use it for validation, type derivation, and schema checking.
  2. JSON Schema is the wire format — TypeBox schemas serialize to JSON Schema (they're a superset with [Kind] symbols that strip on serialization). The hub deserializes via FromSchema(). Any language with a JSON Schema library and a WebSocket client can implement a spoke.
  3. Spoke authors may prefer different schema DSLs — Zod, Valibot, or TypeScript syntax strings are more ergonomic for some developers than TypeBox's builder API. @alkdev/typemap (a fork of the archived @sinclair/typemap) provides bidirectional conversion between TypeBox, Zod, Valibot, and Syntax, with TypeBox as the canonical intermediate representation.

The question is how to integrate typemap without forcing Zod/Valibot into every install, and without changing the internal TypeBox contract.

Decision

TypeBox is canonical — no multi-schema internals

IOperationDefinition.inputSchema and outputSchema remain TSchema. The registry, validation, call protocol, and storage all use TypeBox natively. No TSchema | ZodTypeAny | ValibotSchema union types anywhere in core.

JSON Schema is the wire format

The spoke registration protocol (hub.register) carries operation specs with their schemas serialized as JSON Schema. On deserialization, the hub converts back to TypeBox TSchema via FromSchema(). This is the same pattern already used for MCP tools and OpenAPI specs.

The call protocol events (call.requested, call.responded, etc.) carry input as Type.Unknown() — the payload is validated against the operation's inputSchema by the receiver, not by the transport. The schema itself isn't in every event; only the operationId is, and the receiver looks up the schema from its registry.

Any language with a JSON Schema library and a WebSocket client can implement a spoke. No TypeBox dependency required on the spoke side.

FromSchema() coverage is a subset of JSON Schema

FromSchema() (in @alkdev/operations/from-schema) handles the JSON Schema features most commonly encountered in operation schemas. The current implementation covers:

Feature Support
type: "string", "number", "integer", "boolean", "null" Full
type: "object" with properties / required Full
type: "array" with items (single schema or tuple) Full
allOf, anyOf, oneOf Full
enum (value arrays) Full
const (literal values) Full
$ref (schema references) ⚠️ Partial — produces Type.Ref() but requires definitions registered in TypeBox's schema registry for resolution at validation time
Schema annotations (description, default, format, etc.) Passed through to TypeBox as options
$defs / definitions Not handled — schemas using shared definitions must inline them before sending over the wire
patternProperties, additionalProperties Not handled — falls through to Type.Unknown()
if/then/else Not handled
not Not handled
contentEncoding, contentMediaType Not handled

Wire format constraint: Spoke schemas sent over the wire must be self-contained (no $refs, no $defs/definitions) and use only the supported JSON Schema subset. Unsupported features currently produce Type.Unknown(), which accepts any value — safe (no false rejections) but no validation. The hardened FromSchema() (see security constraints below) must warn on unsupported features rather than silently degrading.

Inbound schema processing has security constraints

When a spoke sends JSON Schema over the wire, the hub runs FromSchema() on it. This is processing untrusted input and must be hardened:

  • Schema depth limit: FromSchema() is recursive. Schemas with deeply nested allOf/anyOf can cause stack overflows. The hub must reject schemas exceeding 10 levels of nesting.
  • Schema size limit: The hub.register handler must reject operation specs whose serialized schema exceeds 64KB per schema.
  • $ref policy: Wire schemas must be self-contained. Circular $refs are a DoS vector. The hub must reject any schema containing $ref or $defs/definitions at registration time.
  • No silent degradation: FromSchema() must warn on unsupported JSON Schema features rather than silently producing Type.Unknown(). The hub logs which features fell through so spoke authors can fix their schemas.

Scanner is the conversion point — typemap converts at scan time

The scanner (@alkdev/operations/scanner, using ScannerFS Deno adapter for filesystem access) walks the filesystem, imports .ts operation files, and registers their default exports. This is where typemap integrates: the scanner detects the schema type and converts non-TypeBox schemas before registration, using the SchemaAdapter pattern from @alkdev/operations/from-typemap.

// Scanner conversion logic (schematic)
if (KindGuard.IsSchema(schema)) {
  // TypeBox — register directly (current path)
} else if (IsZod(schema)) {
  // Zod → TypeBoxFromZod → TSchema → register
} else if (IsValibot(schema)) {
  // Valibot → TypeBoxFromValibot → TSchema → register
} else {
  throw new Error("Not a valid schema type...");
}

The spoke author writes their operation definition using whatever schema DSL they prefer. The scanner converts it to TypeBox transparently at registration time. No manual fromZod() call needed — the author just writes Zod schemas in their operation file and the scanner handles the rest.

The conversion is one-way and happens once at scan time. After registration, only the TypeBox TSchema exists in the registry. The original Zod/Valibot schema is not kept — the TypeBox conversion is the authoritative schema for validation, serialization, and type derivation.

typemap is an optional dependency with dynamic import

@alkdev/typemap is a peer dependency of the spoke package, not a dependency of core. The scanner uses the SchemaAdapter from @alkdev/operations/from-typemap which handles dynamic imports to load typemap's conversion functions only when needed:

// If a Zod schema is detected and typemap isn't installed,
// the error message directs the user to install it.
async function convertFromZod(schema: unknown): Promise<TSchema> {
  try {
    const { TypeBoxFromZod } = await import("@alkdev/typemap");
    return TypeBoxFromZod(schema);
  } catch {
    throw new Error(
      "Zod schema detected but @alkdev/typemap is not installed. " +
      "Add it as a peer dependency to use Zod schemas in operation definitions."
    );
  }
}

This keeps typemap, Zod, and Valibot out of the dependency tree entirely for spoke authors who use TypeBox directly. The import() is conditional — if no Zod schemas are encountered, the dynamic import is never executed and the modules are never loaded.

The type detection guards (IsZod, IsValibot) use the Standard Schema ~standard property with the vendor field ("zod" or "valibot"). This is a community spec implemented by Zod 3.23+ and Valibot 1.0+. The checks are small inline predicates that don't require importing Zod or Valibot themselves.

Hub-side registration stays unchanged

When a spoke sends its operation list over the wire in hub.register, the schemas arrive as plain JSON (no [Kind] symbols). The hub's registration handler converts them via FromSchema() (from @alkdev/operations/from-schema):

// In hub.register handler
for (const spec of wireSpecs) {
  const inputSchema = FromSchema(spec.inputSchema);   // JSON Schema → TSchema
  const outputSchema = FromSchema(spec.outputSchema);  // JSON Schema → TSchema
  registry.register({ ...spec, inputSchema, outputSchema });
}

This is already the pattern used for MCP tools and OpenAPI specs. Spoke registration is the same, whether the original author wrote in TypeBox, Zod, or Valibot — by the time it crosses the wire, it's JSON Schema.

Consequences

Positive:

  • Zero bloat for core or for spoke authors using TypeBox directly
  • Spoke authors get ergonomic schema definition in Zod, Valibot, or Syntax transparently — the scanner converts at registration time
  • Non-TypeScript spokes use JSON Schema natively — no adapter needed at the protocol level
  • Wire format is language-agnostic (JSON Schema)
  • TypeBox remains the single canonical runtime format — no multi-schema validation paths
  • Dynamic imports mean Zod and Valibot are only loaded when schemas in those formats are actually encountered

Negative:

  • Zod refinements that have no JSON Schema equivalent (e.g., .refine(), .superRefine(), .transform()) will be lost in conversion. The TypeBoxFromZod conversion handles declarative constraints (.min(), .max(), .email(), etc.) but not arbitrary validation functions. Spoke authors using Zod refinements need to understand that only the JSON Schema-representable subset survives the TypeBox conversion.
  • Type precision loss at the wire boundary: FromSchema() returns Type.TSchema generically, so Static<typeof schema> resolves to unknown for wire-registered schemas (unlike in-process TypeBox schemas where Static<> gives precise types). Runtime validation is preserved, but compile-time type narrowing is lost for hub-side TypeScript code consuming spoke-registered operations. This is an inherent trade-off with wire-mediated schema exchange — the hub can't reconstruct the precise TypeScript type from JSON Schema alone.
  • Error message fidelity: When a Zod-derived schema fails validation after TypeBox conversion, error messages reference TypeBox paths and type names, not the original Zod field names. Adding description fields to Zod schemas helps, since those survive conversion.
  • The scanner needs a fallback error path for when typemap isn't installed but a Zod/Valibot schema is encountered.
  • typemap is a community-maintained fork of an archived project — carries some maintenance risk, mitigated by it being a thin conversion layer with no runtime presence in the hub.

Implementation status: The scanner enhancement is now implemented in @alkdev/operations. The SchemaAdapter pattern in @alkdev/operations/from-typemap handles schema type detection (using Standard Schema ~standard vendor checks) and dynamic import conversion paths. @alkdev/typemap is an optional peer dependency of the spoke package. FromSchema() in @alkdev/operations/from-schema is hardened with depth limits, size limits, and cycle detection.

Out of Scope

  • Bidirectional Zod ↔ TypeBox sync (conversion is one-way and one-time at scan/registration)
  • Runtime schema migration or schema versioning across re-registrations
  • Auto-generation of TypeScript types from wire schemas (code generation approach, deferred)
  • Converting Zod .transform() / .pipe() output types (these are runtime-only, not representable in JSON Schema)

References

  • @alkdev/typemap npm: @alkdev/typemap@0.10.1 — fork of @sinclair/typemap 0.x
  • Standard Schema spec — community interface for type checking libraries
  • Scanner: @alkdev/operations/scanner (with ScannerFS Deno adapter)
  • FromSchema(): @alkdev/operations/from-schema — JSON Schema → TypeBox converter
  • FromOpenAPI(): @alkdev/operations/from-openapi — OpenAPI → operation definitions
  • SchemaAdapter: @alkdev/operations/from-typemap — Zod/Valibot → TypeBox conversion at registration time
  • Spoke architecture: docs/architecture/spoke-runner.md
  • Call protocol: docs/architecture/call-graph.md
  • Operations system: docs/architecture/operations.md
  • ADR-006: Operation specs as capabilities (definitions vs. registrations)