Files
ujsx/docs/architecture/transforms.md
glm-5.1 09f32f0c64 add architecture docs synced to current source and sdd process
Phase 1 of SDD process: syncing docs/architecture/ to reflect the
existing source code. Eight component documents describe WHAT and WHY
(not HOW) for each module: schema, element factory, reactive layer,
host config, transforms, events, pointers, and build distribution.
Three ADRs capture key decisions (HTML-agnostic core, TypeBox Module
as type registry, Preact signals-core for reactivity). Each doc
documents known reconciler gaps and references the research in
docs/research/reconciler/.

Also adds docs/sdd_process.md (process reference shared across
alkdev projects) matching the taskgraph_ts pattern.
2026-05-18 15:00:33 +00:00

134 lines
7.8 KiB
Markdown

---
status: draft
last_updated: 2026-05-18
---
# Transforms
The TransformRegistry, TransformRule, and TransformContext that power bi-directional tree conversion.
## Overview
UJSX trees need to convert to and from multiple target formats — markdown (mdast), HTML (hast), JSON paths (jpath). The transform system provides a generic, direction-aware rule engine that finds the right handler for a node and invokes it. Rules match on both direction and an arbitrary predicate, not on type tags alone. This enables the "same registry, different direction" pattern: one set of rule definitions handles both UJSX→target and target→UJSX transforms.
The registry is intentionally generic over `TInput`, `TOutput`, and `A` (ancestor type). It knows nothing about UNode, UElement, or any UJSX-specific type. This allows reuse for any tree-to-tree conversion where the same rule structure applies.
## TransformRule
```typescript
interface TransformRule<TInput, TOutput, A> {
name: string;
direction: Direction;
schema?: TSchema;
match: (node: TInput) => boolean;
transform: (node: TInput, ctx: TransformContext<A>, next: TransformFn<TInput, TOutput, A>) => TOutput;
priority?: number;
}
```
### name
Human-readable identifier. Used in error messages and in `transform.apply` events. Rules without meaningful names are hard to debug when the registry throws "no matching rule" errors.
### direction
One of six predefined strings: `"ujsx→mdast"`, `"mdast→ujsx"`, `"ujsx→jpath"`, `"jpath→ujsx"`, `"ujsx→hast"`, `"hast→ujsx"`. The direction is part of the match criteria — a rule for `ujsx→mdast` will not match when the context direction is `mdast→ujsx`. This eliminates the need for separate "encode" and "decode" registries.
The `Direction` type is defined in `context.ts`, not in the transform module. This reflects that direction is a render/conversion concept that exists outside transforms — it also appears in `RenderContext` and event payloads.
### schema
Optional TypeBox schema. When provided, it enables `matchesSchema(rule.schema, node)` as a match predicate. A rule author can use schema-based matching, predicate-based matching, or both. The registry does not automatically check `schema` during `transform()` — it is a convenience for rule authors to compose into their `match` function.
### match
A predicate that returns true if this rule should handle the given node. Combined with `direction`, this is the full match condition. Typical implementations:
- `matchesSchema(schema, node)` — TypeBox `Value.Check` for structural validation
- `(node) => node.type === "heading"` — simple equality check
- `(node) => node.type === "heading" && node.props.level > 3` — compound logic
### transform
The conversion function. Receives the node, the transform context, and a `next` callback. `next` delegates to the next matching rule in the registry — this is a chain-of-responsibility pattern:
- **Short-circuit**: return a converted value without calling `next`. The rule handles the node completely.
- **Delegate**: call `next(node, ctx)` to fall through to the next rule. Useful for middleware-like rules that wrap or augment another rule's output.
This pattern avoids hard-coded rule chaining — rules don't reference each other. The registry manages the chain by passing `next` at invocation time.
### priority
Higher values are checked first. Default is `0`. Rules with equal priority are checked in registration order. Priority allows "catch-all" rules (low or negative priority) to coexist with specific rules without relying on registration order.
## TransformContext
```typescript
interface TransformContext<A = unknown> {
ancestors: A[];
index: number;
direction: Direction;
metadata: Record<string, unknown>;
}
```
- **ancestors** — stack of ancestor nodes, root-first. `childCtx()` pushes a parent onto this stack when descending. Empty at the root level.
- **index** — position within the parent's child list. Used by `transformAll()` to pass the array index.
- **direction** — the conversion direction. Matched against rule direction during `transform()`.
- **metadata** — extensible key-value bag for rules to communicate across the tree traversal. For example, a heading rule might set `metadata.headingDepth` for descendants to reference.
## TransformRegistry
### register(rule)
Adds a rule and re-sorts by `priority` descending. The sort happens on every registration, not just at lookup time. This is acceptable because registrations happen at setup time, not in hot loops. It guarantees that `transform()` always checks the highest-priority rules first.
### transform(node, ctx)
Finds the first rule where `rule.direction === ctx.direction && rule.match(node)` returns true. Throws if no rule matches. Passes `next` as a callback that recursively calls `transform()` — this allows the matched rule to delegate to the next handler.
The "first match wins" semantics mean that priority and registration order resolve ambiguity. There is no rule composition beyond the `next` callback.
### transformAll(nodes, ctx)
Maps `transform()` over an array, passing each node's index as `ctx.index`. This is a convenience for transforming child lists — it preserves the ancestor stack from `ctx` without requiring callers to manage index tracking.
## Direction Definitions
The six directions pair into three bi-directional channels:
| Channel | Forward | Reverse |
|---------|---------|---------|
| Markdown | `ujsx→mdast` | `mdast→ujsx` |
| JSON Path | `ujsx→jpath` | `jpath→ujsx` |
| HTML | `ujsx→hast` | `hast→ujsx` |
These are the directions currently defined. Additional directions (e.g., `ujsx→dom`, `dom→ujsx`) can be added by extending the `Direction` union in `context.ts`. The registry itself is generic and does not enumerate directions.
## Known Gaps
### No transform composition beyond next
Rules can only delegate via `next`. There is no mechanism for a rule to compose multiple sub-rules (e.g., "transform children using rule X, then apply my own logic"). Such composition must be done in application code, outside the registry.
### No built-in error recovery
If `transform()` throws, the entire traversal aborts. There is no fallback rule, no "best effort" mode, and no way to skip a node and continue. Rule authors must handle their own error cases within the `transform` function.
### No caching or memoization
`transform()` performs a linear scan of rules on every call. For large rule sets or deep trees, this could become a bottleneck. No caching of match results or memoization of previous transforms is implemented.
## Constraints
- **Direction is a string union** — not an enum or extensible type. Adding a new direction requires modifying the `Direction` type in `context.ts`. This is intentional: directions define the conversion contract and should be explicitly enumerated.
- **Priority is numeric** — there is no guaranteed order between rules with the same priority beyond registration order. Rule authors should assign distinct priorities when order matters.
- **The registry is generic** — it has no knowledge of `UNode`, `UElement`, or any UJSX type. The same registry class could transform between any tree formats. UJSX-specific semantics live in the rules, not in the registry.
- **matchesSchema is a standalone function** — it is not called automatically by `transform()`. Rule authors opt into schema matching by including it in their `match` predicate.
- **next is a recursive call** — calling `next` from within `transform` re-enters `transform()` with the same node. This is correct behavior (it finds the next matching rule), but rule authors must ensure their `match` predicate excludes the current rule to avoid infinite recursion.
## References
- Source: `src/transform/registry.ts`
- Direction type: `src/core/context.ts`
- TypeBox Value.Check: `@alkdev/typebox`