Files
ujsx/docs/architecture/transforms.md
glm-5.1 0d5b9d5ea8 stabilize architecture docs: address review findings and advance to stable
Critical fixes:
- Restructure pointers.md: move setNode prop-key writes section under
  its own heading (was incorrectly nested under selectNode)
- Add Context/Density/Direction/RenderContext documentation section
  to host-config.md (was only a brief constraint bullet)
- Advance all 5 ADRs from Status: Proposed → Accepted and frontmatter
  from status: draft → status: stable (decisions are driving implementation)
- Add error handling philosophy section to README

Warning/suggestion fixes:
- Add isUElement null check (node !== null) to schema.md discriminator table
- Add UjsxEnvelope convenience type documentation to events.md
- Add Direction Unicode arrow naming note to transforms.md
- Standardize all cross-references from absolute docs/research/ paths
  to relative ../research/ paths across all architecture docs
- Fix schema.md ADR references to use relative paths
- Reduce redundancy between transforms.md and host-config.md Direction notes
- Update all architecture doc frontmatter from draft → stable

Deferred:
- Performance model section (reconciler not yet built)
- Concepts/glossary document (low ROI at current scale)
- Line counts in source references (would date quickly)
2026-05-18 16:10:24 +00:00

148 lines
8.7 KiB
Markdown

---
status: stable
last_updated: 2026-05-18
---
# Transforms
The TransformRegistry, TransformRule, and TransformContext that power bi-directional tree conversion.
## Overview
UJSX trees need to convert to and from multiple target formats — markdown (mdast), HTML (hast), JSON paths (jpath). The transform system provides a generic, direction-aware rule engine that finds the right handler for a node and invokes it. Rules match on both direction and an arbitrary predicate, not on type tags alone. This enables the "same registry, different direction" pattern: one set of rule definitions handles both UJSX→target and target→UJSX transforms.
The registry is intentionally generic over `TInput`, `TOutput`, and `A` (ancestor type). It knows nothing about UNode, UElement, or any UJSX-specific type. This allows reuse for any tree-to-tree conversion where the same rule structure applies.
## TransformRule
```typescript
interface TransformRule<TInput, TOutput, A> {
name: string;
direction: Direction;
schema?: TSchema;
match: (node: TInput) => boolean;
transform: (node: TInput, ctx: TransformContext<A>, next: TransformFn<TInput, TOutput, A>) => TOutput;
priority?: number;
}
```
### name
Human-readable identifier. Used in error messages and in `transform.apply` events. Rules without meaningful names are hard to debug when the registry throws "no matching rule" errors.
### direction
One of six predefined strings: `"ujsx→mdast"`, `"mdast→ujsx"`, `"ujsx→jpath"`, `"jpath→ujsx"`, `"ujsx→hast"`, `"hast→ujsx"`. The direction is part of the match criteria — a rule for `ujsx→mdast` will not match when the context direction is `mdast→ujsx`. This eliminates the need for separate "encode" and "decode" registries.
The `Direction` type is defined in `context.ts`, not in the transform module. This reflects that direction is a render/conversion concept that exists outside transforms — it also appears in `RenderContext` and event payloads.
### schema
Optional TypeBox schema. When provided, it enables `matchesSchema(rule.schema, node)` as a match predicate. A rule author can use schema-based matching, predicate-based matching, or both. The registry does not automatically check `schema` during `transform()` — it is a convenience for rule authors to compose into their `match` function.
### match
A predicate that returns true if this rule should handle the given node. Combined with `direction`, this is the full match condition. Typical implementations:
- `matchesSchema(schema, node)` — TypeBox `Value.Check` for structural validation
- `(node) => node.type === "heading"` — simple equality check
- `(node) => node.type === "heading" && node.props.level > 3` — compound logic
### transform
The conversion function. Receives the node, the transform context, and a `next` callback. `next` delegates to the next matching rule in the registry — this is a chain-of-responsibility pattern:
- **Short-circuit**: return a converted value without calling `next`. The rule handles the node completely.
- **Delegate**: call `next(node, ctx)` to fall through to the next rule. Useful for middleware-like rules that wrap or augment another rule's output.
This pattern avoids hard-coded rule chaining — rules don't reference each other. The registry manages the chain by passing `next` at invocation time.
### priority
Higher values are checked first. Default is `0`. Rules with equal priority are checked in registration order. Priority allows "catch-all" rules (low or negative priority) to coexist with specific rules without relying on registration order.
## TransformContext
```typescript
interface TransformContext<A = unknown> {
ancestors: A[];
index: number;
direction: Direction;
metadata: Record<string, unknown>;
}
```
- **ancestors** — stack of ancestor nodes, root-first. `childCtx()` pushes a parent onto this stack when descending. Empty at the root level.
- **index** — position within the parent's child list. Used by `transformAll()` to pass the array index.
- **direction** — the conversion direction. Matched against rule direction during `transform()`.
- **metadata** — extensible key-value bag for rules to communicate across the tree traversal. For example, a heading rule might set `metadata.headingDepth` for descendants to reference.
## TransformRegistry
### register(rule)
Adds a rule and re-sorts by `priority` descending. The sort happens on every registration, not just at lookup time. This is acceptable because registrations happen at setup time, not in hot loops. It guarantees that `transform()` always checks the highest-priority rules first.
### transform(node, ctx)
Finds the first rule where `rule.direction === ctx.direction && rule.match(node)` returns true. Throws if no rule matches. Passes `next` as a callback that recursively calls `transform()` — this allows the matched rule to delegate to the next handler.
The "first match wins" semantics mean that priority and registration order resolve ambiguity. There is no rule composition beyond the `next` callback.
### transformAll(nodes, ctx)
Maps `transform()` over an array, passing each node's index as `ctx.index`. This is a convenience for transforming child lists — it preserves the ancestor stack from `ctx` without requiring callers to manage index tracking.
## Helper Functions
The transform module exports two context factory functions used alongside `TransformRegistry`:
### `ctx<A>(direction, ancestors?, index?, metadata?)`
Creates a `TransformContext` from its arguments. The `direction` parameter is required; the rest default to `[]`, `0`, and `{}` respectively. Exported as `transformCtx` from the barrel (`@alkdev/ujsx/transform`) to avoid name collision with React's `ctx` naming.
### `childCtx<A>(parent, ctx, index)`
Creates a new `TransformContext` with `parent` pushed onto the ancestors stack and `index` set. This is the standard way to descend into a child node during transformation — it preserves the direction and metadata from the parent context while updating traversal state.
## Direction Definitions
The six directions pair into three bi-directional channels:
| Channel | Forward | Reverse |
|---------|---------|---------|
| Markdown | `ujsx→mdast` | `mdast→ujsx` |
| JSON Path | `ujsx→jpath` | `jpath→ujsx` |
| HTML | `ujsx→hast` | `hast→ujsx` |
These are the directions currently defined. Additional directions (e.g., `ujsx→dom`, `dom→ujsx`) can be added by extending the `Direction` union in `context.ts`. The registry itself is generic and does not enumerate directions.
The `→` character in direction strings is Unicode U+2192 (RIGHTWARDS ARROW), chosen for readability over alternatives like `ujsx-to-mdast`. See [host-config.md](host-config.md) for full documentation of `Direction`, `Density`, and `RenderContext`.
## Known Gaps
### No transform composition beyond next
Rules can only delegate via `next`. There is no mechanism for a rule to compose multiple sub-rules (e.g., "transform children using rule X, then apply my own logic"). Such composition must be done in application code, outside the registry.
### No built-in error recovery
If `transform()` throws, the entire traversal aborts. There is no fallback rule, no "best effort" mode, and no way to skip a node and continue. Rule authors must handle their own error cases within the `transform` function.
### No caching or memoization
`transform()` performs a linear scan of rules on every call. For large rule sets or deep trees, this could become a bottleneck. No caching of match results or memoization of previous transforms is implemented.
## Constraints
- **Direction is a string union** — not an enum or extensible type. Adding a new direction requires modifying the `Direction` type in `context.ts`. This is intentional: directions define the conversion contract and should be explicitly enumerated.
- **Priority is numeric** — there is no guaranteed order between rules with the same priority beyond registration order. Rule authors should assign distinct priorities when order matters.
- **The registry is generic** — it has no knowledge of `UNode`, `UElement`, or any UJSX type. The same registry class could transform between any tree formats. UJSX-specific semantics live in the rules, not in the registry.
- **matchesSchema is a standalone function** — it is not called automatically by `transform()`. Rule authors opt into schema matching by including it in their `match` predicate.
- **next is a recursive call** — calling `next` from within `transform` re-enters `transform()` with the same node. This is correct behavior (it finds the next matching rule), but rule authors must ensure their `match` predicate excludes the current rule to avoid infinite recursion.
## References
- Source: `src/transform/registry.ts`
- Direction type: `src/core/context.ts`
- TypeBox Value.Check: `@alkdev/typebox`