docs: resolve architecture open questions, add type definitions, consolidate docs

Architecture review session resolving all high-priority open questions and
filling documentation gaps identified during review:

Decisions resolved:
- OQ-04: Flat props with inner escape hatch for column validation (ADR-007)
- OQ-05: PG enum pre-declaration returns enums and tables (ADR-008)
- OQ-06: Render results accumulate in root.ctx (resolved in hosts.md)
- Column references vs fk: references is shorthand, explicit fk takes
  precedence (ADR-006)
- ADR-001, 002, 003 promoted from Proposed to Accepted (probe-validated)

Documentation improvements:
- Complete DbColumnType mapping tables for all 14 types across 3 dialects
- Define ColumnMeta, TableMeta, IndexMeta, FkMeta types in elements.md
- Document inner prop, mode prop, and default prop semantics
- Add PgRootCtx, SqliteRootCtx, MySqlRootCtx context types
- Consolidate schema.md and module.md (remove duplication)
- Add end-to-end pipeline walkthrough to README
- Add glossary with 13 terms
- Add error handling strategy
- Remove duplicate content from hosts.md (cross-ref elements.md)
This commit is contained in:
2026-05-23 12:06:51 +00:00
parent 4644e1b362
commit d4fd67f4d2
12 changed files with 476 additions and 221 deletions

View File

@@ -2,7 +2,7 @@
## Status
Proposed
Accepted
## Context

View File

@@ -2,7 +2,7 @@
## Status
Proposed
Accepted
## Context

View File

@@ -2,7 +2,7 @@
## Status
Proposed
Accepted
## Context

View File

@@ -0,0 +1,53 @@
# ADR-006: Column `references` as FK Shorthand
## Status
Accepted
## Context
The `<column>` element has a `references` prop (`string`) that specifies a foreign key target table name. There is also a separate `<fk>` element with full FK specification (`columns`, `references`, `foreignColumns`, `onDelete`, `onUpdate`).
Two mechanisms exist for expressing foreign keys with different capabilities:
- `<column references="users">` — only specifies target table, no composite FK support, no ON DELETE/UPDATE
- `<fk>` — full FK specification with composite support and actions
This creates ambiguity: which takes precedence? Can both be used simultaneously? What if they conflict?
## Decision
`references` on `<column>` is a **shorthand for simple single-column foreign keys** that targets the referenced table's primary key by convention. `<fk>` is the **explicit form** for composite foreign keys and FKs with ON DELETE/ON UPDATE actions.
Internally, `extractTable()` normalizes all `references` props into `FkMeta` entries alongside `<fk>` elements. When both `references` on a column and an explicit `<fk>` referencing the same column exist, the explicit `<fk>` takes precedence.
## Rationale
1. **Simple FKs are the common case**: Most foreign keys are single-column references to another table's primary key. `references="users"` covers this case ergonomically without requiring a separate `<fk>` element.
2. **Composition is additive**: `references` props and `<fk>` elements both produce `FkMeta` entries. There's no separate representation for "simple" vs "complex" FKs in the output — they're unified.
3. **Explicit overrides implicit**: When both forms specify the same relationship, the explicit `<fk>` form is more complete and should take precedence. This allows shorthand to be overridden without conflict.
4. **Convention over configuration**: `references` assumes the target is the primary key. This convention covers 95% of FK relationships. For the remaining cases (non-PK targets, composite FKs), `<fk>` provides full control.
5. **`extractTable()` normalizes**: Both forms produce the same `FkMeta` output, so downstream consumers (hosts, module, repo adapter) don't need to handle two different representations.
## Consequences
### Positive
- Ergonomic shorthand for the most common FK pattern
- No ambiguity about precedence — explicit `<fk>` always wins
- Unified internal representation (`FkMeta`) regardless of authoring style
- `<column references="users">` maps to `{ columns: ['userId'], references: 'users', foreignColumns: ['id'] }` by convention
### Negative
- `references` on `<column>` cannot specify target columns or ON DELETE/UPDATE — must use `<fk>` for those
- The convention that `references` targets the primary key must be documented and understood
- Two ways to express the same thing (simple FKs) — could confuse new users until they learn the convention
## References
- [elements.md](../elements.md) — column and fk element definitions
- [repo-adapter.md](../repo-adapter.md) — FK metadata consumption

View File

@@ -0,0 +1,65 @@
# ADR-007: Flat Props with `inner` Escape Hatch for Column Validation
## Status
Accepted
## Context
Column elements carry both database metadata (notNull, primaryKey, default) and validation semantics (type, format, maxLength). The question is how to handle TypeBox validation constraints that don't have dedicated column props.
Option A (flat props only) limits validation to what has explicit props: `format`, `maxLength`, `minLength`, `pattern`, etc. Each validation constraint requires a new column prop. Custom validation beyond these requires manipulating the module entry directly.
Option B (flat props + `inner` escape hatch) uses flat props for the common 90% of cases and provides an `inner` prop that accepts a full TypeBox schema to override the auto-generated one. The host ignores `inner` — it's purely for the TypeBox schema.
Option C (inner-first) makes the TypeBox schema primary and DB metadata secondary. This is close to the `DbTypeBuilder` pattern that ADR-001 specifically rejected.
The research docs (`docs/research/architecture.md`) proposed `DbType.String({ notNull: true, inner: Type.String({ format: 'email', maxLength: 255 }) })`. With UJSX elements, this becomes `<column name="email" type="string" notNull inner={Type.String({ format: 'email', maxLength: 255 })} />`.
## Decision
Use flat props for common cases with an `inner` escape hatch for custom TypeBox schemas (Option B).
- Common validation constraints (`format`, `maxLength`, `minLength`, `pattern`, `values`) are top-level column props
- The `inner` prop accepts a TypeBox schema that overrides the auto-generated `colToTypeBox()` result
- When `inner` is provided, `extractTable()` uses it directly instead of calling `colToTypeBox(type, props)`
- The host ignores `inner` — database rendering uses only `type` and the other DB metadata props
- When `inner` is absent, `colToTypeBox(type, props)` generates the TypeBox schema from the column type and props
## Rationale
1. **Ergonomics for the common case**: Most columns only need `type`, `notNull`, `primaryKey`, `default`, and maybe `format`. Flat props (`<column name="email" type="string" notNull format="email" />`) are concise and readable.
2. **Escape hatch for complex validation**: Some columns need validation constraints that don't map to column props: `pattern`, `minimum`/`maximum` for numeric ranges, `contentMediaType` for embeddings, custom format validators. The `inner` prop lets you express any TypeBox validation without polluting the column props namespace.
3. **Host independence**: The host renders columns to Drizzle based on `type` and DB metadata. `inner` doesn't affect rendering at all. This means adding `inner` doesn't require host changes — it's a pure TypeConcern.
4. **Future-proofing**: Embedding vectors, custom types, and complex validation patterns will need `inner`. For instance, a vector column might be `<column name="embedding" type="string" mode="json" inner={Type.Array(Type.Number(), { minItems: 1536, maxItems: 1536 })} />` — the host still stores it as JSON, but the TypeBox schema validates the array structure.
5. **Probe validated**: The probe scripts use flat props exclusively. `inner` adds a single conditional in `extractTable()` — if `inner` is provided, use it; otherwise, call `colToTypeBox()`. No structural change to the pipeline.
6. **No alternative duplication**: Without `inner`, consumers who need `Type.String({ format: 'email', maxLength: 255, pattern: '^[a-z]' })` would have to manipulate the module entry after extraction, which defeats the purpose of a schema-first element tree.
## Consequences
### Positive
- Flat props stay ergonomic for the 90% case
- Any TypeBox validation is expressible via `inner` without new column props
- Hosts don't need changes — `inner` is a TypeConcern only
- Future column types (vectors, custom point types) can embed complex validation
- Implementation cost is minimal: one conditional in `extractTable()`
### Negative
- Two ways to express validation (flat props vs `inner`) — could confuse new users
- `inner` TypeBox schemas are not reflected in column props — introspecting a column's validation requires looking at both `props` and `props.inner`
- The `colToTypeBox()` function must still exist for the no-`inner` case, and its output must be compatible with what `inner` would provide
- When `inner` is provided alongside flat validation props (`format`, `maxLength`), `inner` takes precedence. This must be clearly documented.
## References
- [elements.md](../elements.md) — Column element definition and `inner` prop
- [ADR-001](001-ujsx-as-ir.md) — UJSX as the IR (rejected separate builder API)
- [ADR-004](004-format-annotation-only.md) — Format as annotation (relevant: `format` is a flat prop, `inner` is for when it's not enough)
- Research: `docs/research/architecture.md` (DbTypeBuilder inner pattern)

View File

@@ -0,0 +1,73 @@
# ADR-008: PG Enum Pre-declaration — Return Both Enums and Tables
## Status
Accepted
## Context
PostgreSQL requires `pgEnum()` to be called at module scope before any table that references it. SQLite uses `text({ enum: [...] })` inline. MySQL uses `mysqlEnum()` inline. This creates a structural difference in the render output: PG enums must be declared separately.
Three options were considered:
**Option A**: Return both enums and tables from render. The PG host context accumulates enum declarations during the render walk and exposes them alongside tables.
**Option B**: Use `text()` for all enums initially, add native PG enum support later as an opt-in.
**Option C**: Per-column opt-in with `postgres: { nativeEnum: true }` — only generates `pgEnum` when explicitly requested.
## Decision
Use Option A: the PG host returns both enums and tables from the render context.
The PG host context shape after rendering:
```typescript
interface PgRootCtx {
dialect: 'pg'
tables: Record<string, PgTable>
enums: Record<string, PgEnum> // Accumulated during render
}
```
When a `<column type="enum" values={['happy', 'sad', 'neutral']}>` is encountered, the PG host:
1. Registers the enum in `ctx.enums` (using the table and column name to derive a unique enum name)
2. Uses the registered enum in the column builder
3. The consumer includes both `ctx.enums` and `ctx.tables` in their Drizzle schema
SQLite and MySQL hosts use `ctx.tables` only — no `enums` key.
## Rationale
1. **Correct PG behavior**: PG enums are a separate type declaration (`CREATE TYPE ... AS ENUM`). The render output must include them for the schema to be valid.
2. **No information loss**: Option B (use `text()` for all) loses PG's native enum validation at the database level. The whole point of defining schema types is to get correct database behavior per dialect.
3. **Clean API**: The host context already accumulates `tables`. Adding `enums` to the same context is natural and doesn't change the render pattern — it just adds a second accumulation target.
4. **SQLite/MySQL unaffected**: These dialects don't have pre-declared enums. Their host contexts don't include `enums`, so the API is dialect-appropriate.
5. **Consistent with resolved OQ-006**: The rendering pipeline accumulates results in `root.ctx`. PG enums are auxiliary state, just like tables. The context shape varies by dialect — this is expected.
6. **Deriving enum names**: For a column `status` on table `users`, the PG enum name is `users_status` (table_column convention). This is the same convention Drizzle uses for inline enums.
## Consequences
### Positive
- Correct PG enum support from day one
- No opt-in flag needed — enums just work when `type="enum"` is used with the PG host
- Dialect-appropriate context: PG has `enums + tables`, SQLite/MySQL have `tables` only
- Consistent with the context accumulation pattern
### Negative
- PG consumers must include `ctx.enums` in their schema file (alongside `ctx.tables`)
- The derived enum naming convention (`table_column`) could conflict with explicit enum names in the future — a naming override prop may be needed later
- Slightly more complex PG host implementation (tracking enum registrations during the walk)
## References
- [hosts.md](../hosts.md) — Host rendering pipeline and PG column type mapping
- [elements.md](../elements.md) — Column element `type="enum"` and `values` props
- [open-questions.md](../open-questions.md) — Original OQ-05