Add SDD architecture docs for dbtype

Phase 0 architecture specification following the alkdev documentation
pattern from @alkdev/flowgraph. Documents the validated architecture
(UJSX elements → Type.Module → Drizzle hosts) based on e2e probe results.

Docs added:
- README: Project overview, architecture, current state
- architecture/README: Index, design decisions, relationships
- architecture/schema: Type.Module as bundle, construction, serialization
- architecture/hosts: HostConfig per dialect, column mapping, symbolic defaults
- architecture/elements: UJSX element types, props, function components
- architecture/module: Module mechanics, format registration, diffing
- architecture/repo-adapter: from-dbtype operations adapter (phase 2)
- architecture/build-distribution: Package structure, exports
- architecture/open-questions: 10 open questions across all topics
- ADRs 001-005: UJSX as IR, Type.Module, HostConfig, format, repo adapter
This commit is contained in:
2026-05-22 11:34:58 +00:00
parent 6fe84e1a53
commit dd2ec9df3c
14 changed files with 1447 additions and 48 deletions

166
docs/architecture/schema.md Normal file
View File

@@ -0,0 +1,166 @@
---
status: draft
last_updated: 2026-05-22
---
# Schema: Type.Module as the Schema Bundle
How dbtype uses `Type.Module` to store all table schemas, relations, and derived schemas in a single namespace with automatic `Type.Ref` resolution.
## Overview
The `Type.Module` is the central data structure in dbtype. It holds every table's TypeBox schema, all cross-table relations, and derived schemas (insert, update, select variants) in one flat namespace. `Type.Ref` resolves forward and circular references naturally, eliminating the need for separate relation files or import-order management.
The module is also the serialization boundary: `JSON.stringify(module.Import('Users'))` produces valid JSON Schema with `$defs`, enabling migration diffing via `Value.Diff`.
## Construction
### From Element Tree to Module
The element tree (`<table>`, `<column>`) is walked to extract a `Record<string, TSchema>` map, then compiled into a module:
```
UJSX elements → extractTable() → { name, schema, columns } → defs map → Type.Module(defs)
```
Each `<column>` element produces a TypeBox type based on its `type` prop:
| Column Type Prop | TypeBox Schema |
|-----------------|---------------|
| `uuid` | `Type.String({ format: 'uuid' })` |
| `string` | `Type.String()` |
| `integer` | `Type.Integer()` |
| `boolean` | `Type.Boolean()` |
| `timestamp` | `Type.Number()` |
| `enum` | `Type.Union(values.map(v => Type.Literal(v)))` |
### Incremental Construction
The defs map is a plain `Record<string, TSchema>` — it can be built incrementally, mutated, and extended before compilation:
```typescript
const defs: Record<string, any> = {}
// Add tables one at a time
defs.Users = Type.Object({ id: Type.String({ format: 'uuid' }), name: Type.String() })
defs.Tasks = Type.Object({ id: Type.String({ format: 'uuid' }), userId: Type.String({ format: 'uuid' }), title: Type.String() })
// Add columns to an existing table
defs.Users = Type.Object({ ...defs.Users.properties, role: Type.String() })
// Add relations
defs.UsersRelations = Type.Object({ tasks: Type.Array(Type.Ref('Tasks')) })
defs.TasksRelations = Type.Object({ user: Type.Ref('Users') })
// Compile
const M = Type.Module(defs)
```
Once compiled, `M.Import(key)` returns a `TImport` schema with the full `$defs` namespace embedded.
## Schema Derivation
### Select Schema
The module entry as-is is the select schema. Every column is present, nullable columns become `Type.Union([innerType, Type.Null()])`.
### Insert Schema
Derive from the table entry by:
- Removing auto-generated primary keys (columns with `primaryKey: true` and `default` set)
- Making nullable columns and columns with defaults `Type.Optional`
- Keeping required (`notNull` without default) columns mandatory
Implemented by adding a computed entry to the module:
```typescript
defs.InsertUsers = Type.Object({
name: Type.String(),
email: Type.String(),
// id, createdAt, updatedAt omitted (auto-generated)
})
```
### Update Schema
All columns optional. Use `Type.Partial(Type.Ref('TableName'))`:
```typescript
defs.UpdateUsers = Type.Partial(Type.Ref('Users'))
```
### Filter Schema
Per-column comparison operators derived from the column type. Generated by the repo adapter, not the core module.
## Relations
Relations are stored as separate entries in the module, using `Type.Ref` to reference other tables:
```typescript
defs.UsersRelations = Type.Object({ tasks: Type.Array(Type.Ref('Tasks')) })
defs.TasksRelations = Type.Object({ user: Type.Ref('Users') })
```
This gives:
- **Type-safe validation**: `Value.Check(M.Import('UsersRelations'), { tasks: [...] })` validates the full nested structure
- **No circular import issues**: `Type.Ref` resolves within the module namespace regardless of definition order
- **Queryable structure**: The `$defs` map is enumerable — you can find all relations for a table by naming convention
- **Drizzle integration**: The repo adapter reads relation entries to generate `relations()` calls for drizzle's relational query builder
Foreign key metadata lives on the column element's `references` prop (`<column name="userId" type="uuid" references="users" />`), not in the relation entry. Relations describe the "from this side, I see many of those" semantics.
## Serialization
`JSON.stringify(M.Import('TableName'))` produces JSON Schema with `$defs`:
```json
{
"$defs": {
"Users": { "$id": "Users", "type": "object", "properties": { ... } },
"Tasks": { "$id": "Tasks", "type": "object", "properties": { ... } },
"UsersRelations": { "$id": "UsersRelations", "type": "object", "properties": { "tasks": { "items": { "$ref": "Tasks" }, "type": "array" } } }
},
"$ref": "Users"
}
```
Key properties:
- Each `$defs` entry gets an `$id` matching its key name
- `Type.Ref` leaves `$ref` pointers (not inlined) — consumers must resolve them
- The serialized form is valid JSON Schema
- `Value.Diff` produces structural edits between two serialized schemas (useful for migration diffing)
## Migration Diffing
```typescript
const v1 = JSON.parse(JSON.stringify(M.Import('Users')))
// ... add a column to defs.Users ...
const v2 = JSON.parse(JSON.stringify(M2.Import('Users')))
const edits = Value.Diff(v1, v2)
// edits: [{ type: 'insert', path: '/$defs/Users/properties/role', value: { type: 'string' } }, ...]
```
The edits use JSON Pointer paths, which can be translated to `ALTER TABLE ADD COLUMN` statements.
## Constraints
- **Module keys must be unique** — two tables cannot have the same name in the same module
- **`Type.Ref` resolves within the module only** — no cross-module references without `Module.Import`
- **`Type.Ref` outside a module has `static: unknown`** — always access via `M.Import(key)` for proper type inference
- **Defs map is mutable until compiled** — once passed to `Type.Module`, mutations to the original map don't affect the compiled module
- **Format validation requires `FormatRegistry.Set`** — `uuid`, `email`, and other custom formats must be registered before `Value.Check` will enforce them
## Open Questions
1. **Should relation entries use a naming convention?** Currently `UsersRelations` / `TasksRelations`. Is this sufficient, or should relations be structured differently (e.g., a `relations` field on the table entry)?
2. **Derived schemas in the module or separate?** Insert/update schemas can be added as module entries (`InsertUsers`, `UpdateUsers`) or extracted by walking the module schema. Which is cleaner for the repo adapter?
3. **Should the module support multiple databases?** One module per database, or one module with all tables across all databases? Probably one per database namespace.
## References
- TypeBox Module API: `@alkdev/typebox` source — `type/module/module.ts`, `type/ref/ref.ts`
- Probe validation: `scripts/probe-e2e.ts`
- Research: `docs/research/architecture.md`