From 3034e6ebf8a9fee845eab18005c3ea0f70c06ad2 Mon Sep 17 00:00:00 2001 From: "glm-5.1" Date: Sat, 25 Apr 2026 12:14:39 +0000 Subject: [PATCH] docs: add architecture research for schema-first multi-dialect TypeBox/Drizzle bridge --- docs/research/architecture.md | 924 ++++++++++++++++++++++++++ docs/research/dizzle-column-diffs.md | 440 ++++++++++++ docs/research/typedef-kind-pattern.md | 399 +++++++++++ docs/research/typemap-architecture.md | 475 +++++++++++++ 4 files changed, 2238 insertions(+) create mode 100644 docs/research/architecture.md create mode 100644 docs/research/dizzle-column-diffs.md create mode 100644 docs/research/typedef-kind-pattern.md create mode 100644 docs/research/typemap-architecture.md diff --git a/docs/research/architecture.md b/docs/research/architecture.md new file mode 100644 index 0000000..13565db --- /dev/null +++ b/docs/research/architecture.md @@ -0,0 +1,924 @@ +--- +status: draft +last_updated: 2026-04-25 +--- + +# DrizzleBox Architecture: Schema-First Multi-Dialect TypeBox/Drizzle Bridge + +## Design Philosophy + +DrizzleBox bridges TypeBox and Drizzle ORM in both directions: + +1. **Drizzle → TypeBox** (current): Given a Drizzle table definition, produce a TypeBox validation schema. This is what drizzlebox does today. +2. **TypeBox → Drizzle** (new): Define a schema once using TypeBox-based custom kinds, then generate Drizzle table definitions for any supported dialect. Plugin authors write schemas; hosts decide storage backend. + +The key insight is that both directions share the same **DbType IR** — a set of custom TypeBox kinds that carry both validation semantics and database metadata in one schema object. This is the hub-and-spoke pattern: the IR is the hub, dialects are the spokes, and translations go through the IR rather than directly between formats. + +``` + TypeBox Validation + ↑ + │ validate / infer types + │ +┌─────────────────────────────────────────────────────────────────┐ +│ DbType IR (the hub) │ +│ Custom TypeBox Kinds: DbType:String, DbType:Integer, ... │ +│ Each carries inner validation schema + db metadata │ +└────────┬────────────────────────────────┬───────────────────────┘ + │ │ + │ toDrizzle(schema, 'sqlite') │ toDrizzle(schema, 'postgres') + │ │ + ▼ ▼ +┌───────────────────────────┐ ┌───────────────────────────┐ +│ SQLite Transform Module │ │ PostgreSQL Transform Module│ +│ @alkdev/drizzlebox/sqlite │ │ @alkdev/drizzlebox/pg │ +│ (peerDep: drizzle-orm │ │ (peerDep: drizzle-orm │ +│ sqlite-core only) │ │ pg-core only) │ +└───────────────────────────┘ └───────────────────────────┘ +``` + +### Principles + +1. **Schema is source of truth** — validation and database structure derive from the same definition +2. **Compose, don't replace** — DbType kinds wrap inner TypeBox schemas, they don't reimplement validation +3. **Common options first, overrides only when needed** — `primaryKey`, `notNull`, `unique` are cross-dialect; dialect-specific options only appear when they diverge +4. **Tree-shakeable by default** — import only the dialect you need; don't bundle sqlite transforms if you only use postgres +5. **Extensible** — plugin authors can register custom column types and transform rules +6. **Bidirectional eventually** — the IR enables both Drizzle→TypeBox and TypeBox→Drizzle, but we start with TypeBox→Drizzle + +## The DbType IR + +### Custom Kind Pattern + +DbType uses TypeBox's `[Kind]` symbol as the dispatch key, following the established `TypeDef:*` namespace convention. Each DbType kind wraps an inner TypeBox schema and attaches structured database metadata: + +```typescript +import { Kind, TypeRegistry, TSchema, Static } from '@alkdev/typebox' + +// The core pattern: compose, don't replace +export interface TDbColumn extends TSchema { + [Kind]: string // e.g. 'DbType:String', 'DbType:Integer' + static: Static // TypeScript infers from inner schema + inner: TInner // The TypeBox validation schema + columnName: string // Set by DbType.Table, not by individual columns + db: DbColumnMeta // Database metadata (cross-dialect + overrides) +} +``` + +All column kinds share the same `TDbColumn` interface. The `[Kind]` value distinguishes them at runtime — `'DbType:String'`, `'DbType:Integer'`, `'DbType:Boolean'`, etc. + +**Why wrap instead of replace?** TypeBox's built-in types carry rich validation metadata (`format`, `pattern`, `minLength`, `minimum`, `maximum`). DbType preserves all of this in `inner` while layering database semantics in `db`. A `DbType.VarChar(255)` wraps `Type.String({ maxLength: 255 })` — when you call `Value.Check(dbSchema, value)`, validation delegates to the inner schema. + +### Metadata Structure + +```typescript +interface DbColumnMeta { + // Cross-dialect options — apply to all dialects unless overridden + primaryKey?: boolean + notNull?: boolean + unique?: boolean + references?: DbReferences + default?: DbDefault + + // Dialect-specific overrides — only set when they differ from the cross-dialect default + sqlite?: SqliteColumnOpts + postgres?: PgColumnOpts + mysql?: MySqlColumnOpts +} + +interface DbReferences { + table: string + column: string + onDelete?: 'cascade' | 'set null' | 'restrict' | 'no action' + onUpdate?: 'cascade' | 'set null' | 'restrict' | 'no action' +} + +// Symbolic defaults — each dialect translates these to native SQL +type DbDefault = + | 'now' // SQLite: strftime, PG: now(), MySQL: NOW() + | 'uuid' // SQLite: (lower(hex(randomblob(16)))), PG: gen_random_uuid() + | 'autoincrement' // SQLite: INTEGER PRIMARY KEY, PG: SERIAL, MySQL: AUTO_INCREMENT + | 'current_timestamp' // Alias for 'now' with timezone context + | SQL // Raw SQL expression (drizzle-orm's sql tag) +``` + +The key design choice: **`primaryKey`, `notNull`, `unique`, and `references` are cross-dialect by default**. You only specify `sqlite` or `postgres` overrides when a dialect needs different treatment. This eliminates the duplication problem from the original storage.md design: + +```typescript +// BEFORE (storage.md — duplicated options) +DbType.String({ sqlite: { primaryKey: true }, postgres: { primaryKey: true } }) + +// AFTER (this design — cross-dialect by default) +DbType.String({ primaryKey: true }) +``` + +When a dialect-specific override is needed, it merges with and can override the cross-dialect defaults: + +```typescript +// JSON storage: SQLite uses text({ mode: 'json' }), PG uses jsonb() +DbType.Array(DbType.String(), { mode: 'json' }) +// The transform for 'json' mode knows to use the right dialect-specific type +// No manual overrides needed for this case + +// When you DO need a dialect override: +DbType.String({ format: 'uuid', postgres: { type: 'uuid' } }) +// Default: text() everywhere, PG override: uuid() +``` + +### DbDefault: Symbolic Defaults + +SQL default expressions are inherently dialect-specific. Rather than requiring users to write both `sql\`(strftime('%s', 'now'))\`` and `sql\`now()\``, we introduce symbolic defaults: + +| Symbol | SQLite | PostgreSQL | MySQL | +|--------|--------|------------|-------| +| `'now'` | `strftime('%s', 'now')` (as integer epoch) | `now()` (as timestamptz) | `NOW()` | +| `'uuid'` | `lower(hex(randomblob(16)))` | `gen_random_uuid()` | `(UUID())` | +| `'autoincrement'` | Implicit on `INTEGER PRIMARY KEY` | `SERIAL` type | `AUTO_INCREMENT` | +| `'current_timestamp'` | `CURRENT_TIMESTAMP` | `CURRENT_TIMESTAMP` | `CURRENT_TIMESTAMP` | + +For cases not covered by symbolic defaults, raw SQL is available via the `sql` tag: + +```typescript +DbType.String({ default: sql\`(lower(hex(randomblob(4))))` }) +``` + +### TDbTable + +Table definitions group columns and carry table-level options: + +```typescript +export interface TDbTable extends TSchema { + [Kind]: 'DbType:Table' + tableName: string + columns: Record + indexes?: TDbIndex[] + constraints?: DbTableConstraints +} + +export interface TDbIndex { + name: string + columns: string[] + unique?: boolean +} +``` + +### DbTypeBuilder + +Following TypeBox's `Type` and TypeDef's `TypeDefBuilder` pattern, `DbTypeBuilder` provides factory methods: + +```typescript +class DbTypeBuilder { + protected Create( + kind: string, + inner: TInner, + opts: DbColumnOpts + ): TDbColumn { + const { sqlite, postgres, mysql, ...common } = opts + return { + [Kind]: kind, + inner, + columnName: '', // Set by Table() + db: { + ...common, + ...(sqlite ? { sqlite } : {}), + ...(postgres ? { postgres } : {}), + ...(mysql ? { mysql } : {}), + }, + } + } + + String(opts: DbColumnOpts & StringDbOpts = {}): TDbColumn { + const { maxLength, format, ...dbOpts } = opts + const inner = Type.String({ maxLength, format }) + return this.Create('DbType:String', inner, dbOpts) + } + + Integer(opts: DbColumnOpts = {}): TDbColumn { + return this.Create('DbType:Integer', Type.Integer(), opts) + } + + Boolean(opts: DbColumnOpts = {}): TDbColumn { + return this.Create('DbType:Boolean', Type.Boolean(), opts) + } + + Timestamp(opts: DbColumnOpts & TimestampDbOpts = {}): TDbColumn { + // Stored as Unix epoch seconds (number), validated as number + return this.Create('DbType:Timestamp', Type.Number(), opts) + } + + Array(items: T, opts: DbColumnOpts & { mode: 'json' } = {}): TDbColumn> { + return this.Create('DbType:Array', Type.Array(items), opts) + } + + Object(properties: T, opts: DbColumnOpts & { mode: 'json' } = {}): TDbColumn> { + return this.Create('DbType:Object', Type.Object(properties), opts) + } + + Record(values: V, opts: DbColumnOpts & { mode: 'json' } = {}): TDbColumn> { + return this.Create('DbType:Record', Type.Record(Type.String(), values), opts) + } + + Any(opts: DbColumnOpts & { mode: 'json' } = {}): TDbColumn { + return this.Create('DbType:Any', Type.Unknown(), opts) + } + + Enum(values: [...T], opts: DbColumnOpts = {}): TDbColumn[]>> { + const inner = Type.Union(values.map(v => Type.Literal(v))) + return this.Create('DbType:Enum', inner, { ...opts, enumValues: values }) + } + + VarChar(maxLength: number, opts: DbColumnOpts = {}): TDbColumn { + return this.Create('DbType:VarChar', Type.String({ maxLength }), opts) + } + + Uuid(opts: DbColumnOpts = {}): TDbColumn { + return this.Create('DbType:Uuid', Type.String({ format: 'uuid' }), opts) + } + + /** Mark a column as optional (nullable in DB, excluded from insert schema) */ + Optional(column: T): T { + return { ...column, [TypeBox.Optional]: true } as T + } + + Table(name: string, columns: Record, opts?: DbTableOpts): TDbTable { + const namedColumns: Record = {} + for (const [key, col] of Object.entries(columns)) { + namedColumns[key] = { ...col, columnName: key } + } + return { + [Kind]: 'DbType:Table', + tableName: name, + columns: namedColumns, + indexes: opts?.indexes, + constraints: opts?.constraints, + } + } +} + +export const DbType = new DbTypeBuilder() +``` + +### Kind Registration + +DbType kinds register with TypeBox's TypeRegistry so that `Value.Check()` and `Value.Parse()` work on DbType schemas: + +```typescript +// Delegate validation to inner schema +TypeRegistry.Set('DbType:String', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Integer', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Boolean', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Timestamp', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Array', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Object', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Record', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Any', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Enum', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:VarChar', (schema, value) => Value.Check(schema.inner, value)) +TypeRegistry.Set('DbType:Uuid', (schema, value) => Value.Check(schema.inner, value)) +// TDbTable validates each column +TypeRegistry.Set('DbType:Table', (schema, value) => { + return Object.entries(schema.columns).every( + ([key, col]) => Value.Check(col, value[key]) + ) +}) +``` + +### TypeGuard + +A `DbGuard` namespace validates the structure of DbType schema objects (not values, but the schemas themselves): + +```typescript +export namespace DbGuard { + export function TDbColumn(schema: unknown): schema is TDbColumn { + return IsObject(schema) + && Kind in schema + && typeof schema[Kind] === 'string' + && (schema[Kind] as string).startsWith('DbType:') + && IsObject(schema['db']) + && TypeGuard.TSchema(schema['inner']) + } + + export function TDbTable(schema: unknown): schema is TDbTable { + return IsObject(schema) + && schema[Kind] === 'DbType:Table' + && typeof schema['tableName'] === 'string' + && IsObject(schema['columns']) + } + + // ... specific Kind guards +} +``` + +## Dialect Transforms + +### Module Structure (Tree-Shakeable) + +``` +@alkdev/drizzlebox/ + src/ + index.ts # DbType IR, builder, guard, registry + dbtype/ + types.ts # TDbColumn, TDbTable, DbColumnMeta interfaces + builder.ts # DbTypeBuilder class + guard.ts # DbGuard namespace + registry.ts # Kind registration with TypeRegistry + defaults.ts # Symbolic default translations + common.ts # Common column definitions (id, createdAt, updatedAt) + sqlite/ + index.ts # Public API for SQLite dialect + transform.ts # Transform registry rules + columns.ts # Column mapping functions + pg/ + index.ts # Public API for PostgreSQL dialect + transform.ts # Transform registry rules + columns.ts # Column mapping functions + mysql/ # Future + index.ts + transform.ts + columns.ts + drizzle/ # Future: Drizzle → DbType direction + index.ts + from-column.ts # Introspect Drizzle columns into DbType IR +``` + +Package exports for tree-shaking: + +```json +{ + "exports": { + ".": { + "import": "./index.mjs", + "require": "./index.cjs" + }, + "./sqlite": { + "import": "./sqlite.mjs", + "require": "./sqlite.cjs", + "peerDependencies": { "drizzle-orm": ">=0.36.0" } + }, + "./pg": { + "import": "./pg.mjs", + "require": "./pg.cjs", + "peerDependencies": { "drizzle-orm": ">=0.36.0" } + }, + "./common": { + "import": "./common.mjs", + "require": "./common.cjs" + } + }, + "peerDependencies": { + "@alkdev/typebox": ">=0.34.49" + } +} +``` + +The core package (`@alkdev/drizzlebox`) depends only on `@alkdev/typebox`. The dialect modules (`/sqlite`, `/pg`) have `drizzle-orm` as a peer dependency. Users who only use SQLite never import PG transforms. + +### Usage + +```typescript +import { DbType } from '@alkdev/drizzlebox' +import { toSqlite } from '@alkdev/drizzlebox/sqlite' +// Only imports sqlite-core from drizzle-orm + +const UserSchema = DbType.Table('users', { + id: DbType.Uuid({ primaryKey: true, default: 'uuid' }), + name: DbType.String({ notNull: true }), + email: DbType.String({ notNull: true, format: 'email' }), + scopes: DbType.Array(DbType.String(), { mode: 'json' }), + createdAt: DbType.Timestamp({ notNull: true, default: 'now' }), +}) + +// Generate Drizzle SQLite table +const users = toSqlite(UserSchema) +// Equivalent to: sqliteTable('users', { id: text('id').primaryKey().$defaultFn(genRandomUUID), ... }) +``` + +```typescript +import { DbType } from '@alkdev/drizzlebox' +import { toPg } from '@alkdev/drizzlebox/pg' +// Only imports pg-core from drizzle-orm + +const users = toPg(UserSchema) +// Equivalent to: pgTable('users', { id: uuid('id').primaryKey().defaultRandom(), ... }) +``` + +### Transform Registry + +Each dialect module uses a priority-sorted rule registry to map DbType kinds to Drizzle column builders: + +```typescript +// sqlite/transform.ts +import { TransformRegistry } from '../dbtype/registry.ts' + +interface TransformContext { + dialect: 'sqlite' | 'postgres' | 'mysql' + ancestors: TDbColumn[] + metadata: Record +} + +type ColumnTransformResult = DrizzleColumnBuilder // from drizzle-orm + +interface TransformRule { + name: string + match: (schema: TDbColumn, ctx: TransformContext) => boolean + transform: (schema: TDbColumn, ctx: TransformContext) => ColumnTransformResult + priority: number // Lower = higher priority +} + +const sqliteTransforms = new TransformRegistry() + +sqliteTransforms.register({ + name: 'sqlite-string', + priority: 0, + match: (col) => col[Kind] === 'DbType:String', + transform: (col, ctx) => { + const db = col.db + const opts = resolveOpts(db, 'sqlite') + let builder = sqliteText(col.columnName) + if (opts.primaryKey) builder = builder.primaryKey() + if (opts.notNull) builder = builder.notNull() + if (opts.unique) builder = builder.unique() + if (opts.default !== undefined) builder = applyDefault(builder, opts.default, 'sqlite') + return builder + }, +}) + +sqliteTransforms.register({ + name: 'sqlite-uuid', + priority: -1, // Higher priority than generic string + match: (col) => col[Kind] === 'DbType:Uuid', + transform: (col, ctx) => { + const db = col.db + const opts = resolveOpts(db, 'sqlite') + let builder = sqliteText(col.columnName) + if (opts.primaryKey) builder = builder.primaryKey() + if (opts.notNull) builder = builder.notNull() + if (opts.default === 'uuid') { + builder = builder.$defaultFn(() => crypto.randomUUID()) + } + return builder + }, +}) + +sqliteTransforms.register({ + name: 'sqlite-boolean', + priority: 0, + match: (col) => col[Kind] === 'DbType:Boolean', + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'sqlite') + let builder = sqliteInteger(col.columnName, { mode: 'boolean' }) + if (opts.notNull) builder = builder.notNull() + if (opts.default !== undefined) builder = applyDefault(builder, opts.default, 'sqlite') + return builder + }, +}) + +sqliteTransforms.register({ + name: 'sqlite-timestamp', + priority: 0, + match: (col) => col[Kind] === 'DbType:Timestamp', + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'sqlite') + let builder = sqliteInteger(col.columnName, { mode: 'timestamp' }) + if (opts.notNull) builder = builder.notNull() + if (opts.default) builder = applyDefault(builder, opts.default, 'sqlite') + return builder + }, +}) + +sqliteTransforms.register({ + name: 'sqlite-json', + priority: -1, // Higher priority than string/array/object + match: (col) => col[Kind] === 'DbType:Array' || col[Kind] === 'DbType:Object' || col[Kind] === 'DbType:Record' || col[Kind] === 'DbType:Any', + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'sqlite') + let builder = sqliteText(col.columnName, { mode: 'json' }) + if (opts.notNull) builder = builder.notNull() + if (opts.default !== undefined) builder = applyDefault(builder, opts.default, 'sqlite') + return builder + }, +}) +``` + +```typescript +// pg/transform.ts — analogous but using pg-core builders +const pgTransforms = new TransformRegistry() + +pgTransforms.register({ + name: 'pg-uuid', + priority: -1, + match: (col) => col[Kind] === 'DbType:Uuid', + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'postgres') + let builder = pgUuid(col.columnName) + if (opts.primaryKey) builder = builder.primaryKey() + if (opts.notNull) builder = builder.notNull() + if (opts.default === 'uuid') builder = builder.defaultRandom() + return builder + }, +}) + +pgTransforms.register({ + name: 'pg-jsonb', + priority: -1, + match: (col) => ['DbType:Array', 'DbType:Object', 'DbType:Record', 'DbType:Any'].includes(col[Kind]), + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'postgres') + let builder = pgJsonb(col.columnName) + if (opts.notNull) builder = builder.notNull() + if (opts.default !== undefined) builder = applyDefault(builder, opts.default, 'postgres') + return builder + }, +}) + +pgTransforms.register({ + name: 'pg-timestamp', + priority: 0, + match: (col) => col[Kind] === 'DbType:Timestamp', + transform: (col, ctx) => { + const opts = resolveOpts(col.db, 'postgres') + let builder = pgTimestamptz(col.columnName, { withTimezone: true }) + if (opts.notNull) builder = builder.notNull() + if (opts.default === 'now') builder = builder.default(sql`now()`) + return builder + }, +}) +``` + +### Option Resolution + +The `resolveOpts` function merges cross-dialect options with dialect-specific overrides: + +```typescript +function resolveOpts(db: DbColumnMeta, dialect: 'sqlite' | 'postgres' | 'mysql'): ResolvedColumnOpts { + const dialectOpts = db[dialect] ?? {} + return { + primaryKey: dialectOpts.primaryKey ?? db.primaryKey, + notNull: dialectOpts.notNull ?? db.notNull, + unique: dialectOpts.unique ?? db.unique, + references: dialectOpts.references ?? db.references, + default: dialectOpts.default ?? db.default, + ...dialectOpts, // Any dialect-specific extras + } +} +``` + +### Symbolic Default Resolution + +```typescript +function applyDefault( + builder: ColumnBuilder, + defaultVal: DbDefault | unknown, + dialect: 'sqlite' | 'postgres' | 'mysql' +): ColumnBuilder { + if (typeof defaultVal === 'string') { + switch (defaultVal) { + case 'now': + return dialect === 'sqlite' + ? builder.default(sql`(strftime('%s', 'now'))`) + : dialect === 'postgres' + ? builder.default(sql`now()`) + : builder.default(sql`NOW()`) + case 'uuid': + return dialect === 'sqlite' + ? builder.$defaultFn(() => crypto.randomUUID()) + : dialect === 'postgres' + ? builder.defaultRandom() + : builder.$defaultFn(() => crypto.randomUUID()) + case 'autoincrement': + // Handled differently per dialect — usually implicit in primaryKey + return builder + case 'current_timestamp': + return builder.default(sql`CURRENT_TIMESTAMP`) + } + } + // SQL expression or literal value + if (defaultVal instanceof SQL) return builder.default(defaultVal) + return builder.default(defaultVal) +} +``` + +## Type Mapping Table + +| DbType Kind | SQLite Column | PG Column | MySQL Column | Inner TypeBox | +|-------------|---------------|-----------|--------------|---------------| +| `DbType:String` | `text()` | `text()` | `text()` | `Type.String()` | +| `DbType:Uuid` | `text()` | `uuid()` | `varchar(36)` | `Type.String({ format: 'uuid' })` | +| `DbType:VarChar` | `text()` | `varchar(n)` | `varchar(n)` | `Type.String({ maxLength: n })` | +| `DbType:Integer` | `integer()` | `integer()` | `int()` | `Type.Integer()` | +| `DbType:Boolean` | `integer({ mode: 'boolean' })` | `boolean()` | `boolean()` | `Type.Boolean()` | +| `DbType:Timestamp` | `integer({ mode: 'timestamp' })` | `timestamptz()` | `timestamp()` | `Type.Number()` | +| `DbType:Number` | `real()` | `double precision()` | `double()` | `Type.Number()` | +| `DbType:Array` (mode: 'json') | `text({ mode: 'json' })` | `jsonb()` | `json()` | `Type.Array(T)` | +| `DbType:Object` (mode: 'json') | `text({ mode: 'json' })` | `jsonb()` | `json()` | `Type.Object(T)` | +| `DbType:Record` (mode: 'json') | `text({ mode: 'json' })` | `jsonb()` | `json()` | `Type.Record(T)` | +| `DbType:Any` (mode: 'json') | `text({ mode: 'json' })` | `jsonb()` | `json()` | `Type.Unknown()` | +| `DbType:Enum` | `text({ enum: [...] })` | `pgEnum()()` or `text()` | `mysqlEnum()()` | `Type.Union([...Type.Literal()])` | +| `DbType:Real` | `real()` | `real()` | `float()` | `Type.Number()` | + +### Notes on Specific Mappings + +**UUID**: SQLite has no native UUID type. We use `text()` with a JS-side `$defaultFn` for UUID generation. PG gets the native `uuid()` type with `.defaultRandom()`. This is a case where the DbType kind (`DbType:Uuid`) maps to entirely different column types per dialect. + +**Timestamp**: SQLite stores as integer epoch seconds, PG as `timestamptz`. The symbolic default `'now'` resolves to `strftime('%s', 'now')` for SQLite (returning a Unix epoch integer) and `now()` for PG (returning a timestamptz). This is the other case where dialect divergence is hidden behind a single DbType kind. + +**JSON**: All compound types (`Array`, `Object`, `Record`, `Any`) with `mode: 'json'` map to `text({ mode: 'json' })` in SQLite, `jsonb()` in PG, and `json()` in MySQL. The transform registry picks the right one based on dialect. + +**Enum**: This is the most problematic mapping. PG requires `pgEnum()` at module scope (a separate type declaration), while SQLite uses `text({ enum: [...] })` and MySQL uses `mysqlEnum()()`. See Open Question #1. + +## Common Columns + +```typescript +// dbtype/common.ts +export const commonCols = { + id: DbType.Uuid({ primaryKey: true, default: 'uuid' }), + createdAt: DbType.Timestamp({ notNull: true, default: 'now' }), + updatedAt: DbType.Timestamp({ notNull: true, default: 'now' }), +} + +// Usage: +const UserSchema = DbType.Table('users', { + ...commonCols, + name: DbType.String({ notNull: true }), + email: DbType.String({ notNull: true, format: 'email' }), +}) +``` + +Compare to the storage.md version: + +```typescript +// BEFORE — repeated dialect config for identical behavior +createdAt: DbType.Timestamp({ + sqlite: { notNull: true, default: sql`(strftime('%s', 'now'))` }, + postgres: { notNull: true, default: sql`now()` } +}), + +// AFTER — cross-dialect defaults + symbolic default +createdAt: DbType.Timestamp({ notNull: true, default: 'now' }), +``` + +## Validation Schemas from DbType + +Because DbType schemas carry `inner` TypeBox schemas, extracting validation schemas is straightforward: + +```typescript +export function createSelectSchema(table: TDbTable): TObject { + const properties: Record = {} + for (const [name, col] of Object.entries(table.columns)) { + properties[name] = col[Kind] === 'DbType:Table' + ? col.inner // Unwrap to get the inner TypeBox schema + : col.inner + } + return Type.Object(properties) +} + +export function createInsertSchema(table: TDbTable): TObject { + const properties: Record = {} + for (const [name, col] of Object.entries(table.columns)) { + if (col.db.primaryKey && col.db.default === 'autoincrement') continue // Skip auto-increment PKs + let schema = col.inner + if (isOptional(col)) schema = Type.Optional(schema) + if (!col.db.notNull) schema = Type.Optional(Type.Union([schema, Type.Null()])) + properties[name] = schema + } + return Type.Object(properties) +} +``` + +This means plugins define schemas once and get both validation and Drizzle table generation from the same source. + +## Bidirectional Support (Future) + +The IR design enables both directions: + +``` +Drizzle Column ──→ fromDrizzle(column) ──→ DbType IR ──→ toDrizzle(schema, dialect) + │ + └──→ inner ──→ TypeBox validation schema +``` + +`fromDrizzle()` would introspect a Drizzle column and produce a `TDbColumn` with populated `db` metadata and an inferred `inner` TypeBox schema. This would be a straightforward mapping since Drizzle columns carry all the metadata we need (`dataType`, `columnType`, `notNull`, `hasDefault`, `enumValues`, etc.). + +The current `columnToSchema()` in drizzlebox already does the column→TypeBox part. The enhancement would be wrapping the result in a `TDbColumn` with the `db` metadata preserved. + +## Nullability Convention + +Following the storage.md convention but simplified: + +- `DbType.Optional(column)` — nullable in DB, excluded from insert schema +- `{ notNull: true }` — required (non-nullable) in DB, included in insert schema +- Neither — technically nullable, but prefer explicit `Optional()` or `notNull: true` + +## Open Questions + +### 1. PostgreSQL Enum Handling + +**Problem**: PG requires `pgEnum()` at module scope before tables can reference it: + +```typescript +// PG requires this: +const moodEnum = pgEnum('mood', ['happy', 'sad', 'neutral']) +const users = pgTable('users', { mood: moodEnum('mood') }) + +// vs. SQLite: +const users = sqliteTable('users', { mood: text('mood', { enum: ['happy', 'sad', 'neutral'] }) }) +``` + +**Options**: + +**A. Generate enum declarations separately**: `toPg(schema)` returns both enum declarations and table definitions: +```typescript +const { enums, tables } = toPg(UserSchema) +// enums: { mood: pgEnum('mood', ['happy', 'sad', 'neutral']) } +// tables: { users: pgTable('users', { mood: moodEnum('mood') }) } +``` + +**B. Start with `text()` for all dialects**: Don't generate native PG enums initially. Use `text()` with a check constraint or validation-only enum. Add `pgEnum` support as an explicit opt-in later. + +**C. Per-column opt-in**: `DbType.Enum({ values: [...], postgres: { nativeEnum: true } })` — only generates `pgEnum` when explicitly requested. + +**Current leaning**: **B** — start simple, add native enum support later as an opt-in feature. This avoids structural differences in the output between dialects. + +### 2. Mode Inference vs. Explicit Annotation + +**Problem**: Should `DbType.Object({...})` without `mode: 'json'` automatically infer `mode: 'json'` for storage? + +**Options**: + +**A. Require explicit mode**: All compound types must specify `mode: 'json'`. More verbose but unambiguous. + +**B. Auto-infer**: `DbType.Array()`, `DbType.Object()`, `DbType.Record()`, `DbType.Any()` automatically infer `mode: 'json'` since there's no other reasonable storage mode for compound types in relational databases. + +**Current leaning**: **B** — auto-infer `mode: 'json'` for compound types that must be stored as JSON. This matches the storage.md proposal. Scalar columns that happen to store JSON (like a string column holding JSON data) would need explicit annotation. + +### 3. The `inner` Schema: Who Constructs It? + +**Problem**: Should the DbTypeBuilder auto-infer the TypeBox inner schema from the column type, or should users provide it explicitly? + +**Options**: + +**A. Auto-infer**: `DbType.String({ notNull: true })` automatically creates `Type.String()` as inner. Users can override with explicit `inner` if they want validation constraints: +```typescript +DbType.String({ notNull: true }) // inner = Type.String() +DbType.String({ notNull: true, inner: Type.String({ format: 'email', maxLength: 255 }) }) +``` + +**B. Always explicit**: Users always provide the inner schema: +```typescript +DbType.String(Type.String({ format: 'email' }), { notNull: true }) +``` + +**C. Builder methods with validation sugar**: Builder methods that set both inner and db metadata: +```typescript +DbType.Email() // inner = Type.String({ format: 'email' }), [Kind] = 'DbType:String' +DbType.Uuid() // inner = Type.String({ format: 'uuid' }), [Kind] = 'DbType:Uuid' +``` + +**Current leaning**: **A+C** — auto-infer by default, with convenience builder methods for common patterns. The `inner` field is an escape hatch for custom validation constraints. + +### 4. Dialect-Specific Types That Don't Map Cleanly + +**Problem**: Some PG types have no SQLite equivalent (geometric types, `inet`, `cidr`, `macaddr`, array types). Some SQLite modes have no PG equivalent (`blob({ mode: 'bigint' })`). + +**Options**: + +**A. Dialect-specific Kind escapes**: `DbType.PgGeometry()`, `DbType.SqliteBlob()` — kinds that only work in one dialect, fail in others. + +**B. Common abstraction where possible, escape hatches otherwise**: `DbType.Array(inner, { mode: 'json' })` works everywhere (stores as JSON). PG-native arrays via `{ postgres: { nativeArray: true } }` override. + +**C. Only support the common subset**: Don't generate dialect-specific types from DbType at all. Users write raw Drizzle for those columns. + +**Current leaning**: **B** — start with the common subset, provide dialect-specific overrides for escape hatches. The `postgres` and `sqlite` options bags exist for this reason. + +### 5. Should `toDrizzle` Return Table Objects or Builder Callbacks? + +**Problem**: Drizzle tables are typically created with a callback that receives column builders: + +```typescript +sqliteTable('users', (t) => ({ id: t.integer().primaryKey(), name: t.text() })) +``` + +But our transform produces column-by-column. Should we produce: + +**A. Table object directly**: Return the result of `sqliteTable(name, columns)` — simpler, but the callback pattern gives access to `t` for dialect-specific features not expressible in DbType. + +**B. Column definitions only**: Return just the columns record, let users call `sqliteTable(name, columns)` themselves — more flexible but more verbose. + +**C. Table object with extra config callback**: Return the table but accept an `extraConfig` callback for indexes, unique constraints, etc. + +**Current leaning**: **C** — return the table object, handle indexes from `TDbTable.indexes`. For extra config not expressible in DbType, users can use the table extra config pattern separately. The `toDrizzle` function should handle the common 90%. + +### 6. Default Value Types: Symbolic vs Raw + +**Problem**: The `DbDefault` type currently supports a fixed set of symbolic strings plus raw SQL. What about: +- Literal defaults: `{ default: 0 }` or `{ default: '' }` +- JS-side defaults: `{ default: () => crypto.randomUUID() }` +- The difference between SQL defaults and JS-side defaults (Drizzle's `.default()` vs `.$defaultFn()`) + +**Current approach**: Symbolic strings for common patterns, `sql` tagged template for SQL expressions, literal values for simple cases. The transform layer decides `.default()` vs `.$defaultFn()` based on the dialect and symbol. + +**Open question**: Should we also support a function form for JS-side defaults? + +```typescript +DbType.String({ + default: 'uuid', // Symbolic — transform decides implementation + // vs. + default: () => crypto.randomUUID(), // JS-side — always uses $defaultFn +}) +``` + +**Current leaning**: Support both. Symbolic defaults are translated to the appropriate mechanism per dialect. JS function defaults always use `$defaultFn`. Raw SQL uses `.default(sql\`...\`)`. + +### 7. Relation Definitions + +**Problem**: Foreign keys work via column config `references`, but complex relations (many-to-many, join tables) need explicit relation definitions. Should these be part of `DbType.Table` or separate? + +**Status**: Deferred, same as storage.md. `references` on column config covers the common case. Complex relations can be added later via `TDbRelation` or through Drizzle's relation API directly. + +### 8. Migration Generation + +**Problem**: When should migrations be generated — at build time or at runtime? + +**Status**: Same as storage.md — build-time for now via `drizzle-kit`. The DbType schema → Drizzle table → `drizzle-kit generate` pipeline. Dynamic plugin registration is a future concern. + +### 9. Current drizzlebox Direction: Keep, Evolve, or Replace? + +**Problem**: The current `drizzlebox` does Drizzle → TypeBox (generating validation schemas from existing Drizzle tables). The new DbType IR does TypeBox → Drizzle (generating Drizzle tables from TypeBox-based schemas). These are opposite directions. + +**Options**: + +**A. Keep both directions in one package**: The current `columnToSchema()` becomes `fromDrizzle()`, the new transform registry becomes `toDrizzle()`. Both use the same DbType IR as intermediate. Package exports both directions. + +**B. Keep current direction, add new as separate sub-package**: The current `@alkdev/drizzlebox` continues as-is. The new TypeBox→Drizzle direction lives in `@alkdev/drizzlebox/schema` or a separate package. + +**C. Replace current direction eventually**: Phase out the current Drizzle→TypeBox in favor of the schema-first direction. Users define DbType schemas, get both validation and Drizzle for free. No need to reverse-engineer schemas from Drizzle. + +**Current leaning**: **A** — keep both. The Drizzle→TypeBox direction is useful for existing Drizzle users who want validation without rewriting their schemas. The TypeBox→Drizzle direction is for the schema-first use case. They coexist using the same IR as a bridge. The `fromDrizzle(column)` function introspects an existing Drizzle column and produces a `TDbColumn` with the inner TypeBox schema and `db` metadata. + +### 10. Naming Boundaries + +**Question**: The current package is `@alkdev/drizzlebox` with a focus on TypeBox↔Drizzle. The new DbType IR is more general — it could be its own package. Should: + +- The DbType IR live in `@alkdev/drizzlebox` (keeping everything together)? +- The DbType IR be a separate `@alkdev/dbtype` package that drizzlebox depends on? +- The DbType IR live in `@alkdev/typebox` as an extension (since it uses TypeBox's Kind/TypeRegistry)? + +**Current leaning**: Keep in `@alkdev/drizzlebox` for now. The DbType IR's sole purpose is bridging TypeBox and Drizzle. If it becomes useful outside that context, we can factor it out later. The package name `drizzlebox` is already ambiguous enough to encompass both directions. + +## Example: Full Table Definition + +```typescript +import { DbType } from '@alkdev/drizzlebox' +import { toSqlite } from '@alkdev/drizzlebox/sqlite' +import { toPg } from '@alkdev/drizzlebox/pg' + +const IdentitySchema = DbType.Table('identities', { + id: DbType.Uuid({ primaryKey: true, default: 'uuid' }), + keyHash: DbType.String({ notNull: true, unique: true }), + ownerId: DbType.String({ notNull: true }), + type: DbType.Enum(['api_key', 'node_identity'], { notNull: true }), + scopes: DbType.Array(DbType.String(), { notNull: true }), + roles: DbType.Optional(DbType.Array(DbType.String())), + resources: DbType.Optional(DbType.Record(DbType.Array(DbType.String()))), + name: DbType.Optional(DbType.String()), + enabled: DbType.Boolean({ default: true }), + createdAt: DbType.Timestamp({ notNull: true, default: 'now' }), + lastUsedAt: DbType.Optional(DbType.Timestamp()), + revokedAt: DbType.Optional(DbType.Timestamp()), +}, { + indexes: [ + { name: 'idx_identities_owner', columns: ['ownerId'] }, + { name: 'idx_identities_type', columns: ['type'] }, + ], +}) + +// Generate for SQLite +const sqliteIdentities = toSqlite(IdentitySchema) + +// Generate for PostgreSQL +const pgIdentities = toPg(IdentitySchema) + +// Validation (extract inner TypeBox schemas) +import { createSelectSchema, createInsertSchema } from '@alkdev/drizzlebox' + +const SelectIdentity = createSelectSchema(IdentitySchema) +const InsertIdentity = createInsertSchema(IdentitySchema) +``` + +Compare with the storage.md version: +- No per-dialect config for `primaryKey`, `notNull`, `unique`, `references` — same meaning everywhere +- Default values use symbolic `'now'` and `'uuid'` instead of dialect-specific SQL +- `mode: 'json'` is auto-inferred for `Array`, `Record`, `Object` +- Dialect overrides are only needed when types actually differ (e.g., native UUID in PG) + +## Benefits + +| Benefit | Description | +|---------|-------------| +| Single source of truth | Schema defined once, used for both validation and DB structure | +| No duplication | Cross-dialect options specified once, not per-dialect | +| Tree-shakeable | Import only the dialect you need | +| Type safety | TypeScript types from same schema as DB | +| Validation built-in | TypeBox schemas work for request/response validation | +| Extensible | Custom column kinds via TypeRegistry | +| Symbolic defaults | Common patterns like `'now'` and `'uuid'` translate automatically | +| Bidirectional (future) | Same IR supports Drizzle→TypeBox and TypeBox→Drizzle | \ No newline at end of file diff --git a/docs/research/dizzle-column-diffs.md b/docs/research/dizzle-column-diffs.md new file mode 100644 index 0000000..172e339 --- /dev/null +++ b/docs/research/dizzle-column-diffs.md @@ -0,0 +1,440 @@ +# Drizzle ORM Column Builder Differences: SQLite vs PostgreSQL vs MySQL + +Research based on `drizzle-orm@0.38.4` source in `node_modules/drizzle-orm`. + +## 1. Column Types by Dialect + +### SQLite (`drizzle-orm/sqlite-core/columns/`) + +| Function | columnType | dataType | Notes | +|--------------|--------------------|-----------|----------------------------------------------| +| `integer()` | `SQLiteInteger` | `number` | mode: `'number'` (default) | +| `integer()` | `SQLiteTimestamp` | `date` | mode: `'timestamp'` or `'timestamp_ms'` | +| `integer()` | `SQLiteBoolean` | `boolean` | mode: `'boolean'` | +| `text()` | `SQLiteText` | `string` | mode: `'text'` (default), supports `enum`, `length` | +| `text()` | `SQLiteTextJson` | `json` | mode: `'json'` | +| `real()` | `SQLiteReal` | `number` | - | +| `blob()` | `SQLiteBlobJson` | `json` | mode: `'json'` (default) | +| `blob()` | `SQLiteBlobBuffer` | `buffer` | mode: `'buffer'` | +| `blob()` | `SQLiteBigInt` | `bigint` | mode: `'bigint'` | +| `numeric()` | `SQLiteNumeric` | `string` | - | +| `customType()`| `SQLiteCustomColumn`| `custom` | - | + +### PostgreSQL (`drizzle-orm/pg-core/columns/`) + +| Function | columnType | dataType | Notes | +|------------------|--------------------|-----------|------------------------------| +| `integer()` | `PgInteger` | `number` | - | +| `smallint()` | `PgSmallInt` | `number` | - | +| `bigint()` | `PgBigInt` | `bigint` | | +| `serial()` | `PgSerial` | `number` | notNull + hasDefault built-in| +| `smallserial()` | `PgSmallSerial` | `number` | notNull + hasDefault built-in| +| `bigserial()` | `PgBigSerial` | `bigint` | notNull + hasDefault built-in| +| `boolean()` | `PgBoolean` | `boolean` | - | +| `text()` | `PgText` | `string` | supports `enum` | +| `varchar()` | `PgVarchar` | `string` | supports `enum`, `length` | +| `char()` | `PgChar` | `string` | supports `length`, `enum` | +| `numeric()` | `PgNumeric` | `string` | precision/scale config | +| `real()` | `PgReal` | `number` | - | +| `doublePrecision()`| `PgDoublePrecision`| `number` | - | +| `json()` | `PgJson` | `json` | - | +| `jsonb()` | `PgJsonb` | `json` | PG-specific | +| `uuid()` | `PgUUID` | `string` | has `.defaultRandom()` | +| `date()` | `PgDate`/`PgDateString` | `date`/`string` | mode toggle | +| `timestamp()` | `PgTimestamp`/`PgTimestampString`| `date`/`string` | mode, withTimezone, precision | +| `time()` | `PgTime`/`PgTimeString` | `string` | precision, withTimezone | +| `interval()` | `PgInterval` | `string` | PG-specific | +| `inet()` | `PgInet` | `string` | PG network type | +| `cidr()` | `PgCidr` | `string` | PG network type | +| `macaddr()` | `PgMacaddr` | `string` | PG network type | +| `macaddr8()` | `PgMacaddr8` | `string` | PG network type | +| `point()` | `PgPoint` | `string` | PG geometric type | +| `line()` | `PgLine` | `string` | PG geometric type | +| `pgEnum()` | `PgEnumColumn` | `string` | Requires pre-declared enum | +| `customType()` | `PgCustomColumn` | `custom` | - | + +### MySQL (`drizzle-orm/mysql-core/columns/`) + +| Function | columnType | dataType | Notes | +|------------------|-------------------------|-----------|--------------------------------------| +| `int()` | `MySqlInt` | `number` | `unsigned` config, `.autoincrement()`| +| `smallint()` | `MySqlSmallInt` | `number` | `unsigned`, `.autoincrement()` | +| `mediumint()` | `MySqlMediumInt` | `number` | `unsigned`, `.autoincrement()` | +| `tinyint()` | `MySqlTinyInt` | `number` | `unsigned`, `.autoincrement()` | +| `bigint()` | `MySqlBigInt` | `bigint` | `unsigned`, mode toggle | +| `serial()` | `MySqlSerial` | `number` | autoIncrement+primaryKey+notNull+default | +| `boolean()` | `MySqlBoolean` | `boolean` | - | +| `float()` | `MySqlFloat` | `number` | precision config | +| `double()` | `MySqlDouble` | `number` | precision config | +| `decimal()` | `MySqlDecimal` | `string` | precision/scale config | +| `real()` | `MySqlReal` | `number` | - | +| `text()` | `MySqlText` | `string` | supports `enum`, textType variants | +| `tinytext()` | `MySqlText` | `string` | textType: 'tinytext' | +| `mediumtext()` | `MySqlText` | `string` | textType: 'mediumtext' | +| `longtext()` | `MySqlText` | `string` | textType: 'longtext' | +| `varchar()` | `MySqlVarChar` | `string` | `length`, `enum` | +| `char()` | `MySqlChar` | `string` | `length`, `enum` | +| `json()` | `MySqlJson` | `json` | - | +| `date()` | `MySqlDate`/`MySqlDateString` | `date`/`string` | mode toggle | +| `datetime()` | `MySqlDateTime`/`MySqlDateTimeString` | `date`/`string` | mode, fsp | +| `timestamp()` | `MySqlTimestamp`/`MySqlTimestampString` | `date`/`string` | mode, fsp | +| `time()` | `MySqlTime`/`MySqlTimeString` | `string` | fsp | +| `binary()` | `MySqlBinary` | `buffer` | length config | +| `varbinary()` | `MySqlVarBinary` | `buffer` | length config | +| `year()` | `MySqlYear` | `number` | - | +| `mysqlEnum()` | `MySqlEnumColumn` | `string` | inline enum values | +| `customType()` | `MySqlCustomColumn` | `custom` | - | + +## 2. Builder API Naming Differences + +### Factory Function Names + +The same concept has **different factory function names** per dialect. A tool like drizzlebox that must handle all three cannot assume a universal name space. + +| Concept | SQLite | PostgreSQL | MySQL | +|-----------------|-----------------|---------------|-----------------| +| Integer | `integer` | `integer` | `int` | +| Small int | — | `smallint` | `smallint` | +| Big int | — | `bigint` | `bigint` | +| Auto-increment | `.primaryKey()` on `integer` | `serial`/`bigserial`/`smallserial` | `.autoincrement()` on `int` etc., or `serial` | +| Boolean | `integer({ mode: 'boolean' })` | `boolean` | `boolean` / `tinyint` | +| Text/varchar | `text` | `text` / `varchar` | `text` / `varchar` | +| JSON | `text({ mode: 'json' })` / `blob({ mode: 'json' })` | `json` / `jsonb` | `json` | +| Enum | `text({ enum: [...] })` | `pgEnum` (separate declaration) | `mysqlEnum` | +| Timestamp | `integer({ mode: 'timestamp' })` | `timestamp` | `timestamp` / `datetime` | +| UUID | `text()` (manual) | `uuid` | — | +| Numeric/decimal | `numeric` | `numeric` / `decimal` | `decimal` | +| Real/float | `real` | `real` | `float` / `double` / `real` | + +### Builder Class Names + +All builder classes are dialect-prefixed: + +- **SQLite**: `SQLiteTextBuilder`, `SQLiteIntegerBuilder`, `SQLiteColumnBuilder` (base) +- **PG**: `PgTextBuilder`, `PgIntegerBuilder`, `PgColumnBuilder` (base) +- **MySQL**: `MySqlTextBuilder`, `MySqlIntBuilder`, `MySqlColumnBuilder` (base), `MySqlColumnBuilderWithAutoIncrement` + +### Column Class Names (the `columnType` field) + +All column classes are also dialect-prefixed: + +- **SQLite**: `SQLiteText`, `SQLiteInteger`, `SQLiteTimestamp`, `SQLiteBoolean`, `SQLiteTextJson`, `SQLiteReal`, `SQLiteNumeric`, `SQLiteBlobJson`, `SQLiteBlobBuffer`, `SQLiteBigInt`, `SQLiteCustomColumn` +- **PG**: `PgText`, `PgInteger`, `PgBoolean`, `PgJson`, `PgJsonb`, `PgUUID`, `PgEnumColumn`, `PgNumeric`, `PgVarchar`, `PgSerial`, etc. +- **MySQL**: `MySqlText`, `MySqlInt`, `MySqlBoolean`, `MySqlJson`, `MySqlEnumColumn`, `MySqlDecimal`, `MySqlVarChar`, `MySqlSerial`, etc. + +## 3. Shared Column Builder API (from base `ColumnBuilder`) + +All three dialect builder hierarchies share a common base class `ColumnBuilder` (in `drizzle-orm/column-builder.d.ts`) that provides these universal methods: + +| Method | Description | +|---------------------|-------------| +| `.notNull()` | Makes column not null | +| `.default(value)` | Set a default value | +| `.$defaultFn(fn)` / `.$default` | Dynamic runtime default | +| `.$onUpdateFn(fn)` / `.$onUpdate` | Dynamic runtime update value | +| `.primaryKey()` | Makes column a primary key (implies notNull) | +| `.$type()` | Override the TypeScript type | +| `.generatedAlwaysAs(as, config?)` | Generated column (overridden per-dialect) | + +The `ColumnBuilderRuntimeConfig` shared by all dialects: + +```ts +{ + name: string; + keyAsName: boolean; + notNull: boolean; + default: TData | SQL | undefined; + defaultFn: (() => TData | SQL) | undefined; + onUpdateFn: (() => TData | SQL) | undefined; + hasDefault: boolean; + primaryKey: boolean; + isUnique: boolean; + uniqueName: string | undefined; + uniqueType: string | undefined; + dataType: string; + columnType: string; + generated: GeneratedColumnConfig | undefined; + generatedIdentity: GeneratedIdentityConfig | undefined; +} +``` + +The `ColumnBaseConfig` (shared, on the Column side): + +```ts +{ + name: string; + tableName: string; + dataType: ColumnDataType; // 'string' | 'number' | 'boolean' | 'array' | 'json' | 'date' | 'bigint' | 'custom' | 'buffer' + columnType: string; // Dialect-specific e.g. 'PgText', 'SQLiteInteger' + data: unknown; + driverParam: unknown; + notNull: boolean; + hasDefault: boolean; + isPrimaryKey: boolean; + isAutoincrement: boolean; + hasRuntimeDefault: boolean; + enumValues: string[] | undefined; +} +``` + +## 4. Dialect-Specific Differences + +### 4.1 Dialect Discriminator + +Every builder and column carries a `dialect` type tag in its `TTypeConfig`: +- **SQLite**: `{ dialect: 'sqlite' }` +- **PG**: `{ dialect: 'pg' }` +- **MySQL**: `{ dialect: 'mysql' }` + +This is used in the `BuildColumn` conditional type in `column-builder.d.ts` to route to the correct column class. + +### 4.2 SQLite-Specific + +- **`integer()` modes**: The `integer()` function accepts `{ mode: 'number' | 'timestamp' | 'timestamp_ms' | 'boolean' }`. Based on mode, it returns different builder classes (`SQLiteIntegerBuilder`, `SQLiteTimestampBuilder`, `SQLiteBooleanBuilder`). This is a compile-time type-level dispatch, not a runtime polymorphic thing. +- **`text()` modes**: `text()` accepts `{ mode: 'text' | 'json' }`. With `mode: 'json'`, it returns `SQLiteTextJsonBuilder` (dataType `'json'`), which has `mapFromDriverValue`/`mapToDriverValue` for JSON serialization. +- **`blob()` modes**: `blob()` accepts `{ mode: 'buffer' | 'json' | 'bigint' }`. Default is `'json'` (not `'buffer'`!). +- **No native boolean/date types**: SQLite uses `integer` with mode overrides instead of dedicated boolean or date column classes. +- **`primaryKey()` on integer**: SQLite's `primaryKey()` on `integer` implies auto-increment (ROWID). The `SQLiteBaseIntegerBuilder` has a `PrimaryKeyConfig` with `autoIncrement?: boolean` and `onConflict`. +- **`generatedAlwaysAs`**: Accepts `{ mode?: 'virtual' | 'stored' }` config for generated columns. +- **Enum handling**: Enums are just `text({ enum: [...] })` — a constraint on `text`, not a separate type. + +### 4.3 PostgreSQL-Specific + +- **`pgEnum`**: Enums are a top-level declaration, not a column config option. You call `pgEnum('name', ['val1', 'val2'])` at module scope to create a `PgEnum` object, then use it as a column: `myEnum()`. This is fundamentally different from SQLite/MySQL enum handling. +- **Array support**: `PgColumnBuilder` has an `.array(size?)` method that returns a `PgArrayBuilder`. No other dialect has this. +- **`.unique()` with nulls**: PG's `.unique()` accepts `{ nulls: 'distinct' | 'not distinct' }` — a PG-specific option. +- **`timestamp`/`timestamptz`**: `timestamp()` has `{ withTimezone?: boolean, precision?: number, mode?: 'date' | 'string' }`. With `mode: 'string'`, returns `PgTimestampStringBuilder` (dataType `'string'`). +- **`jsonb`**: PG-specific JSON storage type distinct from `json`. +- **`uuid`**: Has `.defaultRandom()` method shortcut for `defaultRANDOM()`. +- **Identity columns**: PG supports `generatedAlwaysAsIdentity()`/`generatedByDefaultAsIdentity()` (not just `generatedAlwaysAs`), using PG sequences. +- **Index operator classes**: `ExtraConfigColumn` on PG columns supports `.asc()`, `.desc()`, `.nullsFirst()`, `.nullsLast()`, `.op(opClass)` for index definitions. +- **Network/geometry types**: `inet`, `cidr`, `macaddr`, `macaddr8`, `point`, `line` — all PG-only. +- **`interval`**: PG-specific date/time interval type. +- **`date()`**: Has `{ mode: 'date' | 'string' }` similar to timestamp. + +### 4.4 MySQL-Specific + +- **`.autoincrement()`**: MySQL int types inherit from `MySqlColumnBuilderWithAutoIncrement` instead of plain `MySqlColumnBuilder`. This adds an `.autoincrement()` method. `serial()` is shorthand for `int().primaryKey().notNull().default(autoincrement)`. +- **`unsigned`**: MySQL int types (`int`, `smallint`, `mediumint`, `tinyint`, `bigint`) accept `{ unsigned?: boolean }`. +- **`generatedAlwaysAs`**: Accepts `{ mode?: 'virtual' | 'stored' }` like SQLite. +- **Enum handling**: `mysqlEnum(values)` is a standalone column factory (like PG's `pgEnum`), but takes inline values instead of a pre-declared enum object. +- **Text variants**: `tinytext()`, `mediumtext()`, `longtext()` are MySQL-specific shorthands. +- **Datetime types**: `datetime()` is MySQL-specific (distinct from `timestamp`). Both have `{ mode?: 'date' | 'string', fsp?: number }`. +- **`year()`**: MySQL-specific. +- **Binary types**: `binary()`, `varbinary()` with length — MySQL-specific. +- **No array type**: MySQL has no `.array()` method. + +### 4.5 Column Builder Inheritance Differences + +``` +ColumnBuilder (base, drizzle-orm/column-builder.js) +├── SQLiteColumnBuilder (sqlite-core/columns/common.js) +│ ├── SQLiteTextBuilder +│ ├── SQLiteTextJsonBuilder +│ ├── SQLiteBaseIntegerBuilder (adds .primaryKey() with autoIncrement config) +│ │ ├── SQLiteIntegerBuilder +│ │ ├── SQLiteTimestampBuilder +│ │ └── SQLiteBooleanBuilder +│ ├── SQLiteRealBuilder +│ ├── SQLiteNumericBuilder +│ ├── SQLiteBlobBufferBuilder / SQLiteBlobJsonBuilder / SQLiteBigIntBuilder +│ └── SQLiteCustomColumnBuilder +│ +├── PgColumnBuilder (pg-core/columns/common.js, adds .array(), .unique() with nulls) +│ ├── PgTextBuilder, PgVarcharBuilder, PgCharBuilder +│ ├── PgIntegerBuilder, PgSmallIntBuilder, etc. +│ ├── PgBooleanBuilder +│ ├── PgJsonBuilder, PgJsonbBuilder +│ ├── PgUUIDBuilder (adds .defaultRandom()) +│ ├── PgEnumColumnBuilder +│ ├── PgDateColumnBaseBuilder (adds .defaultNow()) +│ │ ├── PgTimestampBuilder +│ │ └── PgDateBuilder +│ ├── PgNumericBuilder +│ ├── PgCustomColumnBuilder +│ └── ...network/geometry types +│ +└── MySqlColumnBuilder (mysql-core/columns/common.js) + ├── MySqlColumnBuilderWithAutoIncrement (adds .autoincrement()) + │ ├── MySqlIntBuilder + │ ├── MySqlSmallIntBuilder + │ ├── MySqlSerialBuilder + │ └── ...other int types + ├── MySqlTextBuilder + ├── MySqlVarCharBuilder + ├── MySqlBooleanBuilder + ├── MySqlJsonBuilder + ├── MySqlEnumColumnBuilder + ├── MySqlDateColumnBaseBuilder (adds .defaultNow()) + │ ├── MySqlTimestampBuilder + │ ├── MySqlDateTimeBuilder + │ └── MySqlDateBuilder + ├── MySqlDecimalBuilder + ├── MySqlCustomColumnBuilder + └── ...MySQL-specific types +``` + +## 5. Table Creation Functions + +### Common Pattern + +All three dialects use the same general pattern: + +```ts +dialectTable(name, columns, extraConfig?) +dialectTable(name, (columnTypes) => columns, extraConfig?) +``` + +| Dialect | Table Function | Schema Variant | +|---------|-------------------|----------------------------------| +| SQLite | `sqliteTable` | `sqliteTableCreator(fn)` | +| PG | `pgTable` | `pgTableCreator(fn)` | +| MySQL | `mysqlTable` | `mysqlTableCreator(fn)` | + +Key differences: + +- **PG**: `pgTable` extra config gets `BuildExtraConfigColumns` (columns get `ExtraConfigColumn` with `.asc()`, `.desc()`, `.nullsFirst()`, `.nullsLast()`, `.op()` for index op classes). Also supports `.enableRLS()`. +- **SQLite/MySQL**: Extra config gets `BuildColumns` without the `ExtraConfigColumn` wrapper. +- **SQLite**: Extra config types include `IndexBuilder | CheckBuilder | ForeignKeyBuilder | PrimaryKeyBuilder | UniqueConstraintBuilder`. +- **PG**: Extra config types include `AnyIndexBuilder | CheckBuilder | ForeignKeyBuilder | PrimaryKeyBuilder | UniqueConstraintBuilder | PgPolicy`. +- **MySQL**: Extra config types include `AnyIndexBuilder | CheckBuilder | ForeignKeyBuilder | PrimaryKeyBuilder | UniqueConstraintBuilder`. + +The column builder callback receives dialect-specific column builders: + +```ts +// SQLite +sqliteTable('users', (t) => ({ id: t.integer().primaryKey() })) + +// PG +pgTable('users', (t) => ({ id: t.serial().primaryKey() })) + +// MySQL +mysqlTable('users', (t) => ({ id: t.serial() })) +``` + +## 6. Key Takeaways for a Transform Registry / DbType Approach + +### 6.1 The `columnType` String is the Key Discriminator + +Each column has a unique `columnType` string (e.g., `'SQLiteText'`, `'PgJsonb'`, `'MySqlInt'`). This is the most reliable way to identify a column's dialect and specific type at runtime. The `dataType` field (`'string' | 'number' | 'boolean' | 'date' | 'json' | 'bigint' | 'array' | 'buffer' | 'custom'`) is shared across dialects but too coarse for schema generation. + +### 6.2 The `dataType` Enum is the Universal Type Categories + +The `ColumnDataType` union is: +```ts +'string' | 'number' | 'boolean' | 'array' | 'json' | 'date' | 'bigint' | 'custom' | 'buffer' +``` + +This can serve as a common "logical type" for cross-dialect schema generation, but you need dialect-specific handling for: +- `'string'` columns that are enums (check `enumValues`) +- `'date'` columns (timestamp vs date, timezone handling) +- `'json'` columns (json vs jsonb, mode-based detection) +- `'number'` columns (integer vs float vs decimal semantics) + +### 6.3 Enum Handling is Radically Different + +| Aspect | SQLite | PG | MySQL | +|----------------|----------------------------------|--------------------------------|--------------------------------| +| Declaration | `text({ enum: [...] })` | `pgEnum()` at module scope | `mysqlEnum(values)` inline | +| Column builder | `SQLiteTextBuilder` | `PgEnumColumnBuilder` | `MySqlEnumColumnBuilder` | +| columnType | `SQLiteText` | `PgEnumColumn` | `MySqlEnumColumn` | +| Storage | TEXT with enum constraint | Native `CREATE TYPE ... AS ENUM`| ENUM column type | +| `enumValues` | Set on builder config | Set via `PgEnum` instance | Set on builder constructor | + +For a schema generator, all three surfaces the `enumValues` field on the built column, so you can reliably extract enum values regardless of dialect. But the *declaration* style varies, and PG requires a pre-declared enum type. + +### 6.4 Mode-Based Polymorphism + +Several column types use a `mode` config to change the TypeScript type at compile time: + +| Dialect | Column | Modes | Effect on dataType | +|---------|-------------|------------------------------------------------|--------------------| +| SQLite | `integer()` | `'number'` / `'timestamp'` / `'timestamp_ms'` / `'boolean'` | `number` / `date` / `boolean` | +| SQLite | `text()` | `'text'` / `'json'` | `string` / `json` | +| SQLite | `blob()` | `'buffer'` / `'json'` / `'bigint'` | `buffer` / `json` / `bigint` | +| PG | `timestamp()` | `'date'` / `'string'` | `date` / `string` | +| PG | `date()` | `'date'` / `'string'` | `date` / `string` | +| PG | `time()` | `'string'` (default) / `'date'`? | `string` / `date` | +| MySQL | `timestamp()` | `'date'` / `'string'` | `date` / `string` | +| MySQL | `datetime()` | `'date'` / `'string'` | `date` / `string` | +| MySQL | `date()` | `'date'` / `'string'` | `date` / `string` | +| MySQL | `bigint()` | `'number'` / `'bigint'` | `number` / `bigint`| + +The `mode` is **not** stored in the `dataType` field on the column config directly — it's stored in the dialect-specific runtime config. The `dataType` on the built column does reflect the mode correctly (because each mode produces a different builder class with a different `dataType`). + +### 6.5 Runtime Config Differences + +Every column builder's `config` object extends `ColumnBuilderRuntimeConfig` with dialect-specific fields: + +- **SQLite integer**: `{ autoIncrement: boolean }` (on `SQLiteBaseInteger`) +- **SQLite text**: `{ length?: number, enumValues?: string[] }` +- **PG varchar**: `{ length?: number, enumValues?: string[] }` +- **PG timestamp**: `{ withTimezone: boolean, precision?: number }` +- **PG numeric**: `{ precision?: number, scale?: number }` +- **MySQL int**: `{ unsigned?: boolean }` plus `{ autoIncrement: boolean }` +- **MySQL text**: `{ textType: 'tinytext' | 'text' | 'mediumtext' | 'longtext', enumValues?: string[] }` +- **MySQL timestamp/datetime**: `{ fsp?: number }` + +### 6.6 Implications for a DbType / Transform Registry + +**Recommended approach:** + +1. **Use `dataType` as primary classification** — it's the cross-dialect enum that maps naturally to TypeBox schema types. +2. **Use `columnType` for dialect-specific dispatch** — when you need to handle a column differently based on its SQL type (e.g., `PgJsonb` vs `MySqlJson`). +3. **Check `enumValues`** to detect enum columns regardless of how they were declared. +4. **Don't rely on builder class hierarchy** — the builders are compile-time types that don't exist at runtime in a useful way for introspection. Instead, inspect the built column's properties (`dataType`, `columnType`, `enumValues`, runtime config). +5. **Mode detection**: For columns with mode variants, the `dataType` already reflects the mode (e.g., `integer({ mode: 'boolean' })` produces `dataType: 'boolean'`), so you don't need to check mode separately for most schema generation. +6. **Handle PG arrays specially**: PG's `.array()` wraps any column type into an array type with `dataType: 'array'`. The base column type info is preserved in `config.baseBuilder`. +7. **PG identity columns**: PG has `generatedIdentity` on columns (via `generatedAlwaysAsIdentity()`/`generatedByDefaultAsIdentity()`), which is separate from regular defaults. Check `column.generatedIdentity` or `builder._.identity`. + +### 6.7 Column Introspection at Runtime + +At runtime, all built columns extend `Column` which has these useful properties: + +```ts +column.name // Column name +column.dataType // ColumnDataType: 'string' | 'number' | ... +column.columnType // Dialect-specific string: 'PgJsonb', 'SQLiteText', etc. +column.notNull // boolean +column.hasDefault // boolean +column.primary // boolean (isPrimaryKey) +column.default // Default value or SQL +column.enumValues // string[] | undefined +column.isUnique // boolean +column.uniqueName // string | undefined +column.generated // GeneratedColumnConfig | undefined +``` + +Plus dialect-specific properties accessible via the column class (e.g., `column.withTimezone` on PgTimestamp, `column.mode` on SQLiteTimestamp, etc.). + +### 6.8 The `entityKind` Discriminator + +Every builder and column class has a static `[entityKind]` string that uniquely identifies the class. This can be used for runtime type checking: + +``` +'SQLiteTextBuilder' / 'SQLiteText' +'PgTextBuilder' / 'PgText' +'MySqlTextBuilder' / 'MySqlText' +'PgColumnBuilder' (base) +'SQLiteColumnBuilder' (base) +'MySqlColumnBuilder' (base) +// etc. +``` + +This is set via `entityKind` symbol from `drizzle-orm/entity.js`. + +## 7. Summary Table: Cross-Dialect Type Mapping + +| Logical Type | SQLite Factory | PG Factory | MySQL Factory | dataType | +|-------------|------------------------|---------------------|------------------------|----------| +| String | `text()` | `text()` / `varchar()` | `text()` / `varchar()` | `string` | + Enum | `text({ enum: [...] })`| `pgEnum()()` | `mysqlEnum()()` | `string` | + JSON | `text({ mode: 'json' })`| `json()` / `jsonb()`| `json()` | `json` | + Number | `integer()` / `real()` | `integer()` / `real()`| `int()` / `float()` | `number` | + BigInt | `blob({ mode: 'bigint' })`| `bigint()` | `bigint()` | `bigint` | + Boolean | `integer({ mode: 'boolean' })`| `boolean()` | `boolean()` / `tinyint()`| `boolean`| + Date | `integer({ mode: 'timestamp' })`| `timestamp()` | `timestamp()` / `datetime()`| `date` | + UUID | `text()` (manual) | `uuid()` | — | `string` | + Decimal | `numeric()` | `numeric()` / `decimal()`| `decimal()` | `string` | + Buffer | `blob({ mode: 'buffer' })`| — | `binary()` / `varbinary()`| `buffer`| + Array | — | `.array()` on any col| — | `array` | \ No newline at end of file diff --git a/docs/research/typedef-kind-pattern.md b/docs/research/typedef-kind-pattern.md new file mode 100644 index 0000000..0611807 --- /dev/null +++ b/docs/research/typedef-kind-pattern.md @@ -0,0 +1,399 @@ +# TypeDef Kind-Based Extension Pattern + +## Overview + +The `@alkdev/typebox` TypeDef example (`example/typedef/typedef.ts`) demonstrates how to build a fully custom type system on top of TypeBox's Kind-based extensibility. It replaces JSON Schema semantics with a flat, binary-protocol-oriented type vocabulary while reusing TypeBox's infrastructure (Kind symbol, TypeRegistry, schema interfaces). + +This document analyzes the pattern in detail and maps it to a future `DbType` system for drizzlebox. + +--- + +## 1. The Kind Symbol — Core Extension Mechanism + +**Location**: `@alkdev/typebox/src/type/symbols/symbols.ts:38` + +```ts +export const Kind = Symbol.for('TypeBox.Kind') +``` + +`Kind` is a global Symbol (`Symbol.for`) used as a property key on every schema object. Its value is a string that identifies the type's identity, e.g. `'String'`, `'TypeDef:String'`, `'Object'`. + +Every TypeBox schema interface extends `TSchema`, which extends `TKind`: + +```ts +// schema.ts:49-57 +export interface TKind { + [Kind]: string +} +export interface TSchema extends TKind, SchemaOptions { + [ReadonlyKind]?: string + [OptionalKind]?: string + [Hint]?: string + params: unknown[] + static: unknown +} +``` + +**Key insight**: The `[Kind]` property is the single dispatch key for the entire type system. Validation, type guarding, compilation — everything dispatches on `schema[Kind]`. + +**Namespacing convention**: TypeDef uses a colon-separated namespace: `'TypeDef:String'`, `'TypeDef:Int8'`, `'TypeDef:Struct'`. This avoids collisions with TypeBox's built-in kinds (`'String'`, `'Number'`, `'Object'`). + +--- + +## 2. Defining Custom Kind Interfaces + +Each custom type declares a TypeScript interface extending `Types.TSchema` (imported as `Types from '@alkdev/typebox/type'`): + +```ts +export interface TString extends Types.TSchema { + [Types.Kind]: 'TypeDef:String' + type: 'string' + static: string +} + +export interface TInt8 extends Types.TSchema { + [Types.Kind]: 'TypeDef:Int8' + type: 'int8' + static: number +} + +export interface TStruct extends Types.TSchema, StructMetadata { + [Types.Kind]: 'TypeDef:Struct' + static: StructStatic + optionalProperties: { [K in Assert, keyof T>]: T[K] } + properties: { [K in Assert, keyof T>]: T[K] } +} +``` + +**Pattern anatomy**: +- `[Types.Kind]` — literal string type for dispatch identity +- `static` — mapped TypeScript type for `Static` inference +- Domain-specific properties (`type`, `elements`, `properties`, `optionalProperties`, `discriminator`, `mapping`, `enum`, `additionalProperties`) +- Generic type parameters for compositional types (`TStruct`, `TArray`, `TRecord`) + +**For DbType**: We would define interfaces like: + +```ts +export interface TDbVarChar extends TSchema { + [Kind]: 'DbType:VarChar' + static: Static + inner: TInner // the validation schema (e.g. TString with maxLength) + maxLength: number // database metadata +} + +export interface TDbSerial extends TSchema { + [Kind]: 'DbType:Serial' + static: Static + inner: TInner + dataType: 'integer' +} +``` + +The `inner` field carries the validation schema. The database-specific metadata (`maxLength`, `dataType`, `precision`, etc.) lives alongside it. + +--- + +## 3. TypeDefBuilder.Create() — Metadata Attachment + +**Location**: `typedef.ts:522-525` + +```ts +export class TypeDefBuilder { + protected Create(schema: Record, metadata: Record): any { + const keys = globalThis.Object.getOwnPropertyNames(metadata) + return keys.length > 0 ? { ...schema, metadata: { ...metadata } } : { ...schema } + } + // ... +} +``` + +Each builder method calls `this.Create(...)` with: +1. **The schema object** — containing `[Kind]` and structural properties +2. **A metadata bag** — arbitrary key-value pairs + +If metadata is non-empty, it's spread into a `metadata` sub-object. Otherwise, just the schema is returned plain. + +Example builder methods: + +```ts +public Int8(metadata: Metadata = {}): TInt8 { + return this.Create({ [Types.Kind]: 'TypeDef:Int8', type: 'int8' }, metadata) +} + +public Struct(fields: T, metadata: StructMetadata = {}): TStruct { + // ... computes optionalProperties and properties ... + return this.Create({ [Types.Kind]: 'TypeDef:Struct', ...requiredObject, ...optionalObject }, metadata) +} +``` + +**For DbType**: The `Create()` pattern works well, but we'd modify it to: +- Flatten database metadata into the schema directly (rather than nesting in a `metadata` sub-object) since tools like drizzle-kit need to inspect these properties +- Or keep the `metadata` pattern but make it type-safe with a `DbMetadata` interface + +--- + +## 4. TypeRegistry.Set — Validation Function Registration + +**Location**: `@alkdev/typebox/src/type/registry/type.ts` + +```ts +export type TypeRegistryValidationFunction = (schema: TSchema, value: unknown) => boolean + +const map = new Map>() + +export function Set(kind: string, func: TypeRegistryValidationFunction) { + map.set(kind, func) +} +export function Get(kind: string) { + return map.get(kind) +} +export function Has(kind: string) { + return map.has(kind) +} +``` + +TypeDef registers all its custom kinds at module level: + +```ts +Types.TypeRegistry.Set('TypeDef:Int8', (schema, value) => ValueCheck.Check(schema, value)) +Types.TypeRegistry.Set('TypeDef:String', (schema, value) => ValueCheck.Check(schema, value)) +Types.TypeRegistry.Set('TypeDef:Struct', (schema, value) => ValueCheck.Check(schema, value)) +// ... etc +``` + +**How TypeBox's built-in ValueCheck uses this** (`value/check/check.ts:423-500`): +```ts +function FromKind(schema: TSchema, references: TSchema[], value: unknown): boolean { + if (!TypeRegistry.Has(schema[Kind])) return false + const func = TypeRegistry.Get(schema[Kind])! + return func(schema, value) +} + +function Visit(schema, references, value) { + switch (schema[Kind]) { + case 'String': return FromString(...) + case 'Number': return FromNumber(...) + // ... all built-in kinds + default: + if (!TypeRegistry.Has(schema[Kind])) throw new ValueCheckUnknownTypeError(schema) + return FromKind(schema, references, value) + } +} +``` + +**Key observation**: TypeBox's own `Visit()` handles all built-in kinds in a switch statement. Custom kinds hit the `default` branch and dispatch through `TypeRegistry`. This means TypeDef provides its **own** `ValueCheck.Visit()` that dispatches on `TypeDef:*` kinds — it does NOT rely on TypeBox's `Visit()` at all. TypeDef's `ValueCheck` is a separate, independent validation path. + +**For DbType**: There are two approaches: +1. **TypeDef-style**: Build a completely separate `DbValueCheck.Visit()` with its own switch statement. Full control but duplicates infrastructure. +2. **Registry-style**: Register `DbType:*` validation functions with `TypeRegistry.Set()` and let TypeBox's existing `ValueCheck` dispatch through `FromKind`. This is simpler and integrates with TypeBox's existing validation pipeline. + +Option 2 is preferable for DbType since we want to compose with existing TypeBox types, not replace them. + +--- + +## 5. TypeGuard — Schema Structure Validation + +**Location**: `typedef.ts:376-477` + +TypeDef implements its own `TypeGuard` namespace that validates the **structure** of schema objects (not values, but the schemas themselves): + +```ts +export namespace TypeGuard { + export function TInt8(schema: unknown): schema is TInt8 { + return IsObject(schema) && schema[Types.Kind] === 'TypeDef:Int8' && schema['type'] === 'int8' + } + + export function TStruct(schema: unknown): schema is TStruct { + if(!(IsObject(schema) && schema[Types.Kind] === 'TypeDef:Struct' && IsOptionalBoolean(schema['additionalProperties']))) return false + // ... validate properties and optionalProperties + } + + export function TSchema(schema: unknown): schema is Types.TSchema { + return ( + TArray(schema) || + TBoolean(schema) || + // ... all TypeDef kinds ... + TStruct(schema) || + TTimestamp(schema) || + (TKind(schema) && Types.TypeRegistry.Has(schema[Types.Kind])) // fallback to registry + ) + } +} +``` + +**Fallback to registry**: The last clause `(TKind(schema) && Types.TypeRegistry.Has(schema[Types.Kind]))` is crucial — it allows kinds registered with `TypeRegistry` but not known to the static `TSchema()` guard to still pass. This enables extensibility. + +**For DbType**: We need a `DbGuard` namespace that: +- Validates `DbType:*` schema shapes (checks that `inner` is a valid schema, that `maxLength` is a number, etc.) +- Falls back to TypeBox's built-in `TypeGuard.TSchema()` for non-DbType schemas +- Provides a unified `isDbSchema()` function + +--- + +## 6. TypeSystem.Type() — The "Simple" Custom Type API + +**Location**: `@alkdev/typebox/src/system/system.ts:55-58` + +```ts +export namespace TypeSystem { + export function Type>( + kind: string, + check: (options: Options, value: unknown) => boolean + ): TypeFactoryFunction { + if (TypeRegistry.Has(kind)) throw new TypeSystemDuplicateTypeKind(kind) + TypeRegistry.Set(kind, check) + return (options: Partial = {}) => Unsafe({ ...options, [Kind]: kind }) + } +} +``` + +This is TypeBox's built-in shortcut for simple custom types: +1. Registers a validation function with `TypeRegistry` +2. Returns a factory that creates `TUnsafe` schemas with the custom `[Kind]` + +The `TUnsafe` type (`unsafe.ts`): +```ts +export interface TUnsafe extends TSchema { + [Kind]: string + static: T +} +export function Unsafe(options: UnsafeOptions = {}): TUnsafe { + return CreateType({ [Kind]: options[Kind] ?? 'Unsafe' }, options) as never +} +``` + +**Limitation for DbType**: `TUnsafe` provides static type inference but no structural guarantees. A `DbType:VarChar` created via `TypeSystem.Type()` would have `static: string` but the schema object wouldn't encode `inner`, `maxLength`, etc. in a type-safe way. TypeDef's pattern of explicit interfaces is strictly better for complex types. + +--- + +## 7. Relationship Between TypeDef Types and TypeBox Built-in Types + +TypeDef **replaces** all JSON Schema types. It does not compose with them: +- TypeBox `String` has `kind: 'String'`, supports `minLength`, `maxLength`, `pattern`, `format` +- TypeDef `String` has `kind: 'TypeDef:String'`, supports only `type: 'string'` and metadata + +TypeDef's `ValueCheck` is entirely separate from TypeBox's `ValueCheck`. They dispatch on disjoint Kind namespaces (`'TypeDef:*'` vs `'*'`). + +TypeDef's `TypeGuard.TSchema()` does reference `Types.TypeRegistry.Has()` as a fallback, allowing registered types to pass. But the structural validation of TypeDef schemas is wholly custom. + +**For DbType**: We want **composition**, not replacement. A `DbType:VarChar` should wrap a TypeBox `TString` (with `maxLength`) and add `dB: { kind: 'varchar', maxLength: 255 }`. This means DbType schemas should carry a reference to the inner TypeBox schema, not reimplement validation logic. + +--- + +## 8. What the Current drizzlebox/src Approach Takes From This Pattern + +Looking at `column.ts:66-67`: +```ts +TypeRegistry.Set('Buffer', (_, value) => value instanceof Buffer); +export const bufferSchema: BufferSchema = { [Kind]: 'Buffer', type: 'buffer' } as any; +``` + +This is a **minimal** application of the Kind/TypeRegistry pattern — a single custom type for Buffer validation. It uses the `TypeSystem.Type()`-style approach: register a check function, create a schema object with `[Kind]`. + +Similarly, `utils.ts:18-27` defines `JsonSchema` and `BufferSchema` as interfaces extending `TSchema`: +```ts +export interface JsonSchema extends TSchema { + [Kind]: 'Union' + static: Json + anyOf: Json +} +export interface BufferSchema extends TSchema { + [Kind]: 'Buffer' + static: Buffer + type: 'buffer' +} +``` + +The rest of `column.ts` maps drizzle column types to **standard TypeBox types** (`t.String()`, `t.Integer()`, `t.Number()`, etc.) — meaning all the database type information (that something is a `PgInteger` vs a `MySqlInt`) is lost. The schema only preserves validation semantics. + +--- + +## 9. Proposed DbType Pattern — Key Design Decisions + +### 9.1 Wrap, Don't Replace + +Each DbType schema should carry an `inner` TypeBox schema AND database metadata: + +```ts +export interface TDbInteger extends TSchema { + [Kind]: 'DbType:Integer' + static: Static + inner: TInner + db: { + dataType: 'integer' + columnType: string // 'PgInteger' | 'MySqlInt' | etc. + unsigned?: boolean + hasDefault?: boolean + notNull?: boolean + } +} +``` + +This preserves both validation semantics AND database semantics in one schema object. + +### 9.2 Register with TypeRegistry for Composition + +```ts +TypeRegistry.Set('DbType:Integer', (schema, value) => { + return Value.Check(schema.inner, value) // delegate to inner schema +}) +``` + +This lets DbType schemas compose with TypeBox's existing validation pipeline. + +### 9.3 Separate TypeGuard for DbType Schemas + +```ts +export namespace DbGuard { + export function TDbInteger(schema: unknown): schema is TDbInteger { + return IsObject(schema) + && schema[Kind] === 'DbType:Integer' + && IsObject(schema['db']) + && TypeGuard.TSchema(schema['inner']) + } + // ... etc +} +``` + +### 9.4 Builder Pattern with Column Metadata + +```ts +export class DbBuilder { + protected Create(schema: Record, inner: T, db: DbColumnMeta): any { + return { ...schema, inner, db } + } + + public Integer(column: Column, inner?: TSchema): TDbInteger { + const defaultInner = inner ?? t.Integer({ minimum: ..., maximum: ... }) + return this.Create( + { [Kind]: 'DbType:Integer' }, + defaultInner, + { dataType: 'integer', columnType: column.columnType, ... } + ) + } +} +``` + +### 9.5 What This Enables + +- **Validation**: `Value.Check(dbSchema, value)` works via TypeRegistry delegation +- **Schema introspection**: `dbSchema.inner` for validation-only, `dbSchema.db` for database metadata +- **Type extraction**: `Static` correctly resolves through `inner` +- **Migration generation**: Walk `dbSchema.db` to produce DDL +- **Drizzle integration**: Replace `columnToSchema()` with `DbType.fromColumn()` that producesDbType schemas + +--- + +## 10. Key Differences from TypeDef's Approach + +| Aspect | TypeDef | Proposed DbType | +|--------|---------|-----------------| +| Relation to TypeBox | Replacement | Composition (wrapping) | +| Validation | Custom ValueCheck entirely | Delegate to inner via TypeRegistry | +| TypeGuard | Fully custom | Compose with TypeBox's TypeGuard | +| Metadata | Flat `metadata: {}` bag | Structured `db: DbColumnMeta` | +| Kind namespace | `TypeDef:*` | `DbType:*` | +| Inner schema reference | None (TypeDef IS the schema) | `inner: TSchema` field | +| Purpose | Binary protocol types | Database column types with validation | + +The composition approach is essential because DbType must preserve TypeBox's rich validation capabilities (`format`, `pattern`, `minimum`/`maximum`, etc.) while layering database semantics on top. TypeDef's flat replacement approach works because binary protocol types have simpler validation needs (int8 range checks, ISO timestamp formatting, etc.). \ No newline at end of file diff --git a/docs/research/typemap-architecture.md b/docs/research/typemap-architecture.md new file mode 100644 index 0000000..5675a6e --- /dev/null +++ b/docs/research/typemap-architecture.md @@ -0,0 +1,475 @@ +# Typemap Architecture Research + +## Overview + +`@alkdev/typemap` is a translation system that converts between four schema formats: TypeBox, Valibot, Zod, and Syntax (a TypeBox DSL string format). It implements a **N-way translation matrix** where each target can translate from any source, including itself (identity). + +The architecture has three key properties we want to understand and potentially reuse: +1. **Module structure enabling tree-shaking** +2. **Translation target isolation** +3. **Guard/detection layer for runtime type dispatch** + +--- + +## 1. Module Structure for Tree-Shaking + +### Directory Layout + +``` +src/ + index.ts # Public API barrel export + guard.ts # Runtime type detection + options.ts # Shared option types + static.ts # Static type inference utility + syntax/ + syntax.ts # Dispatcher (Syntax function) + syntax-from-syntax.ts # Identity: syntax -> syntax + syntax-from-typebox.ts # TypeBox -> Syntax + syntax-from-valibot.ts # Valibot -> Syntax (via TypeBox) + syntax-from-zod.ts # Zod -> Syntax (via TypeBox) + typebox/ + typebox.ts # Dispatcher (TypeBox function) + typebox-from-syntax.ts # Syntax -> TypeBox + typebox-from-typebox.ts # Identity: TypeBox -> TypeBox + typebox-from-valibot.ts # Valibot -> TypeBox + typebox-from-zod.ts # Zod -> TypeBox + valibot/ + valibot.ts # Dispatcher (Valibot function) + valibot-from-syntax.ts # Syntax -> Valibot (via TypeBox) + valibot-from-typebox.ts # TypeBox -> Valibot + valibot-from-valibot.ts # Identity: Valibot -> Valibot + valibot-from-zod.ts # Zod -> Valibot (via TypeBox) + common.ts # Shared Valibot type aliases + zod/ + zod.ts # Dispatcher (Zod function) + zod-from-syntax.ts # Syntax -> Zod (via TypeBox) + zod-from-typebox.ts # TypeBox -> Zod + zod-from-valibot.ts # Valibot -> Zod (via TypeBox) + zod-from-zod.ts # Identity: Zod -> Zod + compile/ + compile.ts # High-level compile API + validator.ts # Standard-schema wrapper for TypeCompiler + environment.ts # Detects eval() support + path.ts # JSON Pointer -> Standard Schema path conversion + standard.ts # Standard Schema V1 interface definition +``` + +### How Tree-Shaking Works + +Each `from-*` file is a **self-contained translation unit** that only imports: +- The source library (e.g., `valibot`) for type definitions and runtime introspection +- The `@alkdev/typebox` runtime utilities (`ValueGuard`) for structural checks +- `../guard.ts` in the dispatcher files only + +The `index.ts` re-exports everything from every `from-*` file: + +```ts +export * from './typebox/typebox-from-syntax' +export * from './typebox/typebox-from-typebox' +export * from './typebox/typebox-from-valibot' +export * from './typebox/typebox-from-zod' +export { type TTypeBox, TypeBox } from './typebox/typebox' +// ... same pattern for valibot/, zod/, syntax/ +``` + +**Tree-shaking mechanism**: When a bundler processes `import { TypeBox } from '@alkdev/typemap'`, it: +1. Parses `index.ts` and sees the re-exports +2. Follows into `typebox/typebox.ts` which imports all `from-*` files +3. The `TypeBox()` dispatcher function conditionally calls `TypeBoxFromSyntax`, `TypeBoxFromTypeBox`, `TypeBoxFromValibot`, or `TypeBoxFromZod` based on guards + +**Critical limitation**: Because the dispatchers (`TypeBox()`, `Valibot()`, `Zod()`, `Syntax()`) use **runtime guards** to decide which path to take, bundlers **cannot** eliminate the unused branches. If you call `TypeBox(zodSchema)`, all four `from-*` modules are still included because the bundler cannot know at build time which guard branch will execute. + +The real tree-shaking opportunity exists at the **named export level**. A consumer who only imports `TypeBoxFromZod` directly (not via the `TypeBox` dispatcher) can avoid pulling in valibot/typebox-from-syntax/etc: + +```ts +import { TypeBoxFromZod } from '@alkdev/typemap' +``` + +However, the current `package.json` exports map has a **single entry point** (`"."`), which means there's no sub-path exports for individual translation units. The tree-shaking effectiveness depends entirely on the bundler's ability to eliminate unused `export *` re-exports. + +### Build System + +The `build.mjs` produces two output formats: +- **CJS**: `tsc` with `--module Node16` -> `target/build/cjs/` +- **ESM**: `tsc` with `--module ESNext` -> `target/build/esm/`, then mutates `.js` -> `.mjs` and rewrites import specifiers + +The generated `package.json` for the published package: + +```json +{ + "types": "./build/cjs/index.d.ts", + "main": "./build/cjs/index.js", + "module": "./build/esm/index.mjs", + "esm.sh": { "bundle": false }, + "exports": { + ".": { + "require": { "types": "./build/cjs/index.d.ts", "default": "./build/cjs/index.js" }, + "import": { "types": "./build/esm/index.d.mts", "default": "./build/esm/index.mjs" } + } + } +} +``` + +The `"esm.sh": { "bundle": false }` directive tells esm.sh CDN not to bundle peer dependencies, which is important for keeping external libraries external. + +**No sub-path exports are defined**, meaning consumers can't do granular imports like `@alkdev/typemap/typebox-from-zod`. This is a missed tree-shaking opportunity. + +--- + +## 2. Translation Target Isolation + +### The Dispatcher Pattern + +Each target directory has a `dispatcher.ts` file (e.g., `typebox/typebox.ts`) that: + +1. **Imports all `from-*` modules** in the same target directory +2. **Imports `guard.ts`** for runtime type detection +3. **Exposes a single overloaded function** with conditional dispatch + +Example from `typebox/typebox.ts`: + +```ts +import { TypeBoxFromSyntax } from './typebox-from-syntax' +import { TypeBoxFromTypeBox } from './typebox-from-typebox' +import { TypeBoxFromValibot } from './typebox-from-valibot' +import { TypeBoxFromZod } from './typebox-from-zod' +import * as g from '../guard' + +export function TypeBox(...args: any[]): never { + const [parameter, type, options] = g.Signature(args) + return ( + g.IsSyntax(type) ? TypeBoxFromSyntax(ContextFromParameter(parameter), type, options) : + g.IsTypeBox(type) ? TypeBoxFromTypeBox(type) : + g.IsValibot(type) ? TypeBoxFromValibot(type) : + g.IsZod(type) ? TypeBoxFromZod(type) : + t.Never() + ) as never +} +``` + +Key observations: +- The function uses **rest args** (`...args: any[]`) and `g.Signature()` to normalize the overloaded call signatures +- Runtime dispatch via ternary chain over guard functions +- The same pattern is duplicated for `Valibot()`, `Zod()`, and `Syntax()` + +### The from-* File Pattern + +Each `from-*` file is a pure translation module: + +``` +{name}-from-{source}.ts +``` + +Where `{name}` is the target and `{source}` is the input format. These files: +- Import only the **source** library types and the **target** library types +- Implement the full translation mapping (every type node in source -> target) +- Are self-contained: no cross-dependencies on other `from-*` files **except**: + +### The Two-Hop Translation Pattern + +Some translations don't have a direct path. Instead, they go through TypeBox as an intermediate: + +```ts +// syntax-from-valibot.ts - Valibot -> Syntax (via TypeBox) +export function SyntaxFromValibot>(type: Type): TSyntaxFromValibot { + const typebox = TypeBoxFromValibot(type) // Valibot -> TypeBox + const result = SyntaxFromTypeBox(typebox) // TypeBox -> Syntax + return result as never +} + +// syntax-from-zod.ts - Zod -> Syntax (via TypeBox) +export function SyntaxFromZod>(type: Type): TSyntaxFromZod { + const typebox = TypeBoxFromZod(type) // Zod -> TypeBox + const result = SyntaxFromTypeBox(typebox) // TypeBox -> Syntax + return result as never +} +``` + +This means TypeBox acts as a **hub/intermediate representation (IR)**. The translation graph looks like: + +``` + Syntax + / \ + SyntaxFrom SyntaxFrom + | | + TypeBoxFrom TypeBoxFrom + / \ / \ + / \ / \ + Valibot Zod Valibot Zod +``` + +TypeBox is the **canonical IR**. All translations between non-TypeBox formats go through TypeBox first, then to the target. This reduces the N^2 translation problem to 2N translations (source->IR, IR->target). + +--- + +## 3. Guard/Detection Layer + +The `guard.ts` module provides two mechanisms: + +### Type-Level Guards + +```ts +export type SyntaxType = string +export type TypeBoxType = t.TSchema +export type ValibotType = v.BaseSchema> +export type ZodType = z.ZodTypeAny | z.ZodEffects +``` + +These are used in the conditional type system of each dispatcher: + +```ts +export type TTypeBox : + Type extends g.TypeBoxType ? TTypeBoxFromTypeBox : + Type extends g.ValibotType ? TTypeBoxFromValibot : + Type extends g.ZodType ? TTypeBoxFromZod : + t.TNever +)> = Result +``` + +### Runtime Guards + +```ts +export function IsSyntax(value: unknown): value is string { + return t.ValueGuard.IsString(value) +} + +export function IsTypeBox(type: unknown): type is t.TSchema { + return t.KindGuard.IsSchema(type) // checks for [Symbol.for('@alkdev/typebox/Kind')] +} + +export function IsValibot(type: unknown): type is v.AnySchema { + return ( + t.ValueGuard.IsObject(type) && + t.ValueGuard.HasPropertyKey(type, '~standard') && + t.ValueGuard.IsObject(type['~standard']) && + t.ValueGuard.HasPropertyKey(type['~standard'], 'vendor') && + type['~standard'].vendor === 'valibot' + ) +} + +export function IsZod(type: unknown): type is z.ZodTypeAny { + return ( + t.ValueGuard.IsObject(type) && + t.ValueGuard.HasPropertyKey(type, '~standard') && + t.ValueGuard.IsObject(type['~standard']) && + t.ValueGuard.HasPropertyKey(type['~standard'], 'vendor') && + type['~standard'].vendor === 'zod' + ) +} +``` + +Key design decisions: +- **TypeBox detection** uses the internal `[Kind]` symbol (via `KindGuard.IsSchema`) +- **Valibot and Zod detection** use the `~standard` property from the Standard Schema spec +- **Syntax detection** just checks `typeof === 'string'` +- All guards use `@alkdev/typebox`'s `ValueGuard` utility functions rather than raw JS, ensuring consistency + +### Signature Resolution + +The `Signature()` function normalizes overloaded arguments: + +```ts +// (parameter, syntax, options) -> [parameter, type, options] +// (syntax, options) -> [{}, type, options] +// (parameter, options) -> [parameter, type, {}] +// (syntax | type) -> [{}, type, {}] +``` + +This allows the API to accept multiple calling conventions: +```ts +TypeBox({ Users: UsersSchema }, '{ id: number, name: string }', options) // with context +TypeBox('{ id: number, name: string }', options) // syntax with options +TypeBox(zodSchema) // just a schema +``` + +--- + +## 4. Compile Directory + +The `compile/` directory provides the high-level `Compile()` function that: +1. Accepts any schema type (Syntax string, TypeBox, Valibot, Zod) +2. Converts it to TypeBox via the `TypeBox()` dispatcher +3. Compiles the TypeBox schema into a `TypeCheck` validator using `@alkdev/typebox/compiler` +4. Wraps it in a `Validator` class that implements the Standard Schema V1 interface + +This is the **consumer-facing API** that ties everything together. The `Compile` function uses the same guard/signature pattern: + +```ts +export function Compile(...args: any[]): never { + const [parameter, type, options] = g.Signature(args) + const schema = t.ValueGuard.IsString(type) ? TypeBox(parameter, type, options) : TypeBox(type) + const check = ResolveTypeCheck(schema) + return new Validator(check) as never +} +``` + +The `Validator` class (`compile/validator.ts`) implements `StandardSchemaV1` with `~standard` property, providing: +- `.Check(value)` - validation +- `.Parse(value)` - decode/transform +- `.Errors(value)` - error iterator +- `.Code()` - generated validation code string + +The `environment.ts` module detects if `eval()` is available (for JIT compilation) and falls back to dynamic validation if not. + +--- + +## 5. Adaptation for DrizzleBox (Database Dialect Targets) + +### Mapping the Pattern + +| Typemap Concept | DrizzleBox Equivalent | +|---|---| +| TypeBox | **Drizzle IR** (dialect-agnostic schema representation) | +| Valibot | **SQLite dialect** | +| Zod | **PostgreSQL dialect** | +| Syntax string | *no direct equivalent* (or: SQL string templates) | +| TypeBox->Valibot | DrizzleIR->SQLite DDL | +| TypeBox->Zod | DrizzleIR->PostgreSQL DDL | +| Valibot->TypeBox | SQLite introspection->DrizzleIR | +| Guard system | Dialect detection from schema objects | + +### Proposed Module Structure + +``` +src/ + index.ts + guard.ts # Detect drizzle dialect type (sqlite/postgres/mysql) + options.ts # Shared dialect options + ir/ + ir.ts # DrizzleIR dispatcher (IR function) + ir-from-sqlite.ts # SQLite schema -> DrizzleIR + ir-from-postgres.ts # PostgreSQL schema -> DrizzleIR + ir-from-mysql.ts # MySQL schema -> DrizzleIR + ir-from-ir.ts # Identity + sqlite/ + sqlite.ts # SQLite dispatcher + sqlite-from-ir.ts # DrizzleIR -> SQLite DDL/types + sqlite-from-sqlite.ts # Identity + sqlite-from-postgres.ts # Postgres -> SQLite (via IR) + sqlite-from-mysql.ts # MySQL -> SQLite (via IR) + postgres/ + postgres.ts # PostgreSQL dispatcher + postgres-from-ir.ts # DrizzleIR -> PostgreSQL DDL/types + postgres-from-postgres.ts # Identity + postgres-from-sqlite.ts # SQLite -> PostgreSQL (via IR) + postgres-from-mysql.ts # MySQL -> PostgreSQL (via IR) + mysql/ + mysql.ts # MySQL dispatcher + mysql-from-ir.ts # DrizzleIR -> MySQL DDL/types + mysql-from-mysql.ts # Identity + mysql-from-sqlite.ts # SQLite -> MySQL (via IR) + mysql-from-postgres.ts # PostgreSQL -> MySQL (via IR) +``` + +### Guard Adaptation + +```ts +// guard.ts +import { sqliteTable } from 'drizzle-orm/sqlite-core' +import { pgTable } from 'drizzle-orm/pg-core' +import { mysqlTable } from 'drizzle-orm/mysql-core' + +export type SqliteType = ReturnType +export type PostgresType = ReturnType +export type MysqlType = ReturnType + +export function IsSqlite(schema: unknown): schema is SqliteType { + // Detect via Drizzle's internal dialect markers or symbol properties + return typeof schema === 'function' && /* check dialect symbol */ +} + +export function IsPostgres(schema: unknown): schema is PostgresType { + // Similar detection +} + +export function IsMysql(schema: unknown): schema is MysqlType { + // Similar detection +} +``` + +### Improving Tree-Shaking Over Typemap + +Typemap's current architecture has a tree-shaking weakness: the dispatcher functions pull in all translation paths. For DrizzleBox, we can improve this with **sub-path exports**: + +```json +{ + "exports": { + ".": { + "import": "./build/esm/index.mjs", + "require": "./build/cjs/index.js" + }, + "./sqlite": { + "import": "./build/esm/sqlite/sqlite.mjs", + "require": "./build/cjs/sqlite/sqlite.js" + }, + "./postgres": { + "import": "./build/esm/postgres/postgres.mjs", + "require": "./build/cjs/postgres/postgres.js" + }, + "/mysql": { + "import": "./build/esm/mysql/mysql.mjs", + "require": "./build/cjs/mysql/mysql.js" + } + } +} +``` + +This allows consumers to import only what they need: + +```ts +// Only pulls in sqlite + ir code +import { Sqlite } from '@alkdev/drizzlebox/sqlite' +``` + +Rather than the single-entry-point approach typemap uses, where the entire translation matrix is always imported. + +### IR-as-Hub Pattern + +Following typemap's TypeBox-as-IR pattern, DrizzleBox should use a **dialect-agnostic intermediate representation** as the hub: + +``` + SQLite DDL + / \ + sqlite-from from-sqlite + | | + IR (Drizzle Intermediate Representation) + | | + postgres-from from-postgres + \ / + PostgreSQL DDL +``` + +This means we only need to write: +- **4 translation modules per dialect**: `dialect-from-ir` (generate), `from-dialect` (parse), `dialect-from-dialect` (identity), plus cross-dialect shortcuts that go through IR +- **Cross-dialect translations** (e.g., SQLite->PostgreSQL) are automatically composed: `PostgreSQL(IR(SQLite(schema)))` + +### Key Differences from Typemap + +1. **Type safety is the output, not the input**: Typemap's schemas are validation schemas. DrizzleBox's schemas are database table definitions. The "translation" is generating column types, constraints, and DDL. + +2. **Dialect-specific features need escape hatches**: PostgreSQL has `JSONB`, MySQL has `ENUM`, SQLite has limited `ALTER TABLE`. The IR needs a way to express "dialect-specific" types that don't translate losslessly. This is similar to how typemap handles Valibot-specific types (like `Blob`, `Custom`) by creating custom TypeBox kinds. + +3. **Peer dependency handling**: Typemap uses `peerDependencies` for valibot/zod - users only install what they use. DrizzleBox should do the same with `drizzle-orm/sqlite-core`, `drizzle-orm/pg-core`, `drizzle-orm/mysql-core`. + +4. **No identity bypass needed**: In typemap, `TypeBoxFromTypeBox` just clones. In DrizzleBox, a dialect-to-same-dialect translation might normalize/validate rather than clone. + +### Recommended Architecture + +```ts +// Each dialect module exports: +// 1. A dispatcher function (like TypeBox()) that auto-detects input +// 2. Direct from-* functions for explicit, tree-shakeable usage +// 3. Type-only exports for the generated types + +// postgres/postgres.ts +export function Pg(schema: Schema): TPg { /* dispatch */ } +export { PgFromIR } from './postgres-from-ir' +export { PgFromSqlite } from './postgres-from-sqlite' +export { PgFromMysql } from './postgres-from-mysql' +export { PgFromPg } from './postgres-from-pg' +``` + +This gives consumers two usage modes: +- **Convenience**: `Pg(schema)` - auto-detects, pulls in everything +- **Tree-shakeable**: `PgFromIR(irSchema)` - explicit, minimal imports \ No newline at end of file