- Replace stale DD references (DD3, DD6, DD9, DD10) with proper ADR links - Fix 'Open Question 1' → OQ-01/OQ-03 cross-references - Rewrite metagraph-module.md 'Why TypeBox Modules' to describe capabilities directly instead of framing as SchemaBuilder replacement - Remove 'Transition from SchemaBuilder' section, replace with Source Structure - Clean up implementation path: strikethrough phases → status table - Fix data model diagram: remove non-existent nodeTypeId, fix EdgeType label - Align EdgeConstraints examples with actual code (add default values) - Clarify validateNode/validateEdge error behavior in docs - Align EncryptedDataSchema code example with actual implementation - Fix overview.md: correct dependency table, update current state, fix TypeBox URL - Fix forward-look.md garbled text about dbtype element migration - Fix open-questions.md: correct OQ count (4→7 open), add summary table - Update doc statuses: schema-evolution, encrypted-data, open-questions → reviewed - Update AGENTS.md to reflect current implementation state
335 lines
20 KiB
Markdown
335 lines
20 KiB
Markdown
---
|
|
status: reviewed
|
|
last_updated: 2026-05-30
|
|
---
|
|
|
|
# SQLite Host
|
|
|
|
The SQLite database host for `@alkdev/storage`. Uses Drizzle ORM with
|
|
libsql/Turso for the SQLite dialect and `@alkdev/drizzlebox` for TypeBox schema
|
|
generation from Drizzle table definitions.
|
|
|
|
## Overview
|
|
|
|
The SQLite host provides:
|
|
|
|
1. **Drizzle table definitions** for the metagraph pattern (graph types, node
|
|
types, edge types, graphs, nodes, edges) plus a standalone `actors` table
|
|
2. **Drizzle relations** for the relational query API
|
|
3. **TypeBox schemas** auto-generated from Drizzle tables (select/insert
|
|
validation)
|
|
4. **Injectable database factory** — `createSqliteDatabase(client)` accepts a
|
|
pre-created client
|
|
|
|
The SQLite host is the first-class target. PostgreSQL will follow the same table
|
|
shapes with appropriate dialect changes.
|
|
|
|
## Package Structure
|
|
|
|
```
|
|
src/sqlite/
|
|
├── tables/
|
|
│ ├── common.ts # commonCols, ACTOR_TYPE enum
|
|
│ ├── graphTypes.ts # graph_types table + select/insert schemas
|
|
│ ├── nodeTypes.ts # node_types table + select/insert schemas
|
|
│ ├── edgeTypes.ts # edge_types table + select/insert schemas
|
|
│ ├── graphs.ts # graphs table + select/insert schemas
|
|
│ ├── nodes.ts # nodes table + select/insert schemas
|
|
│ ├── edges.ts # edges table + select/insert schemas
|
|
│ ├── actors.ts # actors table + select/insert schemas
|
|
│ └── index.ts # barrel re-export
|
|
├── relations.ts # Drizzle relational mappings
|
|
├── schema.ts # re-exports tables + relations
|
|
└── client.ts # createSqliteDatabase()
|
|
```
|
|
|
|
## Tables
|
|
|
|
### Common Columns
|
|
|
|
All tables share these columns:
|
|
|
|
```ts
|
|
{
|
|
id: text("id").primaryKey(),
|
|
metadata: text("metadata", { mode: "json" }).$type<Record<string, unknown>>().default({}),
|
|
createdAt: integer("created_at", { mode: "timestamp" })
|
|
.default(sql`(strftime('%s', 'now'))`)
|
|
.notNull(),
|
|
updatedAt: integer("updated_at", { mode: "timestamp" })
|
|
.default(sql`(strftime('%s', 'now'))`)
|
|
.notNull(),
|
|
}
|
|
```
|
|
|
|
**Notable differences from a typical PostgreSQL common columns pattern**:
|
|
|
|
| Column | SQLite | PostgreSQL (typical) |
|
|
| ----------- | ------------------------------------- | ------------------------------------------------------------- |
|
|
| `id` | text PK (consumer-generated) | text PK with `$defaultFn(() => crypto.randomUUID())` |
|
|
| `metadata` | `text` with JSON mode | `jsonb` with `$type<Record<string, unknown>>()` |
|
|
| `createdAt` | `integer` timestamp mode (Unix epoch) | `timestamp with timezone` defaulting `now()` |
|
|
| `updatedAt` | `integer` timestamp mode (Unix epoch) | `timestamp with timezone` defaulting `now()` with `$onUpdate` |
|
|
|
|
The SQLite columns do NOT have `$defaultFn` for ID generation (the consumer
|
|
provides IDs) and do NOT have `$onUpdate` for `updatedAt` (Drizzle's `$onUpdate`
|
|
is application-level; consumers must set it explicitly).
|
|
|
|
### `graph_types`
|
|
|
|
Stores graph type definitions (schemas for classes of graphs).
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ----------- | ------------------- | ----------------------- | ------------------------------------------------------------ |
|
|
| id | text | PK | Consumer-generated UUID |
|
|
| metadata | text (JSON) | default `{}` | Extension namespace |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| name | text | not null, **unique** | Graph type name (e.g., "call-graph", "acl") |
|
|
| description | text | default `""` | Human-readable description |
|
|
| config | text (JSON) | not null | `GraphConfig` — directed/undirected/mixed, multi, self-loops |
|
|
| version | integer | not null, default 1 | Breaking schema version |
|
|
|
|
### `node_types`
|
|
|
|
Stores node type definitions within a graph type. Each node type has a TypeBox
|
|
schema that validates node attributes.
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ----------- | ------------------- | -------------------------------------- | ---------------------------------------- |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| graphTypeId | text | not null, FK → graphTypes.id (cascade) | Parent graph type |
|
|
| name | text | not null | Node type name (e.g., "call", "account") |
|
|
| description | text | default `""` | |
|
|
| schema | text (JSON) | not null | TypeBox schema for node attributes |
|
|
|
|
**Unique constraint**: `(graphTypeId, name)` — node type names are unique within
|
|
a graph type.
|
|
|
|
### `edge_types`
|
|
|
|
Stores edge type definitions within a graph type.
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ------------------ | ------------------- | -------------------------------------- | ---------------------------------------------- |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| graphTypeId | text | not null, FK → graphTypes.id (cascade) | Parent graph type |
|
|
| name | text | not null | Edge type name (e.g., "triggered", "can_read") |
|
|
| description | text | default `""` | |
|
|
| schema | text (JSON) | not null | TypeBox schema for edge attributes |
|
|
| allowedSourceTypes | text (JSON) | default `[]` | Node type names valid at source endpoint |
|
|
| allowedTargetTypes | text (JSON) | default `[]` | Node type names valid at target endpoint |
|
|
|
|
**Unique constraint**: `(graphTypeId, name)` — edge type names are unique within
|
|
a graph type.
|
|
|
|
**Empty array semantics**: `allowedSourceTypes` and `allowedTargetTypes` default
|
|
to `[]` (empty JSON array) in the database. `[]` means "no restriction" — any
|
|
node type is a valid endpoint — matching the behavior of `undefined` in the
|
|
`EdgeType` schema layer. A non-empty array restricts endpoints to only the
|
|
listed node types. There is no "no types allowed" state; if edge types need to
|
|
be disabled, use a status or soft-delete pattern on the edge type definition.
|
|
The repository layer must enforce this convention consistently. See
|
|
[metagraph-module.md](./metagraph-module.md) for edge endpoint semantics.
|
|
|
|
### `graphs`
|
|
|
|
Graph instances. Each graph belongs to a graph type.
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ----------- | ------------------- | --------------------------------------------- | ---------------------------------------------- |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| graphTypeId | text | FK → graphTypes.id (set null) | Set null on graph type deletion (orphan graph) |
|
|
| name | text | not null | Graph instance name |
|
|
| description | text | default `""` | |
|
|
| status | text | not null, enum: `active`, `archived`, `draft` | Default: `draft` |
|
|
|
|
**On `graphTypeId` set null**: When a graph type is deleted, its graphs become
|
|
orphans with `graphTypeId = null`. The application should prevent graph type
|
|
deletion if active graphs reference it, or set affected graphs' `status` to
|
|
`archived` as part of a soft-delete workflow. Orphan graphs cannot validate
|
|
their node/edge types against a missing type definition — queries against orphan
|
|
graphs should check for `graphTypeId !== null` before performing type-aware
|
|
operations.
|
|
|
|
### `nodes`
|
|
|
|
Nodes within a graph instance. Keyed by `(graphId, key)` — unique within a
|
|
graph.
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ---------- | ------------------- | ---------------------------------- | --------------------------------------------- |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| graphId | text | not null, FK → graphs.id (cascade) | Parent graph |
|
|
| key | text | not null | Consumer-defined identity within the graph |
|
|
| attributes | text (JSON) | not null, default `{}` | Node attributes validated by node type schema |
|
|
|
|
**Unique constraint**: `(graphId, key)` — node keys are unique within a graph.
|
|
|
|
**No `nodeTypeId` column**: Nodes do not have a direct FK to `node_types`. The
|
|
node type is determined at the application layer. This is a deliberate design
|
|
decision — adding a `nodeTypeId` FK would couple the graph instance layer to the
|
|
type definition layer. The repository layer can enforce node type constraints
|
|
via validation against the graph type's schema.
|
|
|
|
### `edges`
|
|
|
|
Edges within a graph instance. Keyed by `(graphId, key)` — unique within a
|
|
graph.
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| ------------- | ------------------- | ---------------------------------- | ---------------------------------------------------- |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| graphId | text | not null, FK → graphs.id (cascade) | Parent graph |
|
|
| key | text | | Consumer-defined identity (null for anonymous edges) |
|
|
| sourceNodeKey | text | not null | Source node key within the graph |
|
|
| targetNodeKey | text | not null | Target node key within the graph |
|
|
| attributes | text (JSON) | not null, default `{}` | Edge attributes validated by edge type schema |
|
|
| undirected | integer (boolean) | default false | Treat as undirected regardless of graph type |
|
|
|
|
**Unique constraint**: `(graphId, key)` — edge keys are unique within a graph.
|
|
|
|
**Foreign keys**: `sourceNodeKey` and `targetNodeKey` reference
|
|
`(nodes.graphId, nodes.key)` with cascade delete. Deleting a node removes all
|
|
its edges.
|
|
|
|
### `actors`
|
|
|
|
Standalone identity table. Currently not referenced by any relation — the
|
|
`actors` table has no FK references to or from any metagraph table and is not
|
|
included in `relations.ts`. This is a placeholder for identity data and may
|
|
become a node type in an ACL graph (based on `@alkdev/operations`'s `Identity`
|
|
interface) or remain a standalone table. See OQ-03 in [open-questions.md](./open-questions.md).
|
|
|
|
| Column | Type | Constraints | Notes |
|
|
| --------- | ------------------- | --------------------------------------- | ------------------ |
|
|
| id | text | PK | |
|
|
| metadata | text (JSON) | default `{}` | |
|
|
| createdAt | integer (timestamp) | not null, default `now` | |
|
|
| updatedAt | integer (timestamp) | not null, default `now` | |
|
|
| name | text | not null | Actor display name |
|
|
| type | text | not null, enum: `human`, `llm`, `agent` | Actor type |
|
|
|
|
## Relations
|
|
|
|
Drizzle relational mappings define the following relationships:
|
|
|
|
- **graphTypes → nodeTypes**: one-to-many
|
|
- **graphTypes → edgeTypes**: one-to-many
|
|
- **graphTypes → graphs**: one-to-many
|
|
- **graphs → nodes**: one-to-many
|
|
- **graphs → edges**: one-to-many
|
|
- **nodes → outgoing edges** (sourceNode): one-to-many
|
|
- **nodes → incoming edges** (targetNode): one-to-many
|
|
- **edges → source node**: one-to-one (via composite key)
|
|
- **edges → target node**: one-to-one (via composite key)
|
|
|
|
## Client Factory
|
|
|
|
```ts
|
|
import { createSqliteDatabase } from "@alkdev/storage/sqlite";
|
|
import type { SqliteDatabase } from "@alkdev/storage/sqlite";
|
|
import { createClient } from "@libsql/client";
|
|
|
|
const client = createClient({ url: "file:local.db" });
|
|
const db: SqliteDatabase = createSqliteDatabase(client);
|
|
```
|
|
|
|
The factory takes a pre-created `@libsql/client` client and returns a typed
|
|
Drizzle database instance with the full schema attached. This enables:
|
|
|
|
- In-memory testing with `createClient({ url: ":memory:" })`
|
|
- Turso remote connections
|
|
- Custom client configuration (auth tokens, etc.)
|
|
|
|
## Design Decisions
|
|
|
|
All design decisions are documented as ADRs in [decisions/](decisions/).
|
|
|
|
| ADR | Decision | Summary |
|
|
|-----|----------|---------|
|
|
| [019](decisions/019-json-text-for-schema-columns.md) | JSON text for schema columns in SQLite | SQLite uses `text` with JSON mode; application-level validation |
|
|
| [020](decisions/020-no-nodetypeid-on-nodes.md) | No nodeTypeId on nodes | Node type enforced at application layer, not via FK |
|
|
| [021](decisions/021-edge-identity-uses-consumer-keys.md) | Edge identity uses consumer-defined keys | `(graphId, key)` as unique identity within a graph |
|
|
| [022](decisions/022-composite-fks-for-node-references.md) | Composite foreign keys for node references | Edges reference `(graphId, sourceNodeKey) → (nodes.graphId, nodes.key)` |
|
|
| [006](decisions/006-enum-pattern-as-const-objects.md) | `as const` objects, not TypeScript enums | `GRAPH_STATUS`, `ACTOR_TYPE` use const objects; TypeBox uses Literal unions |
|
|
| [008](decisions/008-common-columns-pattern.md) | Common columns pattern | `id`, `metadata`, `createdAt`, `updatedAt` on every table |
|
|
|
|
## Metadata Convention
|
|
|
|
Every table has a `metadata` JSON column defaulting to `{}`. This is an
|
|
extension namespace for subsystem use, following a namespacing convention:
|
|
`_subsystem.key` (e.g., `_keypal.scopes`, `_retention.expiresAt`).
|
|
|
|
**What metadata is for**: Opaque key-value pairs that subsystems add without
|
|
schema changes. It's never queried in WHERE clauses or JOINs.
|
|
|
|
**What metadata is NOT for**: A replacement for typed columns. If a field
|
|
appears in WHERE clauses, JOIN conditions, or needs a constraint, it should be a
|
|
proper column — not buried in metadata. When in doubt, add a column.
|
|
|
|
**Namespacing convention**: Subsystems should prefix their keys (e.g.,
|
|
`_callgraph.payloadRef`, `_acl.inherited`). Unprefixed keys are reserved for the
|
|
storage package itself.
|
|
|
|
## Concurrency Model
|
|
|
|
The SQLite host targets spoke deployments where a single process accesses the
|
|
database. For this model, SQLite's default journal mode is sufficient. However,
|
|
for spoke deployments that may run concurrent writes (e.g., multiple worker
|
|
threads), consumers should:
|
|
|
|
1. **Enable WAL mode**: `PRAGMA journal_mode=WAL;` — allows concurrent reads
|
|
during writes
|
|
2. **Set busy timeout**: `PRAGMA busy_timeout=5000;` — wait up to 5 seconds for
|
|
lock acquisition
|
|
3. **Use a single writer**: SQLite supports one writer at a time. If multiple
|
|
threads write, route writes through a single queue or connection
|
|
|
|
The `createSqliteDatabase()` factory does not set these pragmas — it's the
|
|
consumer's responsibility to configure the SQLite connection appropriately. The
|
|
libsql client used to create the connection can be pre-configured before passing
|
|
it to the factory.
|
|
|
|
## PostgreSQL Porting Notes
|
|
|
|
When implementing `src/pg/`, the table shapes remain the same but with these
|
|
changes:
|
|
|
|
| SQLite | PostgreSQL |
|
|
| -------------------------------- | ---------------------------------------- |
|
|
| `sqliteTable` | `pgTable` |
|
|
| `text` (JSON mode) | `jsonb` with `.$type<T>()` |
|
|
| `integer` (timestamp mode) | `timestamp` with timezone |
|
|
| `sql\`(strftime('%s', 'now'))\`` | `sql\`now()\`` |
|
|
| `integer` (boolean mode) | `boolean` |
|
|
| `text` (enum) | `pgEnum` or `text` with check constraint |
|
|
|
|
See a consumer's `commonCols` pattern (e.g., the hub's
|
|
`/workspace/@alkdev/hub/docs/architecture/storage/table-reference.md`) for
|
|
PostgreSQL reference patterns.
|
|
|
|
## References
|
|
|
|
- Drizzle ORM SQLite core: https://orm.drizzle.team/docs/sqlite-core
|
|
- libsql client: https://github.com/tursodatabase/libsql
|
|
- Hub common columns (reference consumer):
|
|
`/workspace/@alkdev/hub/docs/architecture/storage/table-reference.md`
|
|
- Operations AccessControl and Identity: `/workspace/@alkdev/operations/docs/architecture/api-surface.md`
|
|
- Source: `src/sqlite/`
|