---
status: draft
last_updated: 2026-05-28
---

# SQLite Host

The SQLite database host for `@alkdev/storage`. Uses Drizzle ORM with
libsql/Turso for the SQLite dialect and `@alkdev/drizzlebox` for TypeBox schema
generation from Drizzle table definitions.

## Overview

The SQLite host provides:

1. **Drizzle table definitions** for the metagraph pattern (graph types, node
   types, edge types, graphs, nodes, edges) plus a standalone `actors` table
2. **Drizzle relations** for the relational query API
3. **TypeBox schemas** auto-generated from Drizzle tables (select/insert
   validation)
4. **Injectable database factory** — `createSqliteDatabase(client)` accepts a
   pre-created client

The SQLite host is the first-class target. PostgreSQL will follow the same table
shapes with appropriate dialect changes.

## Package Structure

```
src/sqlite/
├── tables/
│   ├── common.ts          # commonCols, ACTOR_TYPE enum
│   ├── graphTypes.ts      # graph_types table + select/insert schemas
│   ├── nodeTypes.ts        # node_types table + select/insert schemas
│   ├── edgeTypes.ts        # edge_types table + select/insert schemas
│   ├── graphs.ts           # graphs table + select/insert schemas
│   ├── nodes.ts            # nodes table + select/insert schemas
│   ├── edges.ts            # edges table + select/insert schemas
│   ├── actors.ts           # actors table + select/insert schemas
│   └── index.ts            # barrel re-export
├── relations.ts            # Drizzle relational mappings
├── schema.ts              # re-exports tables + relations
└── client.ts              # createSqliteDatabase()
```

## Tables

### Common Columns

All tables share these columns:

```ts
{
  id: text("id").primaryKey(),
  metadata: text("metadata", { mode: "json" }).$type<Record<string, unknown>>().default({}),
  createdAt: integer("created_at", { mode: "timestamp" })
    .default(sql`(strftime('%s', 'now'))`)
    .notNull(),
  updatedAt: integer("updated_at", { mode: "timestamp" })
    .default(sql`(strftime('%s', 'now'))`)
    .notNull(),
}
```

**Notable differences from a typical PostgreSQL common columns pattern**:

| Column      | SQLite                                | PostgreSQL (typical)                                         |
| ----------- | ------------------------------------- | ------------------------------------------------------------- |
| `id`        | text PK (consumer-generated)          | text PK with `$defaultFn(() => crypto.randomUUID())`          |
| `metadata`  | `text` with JSON mode                 | `jsonb` with `$type<Record<string, unknown>>()`               |
| `createdAt` | `integer` timestamp mode (Unix epoch) | `timestamp with timezone` defaulting `now()`                  |
| `updatedAt` | `integer` timestamp mode (Unix epoch) | `timestamp with timezone` defaulting `now()` with `$onUpdate` |

The SQLite columns do NOT have `$defaultFn` for ID generation (the consumer
provides IDs) and do NOT have `$onUpdate` for `updatedAt` (Drizzle's `$onUpdate`
is application-level; consumers must set it explicitly).

### `graph_types`

Stores graph type definitions (schemas for classes of graphs).

| Column      | Type                | Constraints             | Notes                                                        |
| ----------- | ------------------- | ----------------------- | ------------------------------------------------------------ |
| id          | text                | PK                      | Consumer-generated UUID                                      |
| metadata    | text (JSON)         | default `{}`            | Extension namespace                                          |
| createdAt   | integer (timestamp) | not null, default `now` |                                                              |
| updatedAt   | integer (timestamp) | not null, default `now` |                                                              |
| name        | text                | not null, **unique**    | Graph type name (e.g., "call-graph", "acl")                  |
| description | text                | default `""`            | Human-readable description                                   |
| config      | text (JSON)         | not null                | `GraphConfig` — directed/undirected/mixed, multi, self-loops |
| version     | integer             | not null, default 1     | Breaking schema version                                      |

### `node_types`

Stores node type definitions within a graph type. Each node type has a TypeBox
schema that validates node attributes.

| Column      | Type                | Constraints                            | Notes                                    |
| ----------- | ------------------- | -------------------------------------- | ---------------------------------------- |
| id          | text                | PK                                     |                                          |
| metadata    | text (JSON)         | default `{}`                           |                                          |
| createdAt   | integer (timestamp) | not null, default `now`                |                                          |
| updatedAt   | integer (timestamp) | not null, default `now`                |                                          |
| graphTypeId | text                | not null, FK → graphTypes.id (cascade) | Parent graph type                        |
| name        | text                | not null                               | Node type name (e.g., "call", "account") |
| description | text                | default `""`                           |                                          |
| schema      | text (JSON)         | not null                               | TypeBox schema for node attributes       |

**Unique constraint**: `(graphTypeId, name)` — node type names are unique within
a graph type.

### `edge_types`

Stores edge type definitions within a graph type.

| Column             | Type                | Constraints                            | Notes                                          |
| ------------------ | ------------------- | -------------------------------------- | ---------------------------------------------- |
| id                 | text                | PK                                     |                                                |
| metadata           | text (JSON)         | default `{}`                           |                                                |
| createdAt          | integer (timestamp) | not null, default `now`                |                                                |
| updatedAt          | integer (timestamp) | not null, default `now`                |                                                |
| graphTypeId        | text                | not null, FK → graphTypes.id (cascade) | Parent graph type                              |
| name               | text                | not null                               | Edge type name (e.g., "triggered", "can_read") |
| description        | text                | default `""`                           |                                                |
| schema             | text (JSON)         | not null                               | TypeBox schema for edge attributes             |
| allowedSourceTypes | text (JSON)         | default `[]`                           | Node type names valid at source endpoint       |
| allowedTargetTypes | text (JSON)         | default `[]`                           | Node type names valid at target endpoint       |

**Unique constraint**: `(graphTypeId, name)` — edge type names are unique within
a graph type.

**Empty array semantics**: `allowedSourceTypes` and `allowedTargetTypes` default
to `[]` (empty JSON array) in the database. `[]` means "no restriction" — any
node type is a valid endpoint — matching the behavior of `undefined` in the
`EdgeType` schema layer. A non-empty array restricts endpoints to only the
listed node types. There is no "no types allowed" state; if edge types need to
be disabled, use a status or soft-delete pattern on the edge type definition.
The repository layer must enforce this convention consistently. See
[metagraph-module.md](./metagraph-module.md) for edge endpoint semantics.

### `graphs`

Graph instances. Each graph belongs to a graph type.

| Column      | Type                | Constraints                                   | Notes                                          |
| ----------- | ------------------- | --------------------------------------------- | ---------------------------------------------- |
| id          | text                | PK                                            |                                                |
| metadata    | text (JSON)         | default `{}`                                  |                                                |
| createdAt   | integer (timestamp) | not null, default `now`                       |                                                |
| updatedAt   | integer (timestamp) | not null, default `now`                       |                                                |
| graphTypeId | text                | FK → graphTypes.id (set null)                 | Set null on graph type deletion (orphan graph) |
| name        | text                | not null                                      | Graph instance name                            |
| description | text                | default `""`                                  |                                                |
| status      | text                | not null, enum: `active`, `archived`, `draft` | Default: `draft`                               |

**On `graphTypeId` set null**: When a graph type is deleted, its graphs become
orphans with `graphTypeId = null`. The application should prevent graph type
deletion if active graphs reference it, or set affected graphs' `status` to
`archived` as part of a soft-delete workflow. Orphan graphs cannot validate
their node/edge types against a missing type definition — queries against orphan
graphs should check for `graphTypeId !== null` before performing type-aware
operations.

### `nodes`

Nodes within a graph instance. Keyed by `(graphId, key)` — unique within a
graph.

| Column     | Type                | Constraints                        | Notes                                         |
| ---------- | ------------------- | ---------------------------------- | --------------------------------------------- |
| id         | text                | PK                                 |                                               |
| metadata   | text (JSON)         | default `{}`                       |                                               |
| createdAt  | integer (timestamp) | not null, default `now`            |                                               |
| updatedAt  | integer (timestamp) | not null, default `now`            |                                               |
| graphId    | text                | not null, FK → graphs.id (cascade) | Parent graph                                  |
| key        | text                | not null                           | Consumer-defined identity within the graph    |
| attributes | text (JSON)         | not null, default `{}`             | Node attributes validated by node type schema |

**Unique constraint**: `(graphId, key)` — node keys are unique within a graph.

**No `nodeTypeId` column**: Nodes do not have a direct FK to `node_types`. The
node type is determined at the application layer. This is a deliberate design
decision — adding a `nodeTypeId` FK would couple the graph instance layer to the
type definition layer. The repository layer can enforce node type constraints
via validation against the graph type's schema.

### `edges`

Edges within a graph instance. Keyed by `(graphId, key)` — unique within a
graph.

| Column        | Type                | Constraints                        | Notes                                                |
| ------------- | ------------------- | ---------------------------------- | ---------------------------------------------------- |
| id            | text                | PK                                 |                                                      |
| metadata      | text (JSON)         | default `{}`                       |                                                      |
| createdAt     | integer (timestamp) | not null, default `now`            |                                                      |
| updatedAt     | integer (timestamp) | not null, default `now`            |                                                      |
| graphId       | text                | not null, FK → graphs.id (cascade) | Parent graph                                         |
| key           | text                |                                    | Consumer-defined identity (null for anonymous edges) |
| sourceNodeKey | text                | not null                           | Source node key within the graph                     |
| targetNodeKey | text                | not null                           | Target node key within the graph                     |
| attributes    | text (JSON)         | not null, default `{}`             | Edge attributes validated by edge type schema        |
| undirected    | integer (boolean)   | default false                      | Treat as undirected regardless of graph type         |

**Unique constraint**: `(graphId, key)` — edge keys are unique within a graph.

**Foreign keys**: `sourceNodeKey` and `targetNodeKey` reference
`(nodes.graphId, nodes.key)` with cascade delete. Deleting a node removes all
its edges.

### `actors`

Standalone identity table. Currently not referenced by any relation — the
`actors` table has no FK references to or from any metagraph table and is not
included in `relations.ts`. This is a placeholder for identity data and may
become a node type in an ACL graph (based on `@alkdev/operations`'s `Identity`
interface) or remain a standalone table. See [overview.md](./overview.md) Open
Question 1.

| Column    | Type                | Constraints                             | Notes              |
| --------- | ------------------- | --------------------------------------- | ------------------ |
| id        | text                | PK                                      |                    |
| metadata  | text (JSON)         | default `{}`                            |                    |
| createdAt | integer (timestamp) | not null, default `now`                 |                    |
| updatedAt | integer (timestamp) | not null, default `now`                 |                    |
| name      | text                | not null                                | Actor display name |
| type      | text                | not null, enum: `human`, `llm`, `agent` | Actor type         |

## Relations

Drizzle relational mappings define the following relationships:

- **graphTypes → nodeTypes**: one-to-many
- **graphTypes → edgeTypes**: one-to-many
- **graphTypes → graphs**: one-to-many
- **graphs → nodes**: one-to-many
- **graphs → edges**: one-to-many
- **nodes → outgoing edges** (sourceNode): one-to-many
- **nodes → incoming edges** (targetNode): one-to-many
- **edges → source node**: one-to-one (via composite key)
- **edges → target node**: one-to-one (via composite key)

## Client Factory

```ts
import { createSqliteDatabase } from "@alkdev/storage/sqlite";
import type { SqliteDatabase } from "@alkdev/storage/sqlite";
import { createClient } from "@libsql/client";

const client = createClient({ url: "file:local.db" });
const db: SqliteDatabase = createSqliteDatabase(client);
```

The factory takes a pre-created `@libsql/client` client and returns a typed
Drizzle database instance with the full schema attached. This enables:

- In-memory testing with `createClient({ url: ":memory:" })`
- Turso remote connections
- Custom client configuration (auth tokens, etc.)

## Design Decisions

### SD1: JSON text vs. JSONB in SQLite

SQLite stores JSON as `text` with `{ mode: "json" }`. PostgreSQL uses native
`jsonb`. This means:

- SQLite cannot query inside JSON columns efficiently (no GIN indexes)
- SQLite JSON validation relies on application-level checks (TypeBox schemas)
- PostgreSQL will get queryability benefits for JSON columns

The trade-off: SQLite is for spokes (local, infrequent queries), PostgreSQL is
for the hub (frequent, complex queries).

### SD2: No `nodeTypeId` on nodes

Nodes don't carry a direct FK to `node_types`. The node type is enforced at the
application layer. Reasons:

- Graph type schemas define which node types are valid. Adding a FK would
  duplicate this constraint.
- Node types can evolve (schemas can change) without requiring node row updates.
- The repository layer validates node attributes against the appropriate node
  type schema before insertion.

This may change if query performance requires filtering nodes by type. A
`nodeTypeId` column can be added as a denormalized index.

### SD3: Edge identity uses consumer-defined keys

Edges use `(graphId, key)` as their unique identity. The `key` is
consumer-defined, matching the metagraph model where consumers control
identifiers. For anonymous edges (common in simple graphs), `key` can be
auto-generated.

### SD4: Composite foreign keys for node references

Edges reference nodes via composite FKs:
`(graphId, sourceNodeKey) → (nodes.graphId, nodes.key)`. This ensures
referential integrity within a graph and cascades node deletions to connected
edges.

### SD5: Enum pattern — `as const` objects, not TypeScript enums

All enumerations use the `as const` object pattern (e.g.,
`GRAPH_STATUS = { Active: "active", ... } as const`) rather than TypeScript
`enum`. This matches the `ACTOR_TYPE` pattern in `common.ts` and avoids JSR
slow-type issues. The TypeBox schema is a `Type.Union` of `Type.Literal` values
derived from the object.

## Metadata Convention

Every table has a `metadata` JSON column defaulting to `{}`. This is an
extension namespace for subsystem use, following a namespacing convention:
`_subsystem.key` (e.g., `_keypal.scopes`, `_retention.expiresAt`).

**What metadata is for**: Opaque key-value pairs that subsystems add without
schema changes. It's never queried in WHERE clauses or JOINs.

**What metadata is NOT for**: A replacement for typed columns. If a field
appears in WHERE clauses, JOIN conditions, or needs a constraint, it should be a
proper column — not buried in metadata. When in doubt, add a column.

**Namespacing convention**: Subsystems should prefix their keys (e.g.,
`_callgraph.payloadRef`, `_acl.inherited`). Unprefixed keys are reserved for the
storage package itself.

## Concurrency Model

The SQLite host targets spoke deployments where a single process accesses the
database. For this model, SQLite's default journal mode is sufficient. However,
for spoke deployments that may run concurrent writes (e.g., multiple worker
threads), consumers should:

1. **Enable WAL mode**: `PRAGMA journal_mode=WAL;` — allows concurrent reads
   during writes
2. **Set busy timeout**: `PRAGMA busy_timeout=5000;` — wait up to 5 seconds for
   lock acquisition
3. **Use a single writer**: SQLite supports one writer at a time. If multiple
   threads write, route writes through a single queue or connection

The `createSqliteDatabase()` factory does not set these pragmas — it's the
consumer's responsibility to configure the SQLite connection appropriately. The
libsql client used to create the connection can be pre-configured before passing
it to the factory.

## PostgreSQL Porting Notes

When implementing `src/pg/`, the table shapes remain the same but with these
changes:

| SQLite                           | PostgreSQL                               |
| -------------------------------- | ---------------------------------------- |
| `sqliteTable`                    | `pgTable`                                |
| `text` (JSON mode)               | `jsonb` with `.$type<T>()`               |
| `integer` (timestamp mode)       | `timestamp` with timezone                |
| `sql\`(strftime('%s', 'now'))\`` | `sql\`now()\``                           |
| `integer` (boolean mode)         | `boolean`                                |
| `text` (enum)                    | `pgEnum` or `text` with check constraint |

See a consumer's `commonCols` pattern (e.g., the hub's
`/workspace/@alkdev/hub/docs/architecture/storage/table-reference.md`) for
PostgreSQL reference patterns.

## References

- Drizzle ORM SQLite core: https://orm.drizzle.team/docs/sqlite-core
- libsql client: https://github.com/tursodatabase/libsql
- Hub common columns (reference consumer):
  `/workspace/@alkdev/hub/docs/architecture/storage/table-reference.md`
- Operations AccessControl and Identity: `/workspace/@alkdev/operations/docs/architecture/api-surface.md`
- Source: `src/sqlite/`