Files

glm-5.1 33a5b0816d docs: correct ecosystem dependency direction and add integration context

Architecture docs previously referenced the hub as the authoritative source
for call/identity specs. In reality, call protocol, identity, and access control
come from @alkdev/operations; call graph schemas from @alkdev/flowgraph; task
graph schemas from @alkdev/taskgraph; event transport from @alkdev/pubsub. The
hub is a consumer of @alkdev/storage, not the other way around.

Key changes:
- overview.md: add Ecosystem Integration section with dependency direction
  diagram, What Comes From Where table, repo layer bridging pattern, and
  circular dependency avoidance guidance
- overview.md: promote repo-layer vs operations-bridging from open question
  to explicit decision (CRUD in storage, bridging in consumer)
- overview.md: add zero-ecosystem-dependency statement; fix taskgraph type
  names (TaskGraphNodeAttributes, DependencyEdge)
- overview.md: fix terminology (hub is consumer, not authority)
- metagraph.md: add Ecosystem Context section; replace hub references with
  correct ecosystem sources; fix GraphStatus/GraphBaseType enum
  mischaracterization (C1); unify empty-array semantics with sqlite-host (C2);
  clarify repo layer does NOT import operations (C3); add flowgraph canonical
  schema note; add versioning cross-reference to graph_types table
- encrypted-data.md: reframe hub as provenance not authority; update What
  Lives Where table; fix standalone table advice; update references
- sqlite-host.md: fix actors table description; unify empty-array semantics;
  contextualize hub as reference consumer; add operations identity reference

2026-05-28 14:25:16 +00:00

21 KiB

Raw Blame History

status, last_updated

status	last_updated
draft	2026-05-28

SQLite Host

The SQLite database host for @alkdev/storage. Uses Drizzle ORM with libsql/Turso for the SQLite dialect and @alkdev/drizzlebox for TypeBox schema generation from Drizzle table definitions.

Overview

The SQLite host provides:

Drizzle table definitions for the metagraph pattern (graph types, node types, edge types, graphs, nodes, edges) plus a standalone actors table
Drizzle relations for the relational query API
TypeBox schemas auto-generated from Drizzle tables (select/insert validation)
Injectable database factory — createSqliteDatabase(client) accepts a pre-created client

The SQLite host is the first-class target. PostgreSQL will follow the same table shapes with appropriate dialect changes.

Package Structure

src/sqlite/
├── tables/
│   ├── common.ts          # commonCols, ACTOR_TYPE enum
│   ├── graphTypes.ts      # graph_types table + select/insert schemas
│   ├── nodeTypes.ts        # node_types table + select/insert schemas
│   ├── edgeTypes.ts        # edge_types table + select/insert schemas
│   ├── graphs.ts           # graphs table + select/insert schemas
│   ├── nodes.ts            # nodes table + select/insert schemas
│   ├── edges.ts            # edges table + select/insert schemas
│   ├── actors.ts           # actors table + select/insert schemas
│   └── index.ts            # barrel re-export
├── relations.ts            # Drizzle relational mappings
├── schema.ts              # re-exports tables + relations
└── client.ts              # createSqliteDatabase()

Tables

Common Columns

All tables share these columns:

{
  id: text("id").primaryKey(),
  metadata: text("metadata", { mode: "json" }).$type<Record<string, unknown>>().default({}),
  createdAt: integer("created_at", { mode: "timestamp" })
    .default(sql`(strftime('%s', 'now'))`)
    .notNull(),
  updatedAt: integer("updated_at", { mode: "timestamp" })
    .default(sql`(strftime('%s', 'now'))`)
    .notNull(),
}

Notable differences from a typical PostgreSQL common columns pattern:

Column	SQLite	PostgreSQL (typical)
`id`	text PK (consumer-generated)	text PK with `$defaultFn(() => crypto.randomUUID())`
`metadata`	`text` with JSON mode	`jsonb` with `$type<Record<string, unknown>>()`
`createdAt`	`integer` timestamp mode (Unix epoch)	`timestamp with timezone` defaulting `now()`
`updatedAt`	`integer` timestamp mode (Unix epoch)	`timestamp with timezone` defaulting `now()` with `$onUpdate`

The SQLite columns do NOT have $defaultFn for ID generation (the consumer provides IDs) and do NOT have $onUpdate for updatedAt (Drizzle's $onUpdate is application-level; consumers must set it explicitly).

`graph_types`

Stores graph type definitions (schemas for classes of graphs).

Column	Type	Constraints	Notes
id	text	PK	Consumer-generated UUID
metadata	text (JSON)	default `{}`	Extension namespace
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
name	text	not null, unique	Graph type name (e.g., "call-graph", "acl")
description	text	default `""`	Human-readable description
config	text (JSON)	not null	`GraphConfig` — directed/undirected/mixed, multi, self-loops
version	integer	not null, default 1	Breaking schema version

`node_types`

Stores node type definitions within a graph type. Each node type has a TypeBox schema that validates node attributes.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
graphTypeId	text	not null, FK → graphTypes.id (cascade)	Parent graph type
name	text	not null	Node type name (e.g., "call", "account")
description	text	default `""`
schema	text (JSON)	not null	TypeBox schema for node attributes

Unique constraint: (graphTypeId, name) — node type names are unique within a graph type.

`edge_types`

Stores edge type definitions within a graph type.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
graphTypeId	text	not null, FK → graphTypes.id (cascade)	Parent graph type
name	text	not null	Edge type name (e.g., "triggered", "can_read")
description	text	default `""`
schema	text (JSON)	not null	TypeBox schema for edge attributes
allowedSourceTypes	text (JSON)	default `[]`	Node type names valid at source endpoint
allowedTargetTypes	text (JSON)	default `[]`	Node type names valid at target endpoint

Unique constraint: (graphTypeId, name) — edge type names are unique within a graph type.

Empty array semantics: allowedSourceTypes and allowedTargetTypes default to [] (empty JSON array) in the database. [] means "no restriction" — any node type is a valid endpoint — matching the behavior of undefined in the EdgeType schema layer. A non-empty array restricts endpoints to only the listed node types. There is no "no types allowed" state; if edge types need to be disabled, use a status or soft-delete pattern on the edge type definition. The repository layer must enforce this convention consistently. See metagraph.md for the schema-layer definition.

`graphs`

Graph instances. Each graph belongs to a graph type.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
graphTypeId	text	FK → graphTypes.id (set null)	Set null on graph type deletion (orphan graph)
name	text	not null	Graph instance name
description	text	default `""`
status	text	not null, enum: `active`, `archived`, `draft`	Default: `draft`

On graphTypeId set null: When a graph type is deleted, its graphs become orphans with graphTypeId = null. The application should prevent graph type deletion if active graphs reference it, or set affected graphs' status to archived as part of a soft-delete workflow. Orphan graphs cannot validate their node/edge types against a missing type definition — queries against orphan graphs should check for graphTypeId !== null before performing type-aware operations.

`nodes`

Nodes within a graph instance. Keyed by (graphId, key) — unique within a graph.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
graphId	text	not null, FK → graphs.id (cascade)	Parent graph
key	text	not null	Consumer-defined identity within the graph
attributes	text (JSON)	not null, default `{}`	Node attributes validated by node type schema

Unique constraint: (graphId, key) — node keys are unique within a graph.

No nodeTypeId column: Nodes do not have a direct FK to node_types. The node type is determined at the application layer. This is a deliberate design decision — adding a nodeTypeId FK would couple the graph instance layer to the type definition layer. The repository layer can enforce node type constraints via validation against the graph type's schema.

`edges`

Edges within a graph instance. Keyed by (graphId, key) — unique within a graph.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
graphId	text	not null, FK → graphs.id (cascade)	Parent graph
key	text		Consumer-defined identity (null for anonymous edges)
sourceNodeKey	text	not null	Source node key within the graph
targetNodeKey	text	not null	Target node key within the graph
attributes	text (JSON)	not null, default `{}`	Edge attributes validated by edge type schema
undirected	integer (boolean)	default false	Treat as undirected regardless of graph type

Unique constraint: (graphId, key) — edge keys are unique within a graph.

Foreign keys: sourceNodeKey and targetNodeKey reference (nodes.graphId, nodes.key) with cascade delete. Deleting a node removes all its edges.

`actors`

Standalone identity table. Currently not referenced by any relation — the actors table has no FK references to or from any metagraph table and is not included in relations.ts. This is a placeholder for identity data and may become a node type in an ACL graph (based on @alkdev/operations's Identity interface) or remain a standalone table. See overview.md Open Question 1.

Column	Type	Constraints	Notes
id	text	PK
metadata	text (JSON)	default `{}`
createdAt	integer (timestamp)	not null, default `now`
updatedAt	integer (timestamp)	not null, default `now`
name	text	not null	Actor display name
type	text	not null, enum: `human`, `llm`, `agent`	Actor type

Relations

Drizzle relational mappings define the following relationships:

graphTypes → nodeTypes: one-to-many
graphTypes → edgeTypes: one-to-many
graphTypes → graphs: one-to-many
graphs → nodes: one-to-many
graphs → edges: one-to-many
nodes → outgoing edges (sourceNode): one-to-many
nodes → incoming edges (targetNode): one-to-many
edges → source node: one-to-one (via composite key)
edges → target node: one-to-one (via composite key)

Client Factory

import { createSqliteDatabase } from "@alkdev/storage/sqlite";
import type { SqliteDatabase } from "@alkdev/storage/sqlite";
import { createClient } from "@libsql/client";

const client = createClient({ url: "file:local.db" });
const db: SqliteDatabase = createSqliteDatabase(client);

The factory takes a pre-created @libsql/client client and returns a typed Drizzle database instance with the full schema attached. This enables:

In-memory testing with createClient({ url: ":memory:" })
Turso remote connections
Custom client configuration (auth tokens, etc.)

Design Decisions

SD1: JSON text vs. JSONB in SQLite

SQLite stores JSON as text with { mode: "json" }. PostgreSQL uses native jsonb. This means:

SQLite cannot query inside JSON columns efficiently (no GIN indexes)
SQLite JSON validation relies on application-level checks (TypeBox schemas)
PostgreSQL will get queryability benefits for JSON columns

The trade-off: SQLite is for spokes (local, infrequent queries), PostgreSQL is for the hub (frequent, complex queries).

SD2: No `nodeTypeId` on nodes

Nodes don't carry a direct FK to node_types. The node type is enforced at the application layer. Reasons:

Graph type schemas define which node types are valid. Adding a FK would duplicate this constraint.
Node types can evolve (schemas can change) without requiring node row updates.
The repository layer validates node attributes against the appropriate node type schema before insertion.

This may change if query performance requires filtering nodes by type. A nodeTypeId column can be added as a denormalized index.

SD3: Edge identity uses consumer-defined keys

Edges use (graphId, key) as their unique identity. The key is consumer-defined, matching the metagraph model where consumers control identifiers. For anonymous edges (common in simple graphs), key can be auto-generated.

SD4: Composite foreign keys for node references

Edges reference nodes via composite FKs: (graphId, sourceNodeKey) → (nodes.graphId, nodes.key). This ensures referential integrity within a graph and cascades node deletions to connected edges.

SD5: Enum pattern — `as const` objects, not TypeScript enums

All enumerations use the as const object pattern (e.g., GRAPH_STATUS = { Active: "active", ... } as const) rather than TypeScript enum. This matches the ACTOR_TYPE pattern in common.ts and avoids JSR slow-type issues. The TypeBox schema is a Type.Union of Type.Literal values derived from the object.

Metadata Convention

Every table has a metadata JSON column defaulting to {}. This is an extension namespace for subsystem use, following a namespacing convention: _subsystem.key (e.g., _keypal.scopes, _retention.expiresAt).

What metadata is for: Opaque key-value pairs that subsystems add without schema changes. It's never queried in WHERE clauses or JOINs.

What metadata is NOT for: A replacement for typed columns. If a field appears in WHERE clauses, JOIN conditions, or needs a constraint, it should be a proper column — not buried in metadata. When in doubt, add a column.

Namespacing convention: Subsystems should prefix their keys (e.g., _callgraph.payloadRef, _acl.inherited). Unprefixed keys are reserved for the storage package itself.

Concurrency Model

The SQLite host targets spoke deployments where a single process accesses the database. For this model, SQLite's default journal mode is sufficient. However, for spoke deployments that may run concurrent writes (e.g., multiple worker threads), consumers should:

Enable WAL mode: PRAGMA journal_mode=WAL; — allows concurrent reads during writes
Set busy timeout: PRAGMA busy_timeout=5000; — wait up to 5 seconds for lock acquisition
Use a single writer: SQLite supports one writer at a time. If multiple threads write, route writes through a single queue or connection

The createSqliteDatabase() factory does not set these pragmas — it's the consumer's responsibility to configure the SQLite connection appropriately. The libsql client used to create the connection can be pre-configured before passing it to the factory.

PostgreSQL Porting Notes

When implementing src/pg/, the table shapes remain the same but with these changes:

SQLite	PostgreSQL
`sqliteTable`	`pgTable`
`text` (JSON mode)	`jsonb` with `.$type<T>()`
`integer` (timestamp mode)	`timestamp` with timezone
`sql\`(strftime('%s', 'now'))``	`sql\`now()``
`integer` (boolean mode)	`boolean`
`text` (enum)	`pgEnum` or `text` with check constraint

See a consumer's commonCols pattern (e.g., the hub's /workspace/@alkdev/hub/docs/architecture/storage/table-reference.md) for PostgreSQL reference patterns.

References

Drizzle ORM SQLite core: https://orm.drizzle.team/docs/sqlite-core
libsql client: https://github.com/tursodatabase/libsql
Hub common columns (reference consumer): /workspace/@alkdev/hub/docs/architecture/storage/table-reference.md
Operations AccessControl and Identity: /workspace/@alkdev/operations/docs/architecture/api-surface.md
Source: src/sqlite/

21 KiB Raw Blame History