ADR-018: Remove AI SDK, use openai SDK directly with hub-own streaming
Replace the Vercel AI SDK with direct OpenAI SDK calls and a custom AgentLoop. The AI SDK has zero runtime integration today, so removing it costs nothing. Supply chain risk (2-5 releases/day, April 2026 Vercel breach, bus factor of 1) makes it a liability we don't need. Key changes: - ADR-018 accepted: openai package (zero runtime deps) replaces ai SDK - AgentLoop handles multi-step tool execution explicitly (~300 LOC vs AI SDK's ~2700 LOC streamText) - Hub owns UIMessage/UIPart/ToolCallState types (extends ADR-016) - Hub owns streaming protocol (subset of AI SDK's UIMessageChunk wire format with step boundaries, error handling, usage tracking) - operationToOpenAITool() maps TypeBox schemas directly, no adapter - Trade-off: ~1100 LOC total new code for the savings of 6+ transitive deps, supply chain risk, and release cadence coupling Updates AGENTS.md constraints and dependencies, adds OQ-63/OQ-64/OQ-65 and Theme 11 (Inference & LLM Integration) to open questions.
This commit is contained in:
340
docs/decisions/ADR-018-no-ai-sdk-direct-openai-proxy.md
Normal file
340
docs/decisions/ADR-018-no-ai-sdk-direct-openai-proxy.md
Normal file
@@ -0,0 +1,340 @@
|
||||
# ADR-018: No AI SDK — direct OpenAI proxy with hub-own streaming
|
||||
|
||||
- **Status**: Accepted
|
||||
- **Date**: 2026-05-26
|
||||
- **Deciders**: alkdev
|
||||
|
||||
## Context
|
||||
|
||||
The hub was architected with the Vercel AI SDK (`ai` package + `@ai-sdk/*`) as a core dependency for LLM streaming. `agent-sessions.md` describes direct agents using `streamText()`/`generateText()` with `proxyProvider()` and `operationToTool()` bridging the operations registry to AI SDK tools. ADR-016 made AI SDK `UIMessage` the primary design constraint for the session/message/part schema.
|
||||
|
||||
However, the AI SDK has **zero runtime integration today** — it appears only in architecture docs and `deno.json` has no `ai` import. The hub's `src/inference/` directory doesn't exist yet. This is the right time to remove it before it becomes entrenched.
|
||||
|
||||
### Supply chain risk
|
||||
|
||||
The AI SDK presents moderate supply chain risk:
|
||||
|
||||
1. **Extreme release cadence**: 2-5 releases/day across 3 version lines (1,224 total npm versions). Every release is surface area for compromise or regression.
|
||||
2. **April 2026 Vercel security incident**: A threat actor compromised a Vercel employee's Google Workspace account via a supply chain attack on Context.ai, gaining access to Vercel's internal systems. npm publish tokens were rotated after the breach. While no `ai` packages were confirmed compromised, the attack vector is real.
|
||||
3. **Bus factor of 1**: One dominant contributor (Lgrammel, 1,980 commits — 5x the #2 contributor). No CODEOWNERS file, no formal governance model.
|
||||
4. **Transitive dependency concerns**: `json-schema@0.4.0` is unmaintained with a single maintainer. `@vercel/oidc` is Vercel-specific infrastructure coupling (though only in `@ai-sdk/gateway`, which we wouldn't use).
|
||||
5. **Automated release pipeline**: Changesets auto-merge and auto-publish. A compromised maintainer account or malicious PR could publish a poisoned package.
|
||||
|
||||
For comparison, the `openai` npm package has **zero runtime dependencies**, is auto-generated from OpenAPI spec, and releases ~1/week.
|
||||
|
||||
### Why not "AI SDK with hardening"?
|
||||
|
||||
The supply chain risk assessment ([ai-sdk-supply-chain-risk.md](../research/ai-sdk-supply-chain-risk.md)) recommends "use the AI SDK with supply chain hardening" as its primary option. This ADR goes further and removes the AI SDK entirely. The reasoning:
|
||||
|
||||
1. **Zero runtime integration = zero migration cost**: The hub has no `ai` import in any source file. There is nothing to migrate. Removing a planned dependency that hasn't been integrated yet is essentially free; adding it and removing it later would be expensive.
|
||||
|
||||
2. **Ownership philosophy**: ADR-015 removed opencode because the hub should own its data model and execution model. ADR-016 established hub-own schema ownership. The same principle applies to the streaming protocol and message types — the hub should own these, not have them constrained by a third-party library's release cadence.
|
||||
|
||||
3. **The proxy already abstracts provider routing**: The hub's OpenAI-compatible proxy (already architecturally committed) routes calls to providers. A new provider means adding a route in the proxy, not swapping AI SDK provider packages. The AI SDK's multi-provider abstraction provides no value in this architecture.
|
||||
|
||||
4. **Security is cumulative**: Each supply chain attack surface removed is additive. We removed opencode (ADR-015) and reduced the attack surface. Removing the AI SDK continues this. We're building a platform for other people's production workloads — minimizing trust in external packages with high release cadence and corporate attack targets is a reasonable posture.
|
||||
|
||||
5. **The code is bounded and well-understood**: The AI SDK's streaming protocol is well-specified. Reimplementing a subset that covers the hub's needs is ~900 lines of focused code (see Implementation scope). This is not a risky unknown — it's a straightforward SSE transformation with clear input/output formats.
|
||||
|
||||
### What we actually need from the AI SDK
|
||||
|
||||
The AI SDK provides three things the hub's architecture references:
|
||||
|
||||
1. **`UIMessage` format** — role + parts array for session messages
|
||||
2. **`streamText()`/`generateText()`** — LLM calling with streaming, tool execution, and multi-step agent loops
|
||||
3. **`tool()` + `operationToTool()`** — bridging the operations registry to AI SDK tool definitions
|
||||
|
||||
The proxy is already architecturally committed — `agent-sessions.md` describes `/v1/chat/completions` as a Hono HTTP endpoint. The question is whether we call OpenAI-compatible APIs through the AI SDK or directly through the `openai` npm package.
|
||||
|
||||
### What removing the AI SDK simplifies
|
||||
|
||||
After ADR-015 removed the opencode integration, the AI SDK's role narrowed significantly. The `ai-sdk-provider-opencode-sdk` package is gone. "Runner agents" now run in the dev spoke — they call the hub's OpenAI proxy directly, no AI SDK involved on their side either.
|
||||
|
||||
The only place the AI SDK was used was for "direct agents" running in the hub process. These agents:
|
||||
- Read messages from Postgres
|
||||
- Convert operations to tools
|
||||
- Call an LLM via `streamText()` (which handles multi-step tool execution internally)
|
||||
- Persist the response parts back to Postgres
|
||||
|
||||
This is a bounded loop that the hub can implement directly, without the AI SDK's multi-provider abstraction, React hooks, or streaming protocol layers.
|
||||
|
||||
## Decision
|
||||
|
||||
Remove the Vercel AI SDK as a dependency. The hub will:
|
||||
|
||||
1. **Define its own `UIMessage` type** compatible with the AI SDK's format. ADR-016 already says the hub owns its schema — this extends that ownership to the TypeScript type. The type is a plain interface (role + parts array); there are no runtime dependencies.
|
||||
|
||||
2. **Use the `openai` npm package directly** for LLM calls. Zero runtime dependencies, well-maintained, auto-generated from OpenAPI spec, compatible with Deno via npm specifiers.
|
||||
|
||||
3. **Map operations to OpenAI tool calling format directly** — no `tool()` adapter needed. The operations registry already stores JSON Schema (via TypeBox). Converting `IOperationDefinition.inputSchema` to OpenAI's `{ type: "function", function: { name, description, parameters } }` format is a JSON Schema transform with normalization.
|
||||
|
||||
4. **Implement hub-own streaming** for the proxy's SSE output. The proxy receives OpenAI SSE chunks and transforms them into the hub's stream format — a subset of the AI SDK's `UIMessageChunk` protocol that covers the part types the hub uses.
|
||||
|
||||
5. **Implement the agent execution loop directly**. The AI SDK's `streamText()` handles multi-step tool execution loops internally. The hub will implement this loop explicitly: call LLM → detect tool calls → execute tools via registry → feed results back → repeat until the LLM produces a final response with no tool calls.
|
||||
|
||||
### Architecture changes
|
||||
|
||||
**Before (AI SDK)**:
|
||||
```
|
||||
Direct Agent → streamText() → proxyProvider('anthropic/...') → Hub Proxy → Provider
|
||||
Direct Agent → generateText() → proxyProvider('anthropic/...') → Hub Proxy → Provider
|
||||
Direct Agent → tool() → operationToTool() → registry.execute()
|
||||
Dev Spoke → HTTP POST → Hub Proxy → Provider
|
||||
```
|
||||
|
||||
**After (No AI SDK)**:
|
||||
```
|
||||
Direct Agent → AgentLoop → openai SDK → Hub Proxy → Provider
|
||||
↕
|
||||
operationToOpenAITool() → registry.execute()
|
||||
Dev Spoke → HTTP POST → Hub Proxy → Provider
|
||||
```
|
||||
|
||||
Both paths go through the same proxy. The proxy adds the provider API key and forwards. The direct agent path uses the `openai` SDK pointed at `localhost` (the proxy). The dev spoke path makes HTTP requests to the proxy.
|
||||
|
||||
### Agent execution loop
|
||||
|
||||
The AI SDK's `streamText()` handles multi-step tool execution internally: detect tool calls → execute → feed results → re-prompt → repeat. Without it, the hub must implement this loop explicitly.
|
||||
|
||||
**The `AgentLoop`**:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ 1. Load session messages from Postgres │
|
||||
│ 2. Convert to OpenAI chat message format │
|
||||
│ 3. Convert hub operations to OpenAI tool definitions │
|
||||
│ 4. Call LLM (via openai SDK, streaming) │
|
||||
│ 5. Emit stream events to client (SSE) │
|
||||
│ 6. Accumulate response │
|
||||
│ 7. If response contains tool_calls: │
|
||||
│ a. Emit step-finish event │
|
||||
│ b. For each tool_call: │
|
||||
│ - Execute via registry.execute() │
|
||||
│ - Emit tool-output-available event │
|
||||
│ c. Append tool results to messages │
|
||||
│ d. Emit step-start event │
|
||||
│ e. Go to step 4 │
|
||||
│ 8. If response has no tool_calls: │
|
||||
│ a. Emit finish event (with usage data) │
|
||||
│ b. Persist messages and parts to Postgres │
|
||||
│ c. Done │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Step boundaries**: Each LLM call within a single agent turn is a "step." Steps are bounded by `step-start` and `step-finish` SSE events so clients can distinguish between the LLM's initial response and subsequent responses after tool execution.
|
||||
|
||||
**Max steps**: Default 10 (configurable per session/role). Prevents infinite tool call loops. If the LLM requests more than 10 steps, the loop terminates with a `finish` event containing `finishReason: "max-steps"`.
|
||||
|
||||
**Error handling**: If a tool execution fails, the loop reports the error to the LLM as a tool result with `errorText` and continues the loop. The LLM can choose to retry, use a different tool, or explain the error to the user. If the LLM call itself fails (rate limit, network error), the hub retries with exponential backoff (max 3 retries for 429/5xx errors). Non-retryable errors (4xx except 429, context window exceeded) are emitted as `error` stream events and the loop terminates.
|
||||
|
||||
**Usage tracking**: The `stream_options: { include_usage: true }` parameter is sent with each LLM call. The final step's usage data (prompt tokens, completion tokens) is accumulated across all steps and included in the `finish` event. The hub's `clients` type `llm-provider` stores cost metadata; the session's `data` column records total usage per turn.
|
||||
|
||||
**Concurrent tool calls**: OpenAI responses can include multiple tool calls in a single response. The hub executes all tool calls in a step concurrently (via `Promise.all`), collects results, then continues the loop. All tool results are appended to messages before the next LLM call.
|
||||
|
||||
### `UIMessage` type ownership
|
||||
|
||||
ADR-016 already established that the hub owns its schema. We now also own the TypeScript type definition:
|
||||
|
||||
```ts
|
||||
// src/inference/types.ts
|
||||
|
||||
/** Tool call lifecycle states. */
|
||||
type ToolCallState =
|
||||
| "streaming" // arguments are being streamed (tool-input-delta events)
|
||||
| "call" // arguments complete, awaiting execution
|
||||
| "result" // tool executed successfully, output available
|
||||
| "error"; // tool execution failed, errorText available
|
||||
|
||||
/** Compatible with AI SDK UIMessage but owned by the hub. */
|
||||
type UIPart =
|
||||
| { type: "text"; text: string; state?: "streaming" | "done" }
|
||||
| { type: "reasoning"; text: string; state?: "streaming" | "done" }
|
||||
| { type: "tool"; toolCallId: string; toolName: string; state: ToolCallState; input?: unknown; output?: unknown; errorText?: string }
|
||||
| { type: "file"; mediaType: string; url: string; filename?: string }
|
||||
| { type: "source-url"; sourceId: string; url: string; title?: string }
|
||||
| { type: "step-start" }
|
||||
| { type: "data"; id?: string; data: unknown; transient?: boolean };
|
||||
|
||||
type UIMessage = {
|
||||
id: string;
|
||||
role: "system" | "user" | "assistant";
|
||||
parts: UIPart[];
|
||||
metadata?: {
|
||||
model?: string;
|
||||
provider?: string;
|
||||
tokens?: { prompt: number; completion: number; total: number };
|
||||
cost?: number;
|
||||
finishReason?: string;
|
||||
[key: string]: unknown;
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
This is a **starting subset** of the AI SDK's part types (which includes `source-document`, `dynamic-tool`, `approval-requested`, etc.). We add types as the hub needs them. Import compatibility with opencode sessions remains possible through a mapping layer.
|
||||
|
||||
**Note on `metadata`**: The `metadata` field is typed as a structured object (not `unknown`) because the hub always populates it with model, provider, usage, and finish reason data from the LLM response. The `[key: string]: unknown` index signature allows extensibility without losing type safety for the known fields.
|
||||
|
||||
### Operation → OpenAI tool mapping
|
||||
|
||||
```ts
|
||||
function operationToOpenAITool(spec: IOperationDefinition): OpenAI.FunctionDefinition {
|
||||
const schema = normalizeSchemaForOpenAI(spec.inputSchema);
|
||||
return {
|
||||
type: "function",
|
||||
function: {
|
||||
name: `${spec.namespace}.${spec.name}`,
|
||||
description: spec.description,
|
||||
parameters: schema,
|
||||
strict: true, // enable structured outputs when the operation schema supports it
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* TypeBox produces JSON Schema, but OpenAI function calling has specific requirements:
|
||||
* - Top-level must be object type with properties
|
||||
* - additionalProperties: false at top level (required for strict mode)
|
||||
* - nested $ref needs resolution (TypeBox typically produces inline schemas)
|
||||
* - patternProperties, oneOf/anyOf with complex merging may not translate
|
||||
* This function normalizes TypeBox output for OpenAI compatibility.
|
||||
*/
|
||||
function normalizeSchemaForOpenAI(schema: Record<string, unknown>): Record<string, unknown> {
|
||||
// ~30-50 lines of normalization:
|
||||
// 1. Ensure top-level type: "object"
|
||||
// 2. Set additionalProperties: false for strict mode
|
||||
// 3. Strip unsupported keywords (patternProperties, etc.)
|
||||
// 4. Resolve $ref if present (unusual for TypeBox, but defensive)
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
No adapter layer, no `tool()` wrapper, no AI SDK dependency. The operations registry already stores JSON Schema via TypeBox. The normalization step is necessary because OpenAI's function calling API has stricter JSON Schema requirements than TypeBox's default output.
|
||||
|
||||
### Streaming format for the proxy
|
||||
|
||||
The hub's proxy emits SSE events using a **subset** of the AI SDK's `UIMessageChunk` protocol. We emit only the chunk types we need:
|
||||
|
||||
**Content events**:
|
||||
- `text-start`, `text-delta`, `text-end` — text content
|
||||
- `reasoning-start`, `reasoning-delta`, `reasoning-end` — reasoning content
|
||||
|
||||
**Tool call lifecycle events**:
|
||||
- `tool-input-start` — the LLM is calling a tool (includes `toolCallId`, `toolName`)
|
||||
- `tool-input-delta` — streaming tool arguments (JSON fragments)
|
||||
- `tool-input-available` — complete tool arguments received (parsed JSON)
|
||||
- `tool-output-available` — tool execution result (emitted after registry.execute())
|
||||
- `tool-output-error` — tool execution error
|
||||
|
||||
**Step and message boundary events**:
|
||||
- `start` — message begins (includes optional `messageId`)
|
||||
- `step-start` — new step begins (after tool results are fed back)
|
||||
- `step-finish` — step ends (after LLM response, before tool execution)
|
||||
- `finish` — message complete (includes `finishReason`, `usage` tokens, `metadata`)
|
||||
|
||||
**Error events**:
|
||||
- `error` — stream error (includes `errorText`)
|
||||
|
||||
**Two streaming paths produce the same output format**:
|
||||
|
||||
1. **Proxy path** (dev spoke or external client → Hono HTTP endpoint → provider): The proxy receives OpenAI SSE chunks and transforms them into hub chunk format. This is the SSE handler in the proxy.
|
||||
|
||||
2. **Direct agent path** (hub process → `openai` SDK → proxy → provider): The `AgentLoop` consumes the `openai` SDK's streaming response and emits the same hub chunk format. The internal format is the same; only the input source differs.
|
||||
|
||||
Both paths emit the same SSE format to clients. The direct agent path has the additional responsibility of tool execution and loop management, but the streaming event vocabulary is identical.
|
||||
|
||||
**Tool argument accumulation**: When the proxy path receives `tool-input-delta` events, the client is responsible for accumulating JSON fragments into complete tool arguments. The `openai` SDK handles this accumulation for the direct agent path (its `client.chat.completions.create({ stream: true })` returns accumulated tool call arguments). The `tool-input-available` event contains the complete parsed JSON input.
|
||||
|
||||
**Finish event includes usage data**: The `finish` event includes `usage` with `{ promptTokens, completionTokens, totalTokens }` and `finishReason` (`"stop"`, `"tool-calls"`, `"length"`, `"max-steps"`, `"error"`).
|
||||
|
||||
### Dependencies removed
|
||||
|
||||
| Package | Version | Notes |
|
||||
|---------|---------|-------|
|
||||
| `ai` | (was planned) | Core AI SDK — streaming, tool calling, UIMessage |
|
||||
| `@ai-sdk/openai-compatible` | (was planned) | Provider for OpenAI-compatible APIs |
|
||||
| `@ai-sdk/provider` | (transitive) | Provider interface |
|
||||
| `@ai-sdk/provider-utils` | (transitive) | Provider utilities |
|
||||
| `zod` | (peer dep) | No longer needed as AI SDK peer dep — we use TypeBox |
|
||||
|
||||
### Dependencies added
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| `openai` | Pinned in deno.json | Direct OpenAI API client, zero runtime deps |
|
||||
|
||||
Per project convention (AGENTS.md: "Pin dependency versions in deno.json — update manually when needed"), the `openai` package will be pinned to a specific version.
|
||||
|
||||
### Documents requiring update
|
||||
|
||||
| Document | Change | Status |
|
||||
|----------|--------|--------|
|
||||
| `AGENTS.md` | Remove AI SDK from External Dependencies and Constraints. Add `openai` with pinned version. Update `src/inference/` description. | ✅ Done |
|
||||
| `docs/architecture/agent-sessions.md` | Remove `streamText`/`generateText`/`proxyProvider`/`operationToTool` references. Replace with `AgentLoop` using `openai` SDK and `operationToOpenAITool` mapping. Update session data shapes. | Pending |
|
||||
| `docs/architecture/open-questions.md` | Add OQ-63, OQ-64, OQ-65. Add Theme 11. Add ADR-018 to resolved table. Add inference chain to cross-cutting dependencies. | ✅ Done |
|
||||
| `docs/architecture/packages.md` | Replace "Agent sessions (AI SDK)" with "Agent sessions (openai SDK + AgentLoop)" or similar. | Pending |
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Reduced supply chain attack surface**: Zero transitive dependencies from the LLM calling path. The `openai` package has zero runtime dependencies and is auto-generated from OpenAPI spec.
|
||||
2. **No AI SDK release cadence coupling**: We update the `openai` package on our schedule, not at 2-5 releases/day.
|
||||
3. **Reduced bundle size**: The AI SDK core (`ai`) is ~50 kB minified, `@ai-sdk/provider` adds ~19.5 kB, plus `@ai-sdk/provider-utils` and transitive deps. The `openai` package is ~129.5 kB but with zero transitive deps — total install footprint is significantly smaller than `ai` + its dependency tree. More importantly, the hub's own streaming code (~300 LOC for the SSE transformer + AgentLoop) is a fraction of the AI SDK's ~2700 lines of `streamText()` alone, and we only ship what we use.
|
||||
3. **Hub-own streaming protocol**: We define and evolve the SSE chunk types we need without waiting for AI SDK releases. New part types or chunk types can be added immediately.
|
||||
4. **Simpler code paths**: No `proxyProvider()` factory, no `operationToTool()` adapter, no `LanguageModelV3` interface implementation. Direct `openai` SDK calls + JSON Schema tool definitions + explicit `AgentLoop`.
|
||||
5. **Consistent with existing patterns**: The operations registry already uses TypeBox → JSON Schema. Mapping operations to OpenAI tool format is a JSON Schema transform, not an adapter to a third-party type system.
|
||||
6. **Consistent with ADR-015 and ADR-016**: We've removed opencode's influence on the hub's data model. Removing the AI SDK continues this pattern — the hub owns its types, its streaming protocol, and its tool calling format.
|
||||
7. **Explicit agent loop**: The `AgentLoop` is hub code that we can debug, extend, and add observability to. Multi-step tool execution, max steps, error recovery, and usage tracking are all visible and modifiable. The AI SDK's `streamText()` hides this loop inside ~2700 lines of framework code.
|
||||
|
||||
### Negative
|
||||
|
||||
1. **More code to maintain**: The `AgentLoop`, streaming state machine, and tool execution orchestration are additional hub code. However, this code is bounded (~900 lines total), well-understood (LLM → tool call → execute → feed result → repeat), and has clear input/output formats. The AI SDK's equivalent is ~2700 lines of `streamText()` + the provider abstraction + the tool framework.
|
||||
2. **No multi-provider abstraction**: The AI SDK lets you swap providers with one line (`anthropic(...)` → `openai(...)`). With the `openai` SDK, we're locked to OpenAI-compatible APIs. But the hub's proxy already abstracts this — all LLM calls go through `/v1/chat/completions`, and the proxy routes to providers. Adding a new provider means adding a route in the proxy, not swapping AI SDK providers. For providers that don't support OpenAI-compatible APIs (e.g., Anthropic native), the proxy translates the format.
|
||||
3. **No AI SDK React hooks**: We can't use `useChat` or `useCompletion` on the frontend. The hub doesn't have a React frontend — it has an API server. Frontend concerns are out of scope.
|
||||
4. **Tool calling type safety**: The AI SDK's `tool()` function provides Zod-based type safety for tool input/output. We lose that. But our operations registry already provides TypeBox-based type safety — we're mapping TypeBox schemas to OpenAI's `parameters` field, which is JSON Schema (which TypeBox produces natively).
|
||||
|
||||
### Implementation scope
|
||||
|
||||
| Component | Estimated effort | Notes |
|
||||
|-----------|-----------------|-------|
|
||||
| `UIMessage` + `UIPart` + `ToolCallState` type definitions | Small (~60 lines) | Plain TypeScript interfaces |
|
||||
| `operationToOpenAITool()` + schema normalization | Small-Medium (~80 lines) | JSON Schema normalization for OpenAI strict mode (~30-50 lines) + mapping |
|
||||
| OpenAI proxy SSE handler (Hono) | Medium (~250 lines) | Transform OpenAI SSE → hub chunk format, includes step boundary events |
|
||||
| `AgentLoop` — multi-step tool execution loop | Medium (~300 lines) | Step management, tool call detection, tool execution via registry, result feeding, max steps, usage accumulation |
|
||||
| Direct agent stream consumer | Small (~80 lines) | Consume `openai` SDK streaming response, emit hub chunk events |
|
||||
| Part persistence from stream | Medium-Large (~250 lines) | Map stream chunks to `parts` table inserts/updates, buffered write strategy (flush on `*-end` events), state transitions |
|
||||
| Proxy key routing | Small (~50 lines) | Resolve `clients` + `client_secrets` for provider keys |
|
||||
| Error handling + retry logic | Small-Medium (~80 lines) | Exponential backoff for 429/5xx, non-retryable error mapping |
|
||||
|
||||
**Total: ~1100 lines** of focused, well-bounded code with clear input/output formats.
|
||||
|
||||
The `AgentLoop` is the most significant component. Its contract is simple:
|
||||
- **Input**: messages + tool definitions + model config
|
||||
- **Output**: SSE stream of hub chunk events + final UIMessage + usage data
|
||||
- **Loop**: call → accumulate → detect tools → execute → feed → repeat
|
||||
|
||||
The AI SDK's `streamText()` handles this loop in ~2700 lines (including provider abstraction, middleware hooks, multi-model smoothing, and edge cases we don't need). Our `AgentLoop` handles exactly our use case in ~300 lines.
|
||||
|
||||
### Open questions affected
|
||||
|
||||
| OQ | Impact |
|
||||
|----|--------|
|
||||
| OQ-16 | **Simplified**: ADR-016 resolved this — hub owns its schema. This ADR extends that to TypeScript types. The hub defines `UIMessage`, `UIPart`, and `ToolCallState` types. |
|
||||
| Agent sessions architecture (`agent-sessions.md`) | **Needs update**: Remove `streamText`/`generateText`/`proxyProvider`/`operationToTool` references. Replace with `AgentLoop` using `openai` SDK and `operationToOpenAITool` mapping. Document the two streaming paths producing the same output format. |
|
||||
| `AGENTS.md` Constraints and Dependencies | **Needs update**: Remove AI SDK from dependencies and constraints. Add `openai` package with pinned version. Update `src/inference/` description. |
|
||||
|
||||
### Open questions created
|
||||
|
||||
| ID | Question | Priority |
|
||||
|----|----------|----------|
|
||||
| OQ-63 | What is the exact subset of `UIMessageChunk` types the hub proxy emits? (This ADR lists the initial subset, but extensions will happen as features are added.) | medium |
|
||||
| OQ-64 | Should the direct agent use the `openai` SDK's streaming API or raw HTTP for more control? The `openai` SDK provides a convenient typed interface, but raw HTTP gives more control over SSE parsing for the proxy path. | low |
|
||||
| OQ-65 | What is the buffered write strategy for part persistence? Options: flush on `*-end` events (per-part commits), flush on `step-finish` (per-step commits), or flush on `finish` (per-message commits). Per-step balances latency and write volume. | medium |
|
||||
|
||||
## References
|
||||
|
||||
- [ADR-015: Dev spoke instead of opencode integration](ADR-015-dev-spoke-not-opencode.md) — removed opencode dependency
|
||||
- [ADR-016: Hub-own schema](ADR-016-hub-own-schema.md) — hub owns session/message/part schema
|
||||
- [AI SDK supply chain risk assessment](../research/ai-sdk-supply-chain-risk.md) — detailed analysis of AI SDK risks
|
||||
- [agent-sessions.md](../architecture/agent-sessions.md) — current session architecture (references AI SDK)
|
||||
- [OpenAI Node SDK](https://github.com/openai/openai-node) — zero-dependency, auto-generated from OpenAPI spec
|
||||
Reference in New Issue
Block a user