LLM Orchestration

Standard approach

If a package needs LLM calls, prefer the repository orchestration layer over provider-specific direct calls.

Preferred functions:

resolveConfiguredLlmRuntime()
runResolvedSkillAwareDeterministicLlmTask()

Skill delivery to the LLM

Skills are delivered to the LLM via the skillIds parameter — never by dumping content into the system prompt (personalSkillContent is deprecated and must not be used for new code).

The orchestration wrapper (runSkillAwareDeterministicLlmTask / runResolvedSkillAwareDeterministicLlmTask) auto-selects the delivery method per provider:

OpenAI: skills are preferentially delivered as the shell tool via buildSkillTools() internally. The LLM reads SKILL.md from the on-disk sourcePath recorded by upsertSkill. The owner-mandated rule (Phase 298.16-P2, 2026-05-15) is that the chat/widget paths MUST resolve every skill to a catalog entry with sourcePath — enforced upstream by ensureChatSkillRegistered / per-widget self-heals. When buildSkillTools is called with skill IDs and NONE resolve with sourcePath (e.g. the /api/llm-bridge agent path with GitHub-installed or user-scoped skills that resolve null under the model actor’s visibility filter), it falls back to read_skill so the LLM can still invoke the skill catalog primitive. A console warning is emitted so operators can see partial-resolution.
Anthropic: skills are delivered as shell when shell is supported, OR as read_skill when the Anthropic adapter runs in native-MCP mode (the adapter strips the shell tool and injects read_skill directly — buildSkillTools() is NOT involved on that carve-out).
Gemini: skill content is read directly via readSkillContent() and inlined into the system prompt. This avoids the extra round-trip where Gemini has to call a function tool to read the skill.

Consumers pass skillIds to the wrapper — the delivery method is chosen automatically. Do not call buildSkillTools or readSkillContent directly — they are internal to the orchestration layer.

When skills are delivered as the shell tool (OpenAI path), buildSkillTools() builds:

A type: "shell" tool with local file paths for every skill whose catalog record has a sourcePath on disk. The shell tool uses cat/head/tail executed locally via readSkillFileContent — no Docker required.

read_skill is the fall-back tool for: (a) the Anthropic native-MCP adapter, (b) the /api/llm-bridge shell-incompat path, and (c) buildSkillTools when no skill resolves with sourcePath. New chat/widget code paths should ensure their skills have sourcePath via registerPackageSystemSkill.

The shell tool declaration in the API request includes the skill directory path:

{
  "type": "shell",
  "environment": {
    "type": "local",
    "skills": [{ "name": "agent-scrape", "description": "...", "path": "/abs/path/to/skill/dir" }]
  }
}

This path is present in the request only when sourcePath is set on the skill record. Skills persisted via upsertSkill always have sourcePath set.

Docker-based shell

Passing includeShell: true to buildSkillTools() uses the Docker executor instead of the local file reader. Only needed for write-capable shell tasks. Regular skill reading does not require Docker.

Execution-time skill usage

For LLM-enabled package execution:

Resolve the instance skill ID — call the skill generation function at instance creation time, or use the lazy-migration helper (resolveInstanceSkillId) for old instances without a stored ID.
Resolve the configured runtime.
Pass skillIds: instanceSkillId ? [instanceSkillId] : undefined to runResolvedSkillAwareDeterministicLlmTask.
Use explicit log labels for observability.

Do not pass personalSkillContent. Do not pass useLiveTooling — the shell tool is now included automatically.

`extraTools` — additional tools through the wrapper

When a task needs tools beyond skill tools (e.g. createWebSearchTool()), pass them via extraTools. The wrapper merges them into the final tools array:

const llmResponse = await runResolvedSkillAwareDeterministicLlmTask({
  runtime: llmRuntime,
  skillIds: ["@cinatra/example-skill:extract-data"],
  extraTools: [createWebSearchTool()],
  system: "Extract structured data from the web...",
  user: JSON.stringify({ url, instructions }),
  maxSteps: 15,
  maxOutputTokens: 4000,
  outputSchema: extractionSchema,
  signal,
  logLabel: "extract-websearch",
});

Do not build skill tools manually and merge them with extra tools — use extraTools instead.

Typical package mapping

Scrape-like packages

fetch and parse: deterministic
page discovery: LLM via orchestration with skillIds
extraction from fetched content: LLM via orchestration with skillIds
graceful fallback to deterministic extracted data when appropriate

Research-like packages

validation and web checks: deterministic
plan generation: LLM via orchestration with skillIds
per-item research: LLM via orchestration with skillIds
validation outputs must be included in later LLM context

Enrichment-like packages

structured service lookups: deterministic
no LLM unless the package explicitly adds an LLM-driven enrichment mode

Native MCP server tool (LLM-to-MCP connection)

buildLlmMcpServerTool(provider) in packages/llm-orchestration/src/mcp-access.ts builds an LlmMcpServerTool that lets an LLM provider connect directly to the Cinatra MCP server.

Why it exchanges credentials for a Bearer token

LLM providers (OpenAI, Gemini) call the MCP server over the configured public base URL. The MCP server validates requests with verifyMcpAccessToken, which requires a JWT Bearer token — not raw client credentials. buildLlmMcpServerTool therefore:

Reads the stored clientId / clientSecret for the provider (from getLlmMcpCredentials)
Exchanges them for a short-lived JWT via POST /api/auth/oauth2/token (local, not public-URL)
Passes the JWT as Authorization: Bearer <token> in the MCP tool headers

The `resource` parameter is mandatory

The token request must include resource: getLocalMcpServerUrl("/api/mcp") (RFC 8707). Without it, Better Auth issues an opaque token, which cannot be verified by JWKS. See docs/ai/mcp-patterns.md — LLM provider access section for full details.

body: new URLSearchParams({
  grant_type: "client_credentials",
  scope: credentials.scope,
  resource: getLocalMcpServerUrl("/api/mcp"),  // ← required for JWT issuance
}),

Returns null when unavailable

buildLlmMcpServerTool returns null (not an error) when:

No credentials are provisioned for the provider
No public MCP server URL is configured (operator did not save one in the dev tab)
Token exchange fails

Callers fall back to in-process function tools when it returns null.

Automatic injection via the registry — do not call manually

Do not call buildLlmMcpServerTool at individual call sites. The withMcpServerTool wrapper in packages/llm-orchestration/src/registry.ts intercepts every generate and stream call on the OpenAI adapter and prepends the MCP server tool automatically:

// registry.ts — applied once in resolveProviderAdapter("openai")
function withMcpServerTool(adapter: LlmProviderAdapter): LlmProviderAdapter {
  return {
    ...adapter,
    async generate(input) {
      const mcpTool = await buildLlmMcpServerTool("openai");
      return adapter.generate({ ...input, tools: mcpTool ? [mcpTool, ...(input.tools ?? [])] : input.tools });
    },
    async stream(input) { /* same pattern */ },
  };
}

This means every caller that goes through resolveProviderAdapter("openai") — the chat route, all agent execution packages, orchestration helpers — automatically gets the MCP server tool without any code changes. The tool is placed first in the tools list so the model always sees the MCP server before the in-process function tools.

Anthropic MCP mode

The Anthropic adapter has two MCP delivery modes configurable via the mcpMode setting in @cinatra-ai/anthropic-connector (stored in DB, managed from /apis/claude settings page; the setting follows the Anthropic API since the Phase 397 D1 split, not the inbound MCP-client registry that v5.7 Phase 434.2 renamed to @cinatra-ai/mcp-client-registry-connector):

"function-tools" (default): Uses client.messages.create (standard API). MCP tools are fetched as function tools via fetchMcpToolsAsLlmFunctionTools. No Anthropic beta program required.
"native": Uses client.beta.messages.create with the mcp-client-2025-11-20 beta. Requires the beta to be enabled on the Anthropic account.

If "native" is configured but the beta call throws (e.g. the beta is not active on the account), the adapter automatically falls back to "function-tools" for that run, resets conversation state, and re-fetches MCP tools as function tools. A warning is logged to the console.

The LlmShellTool type is translated to a standard bash function tool on Anthropic — not to bash_20250124 (which would require the computer-use-2025-01-24 beta). No extra beta headers are needed for skill reading.

`executionProvider` routing convention

Agent builder runs carry an executionProvider field ("langgraph" | "default" | "openai" | "anthropic" | "gemini") that determines which execution path handles the run. The canonical isLangGraph check is:

const isLangGraph =
  template.executionProvider === "langgraph" ||
  template.executionProvider === "default"; // "default" maps to LangGraph per Phase 92 convention

This pattern is used in three places and must stay consistent:

File	Context
`packages/agents/src/execution.ts` (line ~436)	Fresh-run dispatch — routes to `AGENT_BUILDER_LANGGRAPH_EXECUTION`
`packages/agents/src/mcp/handlers.ts`	Resume via MCP — `handleAgentBuilderRunResume` routes on `executionProvider` before `executionMode`
`packages/agents/src/review-task-actions.ts`	Approve routing — `approveReviewTaskInternal` routes LangGraph runs to LangGraph resume worker

Rules:

Always discriminate on executionProvider, not executionMode (agentic/deterministic) or lgThreadId. executionMode is a capability flag; executionProvider is the runtime discriminator.
AGENT_BUILDER_RESUME (BullMQ job) is legacy-only — it only executes for openai, anthropic, and gemini templates. All LangGraph runs go through AGENT_BUILDER_LANGGRAPH_EXECUTION.
When resuming a LangGraph run, pass resume: { values: null }. Setup-field values are already merged into agent_runs.inputParams by approveReviewTaskInternal; the Python graph reads collected_inputs from thread state, not from command.resume.
Both resume.ts and agentic-resume.ts guard on run.lgThreadId and delegate to runLangGraphJob — they skip all state-reconstruction logic for LangGraph runs.

Unified LLM bridge — `/api/llm-bridge`

All WayFlow LLM execution goes through /api/llm-bridge — both the TypeScript ApiNode path and the Python container path. The old /api/internal/langgraph-llm-step route was retired in Phase 183 and now returns 410 Gone.

Route: POST /api/llm-bridge

Auth: Bridge-token (X-Cinatra-Bridge-Token header validated by isAuthorizedBridgeRequest) OR Bearer JWT (A2A token validated by verifyLangGraphBridgeToken). No API keys accepted from callers — Cinatra owns the LLM runtime.

Request body:

{
  "user": "workflow input text",
  "agent_id": "email-outreach",
  "max_steps": 6,
  "system": "optional fallback system text",
  "skill_source_path": "/abs/path/to/SKILL.md",
  "toolbox_ids": ["cinatra-mcp"],
  "model_id": "gpt-4o"
}

Skill IDs and custom skill content are resolved server-side from agent_id — callers never pass raw skill lists.

max_steps cap: Server clamps to Math.min(body.max_steps ?? 6, 20) regardless of what the caller sends. Default is 6.

Response: { "output": "final text" } — empty string if LLM returned null.

Python caller: cinatra_sdk.llm_step.run_cinatra_llm_step() — see docs/ai/langgraph-graphs.md for usage. The helper derives origin from state["a2a_base_url"] by stripping the /api/a2a suffix before appending /api/llm-bridge.

Do not call this endpoint from TypeScript. TS callers use runResolvedSkillAwareDeterministicLlmTask directly. The bridge exists for Python→TS delegation from WayFlow graph nodes.

What to avoid

calling buildLlmMcpServerTool manually at individual call sites — it is injected automatically by withMcpServerTool in the registry for all OpenAI calls
calling buildSkillTools or readSkillContent directly — they are internal to the orchestration layer; pass skillIds to runSkillAwareDeterministicLlmTask or runResolvedSkillAwareDeterministicLlmTask instead
building skill tools manually and merging with extra tools — use extraTools instead
direct provider-specific calls when orchestration-layer helpers already exist
passing personalSkillContent or dumping skill content into the system prompt
passing useLiveTooling — it is a no-op; shell tool inclusion is automatic
creating skills with createSkillFromTemplate directly from agent packages — use upsertSkill({ type: "system", ... }) instead (see packages/skills/AGENTS.md)
using LLMs for HTTP fetching or other deterministic tasks