Skip to content

LLM Orchestration

If a package needs LLM calls, prefer the repository orchestration layer over provider-specific direct calls.

Preferred functions:

  • resolveConfiguredLlmRuntime()
  • runResolvedSkillAwareDeterministicLlmTask()

Skills are delivered to the LLM via the skillIds parameter — never by dumping content into the system prompt (personalSkillContent is deprecated and must not be used for new code).

The orchestration wrapper (runSkillAwareDeterministicLlmTask / runResolvedSkillAwareDeterministicLlmTask) auto-selects the delivery method per provider:

  • OpenAI: skills are preferentially delivered as the shell tool via buildSkillTools() internally. The LLM reads SKILL.md from the on-disk sourcePath recorded by upsertSkill. The owner-mandated rule (Phase 298.16-P2, 2026-05-15) is that the chat/widget paths MUST resolve every skill to a catalog entry with sourcePath — enforced upstream by ensureChatSkillRegistered / per-widget self-heals. When buildSkillTools is called with skill IDs and NONE resolve with sourcePath (e.g. the /api/llm-bridge agent path with GitHub-installed or user-scoped skills that resolve null under the model actor’s visibility filter), it falls back to read_skill so the LLM can still invoke the skill catalog primitive. A console warning is emitted so operators can see partial-resolution.
  • Anthropic: skills are delivered as shell when shell is supported, OR as read_skill when the Anthropic adapter runs in native-MCP mode (the adapter strips the shell tool and injects read_skill directly — buildSkillTools() is NOT involved on that carve-out).
  • Gemini: skill content is read directly via readSkillContent() and inlined into the system prompt. This avoids the extra round-trip where Gemini has to call a function tool to read the skill.

Consumers pass skillIds to the wrapper — the delivery method is chosen automatically. Do not call buildSkillTools or readSkillContent directly — they are internal to the orchestration layer.

When skills are delivered as the shell tool (OpenAI path), buildSkillTools() builds:

  1. A type: "shell" tool with local file paths for every skill whose catalog record has a sourcePath on disk. The shell tool uses cat/head/tail executed locally via readSkillFileContent — no Docker required.

read_skill is the fall-back tool for: (a) the Anthropic native-MCP adapter, (b) the /api/llm-bridge shell-incompat path, and (c) buildSkillTools when no skill resolves with sourcePath. New chat/widget code paths should ensure their skills have sourcePath via registerPackageSystemSkill.

The shell tool declaration in the API request includes the skill directory path:

{
"type": "shell",
"environment": {
"type": "local",
"skills": [{ "name": "agent-scrape", "description": "...", "path": "/abs/path/to/skill/dir" }]
}
}

This path is present in the request only when sourcePath is set on the skill record. Skills persisted via upsertSkill always have sourcePath set.

Passing includeShell: true to buildSkillTools() uses the Docker executor instead of the local file reader. Only needed for write-capable shell tasks. Regular skill reading does not require Docker.

For LLM-enabled package execution:

  1. Resolve the instance skill ID — call the skill generation function at instance creation time, or use the lazy-migration helper (resolveInstanceSkillId) for old instances without a stored ID.
  2. Resolve the configured runtime.
  3. Pass skillIds: instanceSkillId ? [instanceSkillId] : undefined to runResolvedSkillAwareDeterministicLlmTask.
  4. Use explicit log labels for observability.

Do not pass personalSkillContent. Do not pass useLiveTooling — the shell tool is now included automatically.

extraTools — additional tools through the wrapper

Section titled “extraTools — additional tools through the wrapper”

When a task needs tools beyond skill tools (e.g. createWebSearchTool()), pass them via extraTools. The wrapper merges them into the final tools array:

const llmResponse = await runResolvedSkillAwareDeterministicLlmTask({
runtime: llmRuntime,
skillIds: ["@cinatra/example-skill:extract-data"],
extraTools: [createWebSearchTool()],
system: "Extract structured data from the web...",
user: JSON.stringify({ url, instructions }),
maxSteps: 15,
maxOutputTokens: 4000,
outputSchema: extractionSchema,
signal,
logLabel: "extract-websearch",
});

Do not build skill tools manually and merge them with extra tools — use extraTools instead.

  • fetch and parse: deterministic
  • page discovery: LLM via orchestration with skillIds
  • extraction from fetched content: LLM via orchestration with skillIds
  • graceful fallback to deterministic extracted data when appropriate
  • validation and web checks: deterministic
  • plan generation: LLM via orchestration with skillIds
  • per-item research: LLM via orchestration with skillIds
  • validation outputs must be included in later LLM context
  • structured service lookups: deterministic
  • no LLM unless the package explicitly adds an LLM-driven enrichment mode

Native MCP server tool (LLM-to-MCP connection)

Section titled “Native MCP server tool (LLM-to-MCP connection)”

buildLlmMcpServerTool(provider) in packages/llm-orchestration/src/mcp-access.ts builds an LlmMcpServerTool that lets an LLM provider connect directly to the Cinatra MCP server.

Why it exchanges credentials for a Bearer token

Section titled “Why it exchanges credentials for a Bearer token”

LLM providers (OpenAI, Gemini) call the MCP server over the configured public base URL. The MCP server validates requests with verifyMcpAccessToken, which requires a JWT Bearer token — not raw client credentials. buildLlmMcpServerTool therefore:

  1. Reads the stored clientId / clientSecret for the provider (from getLlmMcpCredentials)
  2. Exchanges them for a short-lived JWT via POST /api/auth/oauth2/token (local, not public-URL)
  3. Passes the JWT as Authorization: Bearer <token> in the MCP tool headers

The token request must include resource: getLocalMcpServerUrl("/api/mcp") (RFC 8707). Without it, Better Auth issues an opaque token, which cannot be verified by JWKS. See docs/ai/mcp-patterns.md — LLM provider access section for full details.

packages/llm-orchestration/src/mcp-access.ts
body: new URLSearchParams({
grant_type: "client_credentials",
scope: credentials.scope,
resource: getLocalMcpServerUrl("/api/mcp"), // ← required for JWT issuance
}),

buildLlmMcpServerTool returns null (not an error) when:

  • No credentials are provisioned for the provider
  • No public MCP server URL is configured (operator did not save one in the dev tab)
  • Token exchange fails

Callers fall back to in-process function tools when it returns null.

Automatic injection via the registry — do not call manually

Section titled “Automatic injection via the registry — do not call manually”

Do not call buildLlmMcpServerTool at individual call sites. The withMcpServerTool wrapper in packages/llm-orchestration/src/registry.ts intercepts every generate and stream call on the OpenAI adapter and prepends the MCP server tool automatically:

// registry.ts — applied once in resolveProviderAdapter("openai")
function withMcpServerTool(adapter: LlmProviderAdapter): LlmProviderAdapter {
return {
...adapter,
async generate(input) {
const mcpTool = await buildLlmMcpServerTool("openai");
return adapter.generate({ ...input, tools: mcpTool ? [mcpTool, ...(input.tools ?? [])] : input.tools });
},
async stream(input) { /* same pattern */ },
};
}

This means every caller that goes through resolveProviderAdapter("openai") — the chat route, all agent execution packages, orchestration helpers — automatically gets the MCP server tool without any code changes. The tool is placed first in the tools list so the model always sees the MCP server before the in-process function tools.

The Anthropic adapter has two MCP delivery modes configurable via the mcpMode setting in @cinatra-ai/anthropic-connector (stored in DB, managed from /apis/claude settings page; the setting follows the Anthropic API since the Phase 397 D1 split, not the inbound MCP-client registry that v5.7 Phase 434.2 renamed to @cinatra-ai/mcp-client-registry-connector):

  • "function-tools" (default): Uses client.messages.create (standard API). MCP tools are fetched as function tools via fetchMcpToolsAsLlmFunctionTools. No Anthropic beta program required.
  • "native": Uses client.beta.messages.create with the mcp-client-2025-11-20 beta. Requires the beta to be enabled on the Anthropic account.

If "native" is configured but the beta call throws (e.g. the beta is not active on the account), the adapter automatically falls back to "function-tools" for that run, resets conversation state, and re-fetches MCP tools as function tools. A warning is logged to the console.

The LlmShellTool type is translated to a standard bash function tool on Anthropic — not to bash_20250124 (which would require the computer-use-2025-01-24 beta). No extra beta headers are needed for skill reading.

Agent builder runs carry an executionProvider field ("langgraph" | "default" | "openai" | "anthropic" | "gemini") that determines which execution path handles the run. The canonical isLangGraph check is:

const isLangGraph =
template.executionProvider === "langgraph" ||
template.executionProvider === "default"; // "default" maps to LangGraph per Phase 92 convention

This pattern is used in three places and must stay consistent:

FileContext
packages/agents/src/execution.ts (line ~436)Fresh-run dispatch — routes to AGENT_BUILDER_LANGGRAPH_EXECUTION
packages/agents/src/mcp/handlers.tsResume via MCP — handleAgentBuilderRunResume routes on executionProvider before executionMode
packages/agents/src/review-task-actions.tsApprove routing — approveReviewTaskInternal routes LangGraph runs to LangGraph resume worker

Rules:

  • Always discriminate on executionProvider, not executionMode (agentic/deterministic) or lgThreadId. executionMode is a capability flag; executionProvider is the runtime discriminator.
  • AGENT_BUILDER_RESUME (BullMQ job) is legacy-only — it only executes for openai, anthropic, and gemini templates. All LangGraph runs go through AGENT_BUILDER_LANGGRAPH_EXECUTION.
  • When resuming a LangGraph run, pass resume: { values: null }. Setup-field values are already merged into agent_runs.inputParams by approveReviewTaskInternal; the Python graph reads collected_inputs from thread state, not from command.resume.
  • Both resume.ts and agentic-resume.ts guard on run.lgThreadId and delegate to runLangGraphJob — they skip all state-reconstruction logic for LangGraph runs.

All WayFlow LLM execution goes through /api/llm-bridge — both the TypeScript ApiNode path and the Python container path. The old /api/internal/langgraph-llm-step route was retired in Phase 183 and now returns 410 Gone.

Route: POST /api/llm-bridge

Auth: Bridge-token (X-Cinatra-Bridge-Token header validated by isAuthorizedBridgeRequest) OR Bearer JWT (A2A token validated by verifyLangGraphBridgeToken). No API keys accepted from callers — Cinatra owns the LLM runtime.

Request body:

{
"user": "workflow input text",
"agent_id": "email-outreach",
"max_steps": 6,
"system": "optional fallback system text",
"skill_source_path": "/abs/path/to/SKILL.md",
"toolbox_ids": ["cinatra-mcp"],
"model_id": "gpt-4o"
}

Skill IDs and custom skill content are resolved server-side from agent_id — callers never pass raw skill lists.

max_steps cap: Server clamps to Math.min(body.max_steps ?? 6, 20) regardless of what the caller sends. Default is 6.

Response: { "output": "final text" } — empty string if LLM returned null.

Python caller: cinatra_sdk.llm_step.run_cinatra_llm_step() — see docs/ai/langgraph-graphs.md for usage. The helper derives origin from state["a2a_base_url"] by stripping the /api/a2a suffix before appending /api/llm-bridge.

Do not call this endpoint from TypeScript. TS callers use runResolvedSkillAwareDeterministicLlmTask directly. The bridge exists for Python→TS delegation from WayFlow graph nodes.


  • calling buildLlmMcpServerTool manually at individual call sites — it is injected automatically by withMcpServerTool in the registry for all OpenAI calls
  • calling buildSkillTools or readSkillContent directly — they are internal to the orchestration layer; pass skillIds to runSkillAwareDeterministicLlmTask or runResolvedSkillAwareDeterministicLlmTask instead
  • building skill tools manually and merging with extra tools — use extraTools instead
  • direct provider-specific calls when orchestration-layer helpers already exist
  • passing personalSkillContent or dumping skill content into the system prompt
  • passing useLiveTooling — it is a no-op; shell tool inclusion is automatic
  • creating skills with createSkillFromTemplate directly from agent packages — use upsertSkill({ type: "system", ... }) instead (see packages/skills/AGENTS.md)
  • using LLMs for HTTP fetching or other deterministic tasks