LLM Orchestration
Standard approach
Section titled “Standard approach”If a package needs LLM calls, prefer the repository orchestration layer over provider-specific direct calls.
Preferred functions:
resolveConfiguredLlmRuntime()runResolvedSkillAwareDeterministicLlmTask()
Skill delivery to the LLM
Section titled “Skill delivery to the LLM”Skills are delivered to the LLM via the skillIds parameter — never by dumping content into the system prompt (personalSkillContent is deprecated and must not be used for new code).
The orchestration wrapper (runSkillAwareDeterministicLlmTask / runResolvedSkillAwareDeterministicLlmTask) auto-selects the delivery method per provider:
- OpenAI: skills are preferentially delivered as the
shelltool viabuildSkillTools()internally. The LLM readsSKILL.mdfrom the on-disksourcePathrecorded byupsertSkill. The owner-mandated rule (Phase 298.16-P2, 2026-05-15) is that the chat/widget paths MUST resolve every skill to a catalog entry withsourcePath— enforced upstream byensureChatSkillRegistered/ per-widget self-heals. WhenbuildSkillToolsis called with skill IDs and NONE resolve withsourcePath(e.g. the/api/llm-bridgeagent path with GitHub-installed or user-scoped skills that resolve null under the model actor’s visibility filter), it falls back toread_skillso the LLM can still invoke the skill catalog primitive. A console warning is emitted so operators can see partial-resolution. - Anthropic: skills are delivered as
shellwhen shell is supported, OR asread_skillwhen the Anthropic adapter runs in native-MCP mode (the adapter strips the shell tool and injectsread_skilldirectly —buildSkillTools()is NOT involved on that carve-out). - Gemini: skill content is read directly via
readSkillContent()and inlined into the system prompt. This avoids the extra round-trip where Gemini has to call a function tool to read the skill.
Consumers pass skillIds to the wrapper — the delivery method is chosen automatically. Do not call buildSkillTools or readSkillContent directly — they are internal to the orchestration layer.
When skills are delivered as the shell tool (OpenAI path), buildSkillTools() builds:
- A
type: "shell"tool with local file paths for every skill whose catalog record has asourcePathon disk. The shell tool usescat/head/tailexecuted locally viareadSkillFileContent— no Docker required.
read_skill is the fall-back tool for: (a) the Anthropic native-MCP adapter, (b) the /api/llm-bridge shell-incompat path, and (c) buildSkillTools when no skill resolves with sourcePath. New chat/widget code paths should ensure their skills have sourcePath via registerPackageSystemSkill.
The shell tool declaration in the API request includes the skill directory path:
{ "type": "shell", "environment": { "type": "local", "skills": [{ "name": "agent-scrape", "description": "...", "path": "/abs/path/to/skill/dir" }] }}This path is present in the request only when sourcePath is set on the skill record. Skills persisted via upsertSkill always have sourcePath set.
Docker-based shell
Section titled “Docker-based shell”Passing includeShell: true to buildSkillTools() uses the Docker executor instead of the local file reader. Only needed for write-capable shell tasks. Regular skill reading does not require Docker.
Execution-time skill usage
Section titled “Execution-time skill usage”For LLM-enabled package execution:
- Resolve the instance skill ID — call the skill generation function at instance creation time, or use the lazy-migration helper (
resolveInstanceSkillId) for old instances without a stored ID. - Resolve the configured runtime.
- Pass
skillIds: instanceSkillId ? [instanceSkillId] : undefinedtorunResolvedSkillAwareDeterministicLlmTask. - Use explicit log labels for observability.
Do not pass personalSkillContent. Do not pass useLiveTooling — the shell tool is now included automatically.
extraTools — additional tools through the wrapper
Section titled “extraTools — additional tools through the wrapper”When a task needs tools beyond skill tools (e.g. createWebSearchTool()), pass them via extraTools. The wrapper merges them into the final tools array:
const llmResponse = await runResolvedSkillAwareDeterministicLlmTask({ runtime: llmRuntime, skillIds: ["@cinatra/example-skill:extract-data"], extraTools: [createWebSearchTool()], system: "Extract structured data from the web...", user: JSON.stringify({ url, instructions }), maxSteps: 15, maxOutputTokens: 4000, outputSchema: extractionSchema, signal, logLabel: "extract-websearch",});Do not build skill tools manually and merge them with extra tools — use extraTools instead.
Typical package mapping
Section titled “Typical package mapping”Scrape-like packages
Section titled “Scrape-like packages”- fetch and parse: deterministic
- page discovery: LLM via orchestration with
skillIds - extraction from fetched content: LLM via orchestration with
skillIds - graceful fallback to deterministic extracted data when appropriate
Research-like packages
Section titled “Research-like packages”- validation and web checks: deterministic
- plan generation: LLM via orchestration with
skillIds - per-item research: LLM via orchestration with
skillIds - validation outputs must be included in later LLM context
Enrichment-like packages
Section titled “Enrichment-like packages”- structured service lookups: deterministic
- no LLM unless the package explicitly adds an LLM-driven enrichment mode
Native MCP server tool (LLM-to-MCP connection)
Section titled “Native MCP server tool (LLM-to-MCP connection)”buildLlmMcpServerTool(provider) in packages/llm-orchestration/src/mcp-access.ts builds an LlmMcpServerTool that lets an LLM provider connect directly to the Cinatra MCP server.
Why it exchanges credentials for a Bearer token
Section titled “Why it exchanges credentials for a Bearer token”LLM providers (OpenAI, Gemini) call the MCP server over the configured public base URL. The MCP server validates requests with verifyMcpAccessToken, which requires a JWT Bearer token — not raw client credentials. buildLlmMcpServerTool therefore:
- Reads the stored
clientId/clientSecretfor the provider (fromgetLlmMcpCredentials) - Exchanges them for a short-lived JWT via
POST /api/auth/oauth2/token(local, not public-URL) - Passes the JWT as
Authorization: Bearer <token>in the MCP tool headers
The resource parameter is mandatory
Section titled “The resource parameter is mandatory”The token request must include resource: getLocalMcpServerUrl("/api/mcp") (RFC 8707). Without it, Better Auth issues an opaque token, which cannot be verified by JWKS. See docs/ai/mcp-patterns.md — LLM provider access section for full details.
body: new URLSearchParams({ grant_type: "client_credentials", scope: credentials.scope, resource: getLocalMcpServerUrl("/api/mcp"), // ← required for JWT issuance}),Returns null when unavailable
Section titled “Returns null when unavailable”buildLlmMcpServerTool returns null (not an error) when:
- No credentials are provisioned for the provider
- No public MCP server URL is configured (operator did not save one in the dev tab)
- Token exchange fails
Callers fall back to in-process function tools when it returns null.
Automatic injection via the registry — do not call manually
Section titled “Automatic injection via the registry — do not call manually”Do not call buildLlmMcpServerTool at individual call sites. The withMcpServerTool wrapper in packages/llm-orchestration/src/registry.ts intercepts every generate and stream call on the OpenAI adapter and prepends the MCP server tool automatically:
// registry.ts — applied once in resolveProviderAdapter("openai")function withMcpServerTool(adapter: LlmProviderAdapter): LlmProviderAdapter { return { ...adapter, async generate(input) { const mcpTool = await buildLlmMcpServerTool("openai"); return adapter.generate({ ...input, tools: mcpTool ? [mcpTool, ...(input.tools ?? [])] : input.tools }); }, async stream(input) { /* same pattern */ }, };}This means every caller that goes through resolveProviderAdapter("openai") — the chat route, all agent execution packages, orchestration helpers — automatically gets the MCP server tool without any code changes. The tool is placed first in the tools list so the model always sees the MCP server before the in-process function tools.
Anthropic MCP mode
Section titled “Anthropic MCP mode”The Anthropic adapter has two MCP delivery modes configurable via the mcpMode setting in @cinatra-ai/anthropic-connector (stored in DB, managed from /apis/claude settings page; the setting follows the Anthropic API since the Phase 397 D1 split, not the inbound MCP-client registry that v5.7 Phase 434.2 renamed to @cinatra-ai/mcp-client-registry-connector):
"function-tools"(default): Usesclient.messages.create(standard API). MCP tools are fetched as function tools viafetchMcpToolsAsLlmFunctionTools. No Anthropic beta program required."native": Usesclient.beta.messages.createwith themcp-client-2025-11-20beta. Requires the beta to be enabled on the Anthropic account.
If "native" is configured but the beta call throws (e.g. the beta is not active on the account), the adapter automatically falls back to "function-tools" for that run, resets conversation state, and re-fetches MCP tools as function tools. A warning is logged to the console.
The LlmShellTool type is translated to a standard bash function tool on Anthropic — not to bash_20250124 (which would require the computer-use-2025-01-24 beta). No extra beta headers are needed for skill reading.
executionProvider routing convention
Section titled “executionProvider routing convention”Agent builder runs carry an executionProvider field ("langgraph" | "default" | "openai" | "anthropic" | "gemini") that determines which execution path handles the run. The canonical isLangGraph check is:
const isLangGraph = template.executionProvider === "langgraph" || template.executionProvider === "default"; // "default" maps to LangGraph per Phase 92 conventionThis pattern is used in three places and must stay consistent:
| File | Context |
|---|---|
packages/agents/src/execution.ts (line ~436) | Fresh-run dispatch — routes to AGENT_BUILDER_LANGGRAPH_EXECUTION |
packages/agents/src/mcp/handlers.ts | Resume via MCP — handleAgentBuilderRunResume routes on executionProvider before executionMode |
packages/agents/src/review-task-actions.ts | Approve routing — approveReviewTaskInternal routes LangGraph runs to LangGraph resume worker |
Rules:
- Always discriminate on
executionProvider, notexecutionMode(agentic/deterministic) orlgThreadId.executionModeis a capability flag;executionProvideris the runtime discriminator. AGENT_BUILDER_RESUME(BullMQ job) is legacy-only — it only executes foropenai,anthropic, andgeminitemplates. All LangGraph runs go throughAGENT_BUILDER_LANGGRAPH_EXECUTION.- When resuming a LangGraph run, pass
resume: { values: null }. Setup-field values are already merged intoagent_runs.inputParamsbyapproveReviewTaskInternal; the Python graph readscollected_inputsfrom thread state, not fromcommand.resume. - Both
resume.tsandagentic-resume.tsguard onrun.lgThreadIdand delegate torunLangGraphJob— they skip all state-reconstruction logic for LangGraph runs.
Unified LLM bridge — /api/llm-bridge
Section titled “Unified LLM bridge — /api/llm-bridge”All WayFlow LLM execution goes through /api/llm-bridge — both the TypeScript ApiNode path and the Python container path. The old /api/internal/langgraph-llm-step route was retired in Phase 183 and now returns 410 Gone.
Route: POST /api/llm-bridge
Auth: Bridge-token (X-Cinatra-Bridge-Token header validated by isAuthorizedBridgeRequest) OR Bearer JWT (A2A token validated by verifyLangGraphBridgeToken). No API keys accepted from callers — Cinatra owns the LLM runtime.
Request body:
{ "user": "workflow input text", "agent_id": "email-outreach", "max_steps": 6, "system": "optional fallback system text", "skill_source_path": "/abs/path/to/SKILL.md", "toolbox_ids": ["cinatra-mcp"], "model_id": "gpt-4o"}Skill IDs and custom skill content are resolved server-side from agent_id — callers never pass raw skill lists.
max_steps cap: Server clamps to Math.min(body.max_steps ?? 6, 20) regardless of what the caller sends. Default is 6.
Response: { "output": "final text" } — empty string if LLM returned null.
Python caller: cinatra_sdk.llm_step.run_cinatra_llm_step() — see docs/ai/langgraph-graphs.md for usage. The helper derives origin from state["a2a_base_url"] by stripping the /api/a2a suffix before appending /api/llm-bridge.
Do not call this endpoint from TypeScript. TS callers use runResolvedSkillAwareDeterministicLlmTask directly. The bridge exists for Python→TS delegation from WayFlow graph nodes.
What to avoid
Section titled “What to avoid”- calling
buildLlmMcpServerToolmanually at individual call sites — it is injected automatically bywithMcpServerToolin the registry for all OpenAI calls - calling
buildSkillToolsorreadSkillContentdirectly — they are internal to the orchestration layer; passskillIdstorunSkillAwareDeterministicLlmTaskorrunResolvedSkillAwareDeterministicLlmTaskinstead - building skill tools manually and merging with extra tools — use
extraToolsinstead - direct provider-specific calls when orchestration-layer helpers already exist
- passing
personalSkillContentor dumping skill content into the system prompt - passing
useLiveTooling— it is a no-op; shell tool inclusion is automatic - creating skills with
createSkillFromTemplatedirectly from agent packages — useupsertSkill({ type: "system", ... })instead (seepackages/skills/AGENTS.md) - using LLMs for HTTP fetching or other deterministic tasks