Artifacts: LLM attachments + prompt-window file upload
Status: Phase 329b/c/330/331/332/332b shipped on PR #440 (Stage 4 wiring). Core artifacts system: see
artifacts-architecture.mdandartifacts-preflight.md.
This document is the cross-cutting reference for how artifact refs flow from the prompt window through the LLM orchestration layer, the bridge, chat persistence, and A2A agent-run resume. It is the operator/dev contract — read this when adding a new LLM caller, a new provider, a new prompt-window consumer, or a new resume path.
Invariants
Section titled “Invariants”These hold throughout the v5.0 attachment path. Never break any of them without explicit codex review + a planned cutover.
-
Byte-identical legacy. Every pre-v5.0 caller (no
attachments, noresolvedAttachments, noattachmentResolverPorts, nouser_envelope) MUST observe a request body indistinguishable from the prior behavior. The optionality of every new field onLlmMessage,GenerateInput,StreamInput,OrchestrateGenerateInput,OrchestrateStreamInput,DeterministicLlmExecutionInput, and the bridgeRequestSchemais the load-bearing guarantee. -
Decision A — never silently drop. A non-ingestible attachment OR an attachment-carrying turn with no resolver ports OR a per-attachment resolver failure is NEVER silently dropped. The orchestration entry resolver prepends a
[ATTACHMENTS …]manifest to the system prompt so the model knows the file exists and why it cannot read it. -
Resolver ports are INTERNAL.
resolvedAttachmentsis set by the orchestration entry points (Phase 329c) fromresolveAttachments(). A caller NEVER provides it. The publicOrchestrateGenerateInput/OrchestrateStreamInputareOmit<…, "resolvedAttachments">and the entry impls additionally runtime-strip a smuggled field via cast (Phase 329c codex r3 MAJOR-1). -
Cross-tenant safety. Bridge resolver ports are built ONLY from a run resolved via the auth-injected
x-cinatra-a2a-context-idheader. A caller-suppliedbody.agent_run_idcannot select the tenant namespace. If both context and body resolve, they MUST match. Without request-bound ports → ports stay undefined → entry resolver degrades to the Decision-A manifest (Phase 329c codex r3 BLOCKER-2). -
A2A messages stay text-only. Artifact refs that round-trip through the agent-run resume path ride INSIDE the text content of a single A2A part as a JSON envelope
{text, attachments?}— never as native file parts (Phase 332). -
llm-orchestrationstays@/lib-free inattachments/andproviders/. All app-side dependencies (cache, blob store, provider upload, run lookup) are injected via theAttachmentResolverPortscontract.rg "from ['\"]@/lib"in those directories must stay empty. -
generateWithFileInputis untouched. The single-file asset-blog path (Phase 271 era) is unrelated and must stay unchanged.
Prompt → LLM (chat + bridge)
Section titled “Prompt → LLM (chat + bridge)”PromptField.onAttachmentsChange → ChatMessage.attachments[] │ ▼chat-page.tsx strips API messages WITH attachments forwarded │ ▼/api/chat POST → runner.runChatTurn │ builds chat-side AttachmentResolverPorts(sessionOrgId) │ resolves per-message attachments → stamps resolvedAttachments ▼orchestrateStream(provider, system, messages, ...) │ ▼adapter.stream({system, messages, tools, resolvedAttachments?, ...}) │ (provider-parts.ts: resolvedAttachmentsPerMessage) ▼provider-native parts emitted on the LAST user turn (or per-message when each user message carries its own resolvedAttachments)WayFlow / Python container → /api/llm-bridge POST body: {user, user_envelope?, attachments?, agent_run_id?, …} │ │ runFromContext = readAgentRunByContextId(x-cinatra-a2a-context-id) │ runForPorts = runFromContext (mismatch with body.agent_run_id → null) │ │ envelope = parseUserEnvelope(body.user, body.user_envelope === true, body.attachments) │ enabled=false: text VERBATIM; enabled=true: strict-parse or 400 │ │ if envelope.attachments && runForPorts.orgId: │ attachmentResolverPorts = buildBridgeAttachmentResolverPorts({orgId}) ▼runResolvedSkillAwareDeterministicLlmTask({ user: envelope.text, attachments?: envelope.attachments, attachmentResolverPorts?: …,}) │ ▼resolveEntryAttachments → resolveAttachments (port-driven cache+upload) → maps readable → resolvedAttachments → prepends manifest for non-readable refs ▼adapter.generate({system: resolved.system, prompt, resolvedAttachments?, …})A2A agent-run resume (HITL approval)
Section titled “A2A agent-run resume (HITL approval)”HITL approval form → values.userResponse = JSON.stringify({text, attachments?}) │ ▼approveReviewTaskInternal: resumeText = userResponseRaw (verbatim) submittedValues = JSON.parse(userResponseRaw) sendTask({message: {parts: [{kind:"text", text: resumeText}]}, …}) │ ↑ A2A text-only invariant (one text part, no file parts) ▼WayFlow forwards body.user = resumeText to /api/llm-bridge WAYFLOW MUST ALSO SET body.user_envelope = true if it wants the bridge to extract the embedded {text, attachments}. Without that flag the bridge ships the JSON string VERBATIM to orchestration — byte-identical legacy behavior, NOT a bug.Email-attachment provenance (Phase 332b)
Section titled “Email-attachment provenance (Phase 332b)”inbound email handler (future) → for each MIME attachment: await createUploadedArtifact({ orgId, createdBy, stream, maxBytes, declaredMime, title, originKind: "email_attachment", parentId: <email object id>, parentType: "email", })No new artifact type, no new enum value — both fields already exist on
WriteUploadedArtifactInput. The ArtifactOriginKind enum is shared
structurally between @cinatra-ai/artifacts and @cinatra-ai/llm-orchestration
so an email-attached file resolved into a later LLM turn keeps its origin
tag end-to-end.
Per-provider native emission
Section titled “Per-provider native emission”| Provider | nativeKind | providerFileId is… | Emitted as |
|---|---|---|---|
| OpenAI | openai_input_file | Files API file_id | { type: "input_file", file_id } |
| Anthropic | anthropic_document | Files API file_id | { type: "document", source: { type: "file", file_id } } (files-api-2025-04-14 beta gate; ONLY when document parts present — Phase 329b Option A) |
| Gemini | gemini_file_data | the file URI (NOT the resource name) | { fileData: { mimeType, fileUri } } — emitting the resource name silently fails (codex wave-1 BLOCKER-1) |
The capability registry per-provider gates each candidate by mime +
size. Non-ingestible → manifest. Anthropic’s betas array also lists
MCP_CLIENT_BETA when native MCP is on; the two betas combine.
Resolver ports contract
Section titled “Resolver ports contract”export type AttachmentResolverPorts = { cacheGet(ref, provider): Promise<string|null> | string|null; providerUpload(ref, provider, capability: {maxBytes, nativeKind}): Promise<string>; cachePut(ref, provider, providerFileId, ttlMs): Promise<void> | void;};cacheGetreturnsnullon miss/expired/cache-outage; throws are swallowed by the resolver (treated as miss).providerUploadthrows → that ref degrades to the manifest (the turn proceeds for the other refs). The cap is authoritative — the port MUST enforcemaxBytesBEFORE materializing the buffer; it should also use the server-authoritative MIME (notref.mime) and reject mime mismatches (codex r3 MAJOR-4).cachePutfailures after a successful upload are best-effort — the upload still serves THIS turn (codex Stage-4-core round-1 BLOCKER-2).
Stale-cache self-heal: a Gemini cached id that does not have a URI scheme
(legacy files/<id>) is treated as a MISS so the next call re-uploads
with the correct URI (codex wave-1 r2 MEDIUM).
Cutover notes
Section titled “Cutover notes”| Concern | Resolution |
|---|---|
LlmMessage.attachments / resolvedAttachments | Optional everywhere — no caller is forced to opt in. |
body.user_envelope (bridge) | Opt-in; absent → byte-identical. WayFlow agent_loader.py must opt in when forwarding HITL {text, attachments} envelopes. |
| Gemini PROCESSING→ACTIVE poll | Wave-3 follow-up — the bridge providerUpload returns the URI immediately; a Gemini file in PROCESSING state may cause the first turn to skip the attachment silently. The right place to poll is inside the Gemini adapter’s uploadFile (where the SDK client lives). Tracked. |
| Dashboards as artifacts (Phase 333) | Not in PR #440. Plan: dashboard objects publish an artifact extension that maps the dashboard snapshot to an ArtifactRef so it can be attached/referenced like any file. |
| Generic connector-ref resolver (Phase 334) | Not in PR #440. Plan: a new connector-ref artifact type points at a connector-owned resource; resolver port supplies a connector-specific bytes loader. |
| Live-update binding (Phase 335) | Not in PR #440. Plan: when an artifact extension declares liveBindingChannel, the chat replay path subscribes (SSE/WS) and re-renders on update. |
| Email ingestion (Phase 332b consumer) | Not in PR #440. Contract is set; the future inbound-email handler calls createUploadedArtifact({originKind:"email_attachment", parentId, parentType}). |