Skip to content

Artifacts v5.0 — Architecture, Threat Model & Invariants (Phase 314)

Authoritative design: docs/superpowers/specs/2026-05-17-artifacts-and-file-upload-design.md (rev 2a). This page is the binding contract every v5.0 storage/service/LLM phase (318–337) must honor. It lands before any storage code.

  • Artifact = a self-contained deliverable consumed by being opened/read/edited/published/downloaded/attached. Identity is content + provenance, not relations.
  • Data object = a record whose identity is its relations + operational status (contact, account, list, email record/draft). Not an artifact.
  • Artifacts are a typed projection over cinatra.objects — one ownership/scope/Graphiti stack — plus dedicated supporting tables (blob metadata, immutable artifact-version, normalized refs, provider-ref cache, audit/retention). No parallel ownership stack.
  • Artifact types (v5.0): file, dashboard, connector-ref. Each ships as a kind:"artifact" extension; the set is extensible by adding extensions (no core edits).

2. Hard invariants (BLOCKING — enforced by tests/greps in later phases)

Section titled “2. Hard invariants (BLOCKING — enforced by tests/greps in later phases)”
  1. No bytes in objects.data. cinatra.objects.data (JSONB) holds metadata + normalized refs only — never file bytes, base64, or blob content. Blob bytes live only in the blob store; the object row carries { artifactType, latestVersionId, digest, mime, size, originKind, … }.
  2. Full-fidelity file model from its first phase (319) — never an upload-only blob. The file artifact model MUST support, from day one:
    • stable artifact id (survives every version);
    • immutable versions, each with a content digest (sha256) + blob ref;
    • MIME-driven viewer hint;
    • origin.kindupload | email_attachment | agent_generated | external_link | live_generator;
    • arbitrary parent_id / parent_type (e.g. attachment → email object);
    • run / message / provider provenance;
    • editable text/markdown body (not only opaque binary);
    • generated-image variants;
    • publication / reference metadata (published?, editable?, referenced-by).
  3. Tenant/version-scoped blob identity. Physical sha256 dedupe is internal only — never exposed and never used for authorization or cross-tenant existence inference. Blob lookup is always scoped by org_id + artifact-version.
  4. One canonical write path. The artifact service layer (Phase 324) is the only writer. Library UI and MCP CRUD (Phase 325) call the service — never a second write path, never raw blob/object writes.
  5. Immutable, replay-safe refs. A message/run ArtifactRef pins a specific version + digest. Referenced artifacts are tombstone-deleted, never hard-deleted; a referenced version’s bytes are retained.
  6. LLM orchestration is the sole file consumer. WayFlow never consumes files. The prompt window attaches an artifact ref; resolution + provider upload/attach happens only in @cinatra-ai/llm-orchestration via /api/llm-bridge.
ThreatVectorMitigation (phase)
Cross-tenant file disclosureglobal sha dedupe / unscoped blob path / predictable URLstenant+version-scoped blob identity; authz on every serve; signed/internal URLs (318, 322)
Stored-XSS / drive-byserving HTML/SVG/PDF inline as active contentContent-Disposition: attachment for non-safe types, strict CSP, MIME sniffing, no-exec storage path (322)
Path traversal / RCE on uploadcrafted filename / archiveserver-generated storage keys, never client filename on disk; extension allow/deny; size cap; malware-scan hook point (321)
Secret/data exfiltration into graph memoryGraphiti projector serializes full objects.datametadata/excerpt-only projection policy lands before the first artifact write (320)
Privilege escalation via artifact ownershipreassigning owner to widen accesspromote-only ratchet; reassignment = explicit audited transfer; narrowing conservative if referenced (323)
Replay/audit gaphard-deleting a referenced artifacttombstone + retention + audit log on create/delete/transfer/promote (323)
Model hallucinating file accessnon-ingestible type silently droppedstructured “attached, not directly readable” manifest delivered to the model (326)
Unbounded provider re-upload / costre-uploading the same blob each turnprovider-ref cache keyed by artifact-version + provider, with GC (328)
ArtifactRef = {
artifactId: string // stable across versions
versionId: string // pinned, immutable
digest: string // sha256 of the pinned version's bytes
mime: string
originKind: 'upload' | 'email_attachment' | 'agent_generated' | 'external_link' | 'live_generator'
}

Stored in normalized storage (refs table); chat-thread JSON may carry a projection/cache of the ref, never the canonical record, never bytes.

kind:"artifact" extensions are metadata-only (descriptor: type, viewer hint, capabilities, optional resolver). They MUST NOT contain cinatra/oas.json (so WayFlow’s agent loader never mounts them) and MUST NOT carry executable host code paths beyond the descriptor + resolver contract. The ArtifactExtensionTypeHandler validates cinatra.kind:"artifact" + @cinatra-ai/<slug>-artifact naming + absence of oas.json.

5a. Pre-existing systemic gap — dedicated prerequisite phase (Phase 313.1)

Section titled “5a. Pre-existing systemic gap — dedicated prerequisite phase (Phase 313.1)”

Codex round-2 review of Stage 0 surfaced a pre-existing (not v5.0-introduced) systemic gap that must be its own phase, sequenced before any artifact registry-install / marketplace surface work (i.e. before Stage 3 Phase 324):

Phase 313.1 — Extension-dispatch config-DI + kind-agnostic registry listing.

  • ensureConfig() (Phase 222 DI) throws when getAgentPackage() / getPublishedExtensionKind() are called without an explicit VerdaccioConfig. The extension install/update/uninstall/archive/restore dispatch in packages/extensions/src/actions.ts + mcp/handlers.ts calls these without config and swallows the throw → deriveTypeId(null)"agent". This means non-agent extension kinds (skill / connector / artifact) are silently mis-dispatched to the agent handler on main today — a latent correctness bug independent of this milestone. Fix: load loadVerdaccioConfigForServer() once at the server/MCP boundary and thread the resolved VerdaccioConfig into resolveExtensionTypeId + every getAgentPackage/getPublishedExtensionKind call (≈14 sites). This corrects agent/skill/connector and artifact dispatch together.
  • listAgentPackages() extracts agent.json for every package and drops the rest, so skill/connector/artifact packages never appear in the registry marketplace listing. Add a kind-agnostic listExtensionPackages() summary path that reads cinatra.kind from the packument package.json, using the agent payload only when kind === "agent". Until then, in-tree built-in artifact extensions (like connectors today) are not registry-listed — an accepted interim, not a v5.0 regression.

Scope rationale: this is a registries-wide DI/listing change affecting all extension kinds; folding it into Stage 0 would conflate a pre-existing platform fix with the artifact feature. It is tracked here as the canonical routing per the GSD deferral policy and must land before Phase 324.

The Stage-0 kind-agnostic resolveExtensionTypeId / getPublishedExtensionKind plumbing is the correct shape; Phase 313.1 supplies the missing VerdaccioConfig so it actually resolves instead of falling through to "agent".

6. Verification posture (worktree, no live server)

Section titled “6. Verification posture (worktree, no live server)”

Code-level only in-worktree: pnpm typecheck, package vitest, targeted source greps (invariant guards). Live UAT (upload→chat→library→MCP, browser, OAuth) is GSD step-7.2 work on main post-merge — out of the worktree’s scope.