Artifacts v5.0 — Architecture, Threat Model & Invariants (Phase 314)

Authoritative design: docs/superpowers/specs/2026-05-17-artifacts-and-file-upload-design.md (rev 2a). This page is the binding contract every v5.0 storage/service/LLM phase (318–337) must honor. It lands before any storage code.

1. Conceptual model (locked)

Artifact = a self-contained deliverable consumed by being opened/read/edited/published/downloaded/attached. Identity is content + provenance, not relations.
Data object = a record whose identity is its relations + operational status (contact, account, list, email record/draft). Not an artifact.
Artifacts are a typed projection over cinatra.objects — one ownership/scope/Graphiti stack — plus dedicated supporting tables (blob metadata, immutable artifact-version, normalized refs, provider-ref cache, audit/retention). No parallel ownership stack.
Artifact types (v5.0): file, dashboard, connector-ref. Each ships as a kind:"artifact" extension; the set is extensible by adding extensions (no core edits).

2. Hard invariants (BLOCKING — enforced by tests/greps in later phases)

No bytes in objects.data. cinatra.objects.data (JSONB) holds metadata + normalized refs only — never file bytes, base64, or blob content. Blob bytes live only in the blob store; the object row carries { artifactType, latestVersionId, digest, mime, size, originKind, … }.
Full-fidelity file model from its first phase (319) — never an upload-only blob. The file artifact model MUST support, from day one:
- stable artifact id (survives every version);
- immutable versions, each with a content digest (sha256) + blob ref;
- MIME-driven viewer hint;
- origin.kind ∈ upload | email_attachment | agent_generated | external_link | live_generator;
- arbitrary parent_id / parent_type (e.g. attachment → email object);
- run / message / provider provenance;
- editable text/markdown body (not only opaque binary);
- generated-image variants;
- publication / reference metadata (published?, editable?, referenced-by).
Tenant/version-scoped blob identity. Physical sha256 dedupe is internal only — never exposed and never used for authorization or cross-tenant existence inference. Blob lookup is always scoped by org_id + artifact-version.
One canonical write path. The artifact service layer (Phase 324) is the only writer. Library UI and MCP CRUD (Phase 325) call the service — never a second write path, never raw blob/object writes.
Immutable, replay-safe refs. A message/run ArtifactRef pins a specific version + digest. Referenced artifacts are tombstone-deleted, never hard-deleted; a referenced version’s bytes are retained.
LLM orchestration is the sole file consumer. WayFlow never consumes files. The prompt window attaches an artifact ref; resolution + provider upload/attach happens only in @cinatra-ai/llm-orchestration via /api/llm-bridge.

3. Threat model

Threat	Vector	Mitigation (phase)
Cross-tenant file disclosure	global sha dedupe / unscoped blob path / predictable URLs	tenant+version-scoped blob identity; authz on every serve; signed/internal URLs (318, 322)
Stored-XSS / drive-by	serving HTML/SVG/PDF inline as active content	`Content-Disposition: attachment` for non-safe types, strict CSP, MIME sniffing, no-exec storage path (322)
Path traversal / RCE on upload	crafted filename / archive	server-generated storage keys, never client filename on disk; extension allow/deny; size cap; malware-scan hook point (321)
Secret/data exfiltration into graph memory	Graphiti projector serializes full `objects.data`	metadata/excerpt-only projection policy lands before the first artifact write (320)
Privilege escalation via artifact ownership	reassigning owner to widen access	promote-only ratchet; reassignment = explicit audited transfer; narrowing conservative if referenced (323)
Replay/audit gap	hard-deleting a referenced artifact	tombstone + retention + audit log on create/delete/transfer/promote (323)
Model hallucinating file access	non-ingestible type silently dropped	structured “attached, not directly readable” manifest delivered to the model (326)
Unbounded provider re-upload / cost	re-uploading the same blob each turn	provider-ref cache keyed by artifact-version + provider, with GC (328)

4. `ArtifactRef` (normalized, immutable)

ArtifactRef = {
  artifactId: string        // stable across versions
  versionId:  string        // pinned, immutable
  digest:     string        // sha256 of the pinned version's bytes
  mime:       string
  originKind: 'upload' | 'email_attachment' | 'agent_generated' | 'external_link' | 'live_generator'
}

Stored in normalized storage (refs table); chat-thread JSON may carry a projection/cache of the ref, never the canonical record, never bytes.

5. Extension-kind security

kind:"artifact" extensions are metadata-only (descriptor: type, viewer hint, capabilities, optional resolver). They MUST NOT contain cinatra/oas.json (so WayFlow’s agent loader never mounts them) and MUST NOT carry executable host code paths beyond the descriptor + resolver contract. The ArtifactExtensionTypeHandler validates cinatra.kind:"artifact" + @cinatra-ai/<slug>-artifact naming + absence of oas.json.

5a. Pre-existing systemic gap — dedicated prerequisite phase (Phase 313.1)

Codex round-2 review of Stage 0 surfaced a pre-existing (not v5.0-introduced) systemic gap that must be its own phase, sequenced before any artifact registry-install / marketplace surface work (i.e. before Stage 3 Phase 324):

Phase 313.1 — Extension-dispatch config-DI + kind-agnostic registry listing.

ensureConfig() (Phase 222 DI) throws when getAgentPackage() / getPublishedExtensionKind() are called without an explicit VerdaccioConfig. The extension install/update/uninstall/archive/restore dispatch in packages/extensions/src/actions.ts + mcp/handlers.ts calls these without config and swallows the throw → deriveTypeId(null) → "agent". This means non-agent extension kinds (skill / connector / artifact) are silently mis-dispatched to the agent handler on main today — a latent correctness bug independent of this milestone. Fix: load loadVerdaccioConfigForServer() once at the server/MCP boundary and thread the resolved VerdaccioConfig into resolveExtensionTypeId + every getAgentPackage/getPublishedExtensionKind call (≈14 sites). This corrects agent/skill/connector and artifact dispatch together.

listAgentPackages() extracts agent.json for every package and drops the rest, so skill/connector/artifact packages never appear in the registry marketplace listing. Add a kind-agnostic listExtensionPackages() summary path that reads cinatra.kind from the packument package.json, using the agent payload only when kind === "agent". Until then, in-tree built-in artifact extensions (like connectors today) are not registry-listed — an accepted interim, not a v5.0 regression.

Scope rationale: this is a registries-wide DI/listing change affecting all extension kinds; folding it into Stage 0 would conflate a pre-existing platform fix with the artifact feature. It is tracked here as the canonical routing per the GSD deferral policy and must land before Phase 324.

The Stage-0 kind-agnostic resolveExtensionTypeId / getPublishedExtensionKind plumbing is the correct shape; Phase 313.1 supplies the missing VerdaccioConfig so it actually resolves instead of falling through to "agent".

6. Verification posture (worktree, no live server)

Code-level only in-worktree: pnpm typecheck, package vitest, targeted source greps (invariant guards). Live UAT (upload→chat→library→MCP, browser, OAuth) is GSD step-7.2 work on main post-merge — out of the worktree’s scope.