Release Workflows (v6.0)

AI-assisted, reusable product release workflows. A user describes a release in /chat; Cinatra instantiates a template, the user manages it on an interactive Gantt, approves human gates, and a durable reconciler executes the multi-week process on Cinatra’s existing stack.

Design of record: .planning/milestones/v6.0-MILESTONE.md (+ v6.0-REQUIREMENTS.md, v6.0-ROADMAP.md). Brainstormed + adversarially reviewed with Codex across 6 rounds.

Core principle

OpenAI (in /chat) helps CREATE/revise the workflow (proposal-only, via MCP).
Cinatra OWNS the spec, templates, validation, authz, approvals, the Gantt, audit.
A durable BullMQ reconciler EXECUTES the approved workflow (Postgres = source of truth).
WayFlow RUNS the agents (one step type among several).

A release workflow is a first-class calendar-driven process DAG — not an agent. agent_task is one of six heterogeneous step types (agent_task / approval / manual / notification / wait / checkpoint).

Architecture decisions (why)

No Temporal, no n8n, no new service. Reuse Postgres + BullMQ + WayFlow. The workflow is mutable product data (editable mid-flight), not deployed code; and it must be packageable as a marketplace extension — both block Temporal/n8n.
Postgres is the single source of truth (D3). The executor is a thin driver: wake → ask Postgres what’s due & unblocked → dispatch idempotently → record event → sleep. No execution state in code → editing an active workflow is tractable.
Three orthogonal gates (D12) — timing / dependency / approval — in a per-task ledger; a node dispatches only when all three pass (atomic CAS).
depends_on (execution) ≠ schedule.anchor (timing) (D7). Cascade follows schedule expressions, not dependency edges.
Approvals are workflow-native + human-only (D11). Never assistant-callable.
Triggers do not nest in workflows (D14). A node’s schedule is the timing truth.

Package: `packages/release-workflows`

Foundation (Phase 436): the spec + data model + resolver + state machines + scope scaffold + object registration. No execution, no UI yet. See the package AGENTS.md for invariants.

Data model (Postgres, `cinatra` schema)

workflow_template (immutable versions) · workflow (mutable, versioned, lock_version CAS) · workflow_task (heterogeneous; planned/actual + anchor timing; first-class failure/retry/cancel/missed-window policy; per-task CAS) · workflow_dependency (per-edge outcome) · workflow_gate (ledger w/ explainability) · workflow_event (append-only operational log) · workflow_task_attempt (unique idempotency key — at-least-once guard) · workflow_artifact · workflow_approval (own solicitation schedule + deadline + review-packet hash + resolved approvers).

Schedule resolver

Server-side, DST-correct (@date-fns/tz). Relative offsets resolve as calendar durations in the task/release tz then convert to UTC; pinned tasks keep absolute dates; a release-date move yields a cascade diff for unpinned tasks (the Gantt commits via CAS). A relative anchor resolves to the anchor task’s canonical dueAtUtc; pinned tasks freeze as cascade anchors.

Agent steps (Phase 441)

agent_task steps dispatch a child agent run and poll it to terminal — durably, and crash-safe. The leaf package can’t reach the app-layer enqueue chokepoint, so the host injects two functions at engine boot (src/lib/release-workflow-agent-executor.ts, wired in src/instrumentation.node.ts):

executor — resolves the task’s agentRef to a template, calls createAgentRun(...) and enqueueAgentRun(...), returns running + the childRunId. Delegated provenance (auth-derived orgId, runBy, projectId, workflowId, workflowTaskId) is stamped on the child run. Uses softPreflight because the reconciler has no live session actor — a missing connector then surfaces as a run failure at execution (captured by retry/dead-letter), not a hard enqueue block.
poller (getChildRunStatus) — maps an agent_run status to terminal / failed / HITL.

Idempotency (at-least-once safe). The reconciler’s per-attempt key ${workflowId}:${taskId}:${attemptNo} is passed verbatim to createAgentRun, which is race-safe: a redispatch of the same attempt catches the partial-unique violation (agent_runs_idempotency_key_uniq), re-reads, and returns the same child run only if its provenance matches (else fails closed). A retry (new attemptNo → new key) spawns a fresh run.

Poll ordering. Child statuses are read OUTSIDE the per-workflow advisory lock, then the workflow_task is settled UNDER the lock with a CAS that fires only if the task is still running and the attempt’s child_run_id still matches — so a stale read can never clobber a re-claimed task. On success the child run is linked as a workflow_artifact; HITL (the child run paused for human input) bubbles a single agent_hitl event and leaves the task running for the approvals UI (442). A dispatch that crashed before recording a child id is not auto-recovered — it is indistinguishable from a slow in-flight dispatch and would risk a duplicate run, so it needs a durable dispatch lease (deferred); findStuckTasks surfaces such a stuck agent_task meanwhile.

Phasing

436 spec foundation → 437 chat creation + MCP tools → 438 interactive Gantt → 439 durable engine core → 440 lifecycle/notifications → 441 agent steps → 442 approvals/governance/active-edit → 443 marketplace packaging (v1.1).