Skip to main content
Acceptance criteria are project-scoped quality gates that checks on every milestone pull request. Each one is a quantitative, deterministic threshold — “lint warnings must be ≤ 0”, “line coverage must be ≥ 80%”, “p95 test runtime must be ≤ 30s” — that the modernized code has to satisfy before a milestone can ship. Acceptance criteria build directly on your Lifecycle Setup: each custom criterion runs one or more of your lifecycle scripts and reads a measurable signal from its output. Setting them up is optional — a project ships fine with only the built-in gates — but they let you encode the quality bar your team already cares about so enforces it automatically.

Built-in vs. custom criteria

Acceptance criteria come in two flavors. Built-in criteria ship out-of-the-box and are inherited from your lifecycle scripts — you don’t configure them:
CriterionWhen it appliesWhat it checks
Build must succeedYour lifecycle has a build scriptThe project builds without errors (compile / type-check / link errors surface here)
Tests must passYour lifecycle has a test scriptThe configured test suite passes end-to-end
Custom criteria are the ones you define for your project. Each custom criterion checks an aggregate metric against a fixed absolute threshold, measured from a single snapshot (one script run), and can include multiple metrics. Common examples:
  • Lint warnings must be ≤ 0
  • Line coverage must be ≥ 80%
  • Max cyclomatic complexity must be ≤ 10
  • High-severity audit findings must be 0
  • p95 test runtime must be ≤ 30s
Acceptance criteria are not tests. A test asserts a specific behavior (“add(2, 2) returns 4”, “the login endpoint returns 200”). An acceptance criterion gates an aggregate number against a fixed threshold. Behavior assertions belong in your test suite, where the built-in Tests must pass criterion already gates them.
A criterion can also gate several metrics at once (for example one code_coverage criterion requiring both line coverage ≥ 80% and branch coverage ≥ 70%), or span multiple services (one lint criterion measuring both a backend and a frontend script). In those cases every threshold must pass for the criterion to pass.

How criteria are set up

You define acceptance criteria through the chat, without filling out a form by hand. There are three ways they get created:
  1. During onboarding (optional). After your lifecycle is validated, asks whether you want to set up acceptance criteria. If you say yes, it walks you through discovering extra named scripts (lint, coverage, and so on), auto-suggesting criteria from those scripts, and adding any custom ones. If you skip, you can do all of this later from the chat.
  2. Auto-discovery. Ask the chat something like “what acceptance criteria would you suggest?” and inspects your lifecycle scripts, runs them, and proposes threshold gates from the signals it finds. For example: “I see npm run lint reports 12 warnings — should we gate PRs so the warning count never goes up?”. You accept, adjust the threshold, or decline each proposal.
  3. Manual requests. Describe a check in plain language — “add a criterion that line coverage stays at or above 80%” — and confirms the exact metric and threshold with you before saving it.
When proposes a criterion, it shows you the value it measured on your source app and suggests a sensible threshold. Choose the current value to prevent regressions, or a stricter threshold to drive improvement.

Baselines

When a custom criterion is added, captures a baseline — a one-time measurement of the metric on your source (origin) application. The baseline is the reference point milestone results compare against, so you can see whether the modernized code held the line, improved, or regressed. Baseline capture runs in the background. How long it takes depends on the lifecycle scripts it has to run. Each criterion shows its baseline state as a badge:
BadgeMeaning
Baseline capturedThe metric was measured on origin and is ready to compare against
Baseline pendingCapture is queued or running
Baseline stale — script changedA referenced lifecycle script changed; the baseline needs to be re-measured
Orphaned — script missingA referenced lifecycle script no longer exists (see Orphaned criteria)
Baseline N/AA target-only criterion with no origin counterpart, so there is nothing to baseline
Changing a criterion’s threshold or the script it reads, or editing the underlying lifecycle script, invalidates the baseline. automatically re-measures it in the background after such a change.

Origin and target criteria

Like lifecycle configuration, acceptance criteria are tracked separately for origin (your source application) and target (the modernized application). The Acceptance Criteria view has Origin and Target tabs.
  • New criteria default to origin.
  • Most origin criteria are translated to the target automatically so the same quality bar applies to the modernized stack — even when the toolchain differs (for example, a Python linter on origin maps to the equivalent gate on a TypeScript target).
  • You can author target-only criteria for checks that only make sense on the modernized stack. These have no origin baseline and show Baseline N/A.

Viewing your criteria

Acceptance Criteria has its own folder under Project Setup in the knowledge tree (the sidebar), alongside the Origin and Target folders. Expand it and select a side:
  • Origin — criteria for your source application.
  • Target — criteria for the modernized application (shown once a target config exists).
Selecting either opens the Acceptance Criteria view for that side, which lists your gates under two subsections:
  • Built-in — the inherited Build must succeed / Tests must pass gates.
  • Custom — the criteria you added, each showing its description, the threshold it enforces, the lifecycle scripts it reads (shown as Script-<name> chips), and its baseline badge.

How criteria are enforced

During milestone review, re-runs each criterion’s lifecycle scripts against the milestone’s pull request and compares the measured value to the threshold. Results appear in two places:
  • The milestone’s Acceptance Criteria results tab, which shows each metric’s baseline value alongside the value measured on this milestone and whether it passed.
  • The Code Review flow, where a failing criterion surfaces as an issue you can triage.
A criterion failing doesn’t silently block progress — it shows up as a reviewable issue so you can decide how to proceed.

Orphaned criteria

A criterion becomes orphaned when one of the lifecycle scripts it reads no longer exists — usually because the script was renamed or removed. An orphaned criterion can’t be enforced, and surfaces a warning at the top of the Acceptance Criteria section. To resolve one, ask the chat to either point the criterion at the correct script or delete it. Re-measuring a baseline can’t fix an orphan on its own — the missing script has to be restored or the reference updated first.

Lifecycle Setup

Configure the scripts acceptance criteria read from

Lifecycle Strategy

Choose whether the agent builds and runs your app, or connects to a running one

Milestones

How milestones are reviewed, executed, and shipped

Code Review Chat

Triage acceptance-criteria issues found during milestone review