Built-in vs. custom criteria
Acceptance criteria come in two flavors. Built-in criteria ship out-of-the-box and are inherited from your lifecycle scripts — you don’t configure them:| Criterion | When it applies | What it checks |
|---|---|---|
| Build must succeed | Your lifecycle has a build script | The project builds without errors (compile / type-check / link errors surface here) |
| Tests must pass | Your lifecycle has a test script | The configured test suite passes end-to-end |
- Lint warnings must be ≤ 0
- Line coverage must be ≥ 80%
- Max cyclomatic complexity must be ≤ 10
- High-severity audit findings must be 0
- p95 test runtime must be ≤ 30s
Acceptance criteria are not tests. A test asserts a specific behavior (“
add(2, 2) returns 4”, “the login endpoint returns 200”). An acceptance criterion gates an aggregate number against a fixed threshold. Behavior assertions belong in your test suite, where the built-in Tests must pass criterion already gates them.code_coverage criterion requiring both line coverage ≥ 80% and branch coverage ≥ 70%), or span multiple services (one lint criterion measuring both a backend and a frontend script). In those cases every threshold must pass for the criterion to pass.
How criteria are set up
You define acceptance criteria through the chat, without filling out a form by hand. There are three ways they get created:- During onboarding (optional). After your lifecycle is validated, asks whether you want to set up acceptance criteria. If you say yes, it walks you through discovering extra named scripts (lint, coverage, and so on), auto-suggesting criteria from those scripts, and adding any custom ones. If you skip, you can do all of this later from the chat.
- Auto-discovery. Ask the chat something like “what acceptance criteria would you suggest?” and inspects your lifecycle scripts, runs them, and proposes threshold gates from the signals it finds. For example: “I see
npm run lintreports 12 warnings — should we gate PRs so the warning count never goes up?”. You accept, adjust the threshold, or decline each proposal. - Manual requests. Describe a check in plain language — “add a criterion that line coverage stays at or above 80%” — and confirms the exact metric and threshold with you before saving it.
Baselines
When a custom criterion is added, captures a baseline — a one-time measurement of the metric on your source (origin) application. The baseline is the reference point milestone results compare against, so you can see whether the modernized code held the line, improved, or regressed. Baseline capture runs in the background. How long it takes depends on the lifecycle scripts it has to run. Each criterion shows its baseline state as a badge:| Badge | Meaning |
|---|---|
| Baseline captured | The metric was measured on origin and is ready to compare against |
| Baseline pending | Capture is queued or running |
| Baseline stale — script changed | A referenced lifecycle script changed; the baseline needs to be re-measured |
| Orphaned — script missing | A referenced lifecycle script no longer exists (see Orphaned criteria) |
| Baseline N/A | A target-only criterion with no origin counterpart, so there is nothing to baseline |
Origin and target criteria
Like lifecycle configuration, acceptance criteria are tracked separately for origin (your source application) and target (the modernized application). The Acceptance Criteria view has Origin and Target tabs.- New criteria default to origin.
- Most origin criteria are translated to the target automatically so the same quality bar applies to the modernized stack — even when the toolchain differs (for example, a Python linter on origin maps to the equivalent gate on a TypeScript target).
- You can author target-only criteria for checks that only make sense on the modernized stack. These have no origin baseline and show Baseline N/A.
Viewing your criteria
Acceptance Criteria has its own folder under Project Setup in the knowledge tree (the sidebar), alongside the Origin and Target folders. Expand it and select a side:- Origin — criteria for your source application.
- Target — criteria for the modernized application (shown once a target config exists).
- Built-in — the inherited Build must succeed / Tests must pass gates.
- Custom — the criteria you added, each showing its description, the threshold it enforces, the lifecycle scripts it reads (shown as
Script-<name>chips), and its baseline badge.
How criteria are enforced
During milestone review, re-runs each criterion’s lifecycle scripts against the milestone’s pull request and compares the measured value to the threshold. Results appear in two places:- The milestone’s Acceptance Criteria results tab, which shows each metric’s baseline value alongside the value measured on this milestone and whether it passed.
- The Code Review flow, where a failing criterion surfaces as an issue you can triage.
Orphaned criteria
A criterion becomes orphaned when one of the lifecycle scripts it reads no longer exists — usually because the script was renamed or removed. An orphaned criterion can’t be enforced, and surfaces a warning at the top of the Acceptance Criteria section. To resolve one, ask the chat to either point the criterion at the correct script or delete it. Re-measuring a baseline can’t fix an orphan on its own — the missing script has to be restored or the reference updated first.Related Docs
Lifecycle Setup
Configure the scripts acceptance criteria read from
Lifecycle Strategy
Choose whether the agent builds and runs your app, or connects to a running one
Milestones
How milestones are reviewed, executed, and shipped
Code Review Chat
Triage acceptance-criteria issues found during milestone review