Test Regressions
If a test with a perfect execution history suddenly fails, it sends a strong signal that the code change is breaking the system.
Flakiness.io detects such situations and reports them by introducing fifth test status - regression:
Regression Window
Section titled “Regression Window”The regression window is a configurable time period (in days) that defines how far back Flakiness.io looks for a test’s perfect execution history when determining if a failure should be classified as a regression.
This setting can be customized per project in your project settings, allowing you to adjust the sensitivity of regression detection based on your team’s needs:
- A shorter window (e.g., 7 days) makes regression detection more sensitive
- A longer window (e.g., 30 days) requires a longer period of stability before flagging regressions
Classifying Failures as Regressions
Section titled “Classifying Failures as Regressions”To compute regressions for any timeline, Flakiness.io:
- Computes test statuses for all commits
- Classifies commit failures with perfect history within the regression window as regressions
- Classifies failed runs inside Regressed commits as regressions
- Classifies failed days with regressed commits inside as regressions
To illustrate the logic, let’s say we have a testNervousSquirrel test which hasn’t been failing since forever,
but suddenly failed yesterday in Commit X on May 25, 2025, and was failing since then:
Regressions in Commits
Section titled “Regressions in Commits”Flakiness.io classifies commit-level failures as regressions if the test has a perfect record for the previous regression window period.
So in our example:
- Since Commit X failed, and the test had perfect record within the regression window, the failure in Commit X will be classified as regression
- The test continued failing in both Commit Y and Commit Z. However, these failures are not new, so they will not be classified as regressions.
Regressions in Runs
Section titled “Regressions in Runs”Flakiness.io classifies run-level failure as regression if this failure is classified as regression on the commit level.
So if Commit X had, for example, 2 failing runs, then these run failures would be classified as regressions:
Regressions in Days
Section titled “Regressions in Days”Flakiness.io classifies “failed” status for a day as “regression” if the regression happened during this day.
So in our example:
- Failure of the May 25, 2025 will be classified as a regression, since the test regressed during this day
- The failure on the following day of May 26, 2025, will not be classified as regression.