Detecting Flakiness
flakiness.io uses the following approach to identify flaky tests:
- Each test report is tagged with the source code revision (commit) that was tested
- The system compares test results across multiple runs of the same commit
- If a test both passes and fails for the same commit, it’s marked as “flaky”
Understanding Environment-Specific Failures
While this approach reliably identifies flaky tests, it’s important to understand the distinction between true flakiness and environment-specific failures.
Example: Cross-Platform Testing
Consider a scenario where a test is run on both Windows and Linux:
Test A:- ✅ Passes on Windows- ❌ Fails on Linux
This situation can be interpreted in two ways:
-
True Flakiness: If you’re not specifically testing cross-platform compatibility, this might be considered a flaky test.
-
Environment-Specific Issue: If you’re intentionally testing cross-platform behavior, the Linux failure should be treated as a legitimate failure, not flakiness.
Using Timelines for Better Analysis
To properly handle environment-specific test results, flakiness.io provides the Timeline feature. Timelines allow you to:
- Split test histories by environment
- Analyze results separately for different configurations
- Identify platform-specific issues
Learn more about how to use this feature in the Timelines documentation.