Test analytics
for engineers and agents

The missing test layer for GitHub: regressions, flakes, artifacts, and history in one place.

Flakiness.io dashboard
Test Runners

Any test, any runner

Frontend in Vitest. E2E in Playwright. Backend in Pytest or JUnit.

Flakiness.io brings them into one test layer, with native reporters for the major runners, a JUnit XML bridge for everything else, and a Node.js SDK for custom integrations. Mixed stack, one history.

Reports

From failed run to root cause

A failed run carries a lot of signal. Flakiness.io puts the test waterfall, system telemetry, root-cause bins, and Flakiness Query Language in one place.

Engineers and agents can see the failure, understand it, and decide where to look next.

Test Results
1,246
tests
Passing87%
Failed8%
Flaky5%
Test Waterfall1,246 tests · 4 workers · 2m 48s
CPU
Memory
worker #1
worker #2
worker #3
worker #4
0s30s1m1m 30s2m2m 30s
Errors:TimeoutError × 4AssertionError × 2Flaky × 1
Artifacts

Keep evidence attached

Logs, screenshots, videos, and traces stay attached to the step that produced them. Flakiness.io keeps that evidence in context, with built-in image diffing and Playwright trace viewing in the browser.

Step › billing › upgrades plan
Video · 00:24
Image diff
Playwright Trace
Attached to step · Retention: 30 days
CI & Sharding

Defragment your CI

Shards finish at different times. Environments drift. CI shows the run in pieces. Flakiness.io ingests results as they land, merges shards into one report, and keeps staging and production histories separate. GitHub OIDC keeps authentication tokenless.

GitHub ActionsOIDC authenticated
Shard 1/4312 tests
Shard 2/4308 tests
Shard 3/4315 tests
Shard 4/4311 tests
env: production
Unified Report
Shard 1Shard 2Shard 3Shard 4
1,246
tests
Regressions vs Flakes

Separate regressions from flakes

Every result is tied to the commit and environment it ran on. Flakiness.io can tell whether a failure is new in the PR, already broken on main, or flipping on the same commit. Triage gets faster because the signal is cleaner.

The approach is grounded in commit-aware flakiness analysis described in research by Apple engineers.

Pull RequestsTest Results
Refactor auth middlewareHas Regressions
#1409
Update ClickHouse clientMergeable
#1407
Fix snapshot collisionMergeable
#1406
Vendor isolation policyMergeable
#1402
Test Health Calendarmain
Dec
Jan
Feb
Mar
Mon
Wed
Fri
RegressionFailureFlakyPassingNo data
AI Agents

Compact context for coding agents

Raw CI output is expensive context for coding agents. Flakiness.io turns a failed run into a compact record: what failed, whether it regressed, how it behaved on main, and which logs and artifacts matter.

Agents spend fewer tokens gathering context and more time fixing the problem.

Claude CodeClaude Code
CodexCodex
CursorCursor
claude
> Hey Claude, fix failures in PR #219
Pulling test history from flakiness.io…
3 regressions found in billing/*.spec.ts
Analyzing flip rate on main
Drafting fix for race condition in billing.ts:142
Pricing Model

We charge for storage, not tests

Most platforms charge per test run. Flakiness.io charges for stored data. Teams can run more tests, shard aggressively, retry when needed, and keep the bill predictable as coverage grows.

Typical providers
$ / test run
× Penalty for writing more tests
× Surprise bills at scale
Flakiness.io
$ / storage
Run as many tests as you want
Predictable bills at scale
Platform pricing

Choose your plan

Starts free — no credit card required.

Team

$75
/ month
billed monthly
  • All Features
  • Unlimited Test Runs
  • 5 seats
  • 50GB Included Storage
  • 90 Days Data Retention
  • Standard Support
Start for free
Most Popular

Business

$300
/ month
billed monthly
  • All Features
  • Unlimited Test Runs
  • 30 seats
  • 200GB Included Storage
  • 365 Days Data Retention
  • Standard Support
Start for free

Enterprise

Custom
billed annually
  • All Features
  • Unlimited Test Runs
  • Unlimited seats
  • Custom Included Storage
  • Custom Data Retention
  • Priority Support
  • On-Premise or Cloud Deployment
  • Automatic access for GitHub collaborators
[email protected]

Frequently asked questions

Why is Flakiness.io cheaper than other platforms?

Flakiness.io charges for stored data, not per test run. The analytics engine is built on interval unions, which keeps large test histories efficient to pack and process at scale. Pricing follows that architecture: storage drives cost, not execution count.

Can Flakiness.io handle our massive monorepo?

Yes. Flakiness.io is built for large test histories, mixed stacks, and high test volume. We have tested the system on established projects with 500,000+ tests, and it handled the volume without issue.

Can I host Flakiness.io on my own servers?

Yes. A self-hosted deployment is one application container backed by PostgreSQL and an S3-compatible object store. The reporters and the Flakiness CLI work with custom deployments. See the on-premise docs or contact [email protected] for licensing.

Who develops the test runner integrations?

The Flakiness JSON Report format is open source, so anyone can build a reporter.

The Flakiness.io team maintains the official reporters for Playwright Test, Pytest, Vitest, CucumberJS , along with the JUnit bridge and the platform itself.

More questions? Reach out at [email protected]

Open source program

Free for open source

We believe great testing tools should be accessible to everyone. If you maintain a popular open-source project, Flakiness.io is free.

What you get

  • Full Platform access
  • Unlimited Test Runs
  • Your own Organization
  • 10GB Included Storage
  • Public projects
  • Free forever

How to qualify

5,000+GitHub Stars

Your repository needs at least 5,000 stars on GitHub.

Drop us an email with a link to your repository and we'll set you up.

Apply via Email

Who's behind Flakiness.io

A decade of building dev tools

Andrey Lushnikov

Flakiness.io is created by Andrey Lushnikov, who spent over a decade at Google and Microsoft building the tools that testing engineers rely on every day: creating Google Puppeteer and co-founding Microsoft Playwright.

After years of dealing with flaky tests and fragmented test reporting, he built the tool he wished existed.

Andrey Lushnikov