Coverage you'll actually read
· 5 min read
Coverage is one of those tools I genuinely like. It tells me whether the business logic is actually tested - not “do we have tests”, but “do the tests walk through the branches that matter”.
I’m not an advocate for chasing 100% across the whole repo. Plenty of code isn’t worth the ceremony. But the important stuff - business logic, calculations, algorithms - I do want at 100%, because that’s where every if, every condition, every edge case can silently be wrong. There I treat tests as documentation: this is how the code is supposed to behave under these circumstances.
Two things kept getting in the way, though.
Problem 1: coverage output is too chatty to be useful
On a large project - thousands of modules - a coverage run prints a wall of percentages. I changed three files. I care about three files. Instead I get a screen I scroll past, and in practice I just… stop checking. The signal I want is buried under everything I didn’t touch.
Problem 2: coverage is fine until it isn’t
Nobody watches coverage day to day. It quietly drifts down a little each week, and nobody notices - until one PR finally trips the CI threshold and fails. Now the person who happened to push that PR is on the hook for weeks of accumulated slippage that wasn’t theirs. The slow decline was invisible the whole way down.
Both problems have the same root cause: there’s no cheap, per-PR feedback that says did this change make coverage better or worse, on the code this change touches.
The fix: a baseline branch + a per-module diff in the PR
I solved this in al_check PR #3. Two pieces:
1. Keep the main-branch coverage somewhere stable. Every time CI runs on main, it commits the coverage numbers to a dedicated coverage_do_not_delete branch. That branch is the baseline - the last known-good coverage for every module. It never gets deleted, so there’s always something to compare a PR against.
2. On every PR, compare only the modules you changed against that baseline, and post it as a comment. No threshold gymnastics, no scrolling. The PR comment looks like this:
Test Coverage Summary Statistics - SUCCESS
Coverage: 87.5: better for 1.2% Congratulations! Good job!
Modules related to this PR - coverage
Module Baseline Current Delta Check.Summary80.00% 100.00% +20.00% ✅ Check.PrCommentN/A 95.00% 🆕 new
You see the overall direction at the top (better / worse / no change, with the delta), and below it a table scoped strictly to the modules in this PR - with each one’s baseline, current, and delta. New modules are flagged 🆕. That’s the whole point: feedback on your code, and a trend you can’t accidentally ignore.
You can have the same per-module coverage for your changed files with one command locally with the check tool:
How to set CI up yourself
The whole thing is a CI job plus two small scripts. Rather than paste it all here, I’ll walk the flow and link each step in the ci.yml.
0. Create the baseline branch once. Make an orphan branch that only ever holds coverage artifacts:
git checkout --orphan coverage_do_not_delete
git rm -rf .
echo '{}' > best_coverage.json
git add best_coverage.json && git commit -m "init coverage baseline"
git push origin coverage_do_not_delete
You’ll also need a GH_TOKEN secret with contents: write so the job can push back to that branch.
The rest lives in the test job:
-
Pull the baseline first. The job checks out
coverage_do_not_delete, reads the previous per-module numbers, and stashes them in/tmpbefore checking out the PR code — then re-checks out with full history so it can diff againstmain. →ci.yml#L84-L98 -
Run the tests, then emit a coverage report. Tests run through the
checkescript in partitions (check --only test --partitions 3), and a second call (check --full-coverage-output) writes the full per-module coverage table to a file. →ci.yml#L143-L147 -
Parse the report into per-module JSON. A ~20-line script scrapes that coverage table into
{module => percentage}. →parse_coverage.py -
Find which modules the PR touched, then diff them against the baseline. The job runs
git diff --name-only origin/main...HEADonlib/**/*.ex, greps thedefmodulenames out of those files, and feeds them to a script that renders the markdown table you saw above — baseline vs. current, only for the changed modules. →ci.yml#L177-L201andmodule_coverage_diff.py -
Post (or update) the PR comment.
peter-evans/find-commentlocates an existing comment so reruns edit it in place instead of spamming the thread;create-or-update-commentwrites the body. →ci.yml#L203-L225 -
Save the new baseline — but only on
main. After amainrun, the job writes the fresh numbers back tocoverage_do_not_delete, so the next PR compares against an up-to-date baseline. →ci.yml#L241-L256
Why this works
- You only read what you changed. The comment is scoped to your PR’s modules, so checking coverage is a glance, not a chore.
- The trend can’t hide. Every PR shows a delta against
main. A slow leak shows up as a string of smallXs instead of one surprise CI failure months later. - No hard threshold to game. It’s directional feedback, not a gate that punishes whoever happens to cross the line.
The Elixir specifics (the check escript, the defmodule grep) are easy to swap for whatever your stack emits - the shape is the same: parse current → diff modified modules vs. a committed baseline → comment.
Cheers!