Confidence Score

Name: Vigil
Author: Vigil

How Vigil calculates the 0-100 score and what it means.

Formula

The confidence score is a weighted average of all active signals:

formula

Score = Sum(signal_score x weight) / Sum(weight)

Each signal produces a score from 0 to 100. The final score is the sum of each signal's score multiplied by its weight, divided by the total weight of all active signals. The result is rounded to the nearest integer.

Worked Example

Here is how the final score is calculated with all six signals active (Pro plan):

example

Signal                Score   Weight   Contribution
────────────────────  ─────   ──────   ────────────
Claims Verifier         78   x  30   =       2340
Undocumented Changes    60   x  25   =       1500
Credential Scan        100   x  20   =       2000
Coverage Mapper         70   x  10   =        700
Contract Checker        85   x  10   =        850
Diff Analyzer           90   x   5   =        450
                                     ────────────
Total                          100         7840

Score = 7840 / 100 = 78

On the Free tier, only the four Trust Verification signals run (Claims Verifier, Undocumented Changes, Credential Scan, Coverage Mapper). Their weights are renormalized to total 100, so the formula stays the same.

Recommendation Tiers

Score Range	Recommendation	GitHub Check
80 - 100	Safe to merge	`success`
50 - 79	Review needed	`neutral`
0 - 49	Caution	`failure`

You can use the GitHub Check conclusion with branch protection rules to gate merges. For example, require the Vigil check to pass (success) before allowing a merge.

Failure Cap

If any deterministic (non-LLM) signal has passed: false, the final score is capped at 70. This ensures the PR never reaches "Safe to merge" territory when a hard check has failed.

Deterministic signals that can trigger the cap:

Credential Scan — secrets detected in the diff
Coverage Mapper — changed files have no test coverage

LLM-based signals (Claims Verifier, Undocumented Changes, Contract Checker, Diff Analyzer) do not trigger the failure cap. They are advisory: they can lower the score, but they cannot block a PR from being "Safe to merge" on their own. This reflects the inherent uncertainty of LLM analysis.

Action Items

The PR comment includes up to 5 actionable items at the top, sorted by severity:

Must Fix — items that directly failed a deterministic check. These are blocking issues like leaked credentials or missing test coverage.
Consider — items flagged by advisory signals like contract checker or coverage mapper. These deserve a look but may be intentional.

A maximum of 5 action items are shown to keep the comment focused. If more issues exist, they are still visible in the detailed signal breakdown below the summary.

← PreviousDiff Analysis Next →Commands