← Back to Blog

PR Risk Scoring: How CodeDig Decides What's Dangerous

By CodeDig Team

An inside look at how CodeDig assigns a risk score to every pull request — combining blast radius, code complexity, security findings, and historical patterns into a single actionable signal.

Why a Single Score?

Risk assessment in code review is usually qualitative and implicit. The senior reviewer has a gut feel that "this change feels risky" and asks for more tests. The junior reviewer lacks that intuition and approves changes they should not. The difference between the two is experience, not process.

A risk score makes risk assessment repeatable and transparent. Every PR gets the same evaluation criteria, regardless of who reviews it. The score is not a replacement for reviewer judgment — it is a signal that tells the reviewer when to pay closer attention.

The Four Dimensions of PR Risk

CodeDig's risk score is a weighted composite of four dimensions:

### 1. Blast Radius (40% weight)

Blast radius is the strongest single predictor of risk. A change that touches a widely-used function and affects dozens of downstream consumers is inherently more dangerous than an isolated change in a leaf module, regardless of how clean the code looks.

The blast radius component scores:

  • Number of affected downstream consumers — Raw count of call sites and dependent symbols
  • Critical path penetration — Whether affected consumers include payment processing, authentication, data persistence, or other high-stakes paths
  • Service boundary crossing — Whether the change affects symbols that are consumed across service or team boundaries
  • Change type — Signature changes score higher than implementation-only changes

### 2. Code Complexity (25% weight)

A simple change to a simple function is low risk. A complex change to already-complex code is high risk. Complexity is measured by:

  • Cyclomatic complexity delta — How much the change increases branching and decision paths in the affected functions
  • Cognitive load — Nesting depth, conditional density, and error-handling coverage in the changed code
  • File churn — How many files are modified relative to the change's scope
  • Dependency count — How many distinct modules or packages are touched

### 3. Security Findings (25% weight)

Security issues are not always critical, but they always add risk. The security dimension evaluates:

  • Finding severity — Critical, high, medium, or low based on OWASP classification
  • Fix complexity — Whether the finding requires a simple input sanitization fix or a fundamental architectural change
  • Exposure — Whether the vulnerable code is user-facing, internal-only, or on a non-critical path
  • Remediation confidence — Whether CodeDig's suggested fix is straightforward or requires careful manual review

### 4. Historical Risk (10% weight)

Past behavior predicts future risk. The historical dimension looks at:

  • File-level failure history — Whether the files touched in this PR have been associated with past incidents or reverts
  • Author pattern — Whether the PR author has a higher-than-average revert rate for changes in the affected modules
  • Time sensitivity — Whether the PR is opened during a release freeze window or includes last-minute changes before a deployment

Score Ranges and What They Mean

The final score ranges from 0 (trivially safe) to 100 (extremely high risk). The ranges are:

  • 0–30: Low — Standard change with minimal blast radius and low complexity. Normal review path.
  • 31–55: Medium — Moderate risk. Reviewer should verify the blast radius analysis and check that tests exist for the changed paths.
  • 56–75: High — Significant risk. Requires additional review attention, confirmation from downstream teams, and thorough test coverage.
  • 76–100: Critical — Maximum risk. May warrant a staged rollout, additional approvals, or a phased approach to the change.

How the Score Renders in Review

When CodeDig analyzes a PR, the risk score appears in both the PR comment and the Check Run. The reviewer sees:

Risk Score: 62/100 — Medium
Breakdown:
  Blast Radius: 72/100 (47 downstream consumers, 3 service boundaries crossed)
  Complexity:   45/100 (moderate cyclomatic increase, 12 files changed)
  Security:     10/100 (no new findings detected)
  Historical:   55/100 (changed file involved in 2 past incidents)

This breakdown tells the reviewer not just that the PR is risky, but why. A high blast radius score with low complexity suggests the change itself is straightforward but has wide reach. A high security score with low blast radius suggests a focused change that carries vulnerability risk. Each pattern suggests a different review strategy.

Calibration and Customization

The default weighting works well for most teams, but CodeDig supports customization:

  • Weight adjustment — Teams that care more about security can increase the security dimension weight
  • Threshold configuration — The high-risk threshold can be lowered or raised per repository
  • Custom rules — Teams can add their own risk signals, such as "any PR touching the payment module is automatically medium risk"

The Goal: Informed Review, Not Gatekeeping

A risk score is not a merge gate. CodeDig does not block PRs based on the score. The goal is to give every reviewer — regardless of seniority — the same visibility into risk that the most experienced engineer on the team would have. The decision to merge stays with the reviewer. But now the reviewer has the full picture.

See Your Own PR Risk Scores

Install CodeDig and open a PR. The risk score and breakdown will appear in the PR comment within seconds. You can compare the score to your own intuition and see how the four dimensions match your team's risk patterns.