Chapter 43: File Health and Max-Lines Enforcement (CB-040)

The File Health system enforces code maintainability by preventing excessively large files. Based on Toyota Production System principles and peer-reviewed research, this feature ensures:

  • No file exceeds 500 lines (new files)
  • Existing files cannot grow (ratchet mechanism)
  • Test-to-Lines Ratio (TLR) scaling requirements

The Problem

Large files violate the Single Responsibility Principle and become:

  • Untestable (cognitive overload)
  • Unmaintainable (merge conflicts)
  • Error-prone (complexity hotspots)

Research shows files over 500 lines have 2.4x higher defect rates (Nagappan et al., IEEE TSE 2006).

Quick Start

# Check file health in your project
pmat comply check

# View detailed file health report
pmat comply check --verbose

File Health Metrics

1. File Size Classes

ClassLinesRisk Level
Optimal0-200Low
Acceptable201-500Medium
Warning501-1000High
Critical1001-2000Very High
Emergency2000+Extreme

2. Test-to-Lines Ratio (TLR)

TLR requirements scale with file size:

File SizeRequired TLRRationale
< 100 lines0.3Simple code needs fewer tests
100-300 lines0.5Moderate complexity
300-500 lines0.8Complex code needs more tests
500-1000 lines1.2High complexity penalty
> 1000 lines1.5Critical files need extensive tests

3. File Health Score Formula

Health Score = (Size Score × 0.30) + (TLR Score × 0.40) +
               (Complexity Score × 0.20) + (Stability Score × 0.10)

Where:
- Size Score: 100 - (lines / max_lines × 100)
- TLR Score: min(100, actual_tlr / required_tlr × 100)
- Complexity Score: 100 - (avg_complexity / threshold × 100)
- Stability Score: 100 - (churn_30d / 10 × 100)

4. Health Grades

GradeScore RangeStatus
A+95-100Excellent
A90-94Great
B80-89Good
C70-79Acceptable
D60-69Needs Work
F< 60Critical

Pre-commit Hook Enforcement

The pre-commit hook enforces two rules:

Rule 1: New Files Must Be < 500 Lines

# New file check
if [ "$LINES" -gt "$MAX_LINES_NEW" ]; then
    echo "❌ NEW file $file has $LINES lines (max: $MAX_LINES_NEW)"
    exit 1
fi

Rule 2: Existing Files Cannot Grow (Ratchet)

# Ratchet mechanism - Toyota Way Kaizen
BASELINE=$(git show HEAD:"$file" 2>/dev/null | wc -l || echo 0)
if [ "$LINES" -gt "$BASELINE" ] && [ "$BASELINE" -gt 0 ]; then
    echo "❌ RATCHET: $file grew from $BASELINE to $LINES lines"
    echo "   Files can only shrink or stay the same (Toyota Way: Kaizen)"
    exit 1
fi

Installing the Pre-commit Hook

# Install PMAT hooks (includes file health check)
pmat hooks install

# Verify hook is installed
cat .git/hooks/pre-commit | grep "File Health"

Compliance Check Output

When you run pmat comply check, the file health section shows:

📊 File Health Summary
├── 60 files >2000 lines (CRITICAL)
├── 117 files >1000 lines
├── 459 files >500 lines
├── Average Health Score: 73%
└── Grade: C

Priority Files for Refactoring:
1. analysis_utilities.rs (12,087 lines) - EMERGENCY
2. deep_context.rs (7,211 lines) - EMERGENCY
3. commands.rs (6,273 lines) - EMERGENCY
4. tools.rs (6,111 lines) - EMERGENCY

Toyota Way Principles

Jidoka (Built-in Quality)

Quality is built into the process through automated enforcement at commit time.

Kaizen (Continuous Improvement)

The ratchet mechanism ensures files never grow larger - only improvement is allowed.

Muda (Waste Elimination)

Large files represent waste: duplicated logic, dead code, and cognitive overhead.

Genchi Genbutsu (Go and See)

File health metrics are based on actual measurements, not estimates.

Refactoring Strategies

When a file exceeds limits, use these strategies:

1. Extract Module

#![allow(unused)]
fn main() {
// Before: large_file.rs (2000+ lines)
mod validation;
mod processing;
mod reporting;

// After: validation.rs, processing.rs, reporting.rs (~500 lines each)
}

2. Extract Trait

#![allow(unused)]
fn main() {
// Before: monolithic struct
impl LargeService {
    fn validate(&self) { ... }
    fn process(&self) { ... }
    fn report(&self) { ... }
}

// After: focused traits
trait Validator { fn validate(&self); }
trait Processor { fn process(&self); }
trait Reporter { fn report(&self); }
}

3. Extract Constants

#![allow(unused)]
fn main() {
// Before: inline constants throughout
const TIMEOUT: u64 = 30;
const MAX_RETRIES: u32 = 3;

// After: constants module
mod constants {
    pub const TIMEOUT: u64 = 30;
    pub const MAX_RETRIES: u32 = 3;
}
}

Peer-Reviewed References

  1. Nagappan, N., Ball, T. (2006). “Using Software Dependencies and Churn Metrics to Predict Field Failures.” IEEE TSE.
  2. Zimmermann, T., Nagappan, N. (2008). “Predicting Defects using Network Analysis on Dependency Graphs.” ICSE.
  3. Ohno, T. (1988). “Toyota Production System: Beyond Large-Scale Production.” Productivity Press.
  4. Bird, C., et al. (2011). “Don’t Touch My Code! Examining the Effects of Ownership on Software Quality.” FSE.
  5. Bacchelli, A., Bird, C. (2013). “Expectations, Outcomes, and Challenges of Modern Code Review.” ICSE.

Popperian Falsifiability

The file health system uses testable hypotheses:

Falsifiable Claims

  1. Claim: Files > 500 lines have higher defect rates

    • Test: Compare defect density in small vs large files
    • Threshold: 2x higher in large files
  2. Claim: TLR < 0.5 correlates with bugs

    • Test: Track bug rates by TLR quartile
    • Threshold: Bottom quartile has 3x more bugs
  3. Claim: Ratchet prevents regression

    • Test: Measure average file size over 6 months
    • Threshold: Average must not increase

Configuration

Configure file health thresholds in .pmat/project.toml:

[file-health]
max_lines_new = 500
max_lines_critical = 2000
required_tlr_scaling = true
enforce_ratchet = true

[file-health.thresholds]
optimal = 200
acceptable = 500
warning = 1000
critical = 2000

Summary

File Health enforcement prevents the accumulation of technical debt through:

  1. Hard limits on new file sizes (500 lines)
  2. Ratchet mechanism preventing growth of existing files
  3. TLR scaling requiring more tests for larger files
  4. Health scoring with actionable grades
  5. Pre-commit hooks for automated enforcement

This implements Toyota Way principles (Jidoka, Kaizen, Muda elimination) with evidence-based thresholds from peer-reviewed research.