Quality Gates

Trueno enforces rigorous quality gates following Toyota Production System principles (Jidoka, Genchi Genbutsu). This chapter documents the quality enforcement mechanisms implemented in TRUENO-SPEC-013.

Overview

Quality gates are automated checks that must pass before code can be committed or merged:

GateThresholdEnforcement
Test Coverage≥90% (95% for releases)Pre-commit hook
Mutation Score≥80%Tier 3 / Nightly
PMAT TDG GradeB+ (85/100)Pre-commit hook
Bashrs Linting0 errorsPre-commit hook
Smoke TestsAll passPre-merge

Coverage Requirements

Minimum Thresholds

Daily Commits:  ≥90% line coverage
Releases:       ≥95% line coverage (TRUENO-SPEC-013)

Running Coverage

# Generate coverage report (<5 minutes)
make coverage

# View HTML report
open target/coverage/html/index.html

Coverage Breakdown

The coverage report shows per-crate metrics:

trueno:     92.44%  (core library)
trueno-gpu: 93.12%  (GPU/CUDA backend)

Technical Notes

Coverage instrumentation requires disabling the mold linker:

# The Makefile handles this automatically:
# 1. Backs up ~/.cargo/config.toml
# 2. Runs tests with llvm-cov
# 3. Restores config

Smoke Tests (TRUENO-SPEC-013)

Smoke tests verify backend equivalence across SIMD, WGPU, and CUDA:

# Run all smoke tests
make smoke

# Individual backend tests
cargo test --test smoke_e2e smoke_simd -- --nocapture
cargo test --test smoke_e2e smoke_wgpu --features gpu -- --nocapture

Smoke Test Categories

  1. SIMD Backend Tests

    • Vector add, mul, dot product
    • ReLU, Softmax activations
    • L2 norm computation
  2. WGPU Backend Tests (requires GPU)

    • Vector operations (100K+ elements)
    • Matrix multiplication (256x256+)
  3. Backend Equivalence Tests

    • Scalar vs Auto backend comparison
    • Floating-point tolerance: 1e-5
  4. Edge Case Tests (Poka-Yoke)

    • Empty inputs
    • Single element
    • Non-aligned sizes (17 elements)
    • NaN/Infinity propagation

Pixel FKR Tests (Falsification Kernel Regression)

Pixel FKR tests catch GPU kernel bugs using Popperian falsification methodology:

# Run all pixel FKR tests
make pixel-fkr-all

# Individual suites
make pixel-scalar-fkr   # Baseline truth
make pixel-simd-fkr     # SIMD vs scalar
make pixel-wgpu-fkr     # WGPU vs scalar
make pixel-ptx-fkr      # PTX validation (CUDA)

FKR Test Suites

SuitePurposeTolerance
scalar-pixel-fkrGolden baselineExact
simd-pixel-fkrSIMD correctness±1 ULP
wgpu-pixel-fkrGPU correctness±2 ULP
ptx-pixel-fkrPTX validationStatic analysis

Realizer Operations Tested

  • RMS Normalization
  • SiLU Activation
  • Softmax
  • RoPE (Rotary Position Embedding)
  • Causal Mask
  • Q4_K Dequantization

Pre-Commit Hook

The pre-commit hook (.git/hooks/pre-commit) enforces all quality gates:

# Gates checked on every commit:
1. PMAT TDG regression check
2. PMAT TDG quality check (B+ minimum)
3. Bashrs linting (Makefile, shell scripts)
4. Coverage threshold (≥90%)
# Only for emergencies - document why in commit message
git commit --no-verify

Tiered Quality Workflow

Trueno uses a tiered approach inspired by certeza (97.7% mutation score):

Tier 1: On-Save (Sub-second)

make tier1
# Checks: cargo check, clippy (lib), unit tests, property tests (10 cases)

Tier 2: On-Commit (1-5 minutes)

make tier2
# Checks: fmt, full clippy, all tests, property tests (256 cases), coverage, TDG

Tier 3: On-Merge/Nightly (Hours)

make tier3
# Checks: tier2 + mutation testing (80%), security audit, full benchmarks

PMAT Integration

PMAT (Pragmatic AI Labs Multi-Agent Toolkit) provides Technical Debt Grading:

# Check TDG grade
pmat analyze tdg --min-grade B+

# Repository health score
pmat repo-score . --min-score 90

TDG Grading Scale

GradeScoreStatus
A93-100Excellent
A-90-92Very Good
B+85-89Good (minimum)
B80-84Acceptable
C<80Needs Work

Toyota Way Principles

Jidoka (Built-in Quality)

Quality is built in through:

  • Pre-commit hooks that stop defects immediately
  • Automated testing at every tier
  • No bypass without explicit override

Genchi Genbutsu (Go and See)

  • Smoke tests run actual code on real hardware
  • Pixel FKR tests verify visual output
  • No simulation - real GPU execution

Poka-Yoke (Error Prevention)

  • Edge case tests prevent common bugs
  • Type system enforces API contracts
  • Clippy warnings are errors

Quick Reference

# Full quality check
make quality-spec-013

# Coverage only
make coverage

# Smoke tests
make smoke

# Pixel FKR
make pixel-fkr-all

# Tier 2 (pre-commit)
make tier2

See Also