Academic basis

CCPA's design draws on several lines of prior work. Each is cited where its idea informs a specific gate or technique.

Distillation framing

Hinton et al., 1503.02531Distilling the Knowledge in a Neural Network

CCPA treats claude as the teacher and apr code as the student. The "knowledge" being distilled is the action stream — sequences of tool calls, not output logits. This generalizes the original logit-distillation framing to the agentic-execution setting.

Metamorphic testing of ML systems

Segura et al., 2208.08227METTLE: Metamorphic Testing of Deep Learning Systems

LLMORPH, 2603.23611Cataloged Metamorphic Relations for NLP

A metamorphic relation says: "if input X maps to output Y, then transformation T(X) should map to f(Y)." CCPA's per-tool equivalence rules are metamorphic relations specialized to action streams:

  • Bash(cmd) and Bash(canonical_form(cmd)) should produce equivalent file-system mutations
  • Write(path, content) and Edit(path, old, new) that produce the same file SHA256 are file-mutation-equivalent
  • etc.

The DriftCategory taxonomy maps onto Segura's metamorphic-violation severity scale.

Differential testing

2207.11976Differential Testing of Deep Learning Frameworks

CCPA is a differential test of apr code against claude — two implementations of the same logical specification (agentic coding), measured by paired-execution divergence. The static path's compute_parity_score IS a differential-testing scoring function.

Function-scale outcome parity

MultiPL-E, 2208.08227 — Cassano et al.

evidence/phase-3/multipl-e-rust-scores.json records the M150 function-scale measurement (n=5, parity=1.0000) using the MultiPL-E-Rust HumanEval subset. The benchmark is unmodified from upstream.

Project-scale Arena

SWE-bench, 2310.06770 — Jimenez et al.

SWE-bench formalized the "can LLMs resolve real GitHub issues" measurement at project-scale. CCPA's Phase 5 corpus is hand-curated in the SWE-bench style (real GitHub-issue Rust fixtures), but smaller (n=5) for operator-coordinated dispatch cost reasons. Phase 6's under-contract regime adds the compliance-cost dimension that SWE-bench doesn't address.

Chaos engineering for LLM systems

2505.03096Chaos Engineering for LLM Systems

CCPA's regression-corpus design (deliberate drift, must-fail) is in the spirit of chaos engineering: introduce a known failure mode and verify the meter catches it. The M196-M224 4-bug stack is the empirical justification for this practice.

Sovereignty / data-residency

No single paper drives the sovereignty gate (CCPA-006). The design is informed by the broader privacy-engineering literature on differential-privacy boundaries and the FedRAMP / HIPAA classes of "data must not leave the trust boundary" guarantees. The Tier3 SovereigntyViolation category formalizes the boundary.

Per-gate mapping

See docs/specifications/academic-basis.md for the per-gate citation table — every gate has a paper that motivated its design or that it specializes.