Academic basis
CCPA's design draws on several lines of prior work. Each is cited where its idea informs a specific gate or technique.
Distillation framing
Hinton et al., 1503.02531 — Distilling the Knowledge in a Neural Network
CCPA treats claude as the teacher and apr code as the student. The "knowledge" being distilled is the action stream — sequences of tool calls, not output logits. This generalizes the original logit-distillation framing to the agentic-execution setting.
Metamorphic testing of ML systems
Segura et al., 2208.08227 — METTLE: Metamorphic Testing of Deep Learning Systems
LLMORPH, 2603.23611 — Cataloged Metamorphic Relations for NLP
A metamorphic relation says: "if input X maps to output Y, then transformation T(X) should map to f(Y)." CCPA's per-tool equivalence rules are metamorphic relations specialized to action streams:
Bash(cmd)andBash(canonical_form(cmd))should produce equivalent file-system mutationsWrite(path, content)andEdit(path, old, new)that produce the same file SHA256 are file-mutation-equivalent- etc.
The DriftCategory taxonomy maps onto Segura's metamorphic-violation severity scale.
Differential testing
2207.11976 — Differential Testing of Deep Learning Frameworks
CCPA is a differential test of apr code against claude — two implementations of the same logical specification (agentic coding), measured by paired-execution divergence. The static path's compute_parity_score IS a differential-testing scoring function.
Function-scale outcome parity
MultiPL-E, 2208.08227 — Cassano et al.
evidence/phase-3/multipl-e-rust-scores.json records the M150 function-scale measurement (n=5, parity=1.0000) using the MultiPL-E-Rust HumanEval subset. The benchmark is unmodified from upstream.
Project-scale Arena
SWE-bench, 2310.06770 — Jimenez et al.
SWE-bench formalized the "can LLMs resolve real GitHub issues" measurement at project-scale. CCPA's Phase 5 corpus is hand-curated in the SWE-bench style (real GitHub-issue Rust fixtures), but smaller (n=5) for operator-coordinated dispatch cost reasons. Phase 6's under-contract regime adds the compliance-cost dimension that SWE-bench doesn't address.
Chaos engineering for LLM systems
2505.03096 — Chaos Engineering for LLM Systems
CCPA's regression-corpus design (deliberate drift, must-fail) is in the spirit of chaos engineering: introduce a known failure mode and verify the meter catches it. The M196-M224 4-bug stack is the empirical justification for this practice.
Sovereignty / data-residency
No single paper drives the sovereignty gate (CCPA-006). The design is informed by the broader privacy-engineering literature on differential-privacy boundaries and the FedRAMP / HIPAA classes of "data must not leave the trust boundary" guarantees. The Tier3 SovereigntyViolation category formalizes the boundary.
Per-gate mapping
See docs/specifications/academic-basis.md for the per-gate citation table — every gate has a paper that motivated its design or that it specializes.