What is CCPA?
CCPA — the Claude Code Parity for apr code harness — is a measurement system. It does one job: produce a falsifiable, contract-gated parity score between two AI coding agents.
- Teacher (the reference): Claude Code — Anthropic's official CLI, treated as the orchestrator and the action-stream baseline.
- Student (the sovereign system under test):
apr code— a locally-hosted, pure-Rust coding agent that runs against a local GGUF model with no data leaving the machine.
What "parity" means here
Parity is not "the two systems produce identical bytes." Parity is action-stream semantic equivalence under a per-tool rule set.
For each pair of trace records — teacher and student — the differ asks:
- Did they invoke the same logical tool? (
Bash↔Bash,Write↔Write, etc.) - Did the tool inputs differ in ways that matter? (commands semantically equivalent? file paths normalized? content byte-equal or text-equivalent?)
- Did the resulting file-system mutations agree? (hash-checked)
- Did the OS-event trace agree, modulo allowed nondeterminism?
A parity score in [0.0, 1.0] plus a closed enum of DriftCategory for any mismatch is the output. The score and category are mechanically asserted by FALSIFY-CCPA-004 through FALSIFY-CCPA-008.
What CCPA is NOT
- Not a benchmark suite for general LLMs. The corpus is curated for the
apr code↔claudeparity question. SWE-bench, HumanEval, and similar exist for general benchmarking. - Not a record-from-API tool. The original HTTPS-proxy recording path is intentionally out of scope post-M222 directive.
claudeis driven as a subprocess via session-based auth (claude login); CCPA does not useANTHROPIC_API_KEYand does not call the Anthropic API directly. - Not a unit-test framework for
claude. It's a parity harness — the meter between two systems.
Three deliverables, one repository
| Deliverable | What it is | Where it lives |
|---|---|---|
| The differ | ccpa-differ crate + ccpa diff / ccpa corpus CLI | crates/ccpa-differ/ |
| The Arena runner | ccpa-arena crate + ccpa-arena-bench binary | crates/ccpa-arena/ |
| The fixtures | Canonical, regression, project-scale, calibration-and-scale, under-contract | fixtures/ |
All three are governed by one contract YAML — see Methodology.