What is CCPA?

CCPA — the Claude Code Parity for apr code harness — is a measurement system. It does one job: produce a falsifiable, contract-gated parity score between two AI coding agents.

  • Teacher (the reference): Claude Code — Anthropic's official CLI, treated as the orchestrator and the action-stream baseline.
  • Student (the sovereign system under test): apr code — a locally-hosted, pure-Rust coding agent that runs against a local GGUF model with no data leaving the machine.

What "parity" means here

Parity is not "the two systems produce identical bytes." Parity is action-stream semantic equivalence under a per-tool rule set.

For each pair of trace records — teacher and student — the differ asks:

  • Did they invoke the same logical tool? (BashBash, WriteWrite, etc.)
  • Did the tool inputs differ in ways that matter? (commands semantically equivalent? file paths normalized? content byte-equal or text-equivalent?)
  • Did the resulting file-system mutations agree? (hash-checked)
  • Did the OS-event trace agree, modulo allowed nondeterminism?

A parity score in [0.0, 1.0] plus a closed enum of DriftCategory for any mismatch is the output. The score and category are mechanically asserted by FALSIFY-CCPA-004 through FALSIFY-CCPA-008.

What CCPA is NOT

  • Not a benchmark suite for general LLMs. The corpus is curated for the apr codeclaude parity question. SWE-bench, HumanEval, and similar exist for general benchmarking.
  • Not a record-from-API tool. The original HTTPS-proxy recording path is intentionally out of scope post-M222 directive. claude is driven as a subprocess via session-based auth (claude login); CCPA does not use ANTHROPIC_API_KEY and does not call the Anthropic API directly.
  • Not a unit-test framework for claude. It's a parity harness — the meter between two systems.

Three deliverables, one repository

DeliverableWhat it isWhere it lives
The differccpa-differ crate + ccpa diff / ccpa corpus CLIcrates/ccpa-differ/
The Arena runnerccpa-arena crate + ccpa-arena-bench binarycrates/ccpa-arena/
The fixturesCanonical, regression, project-scale, calibration-and-scale, under-contractfixtures/

All three are governed by one contract YAML — see Methodology.