M292 — Agent-Text-Loop detector

Date: 2026-05-21

Source PR: CCPA#260 (merged)

Companion PR: CCPA#261 (M293; env-var wiring)

What it adds

A new ArenaOutcome variant + an opt-in detector that catches the M291 failure signature (consecutive text-only turns) before the full 20-turn budget is consumed.

ArenaOutcome::AgentTextLoop

AgentTextLoop {
    consecutive_text_turns: u32,
    last_text_excerpt: String,    // first 200 chars of the most recent text turn
}

Captures the "talking but not acting" failure class distinctly from OracleFailedAfterMaxTurns.

ArenaSession::with_max_consecutive_text_turns(cap)

Builder method. cap=0 (default) disables the detector — preserves M287/M291 baseline behavior. Operators opt in per-run.

AgentTextLoopState rolling counter

Parallel to ComplianceTrapState. Pure logic:

  • Text invocation → increment counter, snapshot the excerpt.
  • Non-text invocation (Bash/Read/Write/Edit/etc.) → reset counter, clear excerpt.
  • When counter reaches cap → return AgentTextLoop outcome with current excerpt.

Test coverage (7 new tests)

  • agent_text_loop_state_increments_on_text — counter increments, trap fires at cap
  • agent_text_loop_state_resets_on_non_text — Bash invocation resets the counter; subsequent text starts at 1
  • agent_text_loop_state_excerpt_truncates_long_text — 500-char input → excerpt ≤200 chars + ellipsis
  • run_agent_text_loop_disabled_by_default_preserves_baselinecap=0 (default) → text-only turns run to max_turnsOracleFailedAfterMaxTurns
  • run_agent_text_loop_fires_at_cap_when_enabled — 5 text turns with cap=3 → AgentTextLoop after turn 3; history has 3 records
  • run_agent_text_loop_resets_counter_on_tool_use — 2 text + 1 bash + 2 text + 1 bash pattern → no trap (counter resets twice) → runs to max_turns
  • with_max_consecutive_text_turns_accessor_returns_configured_cap + max_consecutive_text_turns_default_is_zero_disabled

All 146 ccpa-arena lib tests still pass.

Opt-in by design

The detector defaults to cap=0 (disabled) because:

  1. Existing benches in evidence/under-contract*/ should remain comparable to new runs — turning the detector on by default would change outcome distributions for control comparisons.
  2. Future operators may want to test agents at the full 20-turn budget for non-V1_004 reasons (e.g., turn-cost ratio measurement).
  3. Phase 6 compliance_cost_ratio aggregate sums over a specific set of outcome variants; adding a new one to the default execution path could silently change the aggregate.

Operator interface (M293)

scripts/phase-6-bench.sh now reads PHASE6_MAX_CONSECUTIVE_TEXT_TURNS (default 0 = disabled). When > 0, threads --max-consecutive-text-turns=N into the ccpa-arena-bench invocation.

# Default — baseline behavior, no detector
bash scripts/phase-6-bench.sh

# Opt in — bail at 5 consecutive text-only turns
PHASE6_MAX_CONSECUTIVE_TEXT_TURNS=5 bash scripts/phase-6-bench.sh

Why this matters

Before M292, the M291 failure signature ("agent emits text for all 20 turns, never invokes a tool") was conflated with OracleFailedAfterMaxTurns — same outcome variant as "agent worked but produced wrong output." That conflation lost signal.

After M292, an operator inspecting scores.json can distinguish:

  • OracleFailedAfterMaxTurns → agent tried, wrong output
  • AgentTextLoop → agent didn't engage at all

This is the kind of diagnostic precision that lets the next experiment be designed correctly (the M294 finetune-A/B was scoped specifically because M291's text-loop signature is what M292 measures).

What this does NOT do

  • Doesn't auto-enable in scripts/phase-6-bench.sh (operator decision per-run).
  • Doesn't change compliance_cost_ratio / recovery_rate semantics (AgentTextLoop counts as "not oracle_passed", same as OracleFailedAfterMaxTurns).
  • Doesn't discharge V1_004 — student_pass_rate > 0 is still the bar.