apr-diagnose

Automated Five Whys root-cause analysis on training checkpoints. Detects symptoms (high loss, NaN gradients, slow convergence, memory spikes, overfitting) and traces through a 5-level diagnostic chain to the root cause.

cargo run --example cli_apr_diagnose -- --demo

CLI equivalent: apr diagnose checkpoint_epoch_10.apr