trueno-cuda-edge: GPU Edge-Case Testing
trueno-cuda-edge is a GPU edge-case test framework implementing Popperian falsificationism for CUDA/GPU code. It provides 5 falsification frameworks with a 50-point verification checklist.
Overview
GPU code is notoriously difficult to test due to:
- Non-deterministic behavior
- Hardware-dependent edge cases
- Complex lifecycle management
- Numerical precision variations
trueno-cuda-edge addresses these challenges with systematic falsification testing that integrates with batuta’s orchestration pipelines.
Integration with Batuta
Batuta orchestrates GPU workloads across the Sovereign AI Stack. trueno-cuda-edge validates that these orchestrations handle GPU edge cases correctly.
Pipeline Validation
Use trueno-cuda-edge to validate batuta’s GPU backend selection:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::shmem_prober::{ComputeCapability, shared_memory_limit, check_allocation};
// Validate backend selection considers GPU capabilities
let ampere = ComputeCapability::new(8, 0);
assert_eq!(shared_memory_limit(ampere), 164 * 1024); // 164 KB
// Check allocation fits before dispatching
check_allocation(ampere, 128 * 1024)?;
}
Null Pointer Safety
Prevent null pointer bugs in GPU memory operations:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::null_fuzzer::{NonNullDevicePtr, InjectionStrategy, NullFuzzerConfig};
// Type-safe device pointer that rejects null at construction
let ptr = NonNullDevicePtr::<f32>::new(0x7f00_0000_0000)?;
assert!(NonNullDevicePtr::<f32>::new(0).is_err());
// Fault injection for testing error handling
let config = NullFuzzerConfig {
strategy: InjectionStrategy::Periodic { interval: 10 },
total_calls: 1000,
fail_fast: false,
};
}
ML Converter Quantization Parity
Validate CPU/GPU numerical parity in batuta’s ML converters:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::quant_oracle::{QuantFormat, check_values_parity, ParityConfig};
// Format-specific tolerances
assert_eq!(QuantFormat::Q4K.tolerance(), 0.05); // 5% for 4-bit
assert_eq!(QuantFormat::Q6K.tolerance(), 0.01); // 1% for 6-bit
// Compare CPU and GPU results
let config = ParityConfig::new(QuantFormat::Q4K);
let report = check_values_parity(&cpu_values, &gpu_values, &config);
assert!(report.passed());
}
PTX Kernel Validation
Validate PTX kernels generated by trueno:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::ptx_poison::{PtxVerifier, PtxMutator, default_mutators};
let verifier = PtxVerifier::new();
// Structural verification (6 checks)
let verified = verifier.verify(ptx_source)?;
// Mutation testing with 8 operators
let mutators = default_mutators();
let mutated = PtxMutator::FlipAddSub.apply(ptx_source);
}
Falsification Frameworks
F1: Null Pointer Sentinel Fuzzer
NonNullDevicePtr<T>: Type-safe device pointerInjectionStrategy: Periodic, SizeThreshold, Probabilistic, TargetedNullSentinelFuzzer: State machine for null injection
F2: Shared Memory Boundary Prober
ComputeCapability: GPU capability detectionshared_memory_limit(): SM-specific limitscheck_allocation(): Validate before dispatch
F3: Context Lifecycle Chaos
ChaosScenario: 8 lifecycle edge casesContextLeakDetector: Memory leak detection- 1 MB tolerance for driver allocations
F4: Quantization Parity Oracle
QuantFormat: Q4K, Q5K, Q6K, Q8_0, F16, F32BoundaryValueGenerator: Edge case inputscheck_values_parity(): CPU/GPU comparison
F5: PTX Compilation Poison Trap
PtxVerifier: 6 structural checksPtxMutator: 8 mutation operators- Mutation score calculation
50-Point Falsification Protocol
Track verification coverage:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::falsification::{FalsificationReport, all_claims};
let mut report = FalsificationReport::new();
// Mark claims as verified during testing
report.mark_verified("NF-001"); // Null fuzzer claim
report.mark_verified("QO-001"); // Quantization oracle claim
// Track coverage
println!("Coverage: {:.1}%", report.coverage() * 100.0);
assert!(report.coverage() >= 0.80); // 80% minimum for release
}
Supervision Integration
Erlang OTP-style supervision for GPU workers:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::supervisor::{
SupervisionStrategy, SupervisionTree, GpuHealthMonitor, HeartbeatStatus
};
// OneForOne: isolated restarts
let mut tree = SupervisionTree::new(SupervisionStrategy::OneForOne, 4);
// Health monitoring
let monitor = GpuHealthMonitor::builder()
.max_missed(3)
.throttle_temp(85)
.shutdown_temp(95)
.build();
// Check worker health
let action = monitor.check_status(HeartbeatStatus::MissedBeats(2));
}