Benchmark Comparison

This example demonstrates how to compare performance across different implementations and configurations in the aprender ecosystem.

Overview

The bench_comparison example provides a standardized way to measure and compare:

  • CPU vs GPU performance
  • Different quantization levels (Q4_K, Q8, F16, F32)
  • Inference throughput (tokens per second)
  • Memory bandwidth utilization

Running the Benchmark

cargo run --release --example bench_comparison

Key Metrics

MetricDescription
tok/sTokens generated per second
BandwidthMemory throughput (GB/s)
LatencyTime per token (ms)
Efficiency% of theoretical peak

See Also