Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Entropy Routing

trueno-ublk uses Shannon entropy analysis to route data to the optimal compression backend.

Shannon Entropy

Shannon entropy measures the information density of data in bits per byte. The formula is:

H(X) = -Σ p(x) * log2(p(x))

Where p(x) is the probability of each byte value occurring.

Entropy (bits/byte)Data TypeCompressibility
0.0All same valueExtremely high
1.0-3.0Simple patternsVery high
4.0-6.0Text, codeGood
7.0-7.5Compressed dataPoor
8.0Random/encryptedNone

Routing Strategy

trueno-ublk routes pages based on their entropy:

┌──────────────────────────────────────────────────────────┐
│                    Incoming Page                          │
│                         │                                 │
│                    Calculate Entropy                      │
│                         │                                 │
│           ┌─────────────┼─────────────┐                  │
│           ▼             ▼             ▼                  │
│      entropy < 4.0  4.0 ≤ e ≤ 7.0  entropy > 7.0        │
│           │             │             │                  │
│           ▼             ▼             ▼                  │
│      GPU Batch      SIMD Path    Scalar/Skip            │
│    (highly comp.)  (normal data)  (incompress.)         │
└──────────────────────────────────────────────────────────┘

GPU Batch Path (entropy < 4.0)

  • Used for highly compressible data
  • Benefits from GPU parallelism
  • Batches multiple pages for efficiency
  • Best for: zeros, repeating patterns, sparse data

SIMD Path (4.0 ≤ entropy ≤ 7.0)

  • Used for typical data
  • AVX2/AVX-512/NEON acceleration
  • Single-page processing
  • Best for: text, code, structured data

Scalar/Skip Path (entropy > 7.0)

  • Used for incompressible data
  • Avoids wasting CPU cycles
  • May store uncompressed
  • Best for: encrypted data, already-compressed media

Configuration

Set the entropy threshold when creating a device:

#![allow(unused)]
fn main() {
use trueno_ublk::BlockDevice;
use trueno_zram_core::{Algorithm, CompressorBuilder};

let compressor = CompressorBuilder::new()
    .algorithm(Algorithm::Lz4)
    .build()?;

// Lower threshold = more aggressive SIMD usage
let device = BlockDevice::with_entropy_threshold(
    1 << 30,    // Size
    compressor,
    6.5,        // Custom threshold (default is 7.0)
);
}

Monitoring Entropy

Use the CLI to analyze device entropy:

# Show entropy distribution
trueno-ublk entropy /dev/ublkb0

# Output:
# Entropy Distribution:
#   0.0-2.0 bits: ████████████████████ 45% (GPU)
#   2.0-4.0 bits: ████████ 18% (GPU)
#   4.0-6.0 bits: ██████████ 22% (SIMD)
#   6.0-7.0 bits: ████ 10% (SIMD)
#   7.0-8.0 bits: ██ 5% (Scalar)
#
# Average entropy: 3.2 bits/byte
# Recommended threshold: 7.0

Statistics

Track routing decisions via stats:

#![allow(unused)]
fn main() {
let stats = device.stats();

println!("Routing Statistics:");
println!("  GPU pages:    {} ({:.1}%)",
    stats.gpu_pages,
    100.0 * stats.gpu_pages as f64 / stats.pages_stored as f64);
println!("  SIMD pages:   {} ({:.1}%)",
    stats.simd_pages,
    100.0 * stats.simd_pages as f64 / stats.pages_stored as f64);
println!("  Scalar pages: {} ({:.1}%)",
    stats.scalar_pages,
    100.0 * stats.scalar_pages as f64 / stats.pages_stored as f64);
println!("  Zero pages:   {} ({:.1}%)",
    stats.zero_pages,
    100.0 * stats.zero_pages as f64 / stats.pages_stored as f64);
}

Zero-Page Optimization

All-zero pages receive special handling regardless of entropy:

#![allow(unused)]
fn main() {
// Zero pages are detected and deduplicated
let zeros = vec![0u8; PAGE_SIZE];
device.write(0, &zeros)?;
device.write(PAGE_SIZE as u64, &zeros)?;
device.write(2 * PAGE_SIZE as u64, &zeros)?;

let stats = device.stats();
// All three pages share the same zero-page representation
assert_eq!(stats.zero_pages, 3);
}

This achieves effective 2048:1 compression for sparse data like freshly-allocated memory.