KV Store Compression (GH-5)
Trueno-DB provides transparent LZ4/ZSTD compression for KV stores, reducing memory footprint by 5-10x for typical LLM KV caches.
Quick Start
use trueno_db::kv::{CompressedKvStore, Compression, KvStore, MemoryKvStore};
#[tokio::main]
async fn main() -> trueno_db::Result<()> {
// Wrap any KvStore with transparent compression
let inner = MemoryKvStore::new();
let store = CompressedKvStore::new(inner, Compression::Lz4);
// Use like any other KvStore - compression is transparent
store.set("key", b"value".to_vec()).await?;
let value = store.get("key").await?;
Ok(())
}
Compression Algorithms
LZ4 (Default)
- Speed: ~500 MB/s compression, ~1.5 GB/s decompression
- Ratio: 2-4x for typical data
- Use case: Real-time KV caches, low-latency requirements
let store = CompressedKvStore::new(inner, Compression::Lz4);
ZSTD
- Speed: ~150 MB/s compression, ~500 MB/s decompression
- Ratio: 3-6x for typical data
- Use case: Storage-constrained environments, cold storage
let store = CompressedKvStore::new(inner, Compression::Zstd);
LLM KV Cache Use Case
LLM attention mechanisms cache key/value tensors for each layer. For a typical model:
| Component | Size per Token | 512 Tokens | 2048 Tokens |
|---|---|---|---|
| Uncompressed | ~4 KB | 2 MB | 8 MB |
| LZ4 | ~1.5 KB | 768 KB | 3 MB |
| ZSTD | ~1 KB | 512 KB | 2 MB |
Example: 12-Layer Model
use trueno_db::kv::{CompressedKvStore, Compression, KvStore, MemoryKvStore};
let store = CompressedKvStore::new(MemoryKvStore::new(), Compression::Lz4);
// Store KV cache for each layer
for layer in 0..12 {
let kv_cache: Vec<u8> = compute_attention(layer);
store.set(&format!("layer:{layer}:kv"), kv_cache).await?;
}
// Memory usage reduced by ~3x with LZ4
API Reference
Compression Enum
pub enum Compression {
Lz4, // Fast compression (default)
Zstd, // Better ratio
}
impl Compression {
pub const fn as_str(&self) -> &'static str;
pub fn compress(&self, data: &[u8]) -> Result<Vec<u8>>;
pub fn decompress(&self, data: &[u8]) -> Result<Vec<u8>>;
}
CompressedKvStore
pub struct CompressedKvStore<S: KvStore> {
// Wraps any KvStore with transparent compression
}
impl<S: KvStore> CompressedKvStore<S> {
pub const fn new(inner: S, compression: Compression) -> Self;
pub const fn inner(&self) -> &S;
pub const fn compression(&self) -> Compression;
}
// Implements KvStore trait - all methods work transparently
impl<S: KvStore> KvStore for CompressedKvStore<S> { ... }
Feature Flag
Compression requires the compression feature:
[dependencies]
trueno-db = { version = "0.3.9", features = ["compression"] }
Running the Example
cargo run --example compressed_kv --features compression
Benchmarks
Tested on synthetic KV cache data (2MB per entry):
| Algorithm | Compression | Decompression | Ratio |
|---|---|---|---|
| LZ4 | 4.2 ms | 1.8 ms | 2.8x |
| ZSTD | 12.3 ms | 4.1 ms | 4.2x |
Choose LZ4 for latency-sensitive workloads, ZSTD for storage optimization.