Case Study: Tracing Memory Paging with Renacer
Use renacer to understand and optimize memory paging behavior in ML model loading. This case study demonstrates syscall-level profiling of aprender's bundle module.
Quick Start
# Build the demo
cargo build --example bundle_trace_demo
# Trace file operations with timing
renacer -e trace=file -T -c -- ./target/debug/examples/bundle_trace_demo
Why Trace Memory Paging?
When deploying ML models with memory constraints, you need to understand:
- When models are loaded from disk
- How much I/O is happening
- Which evictions are occurring
- Whether pre-fetching is effective
Renacer provides syscall-level visibility into these operations.
The Bundle Trace Demo
//! examples/bundle_trace_demo.rs
use aprender::bundle::{BundleBuilder, PagedBundle, PagingConfig};
fn main() {
// Create bundle with 3 models (1300 bytes total)
let bundle = BundleBuilder::new("/tmp/demo.apbundle")
.add_model("encoder", vec![1u8; 500])
.add_model("decoder", vec![2u8; 500])
.add_model("classifier", vec![3u8; 300])
.build().unwrap();
// Load with 1KB memory limit (forces paging)
let config = PagingConfig::new()
.with_max_memory(1024)
.with_prefetch(false);
let mut paged = PagedBundle::open("/tmp/demo.apbundle", config).unwrap();
// Access models - observe paging behavior
let _ = paged.get_model("encoder"); // Load: 500 bytes
let _ = paged.get_model("decoder"); // Load: 500 bytes (total: 1000)
let _ = paged.get_model("classifier"); // Evict encoder, load: 300 bytes
}
Tracing with Renacer
Basic File Trace
$ renacer -e trace=file -T -- ./target/debug/examples/bundle_trace_demo
openat("/tmp/demo.apbundle", O_CREAT|O_WRONLY) = 3 <0.000054>
write(3, ..., 1424) = 1424 <0.000019>
close(3) = 0 <0.000011>
openat("/tmp/demo.apbundle", O_RDONLY) = 3 <0.000011>
read(3, ..., 8192) = 1424 <0.000008>
lseek(3, 20, SEEK_SET) = 20 <0.000008>
read(3, ..., 8192) = 1404 <0.000008>
lseek(3, 124, SEEK_SET) = 124 <0.000008>
read(3, ..., 8192) = 1300 <0.000008>
...
What we see:
openat+write- Bundle creation (1424 bytes)openat+read- Initial manifest load- Multiple
lseek+readpairs - On-demand model loading
Summary Statistics
$ renacer -e trace=file -T -c -- ./target/debug/examples/bundle_trace_demo
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
36.86 0.000258 8 32 write
19.71 0.000138 8 17 read
8.29 0.000058 7 8 close
7.57 0.000053 6 8 lseek
17.29 0.000121 15 8 openat
4.86 0.000034 6 5 newfstatat
4.14 0.000029 29 1 unlink
------ ----------- ----------- --------- --------- ----------------
100.00 0.000700 8 80 1 total
Key metrics:
- 32 writes: Stdout output + bundle creation
- 17 reads: Manifest + model data reads
- 8 lseek: Seeking to different model offsets
- 8 openat: Library loading + bundle file access
Source Correlation
$ renacer -s -e trace=file -T -- ./target/debug/examples/bundle_trace_demo
openat("/tmp/demo.apbundle", O_RDONLY) = 3 <0.000011>
at src/bundle/format.rs:87 # BundleReader::open()
read(3, ..., 8192) = 1424 <0.000008>
at src/bundle/format.rs:102 # read_manifest()
lseek(3, 124, SEEK_SET) = 124 <0.000008>
at src/bundle/format.rs:156 # read_model()
With -s, renacer shows which source lines triggered each syscall.
Analyzing Paging Behavior
Detecting Evictions
When memory limit is exceeded, you'll see additional reads:
# First access to "encoder" (miss)
lseek(3, 124, SEEK_SET) = 124
read(3, ..., 8192) = 500
# Second access to "decoder" (miss)
lseek(3, 624, SEEK_SET) = 624
read(3, ..., 8192) = 500
# Third access to "classifier" - encoder evicted first
lseek(3, 1124, SEEK_SET) = 1124
read(3, ..., 8192) = 300
# Re-access "encoder" - must reload (was evicted)
lseek(3, 124, SEEK_SET) = 124
read(3, ..., 8192) = 500
The repeated lseek to offset 124 indicates the encoder was evicted and reloaded.
Measuring Hit Rate Impact
# Poor hit rate (thrashing)
$ renacer -c -e trace=read,lseek -- ./thrashing_workload
read: 150 calls # Many reloads
lseek: 150 calls
# Good hit rate (cached)
$ renacer -c -e trace=read,lseek -- ./sequential_workload
read: 5 calls # Load once
lseek: 5 calls
Pre-fetch Analysis
With pre-fetching enabled:
let config = PagingConfig::new()
.with_prefetch(true)
.with_prefetch_count(2);
Trace shows speculative reads:
# Access "encoder"
lseek(3, 124, ...) read(3, ...) = 500 # Requested
# Pre-fetch kicks in
lseek(3, 624, ...) read(3, ...) = 500 # Speculative (decoder)
lseek(3, 1124, ...) read(3, ...) = 300 # Speculative (classifier)
# Later access to "decoder" - no I/O (cached from pre-fetch)
# (no lseek/read syscalls)
Optimization Patterns
Pattern 1: Reduce Seeks
Problem: Many small models = many seeks
% time syscall
45% lseek # Too many seeks!
40% read
Solution: Batch small models together or increase page size
Pattern 2: Right-Size Memory Limit
Problem: Memory limit too small = thrashing
read: 500 calls # Constant reloading
evictions: 200 # High eviction count
Solution: Increase memory limit or reduce model sizes
// Before: 1KB limit, 1300 bytes of models
let config = PagingConfig::new().with_max_memory(1024);
// After: 2KB limit, fits all models
let config = PagingConfig::new().with_max_memory(2048);
Pattern 3: Enable Pre-fetching for Sequential Access
Problem: Sequential access pattern with cache misses
# Model A accessed, then B, then C - each is a miss
miss, miss, miss
Solution: Enable pre-fetching
let config = PagingConfig::new()
.with_prefetch(true)
.with_prefetch_count(2);
JSON Output for Analysis
Export traces for programmatic analysis:
$ renacer --format json -e trace=file -- ./bundle_demo > trace.json
{
"syscalls": [
{
"name": "openat",
"args": ["/tmp/demo.apbundle", "O_RDONLY"],
"result": 3,
"duration_us": 11
},
{
"name": "lseek",
"args": [3, 124, "SEEK_SET"],
"result": 124,
"duration_us": 8
}
],
"summary": {
"total_time_us": 700,
"syscall_counts": {"read": 17, "lseek": 8}
}
}
Integration with aprender Stats
Combine renacer traces with aprender's built-in statistics:
let stats = bundle.stats();
println!("Hits: {}, Misses: {}, Evictions: {}",
stats.hits, stats.misses, stats.evictions);
println!("Hit rate: {:.1}%", stats.hit_rate() * 100.0);
Output:
Hits: 47, Misses: 3, Evictions: 1
Hit rate: 94.0%
Cross-reference with renacer:
- 3 misses = 3
lseek+readpairs for model data - 1 eviction = model reloaded later (additional
lseek+read)
Troubleshooting Guide
| Symptom | Renacer Shows | Fix |
|---|---|---|
| Slow first load | Many read syscalls | Enable pre-fetching |
| Thrashing | Repeated lseek to same offset | Increase memory limit |
| High latency | Large duration_us values | Use SSD, reduce model size |
| OOM after paging | Memory syscalls fail | Reduce max_memory setting |
Complete Workflow
# 1. Build with debug symbols
cargo build --example bundle_trace_demo
# 2. Baseline run (see program output)
./target/debug/examples/bundle_trace_demo
# 3. Trace file operations
renacer -e trace=file -T -c -- ./target/debug/examples/bundle_trace_demo
# 4. Detailed trace with source
renacer -s -e trace=file -T -- ./target/debug/examples/bundle_trace_demo
# 5. Export for analysis
renacer --format json -e trace=file -- ./target/debug/examples/bundle_trace_demo > trace.json
# 6. Compare different configurations
renacer -c -e trace=file -- ./target/debug/examples/bundle_1kb_limit
renacer -c -e trace=file -- ./target/debug/examples/bundle_10kb_limit
Key Takeaways
- Use
-cfor quick overview - Shows syscall distribution - Use
-Tfor timing - Identifies slow operations - Use
-sfor debugging - Maps syscalls to source code - Focus on
lseek+readpairs - These indicate model loads - Watch for repeated seeks - Indicates eviction and reload
- Compare configurations - Measure impact of tuning
See Also
- Model Bundling and Memory Paging - Bundle module API
- AI Shell Completion - Real-world paging usage
- renacer Documentation - Full tracer reference