Chapter 46: Rust Best Practices (CB-500 to CB-530)
The CB-500 series detects generic Rust defect patterns that apply to any Rust project. These checks were motivated by cross-stack fault analysis of 10 batuta projects that revealed systematic gaps: extreme unwrap density (14.7/file in trueno-rag), missing clippy/deny configurations (5/10 projects), string byte indexing panics on non-ASCII input, and universally low Rust Tooling scores (<55%).
Overview
# Run all compliance checks including CB-500 series
pmat comply check
# Example output:
# ⚠ CB-500: Rust Best Practices (CB-500 to CB-527): [Advisory] 0 errors, 189 warnings, 160 info:
# CB-506: String byte indexing (&str[n..m]) can panic on non-ASCII input (src/lib.rs:214)
# CB-501: 8 unwrap() calls in production code (threshold: 5) (src/parser.rs:0)
# ...
The CB-500 series is advisory — it reports with Warn status but does not block CI or commits. Violations are categorized into three severity tiers:
| Severity | Meaning | Example |
|---|---|---|
| Error | Likely defect in production | >10 unwrap() per file |
| Warning | Code smell, should fix | String byte indexing, panic macros |
| Info | Suggestion, low priority | Missing clippy.toml, no deny.toml |
Defect Taxonomy
Project Configuration (CB-500, CB-503, CB-504, CB-505)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-500 | Publish Hygiene | Warning | Missing exclude in Cargo.toml |
| CB-503 | Clippy Configuration | Info | Missing .clippy.toml or no disallowed-methods |
| CB-504 | Deny Configuration | Info | Missing deny.toml for supply chain security |
| CB-505 | Workspace Lint Hygiene | Warning | Missing [lints] or [workspace.lints] section |
Code Quality (CB-501, CB-502, CB-506, CB-507, CB-508)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-501 | Unwrap Density | Warning/Error | >5 (Warn) or >10 (Error) unwrap() per file |
| CB-502 | Expect Quality | Warning | .expect(""), .expect("failed") — lazy messages |
| CB-506 | String Byte Indexing | Warning | &str[n..m] can panic on non-ASCII input |
| CB-507 | Panic Macros | Warning | todo!(), unimplemented!() in production code |
| CB-508 | Lossy Numeric Casts | Warning | >10 as u8/as i32/etc. casts per file |
Testing & Architecture (CB-509, CB-510, CB-511, CB-512)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-509 | Feature Gate Coverage | Info | Features defined but no CI matrix testing |
| CB-510 | include!() Macro Hygiene | Info | Non-standalone files included via include!() |
| CB-511 | Flaky Timing Tests | Warning | Instant::now() with duration assertions in tests |
| CB-512 | Error Propagation Gap | Warning | Functions returning Result but using unwrap() internally |
Error Handling & Debug Hygiene (CB-513, CB-514, CB-517)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-513 | Silent Error Swallowing | Warning | .unwrap_or_else(|_| and .map_err(|_| discarding error context |
| CB-514 | Debug Eprintln Leaks | Warning | eprintln!("[DEBUG/[DBG/[TRACE in production code |
| CB-517 | Stale Debug Artifacts | Warning | static AtomicUsize/AtomicBool counters, #[allow(unused)] on statics |
Pattern Safety (CB-515, CB-516, CB-518)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-515 | Catch-All Match Default | Warning | _ => returning concrete values instead of errors |
| CB-516 | Hardcoded Magic Numbers | Info | Large numeric literals in Some() or struct field contexts |
| CB-518 | Expensive Clone in Loop | Info | >3 .clone() calls inside for/while/loop bodies |
Data Pipeline & Format Safety (CB-519, CB-520, CB-521)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-519 | Lossy Data Pipeline | Warning | quantize/dequantize or encode/decode round-trips in same function |
| CB-520 | Expensive Init in Hot Path | Warning | ::new()/::open()/::connect() calls inside loop bodies |
| CB-521 | Format Without Magic Bytes | Warning | Binary format parsing without magic byte/header validation |
Robustness & Compatibility (CB-522, CB-523, CB-524, CB-525, CB-526, CB-527)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-522 | Untested Path Normalization | Info | 3+ URL/path manipulation ops without edge case coverage |
| CB-523 | External Config Over Embedded | Info | Filesystem heuristics for config discovery instead of embedded metadata |
| CB-524 | Incomplete Enum Match Coverage | Warning | 3+ wildcard match arms with concrete defaults in one file |
| CB-525 | Hardcoded Field Names | Info | 5+ .get("field") calls without .or_else() alias fallbacks |
| CB-526 | Single-Path File Resolution | Info | path.join("file").exists() without parent/recursive fallback |
| CB-527 | Incomplete Pattern List | Info | 3+ .contains() classification chain — may miss variants |
Numerical Safety (CB-528, CB-530)
| ID | Check | Severity | What it detects |
|---|---|---|---|
| CB-528 | Division by Length | Warning | x / collection.len() without is_empty() or .max(1) guard |
| CB-530 | Log Without Clamp | Warning | .ln(), .log2(), .log10() without .max(epsilon) or .clamp() guard |
Detection Algorithms
CB-500: Publish Hygiene
Checks Cargo.toml for the exclude field that prevents publishing unnecessary files to crates.io:
# ✅ Good: Critical patterns excluded
[package]
exclude = [
"target/",
".profraw",
".profdata",
".vscode/",
".idea/",
".pmat",
"proptest-regressions",
]
# ❌ Bad: No exclude field - publishes everything
[package]
name = "my-crate"
version = "0.1.0"
Three sub-checks:
- Missing
exclude: If neitherexcludenorincludeis present → Warning - Include+Exclude conflict: If both are present → Warning (Cargo ignores
excludewhenincludeis set) - Insufficient patterns: If
excludeexists but covers <3 of 7 critical patterns → Info
CB-501: Unwrap Density
Counts .unwrap() calls per file in production code, excluding test files and #[cfg(test)] regions:
#![allow(unused)] fn main() { // ❌ High density (CB-501 Warning at >5, Error at >10): fn process(data: &str) -> String { let parsed = serde_json::from_str(data).unwrap(); let field = parsed.get("key").unwrap(); let value = field.as_str().unwrap(); let num = value.parse::<i32>().unwrap(); let result = compute(num).unwrap(); format_output(result).unwrap() } // ✅ Better: Use ? operator or contextual errors fn process(data: &str) -> Result<String, Error> { let parsed: Value = serde_json::from_str(data)?; let field = parsed.get("key").ok_or(Error::MissingField("key"))?; let value = field.as_str().ok_or(Error::TypeMismatch)?; let num: i32 = value.parse().map_err(Error::Parse)?; let result = compute(num)?; Ok(format_output(result)) } }
CB-502: Expect Quality
Detects lazy or uninformative .expect() messages. A good expect message explains why the invariant should hold, not just that it failed:
#![allow(unused)] fn main() { // ❌ Lazy messages detected by CB-502: let config = load_config().expect(""); let handle = open_file().expect("failed"); let conn = connect().expect("error"); let val = parse().expect("unexpected"); let item = lookup().expect("should not happen"); // ✅ Informative messages: let config = load_config().expect("config.toml must exist in project root"); let handle = open_file().expect("log file was verified writable in init()"); let conn = connect().expect("database URL validated at startup"); }
Flagged patterns: "", "failed", "error", "unexpected", "should not happen", "todo", "bug", "impossible".
CB-506: String Byte Indexing
Detects &str[n..m] patterns that panic on multi-byte UTF-8 input:
#![allow(unused)] fn main() { // ❌ Panics on non-ASCII (CB-506): let prefix = &name[..3]; let suffix = &text[start..end]; // ✅ Safe alternatives: let prefix = name.get(..3).unwrap_or(name); // Returns None on boundary let prefix = &name.chars().take(3).collect::<String>(); // Character-aware let suffix = text.get(start..end).unwrap_or_default(); // Safe fallback }
Uses regex &\w+\[\d*\.\.\d*\] to detect the pattern. Skips test code and comments.
CB-507: Panic Macros
Detects todo!() and unimplemented!() in production code. These are useful during development but should be replaced before release:
#![allow(unused)] fn main() { // ❌ Panics at runtime (CB-507): fn handle_edge_case(&self) -> Result<()> { todo!() } fn serialize_v2(&self) -> Vec<u8> { unimplemented!() } // ✅ Proper handling: fn handle_edge_case(&self) -> Result<()> { Err(Error::NotSupported("edge case handling")) } fn serialize_v2(&self) -> Vec<u8> { self.serialize_v1() // Fallback to v1 } }
The detector skips macros that appear inside string literals (e.g., "todo!() is a macro").
CB-508: Lossy Numeric Casts
Detects files with >10 as casts to narrower types without bounds checking:
#![allow(unused)] fn main() { // ❌ Lossy casts (CB-508): let byte = large_number as u8; // Silently truncates let small = big_float as f32; // Loses precision let signed = unsigned_val as i32; // Can overflow // ✅ Checked alternatives: let byte = u8::try_from(large_number)?; let small: f32 = big_float as f32; // With #[allow(clippy::cast_possible_truncation)] let signed = i32::try_from(unsigned_val).unwrap_or(i32::MAX); }
Lines with allow(clippy::cast annotations are excluded from the count.
CB-509: Feature Gate Coverage
Projects with >3 features in Cargo.toml should have CI matrix testing to ensure all feature combinations compile:
# ✅ Good: CI tests feature combinations
jobs:
test:
strategy:
matrix:
features: ["default", "full", "minimal", "no-std"]
steps:
- run: cargo test --features ${{ matrix.features }}
CB-510: include!() Macro Hygiene
Flags include!() macro usage because included files are not standalone compilable — they cannot be analyzed by tree-sitter, cause false positives in complexity gates, and break IDE tooling:
#![allow(unused)] fn main() { // ⚠ CB-510 Info: include!("helpers/parse_utils.rs"); // Not standalone compilable // ✅ Better: Use modules mod parse_utils; // Standard module system }
CB-511: Flaky Timing Tests
Detects tests that use Instant::now() with duration assertions, which are inherently flaky under CI load:
#![allow(unused)] fn main() { // ❌ Flaky under CI load (CB-511): #[test] fn test_cache_performance() { let start = Instant::now(); cache.lookup("key"); assert!(start.elapsed() < Duration::from_millis(10)); // Fails on slow CI } // ✅ Test behavior, not timing: #[test] fn test_cache_hit() { cache.insert("key", "value"); assert_eq!(cache.lookup("key"), Some("value")); } }
CB-512: Error Propagation Gap
Detects functions that return Result<T, E> but use .unwrap() >=3 times internally — a sign that error handling is incomplete:
#![allow(unused)] fn main() { // ❌ Returns Result but unwraps internally (CB-512): fn parse_config(path: &Path) -> Result<Config, Error> { let content = fs::read_to_string(path)?; let parsed = toml::from_str(&content).unwrap(); // Why not ? let name = parsed.get("name").unwrap().as_str().unwrap(); // Two more unwraps Ok(Config { name: name.to_string() }) } // ✅ Consistent error propagation: fn parse_config(path: &Path) -> Result<Config, Error> { let content = fs::read_to_string(path)?; let parsed: toml::Value = toml::from_str(&content)?; let name = parsed.get("name") .and_then(|v| v.as_str()) .ok_or(Error::MissingField("name"))?; Ok(Config { name: name.to_string() }) } }
CB-513: Silent Error Swallowing
Detects patterns where errors are intentionally discarded, hiding failure context. Motivated by GH-215 where silent error swallowing in quantization hid data corruption:
#![allow(unused)] fn main() { // ❌ Discards error context (CB-513): let config = load_config().unwrap_or_else(|_| Config::default()); let data = parse(input).map_err(|_| MyError::ParseFailed)?; // ✅ Preserve error context: let config = load_config().unwrap_or_else(|e| { tracing::warn!("config load failed: {e}, using defaults"); Config::default() }); let data = parse(input).map_err(|e| MyError::ParseFailed { source: e })?; }
The |_| closure parameter is the signal — it means the original error is being intentionally thrown away.
CB-514: Debug Eprintln Leaks
Detects debug print statements left in production code. These leak internal state to stderr and indicate incomplete cleanup after debugging sessions:
#![allow(unused)] fn main() { // ❌ Debug output in production (CB-514): eprintln!("[DEBUG] parsing token: {:?}", token); eprintln!("[TRACE] entering function with state={}", state); eprintln!("[DBG] cache size: {}", cache.len()); // ✅ Use structured logging: tracing::debug!(?token, "parsing token"); tracing::trace!(state, "entering function"); log::debug!("cache size: {}", cache.len()); }
CB-515: Catch-All Match Default
Detects _ => match arms that return a concrete value instead of an error, None, or unreachable!(). Motivated by GH-236 where _ => Architecture::Qwen2 caused all unknown model architectures to silently receive wrong configuration:
#![allow(unused)] fn main() { // ❌ Silent default (CB-515): fn get_architecture(name: &str) -> Architecture { match name { "gpt" => Architecture::Gpt, "llama" => Architecture::Llama, _ => Architecture::Qwen2, // All unknowns become Qwen2! } } // ✅ Explicit error on unknown: fn get_architecture(name: &str) -> Result<Architecture, Error> { match name { "gpt" => Ok(Architecture::Gpt), "llama" => Ok(Architecture::Llama), _ => Err(Error::UnknownArchitecture(name.to_string())), } } }
Safe patterns that are not flagged: Err(...), None, unreachable!(), panic!(), return Err(...), Default::default(), bail!(), todo!().
CB-516: Hardcoded Magic Numbers
Detects large numeric literals (>100) in Some() or struct field contexts that likely represent configuration defaults. Motivated by GH-231 where a hardcoded rope_theta: Some(10000.0) default produced garbage output for models requiring different values:
#![allow(unused)] fn main() { // ❌ Hardcoded config defaults (CB-516 Info): Config { rope_theta: Some(10000.0), // Wrong for Qwen2 (uses 1000000.0) max_seq_len: Some(4096), } // ✅ Named constants with documentation: const DEFAULT_ROPE_THETA: f64 = 10000.0; const DEFAULT_MAX_SEQ_LEN: usize = 4096; Config { rope_theta: Some(DEFAULT_ROPE_THETA), max_seq_len: Some(DEFAULT_MAX_SEQ_LEN), } }
This is Info severity — advisory only with expected false positives. Common values (128, 256, 512, 1024, etc.) are excluded.
CB-517: Stale Debug Artifacts
Detects leftover debug instrumentation in production code — static atomic counters and #[allow(unused)] annotations on static variables that were used during debugging and not cleaned up:
#![allow(unused)] fn main() { // ❌ Leftover debug counter (CB-517): static DEBUG_COUNTER: AtomicUsize = AtomicUsize::new(0); fn process() { DEBUG_COUNTER.fetch_add(1, Ordering::Relaxed); } // ❌ Suppressed unused static (CB-517): #[allow(unused)] static TRACE_ENABLED: bool = false; // ✅ Remove debug artifacts before committing, or use proper instrumentation: fn process() { metrics::counter!("process_calls").increment(1); } }
CB-518: Expensive Clone in Loop
Detects loop bodies with >3 .clone() calls, which often indicate that data should be borrowed or restructured to avoid repeated allocation:
#![allow(unused)] fn main() { // ❌ Excessive cloning in loop (CB-518): for item in &items { let name = config.name.clone(); let path = config.path.clone(); let data = config.data.clone(); let meta = config.meta.clone(); process(item, &name, &path, &data, &meta); } // ✅ Clone once before the loop, or borrow: let name = &config.name; let path = &config.path; for item in &items { process(item, name, path, &config.data, &config.meta); } }
This is Info severity — advisory, as some clones are necessary (e.g., sending data across threads).
CB-519: Lossy Data Pipeline
Detects functions containing both halves of a lossy transform pair (e.g., quantize + dequantize), which indicates data being round-tripped through lossy operations. Motivated by aprender GH-215/231/234/237 where GGUF export double-quantized attention weights:
#![allow(unused)] fn main() { // ❌ Lossy round-trip in same function (CB-519): fn convert_tensor(data: &[f32]) -> Vec<u8> { let quantized = quantize_q4(data); // f32 → q4 (lossy) let dequantized = dequantize_q4(&quantized); // q4 → f32 (lossy again!) pack_bytes(&dequantized) } // ✅ Single direction only: fn export_tensor(data: &[f32]) -> Vec<u8> { let quantized = quantize_q4(data); pack_bytes(&quantized) } }
Detected pairs: quantize/dequantize, encode/decode, compress/decompress, serialize/deserialize, pack/unpack, to_bytes/from_bytes, to_f16/to_f32, to_bf16/to_f32.
CB-520: Expensive Init in Hot Path
Detects constructor/load/open calls inside loop bodies where the initialization could be hoisted. Motivated by aprender GH-224 where ChatSession recreated GPU models (5-6GB VRAM upload) on every generate() call:
#![allow(unused)] fn main() { // ❌ Expensive init in loop (CB-520): for item in &items { let client = HttpClient::new(config); // Re-created every iteration let conn = Database::connect("url"); // Re-connected every iteration client.send(item); } // ✅ Hoist initialization: let client = HttpClient::new(config); let conn = Database::connect("url"); for item in &items { client.send(item); } }
Detected patterns: ::new(), ::open(), ::connect(), ::create(), ::load(), ::init(), ::build(), ::from_file(), ::from_path(), File::open(). Threshold: 2+ init calls per loop body.
CB-521: Format Detection Without Magic Bytes
Detects binary format parsing functions that read binary data without validating magic bytes or format signatures. Motivated by aprender GH-213 where .safetensors.index.json was parsed as binary SafeTensors:
#![allow(unused)] fn main() { // ❌ No magic byte validation (CB-521): fn parse_file(reader: &mut impl Read) -> Result<Header, Error> { let mut buf = [0u8; 8]; reader.read_exact(&mut buf)?; // Assumes format is correct let size = u64::from_le_bytes(buf); Ok(Header { size }) } // ✅ Validate magic bytes first: fn parse_file(reader: &mut impl Read) -> Result<Header, Error> { let mut magic = [0u8; 4]; reader.read_exact(&mut magic)?; if &magic != FILE_MAGIC { return Err(Error::InvalidFormat); } let mut buf = [0u8; 8]; reader.read_exact(&mut buf)?; Ok(Header { size: u64::from_le_bytes(buf) }) } }
CB-522: Untested Path Normalization
Detects files with 3+ URL/path manipulation operations (.replace("//"), .strip_prefix("http"), Url::parse(), etc.) which indicates complex normalization logic that needs edge case testing. Motivated by GH-221 where hf:// URLs preserved web path components as file paths.
CB-523: External Config Over Embedded Metadata
Detects filesystem heuristic patterns like path.with_file_name("config.json") that discover configuration from sibling files when the loaded data may already contain embedded metadata. Motivated by GH-222 where apr chat used sibling config.json instead of APR v2 embedded metadata.
CB-524: Incomplete Enum Match Coverage
Detects files with 3+ _ => wildcard match arms returning concrete values, indicating an enum is dispatched on inconsistently across multiple functions. Motivated by GH-233/236 where adding new Architecture variants required updating match arms in 5+ places. This is a code smell — consider #[non_exhaustive] enums or centralizing dispatch logic.
CB-525: Hardcoded Field Names Without Aliases
Detects functions with 5+ .get("field") calls on JSON values without .or_else() fallback aliases. Motivated by GH-235 where load_model_config_from_json only handled HuggingFace field names (hidden_size) but not GPT-2 names (n_embd).
CB-526: Single-Path File Resolution
Detects path.join("filename").exists() patterns without fallback search (parent directory, recursive discovery). Motivated by GH-216 where tokenizer.json wasn’t found in workspace layouts.
CB-527: Incomplete Pattern List for Classification
Detects .contains("x") || .contains("y") || ... chains with 3+ patterns, which are typically data classification logic that may miss variants. Motivated by GH-233B/234 where Rosetta density threshold only recognized "embed" but not "wte", "wpe", "position_embedding". Consider centralizing patterns in a constant array or registry.
CB-528: Division by Length Without Empty Guard
Detects x / collection.len() without a preceding is_empty() or len() > 0 guard. In ML and numerical code, dividing by the length of an empty collection causes division-by-zero: panic for integers, Inf/NaN for floats. Motivated by cross-stack analysis where mean calculations over empty batches silently produced NaN that propagated through training losses.
#![allow(unused)] fn main() { // ❌ Division by zero on empty input (CB-528): fn compute_mean(values: &[f64]) -> f64 { let sum: f64 = values.iter().sum(); sum / values.len() as f64 // NaN when values is empty } fn average_batch(batch: &[Tensor]) -> Tensor { let total = batch.iter().fold(Tensor::zeros(), |a, b| a + b); total / batch.len() as f32 // Inf when batch is empty } // ✅ Guarded alternatives: fn compute_mean(values: &[f64]) -> f64 { if values.is_empty() { return 0.0; } let sum: f64 = values.iter().sum(); sum / values.len() as f64 } fn compute_mean_alt(values: &[f64]) -> f64 { let sum: f64 = values.iter().sum(); sum / values.len().max(1) as f64 // .max(1) prevents zero denominator } }
The detector looks back up to 8 lines for guard patterns: is_empty(), .len() > 0, .len() >= 1, .len() != 0, .len() == 0, and .max(1) on the same line.
CB-530: Log Without Clamp Guard
Detects .ln(), .log2(), .log10() calls without a preceding .max(epsilon) or .clamp() guard. Passing zero or negative values to log functions produces -Inf or NaN, which silently corrupts ML training losses, probability calculations, and information-theoretic metrics. Discovered during 3.4.0 dogfooding where trueno’s scalar backend had 3 unguarded log calls.
#![allow(unused)] fn main() { // ❌ Risk of -Inf/NaN (CB-530): fn cross_entropy(predicted: &[f64], actual: &[f64]) -> f64 { predicted.iter().zip(actual).map(|(p, a)| { -a * p.ln() // -Inf when p == 0.0 }).sum() } fn information_content(probability: f64) -> f64 { -probability.log2() // NaN when probability < 0.0 } fn signal_magnitude(value: f64) -> f64 { value.log10() // -Inf when value == 0.0 } // ✅ Clamped alternatives: fn cross_entropy(predicted: &[f64], actual: &[f64]) -> f64 { predicted.iter().zip(actual).map(|(p, a)| { -a * p.max(1e-10).ln() // Clamped to epsilon }).sum() } fn information_content(probability: f64) -> f64 { -probability.clamp(f64::EPSILON, 1.0).log2() } fn signal_magnitude(value: f64) -> f64 { value.max(f64::EPSILON).log10() } }
The detector recognizes these safe patterns and does not flag them:
.max(epsilon).ln()— epsilon guard on same expression.clamp(low, high).ln()— range clamp before log(1.0 + x).ln()— log of sum with positive constant (always > 0)2.0_f64.ln()— log of known positive literal- Variable guarded within 3 lines:
let x = val.max(1e-10);thenx.ln()
Test Code Exclusion
All file-scanning checks (CB-501, CB-502, CB-506–CB-508, CB-512–CB-528, CB-530) exclude test code using two mechanisms:
- Test file exclusion: Files matching
*_test.rs,*_tests.rs, or under atests/directory - Test region exclusion: Code inside
#[cfg(test)]module blocks within production files
This prevents false positives from test code where .unwrap() and todo!() are acceptable.
Self-Detection Avoidance
The detection code itself uses concat!() to avoid self-detection:
#![allow(unused)] fn main() { // The scanner uses split strings to avoid matching itself: const DOT_UNWRAP: &str = concat!(".unwr", "ap()"); const DOT_EXPECT_QUOTE: &str = concat!(".expe", "ct(\""); }
Remediation Priority
When pmat comply check reports CB-500 violations, fix them in this priority order:
- CB-501 Errors (>10 unwrap/file) — highest crash risk
- CB-530 — log without clamp produces -Inf/NaN that silently corrupts ML losses and metrics
- CB-519 — lossy data pipeline round-trips corrupt model weights (GH-215/231/237)
- CB-528 — division by
.len()without empty guard causes panic (integers) or NaN (floats) - CB-521 — binary format parsing without magic bytes causes crashes (GH-213)
- CB-515, CB-524 — catch-all match arms / incomplete enum coverage (GH-236)
- CB-513 — silent error swallowing hiding data corruption (GH-215)
- CB-520 — expensive initialization in hot path (GH-224)
- CB-512 — functions claiming error handling but not doing it
- CB-506 — string indexing panics on internationalized input
- CB-507 — todo!/unimplemented! left in production
- CB-514, CB-517 — debug artifacts leaked to production
- CB-525 — hardcoded field names without aliases (GH-235)
- CB-502 — lazy expect messages hide root cause during debugging
- CB-508 — lossy casts cause silent data corruption
- CB-500, CB-505 — project configuration hygiene
- CB-503, CB-504, CB-509–CB-511, CB-516, CB-518, CB-522, CB-523, CB-526, CB-527 — informational, fix at leisure
CI/CD Integration
# .github/workflows/rust-best-practices.yml
name: Rust Best Practices
on: [push, pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install PMAT
run: cargo install pmat
- name: Check Rust Best Practices
run: |
OUTPUT=$(pmat comply check 2>&1)
echo "$OUTPUT"
# Fail on Error-severity violations
if echo "$OUTPUT" | grep -q "CB-500.*errors: [1-9]"; then
echo "::error::CB-500 series has Error-severity violations"
exit 1
fi
Academic Foundations
The CB-500 checks are grounded in empirical research on Rust defect patterns:
| Paper | Finding | Applied To |
|---|---|---|
| Xu et al. (2021). “Memory-Safety Challenge Considered Solved?” | 30% of Rust CVEs involve unwrap/expect panics | CB-501, CB-502, CB-512 |
| Qin et al. (2020). “Understanding Memory and Thread Safety Practices” | Unsafe patterns cluster in specific files | CB-507, CB-508 |
| Evans et al. (2020). “Is Rust Used Safely?” | String boundary panics in 18% of crates | CB-506 |
| Zhu et al. (2022). “Learning and Programming Challenges of Rust” | Feature flag complexity is top-5 pain point | CB-509 |
Specification Reference
Full detection logic: src/cli/handlers/comply_cb_detect/rust_best_practices.rs (CB-500–CB-521) and src/cli/handlers/comply_cb_detect/rust_best_practices_extended.rs (CB-522–CB-530)
Aggregate check: src/cli/handlers/comply_cb_detect/check_handlers.rs (check_rust_best_practices)