Testing Strategies
Testing is a core pillar of pforge’s quality philosophy. With 115 comprehensive tests across multiple layers and strategies, pforge ensures production-ready reliability through a rigorous, multi-faceted testing approach that combines traditional and advanced testing methodologies.
The pforge Testing Philosophy
pforge’s testing strategy is built on three foundational principles:
- Extreme TDD: 5-minute cycles (RED → GREEN → REFACTOR) with quality gates at every step
- Defense in Depth: Multiple layers of testing catch different classes of bugs
- Quality as Code: Tests are first-class citizens, with coverage targets and mutation testing enforcement
This chapter provides a comprehensive guide to pforge’s testing pyramid and how each layer contributes to overall system quality.
The Testing Pyramid
pforge implements a balanced testing pyramid that ensures comprehensive coverage without sacrificing speed or maintainability:
/\
/ \ Property-Based Tests (12 tests, 10K cases each)
/____\ ├─ Config serialization properties
/ \ ├─ Handler dispatch invariants
/ \ └─ Validation consistency
/__________\
/ \ Integration Tests (26 tests)
/ \ ├─ Multi-crate workflows
/ \ ├─ Middleware chains
/____Unit Tests____\ └─ End-to-end scenarios
/ \
/______________________\ Unit Tests (74 tests, <1ms each)
├─ Config parsing
├─ Handler registry
├─ Code generation
└─ Type validation
Test Distribution
- 74 Unit Tests: Fast, focused tests of individual components (<1ms each)
- 26 Integration Tests: Cross-crate and system-level tests (<100ms each)
- 12 Property-Based Tests: Automated edge-case discovery (10,000 iterations each)
- 5 Doctests: Executable documentation examples
- 8 Quality Gate Tests: PMAT integration and enforcement
Total: 115 tests ensuring comprehensive coverage at every level.
Performance Targets
pforge enforces strict performance requirements for tests to maintain rapid feedback cycles:
Test Type | Target | Actual | Enforcement |
---|---|---|---|
Unit tests | <1ms | <1ms | CI enforced |
Integration tests | <100ms | 15-50ms | CI enforced |
Property tests | <5s per property | 2-4s | Local only |
Full test suite | <30s | ~15s | CI enforced |
Coverage generation | <2min | ~90s | Makefile target |
Fast tests enable the 5-minute TDD cycle that drives pforge development.
Quality Metrics
pforge enforces industry-leading quality standards through automated gates:
Coverage Requirements
- Line Coverage: ≥80% (currently ~85%)
- Branch Coverage: ≥75% (currently ~78%)
- Mutation Kill Rate: ≥90% target with cargo-mutants
Complexity Limits
- Cyclomatic Complexity: ≤20 per function
- Cognitive Complexity: ≤15 per function
- Technical Debt Grade (TDG): ≥0.75
Zero Tolerance
- No unwrap(): Production code must handle all errors explicitly
- No panic!(): All panics confined to test code only
- No SATD: Self-Admitted Technical Debt comments blocked in PRs
Test Organization
pforge tests are organized by scope and purpose:
pforge/
├── crates/*/src/**/*.rs # Unit tests (inline #[cfg(test)] modules)
├── crates/*/tests/*.rs # Crate-level integration tests
├── crates/pforge-integration-tests/
│ ├── integration_test.rs # Cross-crate integration
│ └── property_test.rs # Property-based tests
└── crates/pforge-cli/tests/
└── scaffold_tests.rs # CLI integration tests
Test Module Structure
Each source file includes inline unit tests:
// In crates/pforge-runtime/src/registry.rs
pub struct HandlerRegistry {
// Implementation...
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_registry_lookup() {
// Fast, focused test (<1ms)
}
#[tokio::test]
async fn test_async_dispatch() {
// Async test with tokio runtime
}
}
Running Tests
Quick Test Commands
# Run all tests (unit + integration + doctests)
make test
# Run only unit tests (fastest feedback)
cargo test --lib
# Run specific test
cargo test test_name
# Run tests in specific crate
cargo test -p pforge-runtime
# Run with verbose output
cargo test -- --nocapture
Continuous Testing
pforge provides a watch mode for extreme TDD:
# Watch mode: auto-run tests on file changes
make watch
# Manual watch with cargo-watch
cargo watch -x 'test --lib --quiet' -x 'clippy --quiet'
Tests re-run automatically on file save, providing <1s feedback for unit tests.
Coverage Analysis
# Generate comprehensive coverage report
make coverage
# View summary
make coverage-summary
# Open HTML report in browser
make coverage-open
Coverage generation uses cargo-llvm-cov
with cargo-nextest
for accurate, fast results.
Quality Gates
Every commit must pass the quality gate:
# Run full quality gate (CI equivalent)
make quality-gate
This runs:
cargo fmt --check
- Code formattingcargo clippy -- -D warnings
- Linting with zero warningscargo test --all
- All testscargo llvm-cov
- Coverage check (≥80%)pmat analyze complexity --max 20
- Complexity enforcementpmat analyze satd
- Technical debt detectionpmat tdg
- Technical Debt Grade (≥0.75)
Development is blocked if any gate fails (Jidoka/“stop the line” principle).
Pre-Commit Hooks
pforge uses git hooks to enforce quality before commits:
# Located at: .git/hooks/pre-commit
#!/bin/bash
set -e
echo "Running pre-commit quality gates..."
# Format check
cargo fmt --check || (echo "Run 'cargo fmt' first" && exit 1)
# Clippy
cargo clippy --all-targets -- -D warnings
# Tests
cargo test --all
# PMAT checks
pmat quality-gate run
echo "✅ All quality gates passed!"
Commits are rejected if any check fails, ensuring the main branch always passes CI.
Continuous Integration
GitHub Actions runs comprehensive tests on every PR:
# .github/workflows/quality.yml
jobs:
quality:
runs-on: ubuntu-latest
steps:
- name: Run quality gate
run: make quality-gate
- name: Mutation testing
run: cargo mutants --check
- name: Upload coverage
uses: codecov/codecov-action@v3
CI enforces:
- All tests pass on multiple platforms (Linux, macOS, Windows)
- Coverage ≥80%
- Zero clippy warnings
- PMAT quality gates pass
- Mutation testing achieves ≥90% kill rate
Test-Driven Development
pforge uses Extreme TDD with strict 5-minute cycles:
The 5-Minute Cycle
- RED (2 min): Write a failing test
- GREEN (2 min): Write minimum code to pass
- REFACTOR (1 min): Clean up, run quality gates
- COMMIT: If gates pass
- RESET: If cycle exceeds 5 minutes, start over
Example TDD Session
// RED: Write failing test (2 min)
#[test]
fn test_config_validation_rejects_duplicates() {
let config = create_config_with_duplicate_tools();
let result = validate_config(&config);
assert!(result.is_err()); // FAILS: validation not implemented
}
// GREEN: Implement minimal solution (2 min)
pub fn validate_config(config: &ForgeConfig) -> Result<()> {
let mut seen = HashSet::new();
for tool in &config.tools {
if !seen.insert(tool.name()) {
return Err(ConfigError::DuplicateToolName(tool.name()));
}
}
Ok(())
}
// REFACTOR: Clean up (1 min)
// - Add documentation
// - Run clippy
// - Check complexity
// - Commit if all gates pass
Benefits of Extreme TDD
- Rapid Feedback: <1s for unit tests
- Quality Built In: Tests written first ensure comprehensive coverage
- Prevention Over Detection: Bugs caught at creation time
- Living Documentation: Tests document expected behavior
Testing Best Practices
Unit Test Guidelines
- Fast: Each test must complete in <1ms
- Focused: Test one behavior per test
- Isolated: No shared state between tests
- Deterministic: Same input always produces same result
- Clear: Test name describes what’s being tested
#[test]
fn test_handler_registry_returns_error_for_unknown_tool() {
let registry = HandlerRegistry::new();
let result = registry.get("nonexistent");
assert!(result.is_err());
assert!(matches!(result.unwrap_err(), Error::ToolNotFound(_)));
}
Integration Test Guidelines
- Realistic: Test real workflows
- Efficient: Target <100ms per test
- Comprehensive: Cover all integration points
- Independent: Each test can run in isolation
#[tokio::test]
async fn test_middleware_chain_with_recovery() {
let mut chain = MiddlewareChain::new();
chain.add(Arc::new(ValidationMiddleware::new(vec!["input".to_string()])));
chain.add(Arc::new(RecoveryMiddleware::new()));
let result = chain.execute(json!({"input": 42}), handler).await;
assert!(result.is_ok());
}
Property Test Guidelines
- Universal: Test properties that hold for all inputs
- Diverse: Generate wide range of test cases
- Persistent: Save failing cases for regression prevention
- Exhaustive: Run thousands of iterations (10K default)
proptest! {
#[test]
fn config_serialization_roundtrip(config in arb_forge_config()) {
let yaml = serde_yml::to_string(&config)?;
let parsed: ForgeConfig = serde_yml::from_str(&yaml)?;
prop_assert_eq!(config.forge.name, parsed.forge.name);
}
}
Common Testing Patterns
Testing Error Paths
All error paths must be tested:
#[test]
fn test_handler_timeout_returns_timeout_error() {
let handler = create_slow_handler();
let result = execute_with_timeout(handler, Duration::from_millis(10));
assert!(matches!(result.unwrap_err(), Error::Timeout(_)));
}
Testing Async Code
Use #[tokio::test]
for async tests:
#[tokio::test]
async fn test_concurrent_handler_dispatch() {
let registry = create_registry();
let handles: Vec<_> = (0..100)
.map(|i| tokio::spawn(registry.dispatch("tool", ¶ms(i))))
.collect();
for handle in handles {
assert!(handle.await.unwrap().is_ok());
}
}
Testing State Management
Isolate state between tests:
#[tokio::test]
async fn test_state_persistence() {
let state = MemoryStateManager::new();
state.set("key", b"value".to_vec(), None).await?;
assert_eq!(state.get("key").await?, Some(b"value".to_vec()));
state.delete("key").await?;
assert_eq!(state.get("key").await?, None);
}
Debugging Failed Tests
Verbose Output
# Show println! output
cargo test -- --nocapture
# Show test names as they run
cargo test -- --nocapture --test-threads=1
Running Single Tests
# Run specific test
cargo test test_config_validation
# Run with backtrace
RUST_BACKTRACE=1 cargo test test_config_validation
# Run with full backtrace
RUST_BACKTRACE=full cargo test test_config_validation
Test Filtering
# Run all tests matching pattern
cargo test config
# Run tests in specific module
cargo test registry::tests
# Run ignored tests
cargo test -- --ignored
Summary
pforge’s testing strategy ensures production-ready quality through:
- 115 comprehensive tests across all layers
- Multiple testing strategies: unit, integration, property-based, mutation
- Strict quality gates: coverage, complexity, TDD enforcement
- Fast feedback loops: <1ms unit tests, <15s full suite
- Continuous quality: pre-commit hooks, CI/CD pipeline
The following chapters provide detailed guides for each testing layer:
- Chapter 9.1: Unit Testing - Fast, focused component tests
- Chapter 9.2: Integration Testing - Cross-crate and system tests
- Chapter 9.3: Property-Based Testing - Automated edge case discovery
- Chapter 9.4: Mutation Testing - Validating test effectiveness
Together, these strategies ensure pforge maintains the highest quality standards while enabling rapid, confident development.