Property-Based Testing
Property-based testing automatically discovers edge cases by generating thousands of random test inputs and verifying that certain properties (invariants) always hold true. pforge uses 12 property-based tests with 10,000 iterations each, totaling 120,000 automated test cases that would be infeasible to write manually.
Property-Based Testing Philosophy
Traditional example-based testing tests specific cases. Property-based testing tests universal truths:
Approach | Example-Based | Property-Based |
---|---|---|
Test cases | Hand-written | Auto-generated |
Coverage | Specific scenarios | Wide input space |
Edge cases | Manual discovery | Automatic discovery |
Count | Dozens | Thousands |
Failures | Show bug | Find + minimize example |
The Power of Properties
A single property test replaces hundreds of example tests:
// Example-based: Test specific cases
#[test]
fn test_config_roundtrip_example1() {
let config = /* specific config */;
let yaml = serde_yml::to_string(&config).unwrap();
let parsed: ForgeConfig = serde_yml::from_str(&yaml).unwrap();
assert_eq!(config.name, parsed.name);
}
#[test]
fn test_config_roundtrip_example2() { /* ... */ }
// ... hundreds more examples needed ...
// Property-based: Test universal property
proptest! {
#[test]
fn config_serialization_roundtrip(config in arb_forge_config()) {
// Tests 10,000 random configs automatically!
let yaml = serde_yml::to_string(&config)?;
let parsed: ForgeConfig = serde_yml::from_str(&yaml)?;
prop_assert_eq!(config.forge.name, parsed.forge.name);
}
}
Setup and Configuration
pforge uses the proptest
crate for property-based testing:
# Cargo.toml
[dev-dependencies]
proptest = "1.0"
Proptest Configuration
proptest! {
#![proptest_config(ProptestConfig {
cases: 10000, // Run 10K iterations per property
max_shrink_iters: 10000, // Minimize failing examples
..ProptestConfig::default()
})]
#[test]
fn my_property(input in arb_my_type()) {
// Test logic...
}
}
Arbitrary Generators
Generators create random test data. pforge has custom generators for all config types:
Simple Type Generators
fn arb_simple_type() -> impl Strategy<Value = SimpleType> {
prop_oneof![
Just(SimpleType::String),
Just(SimpleType::Integer),
Just(SimpleType::Float),
Just(SimpleType::Boolean),
Just(SimpleType::Array),
Just(SimpleType::Object),
]
}
fn arb_transport_type() -> impl Strategy<Value = TransportType> {
prop_oneof![
Just(TransportType::Stdio),
Just(TransportType::Sse),
Just(TransportType::WebSocket),
]
}
fn arb_optimization_level() -> impl Strategy<Value = OptimizationLevel> {
prop_oneof![
Just(OptimizationLevel::Debug),
Just(OptimizationLevel::Release),
]
}
Structured Generators
fn arb_forge_metadata() -> impl Strategy<Value = ForgeMetadata> {
(
"[a-z][a-z0-9_-]{2,20}", // Name regex
"[0-9]\\.[0-9]\\.[0-9]", // Version regex
arb_transport_type(),
arb_optimization_level(),
)
.prop_map(|(name, version, transport, optimization)| ForgeMetadata {
name,
version,
transport,
optimization,
})
}
fn arb_handler_ref() -> impl Strategy<Value = HandlerRef> {
"[a-z][a-z0-9_]{2,10}::[a-z][a-z0-9_]{2,10}"
.prop_map(|path| HandlerRef { path, inline: None })
}
fn arb_param_schema() -> impl Strategy<Value = ParamSchema> {
prop::collection::hash_map(
"[a-z][a-z0-9_]{2,15}", // Field names
arb_simple_type().prop_map(ParamType::Simple),
0..5, // 0-5 fields
)
.prop_map(|fields| ParamSchema { fields })
}
Complex Generators with Constraints
fn arb_forge_config() -> impl Strategy<Value = ForgeConfig> {
(
arb_forge_metadata(),
prop::collection::vec(arb_tool_def(), 1..10),
)
.prop_map(|(forge, tools)| {
// Ensure unique tool names (constraint)
let mut unique_tools = Vec::new();
let mut seen_names = std::collections::HashSet::new();
for tool in tools {
let name = tool.name();
if seen_names.insert(name.to_string()) {
unique_tools.push(tool);
}
}
ForgeConfig {
forge,
tools: unique_tools,
resources: vec![],
prompts: vec![],
state: None,
}
})
}
pforge’s 12 Properties
Category 1: Configuration Properties (6 tests)
Property 1: Serialization Roundtrip
Invariant: Serializing and deserializing a config preserves its structure.
proptest! {
#[test]
fn config_serialization_roundtrip(config in arb_forge_config()) {
// YAML roundtrip
let yaml = serde_yml::to_string(&config).unwrap();
let parsed: ForgeConfig = serde_yml::from_str(&yaml).unwrap();
// Key properties preserved
prop_assert_eq!(&config.forge.name, &parsed.forge.name);
prop_assert_eq!(&config.forge.version, &parsed.forge.version);
prop_assert_eq!(config.tools.len(), parsed.tools.len());
}
}
Edge cases found: Empty strings, special characters, Unicode in names.
Property 2: Tool Name Uniqueness
Invariant: After validation, all tool names are unique.
proptest! {
#[test]
fn tool_names_unique(config in arb_forge_config()) {
let mut names = std::collections::HashSet::new();
for tool in &config.tools {
prop_assert!(names.insert(tool.name()));
}
}
}
Edge cases found: Case sensitivity, whitespace differences.
Property 3: Valid Configs Pass Validation
Invariant: Configs generated by our generators always pass validation.
proptest! {
#[test]
fn valid_configs_pass_validation(config in arb_forge_config()) {
let result = validate_config(&config);
prop_assert!(result.is_ok(), "Valid config failed validation: {:?}", result);
}
}
Edge cases found: Empty tool lists, minimal configs.
Property 4: Handler Paths Contain Separator
Invariant: Native tool handler paths always contain ::
.
proptest! {
#[test]
fn native_handler_paths_valid(config in arb_forge_config()) {
for tool in &config.tools {
if let ToolDef::Native { handler, .. } = tool {
prop_assert!(handler.path.contains("::"),
"Handler path '{}' doesn't contain ::", handler.path);
}
}
}
}
Edge cases found: Single-segment paths, paths with multiple separators.
Property 5: Transport Types Serialize Correctly
Invariant: Transport types roundtrip through serialization.
proptest! {
#[test]
fn transport_types_valid(config in arb_forge_config()) {
let yaml = serde_yml::to_string(&config.forge.transport).unwrap();
let parsed: TransportType = serde_yml::from_str(&yaml).unwrap();
prop_assert_eq!(config.forge.transport, parsed);
}
}
Property 6: Tool Names Follow Conventions
Invariant: Tool names are lowercase alphanumeric with hyphens/underscores, length 3-50.
proptest! {
#[test]
fn tool_names_follow_conventions(config in arb_forge_config()) {
for tool in &config.tools {
let name = tool.name();
prop_assert!(name.chars().all(|c|
c.is_ascii_lowercase() || c.is_ascii_digit() || c == '-' || c == '_'
), "Tool name '{}' doesn't follow conventions", name);
prop_assert!(name.len() >= 3 && name.len() <= 50,
"Tool name '{}' length {} not in range 3-50", name, name.len());
}
}
}
Category 2: Validation Properties (2 tests)
Property 7: Duplicate Names Always Rejected
Invariant: Configs with duplicate tool names always fail validation.
proptest! {
#[test]
fn duplicate_tool_names_rejected(name in "[a-z][a-z0-9_-]{2,20}") {
let config = ForgeConfig {
forge: create_test_metadata(),
tools: vec![
ToolDef::Native {
name: name.clone(),
description: "Tool 1".to_string(),
handler: HandlerRef { path: "mod1::handler".to_string(), inline: None },
params: ParamSchema { fields: HashMap::new() },
timeout_ms: None,
},
ToolDef::Native {
name: name.clone(), // Duplicate!
description: "Tool 2".to_string(),
handler: HandlerRef { path: "mod2::handler".to_string(), inline: None },
params: ParamSchema { fields: HashMap::new() },
timeout_ms: None,
},
],
resources: vec![],
prompts: vec![],
state: None,
};
let result = validate_config(&config);
prop_assert!(result.is_err(), "Duplicate names should fail validation");
prop_assert!(matches!(result.unwrap_err(), ConfigError::DuplicateToolName(_)));
}
}
Property 8: Invalid Handler Paths Rejected
Invariant: Handler paths without ::
are always rejected.
proptest! {
#[test]
fn invalid_handler_paths_rejected(path in "[a-z]{3,20}") {
// Path without :: should fail
let config = create_config_with_handler_path(path);
let result = validate_config(&config);
prop_assert!(result.is_err(), "Invalid handler path should fail validation");
}
}
Category 3: Edge Case Properties (2 tests)
Property 9: Empty Configs Valid
Invariant: Configs with only metadata (no tools) are valid.
proptest! {
#[test]
fn empty_config_valid(forge in arb_forge_metadata()) {
let config = ForgeConfig {
forge,
tools: vec![],
resources: vec![],
prompts: vec![],
state: None,
};
let result = validate_config(&config);
prop_assert!(result.is_ok(), "Empty config should be valid");
}
}
Property 10: Single Tool Configs Valid
Invariant: Any config with exactly one tool is valid.
proptest! {
#[test]
fn single_tool_valid(forge in arb_forge_metadata(), tool in arb_tool_def()) {
let config = ForgeConfig {
forge,
tools: vec![tool],
resources: vec![],
prompts: vec![],
state: None,
};
let result = validate_config(&config);
prop_assert!(result.is_ok(), "Single tool config should be valid");
}
}
Category 4: Type System Properties (2 tests)
Property 11: HTTP Methods Serialize Correctly
proptest! {
#[test]
fn http_methods_valid(method in arb_http_method()) {
let yaml = serde_yml::to_string(&method).unwrap();
let parsed: HttpMethod = serde_yml::from_str(&yaml).unwrap();
prop_assert_eq!(method, parsed);
}
}
Property 12: Optimization Levels Consistent
proptest! {
#[test]
fn optimization_levels_consistent(level in arb_optimization_level()) {
let yaml = serde_yml::to_string(&level).unwrap();
let parsed: OptimizationLevel = serde_yml::from_str(&yaml).unwrap();
prop_assert_eq!(level, parsed);
}
}
Shrinking: Minimal Failing Examples
When a property fails, proptest shrinks the input to find the minimal example:
// Property fails with complex config
Config {
name: "xyz_server_test_123",
tools: [tool1, tool2, tool3, tool4],
...
}
// Proptest shrinks to minimal failing case
Config {
name: "a", // Minimal failing name
tools: [], // Minimal failing tools
...
}
Shrunk examples are persisted in proptest-regressions/
to prevent regressions.
Running Property Tests
Basic Commands
# Run all property tests (10K cases each)
cargo test --test property_test
# Run specific property
cargo test --test property_test config_serialization_roundtrip
# Run with more cases
PROPTEST_CASES=100000 cargo test --test property_test
# Run with seed for reproducibility
PROPTEST_SEED=1234567890 cargo test --test property_test
Release Mode
Property tests run faster in release mode:
# Recommended: Run in release mode
cargo test --test property_test --release -- --test-threads=1
This is the default in Makefile
:
make test-property
Regression Files
Failed tests are saved in proptest-regressions/
:
crates/pforge-integration-tests/
└── proptest-regressions/
└── property_test.txt # Failing cases
Example regression file:
# Seeds for failing test cases. Edit at your own risk.
# property: config_serialization_roundtrip
xs 3582691854 1234567890
Important: Commit regression files to git! They ensure failures don’t reoccur.
Writing New Properties
Step 1: Define Generator
fn arb_my_type() -> impl Strategy<Value = MyType> {
(
arb_field1(),
arb_field2(),
).prop_map(|(field1, field2)| MyType { field1, field2 })
}
Step 2: Write Property
proptest! {
#[test]
fn my_property(input in arb_my_type()) {
let result = my_function(input);
prop_assert!(result.is_ok());
}
}
Step 3: Run and Refine
cargo test --test property_test my_property
If failures occur:
- Check if property is actually true
- Adjust generator constraints
- Fix implementation bugs
- Commit regression file
Property Testing Best Practices
1. Test Universal Truths
// Good: Universal property
proptest! {
#[test]
fn serialize_deserialize_roundtrip(x in any::<MyType>()) {
let json = serde_json::to_string(&x)?;
let y: MyType = serde_json::from_str(&json)?;
prop_assert_eq!(x, y); // Always true
}
}
// Bad: Specific assertion
proptest! {
#[test]
fn bad_property(x in any::<i32>()) {
prop_assert_eq!(x, 42); // Only true 1/2^32 times!
}
}
2. Use Meaningful Generators
// Good: Generates valid data
fn arb_email() -> impl Strategy<Value = String> {
"[a-z]{1,10}@[a-z]{1,10}\\.(com|org|net)"
}
// Bad: Most generated strings aren't emails
fn arb_email_bad() -> impl Strategy<Value = String> {
any::<String>() // Generates random bytes
}
3. Add Constraints to Generators
fn arb_positive_number() -> impl Strategy<Value = i32> {
1..=i32::MAX // Constrained range
}
fn arb_non_empty_vec<T: Arbitrary>() -> impl Strategy<Value = Vec<T>> {
prop::collection::vec(any::<T>(), 1..100) // At least 1 element
}
4. Test Error Conditions
proptest! {
#[test]
fn invalid_input_rejected(bad_input in arb_invalid_input()) {
let result = validate(bad_input);
prop_assert!(result.is_err()); // Should always fail
}
}
Benefits and Limitations
Benefits
- Comprehensive: 10K+ cases per property vs ~10 manual examples
- Edge case discovery: Finds bugs humans miss
- Regression prevention: Failing cases saved automatically
- Documentation: Properties describe system invariants
- Confidence: Mathematical proof of correctness over input space
Limitations
- Slower: 10K iterations takes seconds vs milliseconds for unit tests
- Complexity: Generators can be complex to write
- False positives: Properties must be precisely stated
- Non-determinism: Random failures can be hard to debug (use seeds!)
Integration with CI/CD
Property tests run in CI but with fewer iterations for speed:
# .github/workflows/quality.yml
- name: Property tests
run: |
PROPTEST_CASES=1000 cargo test --test property_test --release
Locally, run full 10K iterations:
make test-property # Uses 10K cases
Real-World Impact
Property-based testing has found real bugs in pforge:
- Unicode handling: Tool names with emoji crashed parser
- Empty configs: Validation rejected valid empty tool lists
- Case sensitivity: Duplicate detection was case-sensitive
- Whitespace: Leading/trailing whitespace in names caused issues
- Nesting depth: Deeply nested param schemas caused stack overflow
All caught by property tests before reaching production!
Summary
Property-based testing provides massive test coverage with minimal code:
- 12 properties generate 120,000 test cases
- Automatic edge case discovery finds bugs humans miss
- Shrinking provides minimal failing examples
- Regression prevention through persisted failing cases
- Mathematical rigor proves invariants hold
Combined with unit tests (Chapter 9.1) and integration tests (Chapter 9.2), property-based testing ensures pforge’s configuration system is rock-solid. Next, Chapter 9.4 covers mutation testing to validate that our tests are actually effective.
Further Reading
- Proptest Book
- QuickCheck Paper - Original property testing paper
- Hypothesis - Python property testing
- pforge property tests:
crates/pforge-integration-tests/property_test.rs