RED: Write Failing Test
The RED phase is where you define what success looks like before writing any production code. You have exactly 2 minutes to write a failing test that clearly specifies the next increment of behavior.
The Purpose of RED
RED is about specification, not testing. The test you write answers the question: “What should the next tiny piece of functionality do?”
Why Tests Come First
Design Pressure: Writing tests first forces you to think from the caller’s perspective. You design interfaces that are pleasant to use, not convenient to implement.
Clear Goal: Before writing implementation, you have a concrete, executable definition of “done.” The test passes = you’re finished.
Prevents Scope Creep: Writing tests first forces you to commit to a small scope before getting distracted by implementation details.
Living Documentation: Tests document intent better than comments. Comments lie; tests are executable and must stay accurate.
The 2-Minute Budget
Two minutes to write a test feels tight. It is. This constraint forces several good practices:
Small Increments: If you can’t write a test in 2 minutes, your increment is too large. Break it down.
Test Template Reuse: You’ll develop a library of test patterns that you can copy and adapt quickly.
No Overthinking: Two minutes prevents analysis paralysis. Write the simplest test that fails for the right reason.
Anatomy of a Good RED Test
A good RED test has three characteristics:
1. Compiles (If Possible)
In typed languages like Rust, the test should compile even if types don’t exist yet. Use comments or temporary stubs:
// COMPILES - Types exist
#[tokio::test]
async fn test_greet_returns_greeting() {
let handler = GreetHandler;
let input = GreetInput {
name: "Alice".to_string(),
};
let result = handler.handle(input).await;
assert!(result.is_ok());
}
If types don’t exist:
// DOESN'T COMPILE YET - Types will be created in GREEN
#[tokio::test]
async fn test_divide_handles_zero() {
let handler = DivideHandler;
let input = DivideInput {
numerator: 10.0,
denominator: 0.0,
};
let result = handler.handle(input).await;
assert!(result.is_err());
// Will be: Error::Validation("Division by zero")
}
Both are valid RED tests. The first runs and fails (returns wrong value). The second doesn’t compile (types missing). Either way, you’re RED.
2. Fails for the Right Reason
The test must fail because the feature doesn’t exist, not because of typos or wrong imports:
// GOOD - Fails because feature missing
#[tokio::test]
async fn test_calculate_mean() {
let handler = StatisticsHandler;
let input = StatsInput {
data: vec![1.0, 2.0, 3.0, 4.0, 5.0],
};
let result = handler.handle(input).await.unwrap();
assert_eq!(result.mean, 3.0);
}
// Fails: field `mean` does not exist in `StatsOutput`
// BAD - Fails because of typo
#[tokio::test]
async fn test_calculate_mean() {
let handler = StatisticsHander; // typo!
// ...
}
// Fails: cannot find struct `StatisticsHander`
Run your test immediately after writing it to verify it fails correctly.
3. Tests One Thing
Each test should verify one specific behavior:
// GOOD - One behavior
#[tokio::test]
async fn test_divide_returns_quotient() {
let handler = DivideHandler;
let input = DivideInput {
numerator: 10.0,
denominator: 2.0,
};
let result = handler.handle(input).await.unwrap();
assert_eq!(result.quotient, 5.0);
}
// GOOD - Different behavior, separate test
#[tokio::test]
async fn test_divide_rejects_zero_denominator() {
let handler = DivideHandler;
let input = DivideInput {
numerator: 10.0,
denominator: 0.0,
};
let result = handler.handle(input).await;
assert!(result.is_err());
}
// BAD - Multiple behaviors in one test
#[tokio::test]
async fn test_divide_everything() {
// Tests division
let result1 = handler.handle(DivideInput { ... }).await.unwrap();
assert_eq!(result1.quotient, 5.0);
// Tests zero handling
let result2 = handler.handle(DivideInput { denominator: 0.0, ... }).await;
assert!(result2.is_err());
// Tests negative numbers
let result3 = handler.handle(DivideInput { numerator: -10.0, ... }).await.unwrap();
assert_eq!(result3.quotient, -5.0);
}
Multiple assertions are fine if they verify the same behavior. Multiple behaviors require separate tests.
Test Naming Conventions
Test names should read as specifications:
// GOOD - Reads as specification
test_greet_returns_personalized_message()
test_divide_rejects_zero_denominator()
test_statistics_calculates_mean_correctly()
test_file_read_handles_missing_file()
test_http_call_retries_on_timeout()
// BAD - Vague or implementation-focused
test_greet()
test_division()
test_math_works()
test_error_case()
test_function_1()
Pattern: test_<subject>_<behavior>_<condition>
Examples:
test_calculator_adds_positive_numbers
test_file_handler_creates_missing_directory
test_api_client_refreshes_expired_token
Quick Test Templates for pforge
Handler Happy Path Template
#[tokio::test]
async fn test_HANDLER_NAME_returns_OUTPUT() {
let handler = HandlerStruct;
let input = InputStruct {
field: value,
};
let result = handler.handle(input).await;
assert!(result.is_ok());
let output = result.unwrap();
assert_eq!(output.field, expected_value);
}
Handler Error Case Template
#[tokio::test]
async fn test_HANDLER_NAME_rejects_INVALID_INPUT() {
let handler = HandlerStruct;
let input = InputStruct {
field: invalid_value,
};
let result = handler.handle(input).await;
assert!(result.is_err());
match result.unwrap_err() {
Error::Validation(msg) => assert!(msg.contains("expected error substring")),
_ => panic!("Wrong error type"),
}
}
Handler Async Operation Template
#[tokio::test]
async fn test_HANDLER_NAME_completes_within_timeout() {
let handler = HandlerStruct;
let input = InputStruct { /* ... */ };
let timeout_duration = std::time::Duration::from_secs(5);
let result = tokio::time::timeout(
timeout_duration,
handler.handle(input)
).await;
assert!(result.is_ok(), "Handler timed out");
assert!(result.unwrap().is_ok());
}
Copy these templates, replace the placeholders, and you have a test in under 2 minutes.
The RED Checklist
Before moving to GREEN, verify:
- Test compiles OR fails to compile for the right reason (missing types)
- Test runs and fails OR doesn’t compile
- Test name clearly describes the behavior being specified
- Test is focused on one specific behavior
- Timer shows less than 2:00 minutes elapsed
If any item is unchecked, refine the test. If the timer exceeds 2:00, RESET.
Common RED Phase Mistakes
Mistake 1: Testing Too Much at Once
// BAD - Too much for one test
#[tokio::test]
async fn test_calculator_all_operations() {
// Addition
assert_eq!(calc.add(2, 3).await.unwrap(), 5);
// Subtraction
assert_eq!(calc.subtract(5, 3).await.unwrap(), 2);
// Multiplication
assert_eq!(calc.multiply(2, 3).await.unwrap(), 6);
// Division
assert_eq!(calc.divide(6, 3).await.unwrap(), 2);
}
Why it’s bad: If this test fails, you don’t know which operation broke. Also, implementing all four operations takes more than 2 minutes (GREEN phase).
Fix: One test per operation.
Mistake 2: Testing Implementation Details
// BAD - Tests internal structure
#[tokio::test]
async fn test_handler_uses_hashmap_internally() {
let handler = CacheHandler::new();
// Somehow peek into internals
assert!(handler.storage.is_hashmap());
}
Why it’s bad: Tests should verify behavior, not implementation. If you refactor from HashMap to BTreeMap, this test breaks even though behavior is unchanged.
Fix: Test observable behavior only.
// GOOD - Tests behavior
#[tokio::test]
async fn test_cache_retrieves_stored_value() {
let handler = CacheHandler::new();
handler.store("key", "value").await.unwrap();
let result = handler.retrieve("key").await.unwrap();
assert_eq!(result, "value");
}
Mistake 3: Complex Test Setup
// BAD - Setup takes too long
#[tokio::test]
async fn test_user_registration() {
// Too much setup
let db = setup_test_database().await;
let email_service = MockEmailService::new();
let password_hasher = Argon2::default();
let config = load_test_config("config.yaml");
let logger = setup_test_logger();
let handler = RegistrationHandler::new(db, email_service, password_hasher, config, logger);
// Test starts here...
}
Why it’s bad: You’ve exceeded 2 minutes just on setup. The test hasn’t even run yet.
Fix: Extract setup to a helper function or use test fixtures:
// GOOD - Fast setup
#[tokio::test]
async fn test_user_registration() {
let handler = create_test_registration_handler().await;
let input = RegistrationInput {
email: "test@example.com".to_string(),
password: "securepass123".to_string(),
};
let result = handler.handle(input).await;
assert!(result.is_ok());
}
// Helper function defined once, reused many times
async fn create_test_registration_handler() -> RegistrationHandler {
let db = setup_test_database().await;
let email_service = MockEmailService::new();
// ... etc
RegistrationHandler::new(db, email_service, /* ... */)
}
Mistake 4: Not Running the Test
Symptom: You write a test, assume it fails correctly, and move to GREEN.
Why it’s bad: The test might already pass (making it useless), or fail for the wrong reason (typo, wrong import).
Fix: Always run the test immediately and verify the failure message:
# After writing test
cargo test test_divide_returns_quotient
# Expected: Test failed (function not implemented)
# If: Test passed → test is useless
# If: Test failed (wrong reason) → fix test first
Advanced RED Techniques
Outside-In TDD
Start with high-level behavior, let tests drive lower-level design:
// Minute 0:00 - High-level test
#[tokio::test]
async fn test_api_returns_user_profile() {
let api = UserAPI::new();
let result = api.get_profile("user123").await;
assert!(result.is_ok());
let profile = result.unwrap();
assert_eq!(profile.username, "alice");
}
This test will drive the creation of:
UserAPI
structget_profile
methodProfile
struct- Database layer (in later cycles)
Property-Based Testing Hint
For complex logic, use RED to specify properties:
// Standard example-based test
#[tokio::test]
async fn test_sort_orders_numbers() {
let input = vec![3, 1, 4, 1, 5];
let result = sort(input).await;
assert_eq!(result, vec![1, 1, 3, 4, 5]);
}
// Property-based test (RED phase)
#[tokio::test]
async fn test_sort_maintains_length() {
use proptest::prelude::*;
proptest!(|(numbers: Vec<i32>)| {
let sorted = sort(numbers.clone()).await;
prop_assert_eq!(sorted.len(), numbers.len());
});
}
Property tests specify invariants rather than specific examples.
Test-Driven Error Messages
Write the test with the error message you want users to see:
#[tokio::test]
async fn test_divide_provides_helpful_error_message() {
let handler = DivideHandler;
let input = DivideInput {
numerator: 10.0,
denominator: 0.0,
};
let result = handler.handle(input).await;
assert!(result.is_err());
let error = result.unwrap_err();
let message = format!("{}", error);
// Specify the exact error message you want
assert!(message.contains("Division by zero"));
assert!(message.contains("denominator must be non-zero"));
}
This drives you to write good error messages, not generic “An error occurred.”
Integration with pforge Watch Mode
Run tests continuously during RED phase:
# Terminal 1: Start watch mode
cargo watch -x 'test test_divide_returns_quotient --lib'
# Terminal 2: Edit test
vim crates/pforge-runtime/tests/unit/calculator_test.rs
Watch mode gives instant feedback. Save the file, see the failure, confirm it’s RED for the right reason.
RED Phase Workflow Summary
- Start timer (5-minute cycle begins)
- Open test file (under 10 seconds)
- Copy test template (under 20 seconds)
- Fill in specifics (under 60 seconds)
- Run test (under 10 seconds)
- Verify failure (under 20 seconds)
- Total: ~2 minutes
With practice, you’ll complete RED in 90 seconds consistently, giving extra time for GREEN and REFACTOR.
Example: RED Phase Executed Correctly
Let’s implement a clamp
function that constrains a value between min and max.
Minute 0:00 - Start Timer
termdown 5m &
vim crates/pforge-runtime/src/lib.rs
Minute 0:10 - Decide on Test
Feature: Clamp function for numbers Test: Value below min returns min
Minute 0:20 - Open Test File
vim crates/pforge-runtime/tests/unit/math_test.rs
Minute 0:30 - Write Test
#[test]
fn test_clamp_returns_min_when_below_range() {
let result = clamp(5, 10, 20);
assert_eq!(result, 10);
}
Minute 0:50 - Run Test
cargo test test_clamp_returns_min_when_below_range
Output:
error: cannot find function `clamp` in this scope
Minute 1:00 - Verify RED
Perfect! Test fails because function doesn’t exist. This is the right failure.
Minute 1:10 - Document in Test
#[test]
fn test_clamp_returns_min_when_below_range() {
// clamp(value, min, max) constrains value to [min, max]
let result = clamp(5, 10, 20);
assert_eq!(result, 10);
}
Minute 2:00 - RED Phase Complete
We have:
- ✅ Test written
- ✅ Test fails for right reason
- ✅ Behavior clearly specified
- ✅ Under 2-minute budget
Time to move to GREEN.
When RED Takes Longer Than 2 Minutes
If you hit 2:00 and the test isn’t ready, you have two options:
Option 1: Finish Quickly (If < 30 Seconds Remaining)
If you’re truly close (just need to add assertions), finish quickly:
// 1:50 elapsed, just need to add:
assert_eq!(result.value, expected);
// Total: 2:05 - acceptable
Minor overruns (< 15 seconds) are acceptable if test is complete and verified RED.
Option 2: RESET (If Significantly Over)
If you’re at 2:30 and still writing the test, RESET:
git checkout .
Reflect: Why did RED take so long?
- Test setup too complex → Need helper function
- Testing too much → Break into smaller tests
- Unclear what to test → Spend 1 minute planning before next cycle
RED Phase Success Metrics
Track these metrics to improve:
Time to RED: Average time to write failing test
- Target: < 2:00
- Excellent: < 1:30
- Expert: < 1:00
RED Failure Rate: Tests that fail for wrong reason
- Target: < 10%
- Excellent: < 5%
- Expert: < 1%
RED Rewrites: Tests rewritten during same cycle
- Target: < 20%
- Excellent: < 10%
- Expert: < 5%
Psychological Benefits of RED First
Confidence: You know what you’re building before you start.
Clarity: The test clarifies vague requirements into concrete behavior.
Progress: Each RED test is a small, achievable goal.
Safety Net: Tests catch regressions as you refactor later.
Documentation: Future developers understand intent from tests.
Next Phase: GREEN
You’ve written a failing test that specifies behavior. Now it’s time to make it pass with the minimum code necessary.
The GREEN phase has one goal: get from RED to GREEN as fast as possible, even if the implementation is ugly. We’ll clean it up in REFACTOR.
Previous: The 5-Minute TDD Cycle Next: GREEN: Minimum Code