Quality Assurance with AI

AI assistants generate code quickly, but speed without quality creates technical debt. This chapter covers quality assurance patterns for AI-assisted MCP development, ensuring generated code meets production standards.

The Quality Assurance Stack

┌─────────────────────────────────────────────────────────────────────────┐
│                    Quality Assurance Layers                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ Layer 1: Compile-Time Safety                                    │    │
│  │                                                                 │    │
│  │  cargo build   →  Type errors, borrow issues, missing imports   │    │
│  │                   AI iterates until compilation succeeds        │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ Layer 2: Static Analysis                                        │    │
│  │                                                                 │    │
│  │  cargo clippy  →  Code smells, inefficiencies, patterns         │    │
│  │                   AI fixes warnings to meet zero-warning goal   │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ Layer 3: Unit Testing                                           │    │
│  │                                                                 │    │
│  │  cargo test    →  Handler logic, edge cases, error paths        │    │
│  │                   AI writes tests covering success and failure  │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ Layer 4: Integration Testing                                    │    │
│  │                                                                 │    │
│  │  cargo pmcp test →  Full server behavior, MCP protocol          │    │
│  │                      AI generates and runs scenarios            │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ Layer 5: Code Review                                            │    │
│  │                                                                 │    │
│  │  Human review  →  Logic correctness, security, maintainability  │    │
│  │                   You verify AI's work before deployment        │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Layer 1: Compile-Time Safety

The Compilation Loop

AI generates code → Compiler checks → Errors found → AI fixes → Repeat

User: Create a weather tool

AI: [Generates code]
$ cargo build

error[E0308]: mismatched types
  --> src/tools/weather.rs:25:12
   |
25 |     return temperature;
   |            ^^^^^^^^^^^ expected `WeatherOutput`, found `f64`

AI: I see the issue - I need to wrap the result.
[Fixes code]

$ cargo build
   Compiling mcp-weather-core v1.0.0
    Finished dev [unoptimized + debuginfo]

Requesting Compilation Checks

After any code change:

Run cargo build and fix any errors

AI will:

  1. Run the build
  2. Parse error messages
  3. Apply targeted fixes
  4. Repeat until success

Common Compilation Errors AI Handles

Error TypeAI Fix
Type mismatchWrap in correct type, add conversion
Missing importAdd use statement
Borrow issueClone, use reference, restructure
Lifetime errorAdd annotation, restructure ownership
Missing traitAdd derive macro, implement trait

Layer 2: Static Analysis (Clippy)

Running Clippy

Run cargo clippy and fix all warnings

AI executes:

cargo clippy -- -D warnings

Common Clippy Fixes

Redundant Clone:

#![allow(unused)]
fn main() {
// Before: warning: redundant clone
let city = input.city.clone();
process(city);

// After: AI fix
let city = input.city;
process(city);
}

Unnecessary Collect:

#![allow(unused)]
fn main() {
// Before: warning: avoid using `collect()` followed by `into_iter()`
let items: Vec<_> = data.iter().collect();
for item in items.into_iter() { ... }

// After: AI fix
for item in data.iter() { ... }
}

Complex Match:

#![allow(unused)]
fn main() {
// Before: warning: this match could be replaced
match result {
    Some(x) => x,
    None => return Err(Error::validation("not found")),
}

// After: AI fix
result.ok_or_else(|| Error::validation("not found"))?
}

Strict Clippy Configuration

For maximum quality, enable pedantic lints:

# Cargo.toml
[lints.clippy]
all = "warn"
pedantic = "warn"

Layer 3: Unit Testing

Requesting Tests

Add unit tests for the get_weather handler covering:
1. Valid city returns weather data
2. Empty city returns validation error
3. Whitespace-only city returns validation error
4. Very long city name returns validation error

Generated Test Structure

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    fn create_test_input(city: &str) -> WeatherInput {
        WeatherInput {
            city: city.to_string(),
            days: None,
        }
    }

    #[tokio::test]
    async fn test_valid_city() {
        let input = create_test_input("London");
        let result = handler(input, RequestHandlerExtra::default()).await;

        assert!(result.is_ok());
        let output = result.unwrap();
        assert_eq!(output.city, "London");
        assert!(output.temperature_celsius > -100.0);
        assert!(output.temperature_celsius < 100.0);
    }

    #[tokio::test]
    async fn test_empty_city() {
        let input = create_test_input("");
        let result = handler(input, RequestHandlerExtra::default()).await;

        assert!(result.is_err());
        let err = result.unwrap_err();
        assert!(err.to_string().contains("cannot be empty"));
    }

    #[tokio::test]
    async fn test_whitespace_city() {
        let input = create_test_input("   ");
        let result = handler(input, RequestHandlerExtra::default()).await;

        assert!(result.is_err());
    }

    #[tokio::test]
    async fn test_very_long_city() {
        let long_city = "a".repeat(1000);
        let input = create_test_input(&long_city);
        let result = handler(input, RequestHandlerExtra::default()).await;

        assert!(result.is_err());
        let err = result.unwrap_err();
        assert!(err.to_string().contains("too long"));
    }
}
}

Test Coverage Goals

Request specific coverage:

Ensure the weather tool has:
- At least one test per error case
- Tests for boundary conditions (0, max, edge values)
- Tests with Unicode input
- Tests for optional parameter handling

Running Tests

cargo test

With output:

running 4 tests
test tools::weather::tests::test_valid_city ... ok
test tools::weather::tests::test_empty_city ... ok
test tools::weather::tests::test_whitespace_city ... ok
test tools::weather::tests::test_very_long_city ... ok

test result: ok. 4 passed; 0 failed; 0 ignored

Layer 4: Integration Testing

Generating Scenarios

Generate integration test scenarios for the weather server

AI executes:

cargo pmcp test --server weather --generate-scenarios

Scenario Structure

Generated scenarios/weather/generated.yaml:

name: "Weather Server Integration Tests"
description: "End-to-end tests for weather MCP server"
timeout: 60
stop_on_failure: false

steps:
  - name: "List available tools"
    operation:
      type: list_tools
    assertions:
      - type: success
      - type: contains
        path: "tools"
        value: "get-current-weather"

  - name: "Get weather for valid city"
    operation:
      type: tool_call
      tool: "get-current-weather"
      arguments:
        city: "London"
    assertions:
      - type: success
      - type: field_exists
        path: "content.0.text"

  - name: "Get weather for invalid city"
    operation:
      type: tool_call
      tool: "get-current-weather"
      arguments:
        city: ""
    assertions:
      - type: error
      - type: contains
        path: "error.message"
        value: "cannot be empty"

Running Integration Tests

# Terminal 1: Start dev server
cargo pmcp dev --server weather

# Terminal 2: Run tests
cargo pmcp test --server weather

Output:

Running scenarios for weather server...

Scenario: Weather Server Integration Tests
  ✓ List available tools (12ms)
  ✓ Get weather for valid city (45ms)
  ✓ Get weather for invalid city (8ms)

Results: 3 passed, 0 failed

Custom Scenarios

Add edge cases to generated scenarios:

  - name: "Test with Unicode city name"
    operation:
      type: tool_call
      tool: "get-current-weather"
      arguments:
        city: "東京"
    assertions:
      - type: success

  - name: "Test with special characters"
    operation:
      type: tool_call
      tool: "get-current-weather"
      arguments:
        city: "São Paulo"
    assertions:
      - type: success

  - name: "Test concurrent requests"
    parallel:
      - operation:
          type: tool_call
          tool: "get-current-weather"
          arguments:
            city: "London"
      - operation:
          type: tool_call
          tool: "get-current-weather"
          arguments:
            city: "Paris"
    assertions:
      - type: all_success

Layer 5: Code Review

Review Checklist

When reviewing AI-generated code:

1. Logic Correctness

  • Does the code do what was requested?
  • Are edge cases handled?
  • Is error handling appropriate?

2. Security

  • Input validation present?
  • No injection vulnerabilities?
  • Sensitive data protected?

3. Performance

  • No unnecessary allocations?
  • Appropriate data structures?
  • No blocking in async code?

4. Maintainability

  • Clear variable names?
  • Appropriate comments?
  • Follows project conventions?

Requesting Explanations

Ask AI to explain generated code:

Explain the error handling in the get_weather handler.
Why did you choose to return validation errors for these cases?

AI explains:

The handler uses two types of errors:

1. Error::validation() for user-fixable issues:
   - Empty city name (user should provide a valid city)
   - City name too long (user should shorten input)

2. Error::internal() for system issues:
   - API connection failure (not user's fault)
   - Parse failure (indicates API change)

This distinction helps AI clients understand whether to retry
with different input or report a server problem.

Security Review Prompts

Review the database query tool for SQL injection vulnerabilities.
Are there any cases where user input reaches the query unsanitized?

AI responds:

I've reviewed the query handling. The current implementation uses
parameterized queries via sqlx, which prevents SQL injection:

let results = sqlx::query_as!(
    Record,
    "SELECT * FROM users WHERE name = $1",
    input.name  // Parameterized, not interpolated
)

No string concatenation with user input. Safe.

Quality Gate Automation

Pre-Commit Hook

Create .git/hooks/pre-commit:

#!/bin/bash
set -e

echo "Running quality gates..."

# Format check
cargo fmt --check || {
    echo "❌ Run 'cargo fmt' to fix formatting"
    exit 1
}

# Clippy
cargo clippy -- -D warnings || {
    echo "❌ Fix clippy warnings"
    exit 1
}

# Unit tests
cargo test || {
    echo "❌ Fix failing tests"
    exit 1
}

echo "✅ All quality gates passed"

Makefile Integration

.PHONY: quality-gate test lint fmt

quality-gate: fmt lint test
	@echo "✅ All quality gates passed"

fmt:
	cargo fmt --check

lint:
	cargo clippy -- -D warnings

test:
	cargo test

integration-test:
	cargo pmcp test --server $(SERVER)

Usage:

make quality-gate

CI/CD Pipeline

# .github/workflows/quality.yml
name: Quality Gates

on: [push, pull_request]

jobs:
  quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Rust
        uses: dtolnay/rust-action@stable

      - name: Format Check
        run: cargo fmt --check

      - name: Clippy
        run: cargo clippy -- -D warnings

      - name: Unit Tests
        run: cargo test

      - name: Build Release
        run: cargo build --release

Common Quality Issues and Fixes

Issue: Unwrap in Production Code

Detection: Clippy warning or code review

Request:

Replace all unwrap() calls with proper error handling using ? or ok_or_else

Before:

#![allow(unused)]
fn main() {
let value = map.get("key").unwrap();
}

After:

#![allow(unused)]
fn main() {
let value = map.get("key")
    .ok_or_else(|| Error::internal("Missing required key"))?;
}

Issue: Missing Input Validation

Detection: Code review or integration test failure

Request:

Add input validation for the create_user tool:
- username: 3-50 chars, alphanumeric only
- email: valid email format
- age: 13-120 (if provided)

Issue: Incomplete Error Messages

Detection: Code review

Request:

Improve error messages in the file upload tool.
Each error should explain:
1. What went wrong
2. What the constraint is
3. How to fix it

Example:
Bad: "File too large"
Good: "File size 15MB exceeds maximum of 10MB. Reduce file size or split into parts."

Issue: Missing Tests

Detection: Coverage analysis

Request:

Add tests for these uncovered cases in the payment tool:
1. Amount of exactly 0.00
2. Negative amount
3. Amount with too many decimal places
4. Currency code not in allowed list

Continuous Quality Improvement

Regular Audits

Periodically request:

Review the entire MCP server for:
1. Deprecated patterns
2. Unused code
3. Potential performance issues
4. Security vulnerabilities

Suggest improvements.

Dependency Updates

Check for outdated dependencies and suggest updates.
Ensure compatibility with latest pmcp version.

Documentation Verification

Verify all public APIs have:
1. Doc comments with descriptions
2. Example usage in doc tests
3. Parameter documentation
4. Return value documentation

Summary

Quality assurance with AI follows a layered approach:

LayerToolWhat It CatchesAI Role
1cargo buildType errors, syntaxFixes automatically
2cargo clippyCode smells, patternsFixes warnings
3cargo testLogic errorsWrites tests
4cargo pmcp testIntegration issuesGenerates scenarios
5Human reviewDesign flawsExplains, justifies

Key practices:

  1. Run all gates after every change - Don't accumulate issues
  2. Treat warnings as errors - cargo clippy -- -D warnings
  3. Generate tests automatically - --generate-scenarios
  4. Review AI output - Understand what was generated
  5. Automate with hooks - Pre-commit catches issues early

The combination of Rust's compile-time safety, cargo-pmcp's test generation, and AI's ability to iterate creates a rapid development cycle without sacrificing quality.


Return to Effective AI Collaboration | Part VII: Observability →