Metrics Collection

Metrics transform operations from reactive firefighting to proactive monitoring. This section covers Rust's metrics ecosystem, PMCP's built-in metrics middleware, and integration with popular observability platforms.

What are Metrics?

If you're new to production metrics, think of them as the vital signs of your application. Just as a doctor monitors heart rate, blood pressure, and temperature to assess health, metrics give you numbers that indicate whether your system is healthy.

┌─────────────────────────────────────────────────────────────────────────┐
│                    Metrics vs Logs: When to Use Each                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  LOGS answer: "What happened?"                                          │
│  METRICS answer: "How much/how fast/how many?"                          │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Scenario: Your MCP server is "slow"                                    │
│                                                                         │
│  Logs tell you:                    Metrics tell you:                    │
│  ═══════════════                   ═════════════════                    │
│                                                                         │
│  "Request abc-123 took 5000ms"     Requests/second: 150                 │
│  "Request def-456 took 3200ms"     P50 latency: 45ms                    │
│  "Request ghi-789 took 4800ms"     P95 latency: 250ms                   │
│  "Request jkl-012 took 50ms"       P99 latency: 4,800ms  ← Problem!     │
│  ... (thousands more)              Error rate: 0.5%                     │
│                                                                         │
│  To find the problem in logs:      To find the problem in metrics:      │
│  • Search through thousands        • Glance at dashboard                │
│  • Calculate averages manually     • See P99 spike immediately          │
│  • Hard to spot patterns           • Correlate with time                │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Use LOGS when you need:           Use METRICS when you need:           │
│  • Full context of an event        • Trends over time                   │
│  • Debugging specific issues       • Alerting on thresholds             │
│  • Audit trails                    • Capacity planning                  │
│  • Error messages                  • SLA monitoring                     │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Why Metrics Matter

Without Metrics	With Metrics
"Users say it's slow"	"P95 latency increased from 100ms to 500ms at 2:30 PM"
"Something is wrong"	"Error rate jumped from 0.1% to 5% after the last deployment"
"We need more capacity"	"At current growth rate, we'll hit capacity limits in 3 weeks"
"Is the fix working?"	"Error rate dropped from 5% to 0.2% after the hotfix"

The Three Types of Metrics

Before diving into code, let's understand the three fundamental metric types. Each serves a different purpose:

┌─────────────────────────────────────────────────────────────────────────┐
│                    The Three Metric Types                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  COUNTER                                                                │
│  ═══════                                                                │
│  "How many times did X happen?"                                         │
│                                                                         │
│  • Only goes UP (or resets to 0)                                        │
│  • Like an odometer in a car                                            │
│                                                                         │
│  Examples:                          ┌─────────────────────────┐         │
│  • Total requests served            │ requests_total          │         │
│  • Total errors                     │ ████████████████ 1,523  │         │
│  • Total bytes transferred          │                         │         │
│                                     │ errors_total            │         │
│  Use when: You want to count        │ ██ 47                   │         │
│  events that accumulate             └─────────────────────────┘         │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  GAUGE                                                                  │
│  ═════                                                                  │
│  "What is the current value of X?"                                      │
│                                                                         │
│  • Can go UP and DOWN                                                   │
│  • Like a thermometer or fuel gauge                                     │
│                                                                         │
│  Examples:                          ┌─────────────────────────┐         │
│  • Active connections               │ connections_active      │         │
│  • Queue depth                      │ ████████░░░░ 42         │         │
│  • Memory usage                     │                         │         │
│  • Temperature                      │ (can increase/decrease) │         │
│                                     └─────────────────────────┘         │
│  Use when: You want to track                                            │
│  current state that fluctuates                                          │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  HISTOGRAM                                                              │
│  ═════════                                                              │
│  "What is the distribution of X?"                                       │
│                                                                         │
│  • Records many values, calculates percentiles                          │
│  • Like tracking all marathon finish times, not just the average        │
│                                                                         │
│  Examples:                          ┌─────────────────────────┐         │
│  • Request latency                  │ request_duration_ms     │         │
│  • Response size                    │                         │         │
│  • Query execution time             │  ▂▅█▇▄▂▁                │         │
│                                     │  10 50 100 200 500 ms   │         │
│  Use when: You need percentiles     │                         │         │
│  (P50, P95, P99) not just averages  │  P50: 45ms  P99: 450ms  │         │
│                                     └─────────────────────────┘         │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Understanding Percentiles

Percentiles are crucial for understanding real user experience. Here's why averages can be misleading:

┌─────────────────────────────────────────────────────────────────────────┐
│                    Why Percentiles Matter                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  Scenario: 100 requests with these latencies:                           │
│                                                                         │
│  • 90 requests: 50ms each                                               │
│  • 9 requests: 100ms each                                               │
│  • 1 request: 5,000ms (timeout!)                                        │
│                                                                         │
│  Average = (90×50 + 9×100 + 1×5000) / 100 = 104ms  ← "Looks fine!"      │
│                                                                         │
│  But look at percentiles:                                               │
│  • P50 (median) = 50ms    ← Half of users see 50ms or less              │
│  • P90 = 50ms             ← 90% of users see 50ms or less               │
│  • P95 = 100ms            ← 95% of users see 100ms or less              │
│  • P99 = 5,000ms          ← 1% of users wait 5 SECONDS! 🚨              │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Which percentile to monitor?                                           │
│                                                                         │
│  • P50 (median): Typical user experience                                │
│  • P95: Most users' worst-case experience                               │
│  • P99: Your "long tail" - affects 1 in 100 users                       │
│  • P99.9: For high-traffic sites (1 in 1000 users)                      │
│                                                                         │
│  If you have 1 million requests/day:                                    │
│  • P99 = 10,000 users having a bad experience daily                     │
│  • P99.9 = 1,000 users having a bad experience daily                    │
│                                                                         │
│  Rule of thumb: Alert on P95 or P99, not averages                       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

The Metrics Ecosystem

Rust's metrics crate provides a facade pattern similar to log for logging—you write metrics once and choose the backend at runtime:

┌─────────────────────────────────────────────────────────────────────────┐
│                    Metrics Architecture                                 │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  Application Code                                                       │
│  ════════════════                                                       │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  counter!("requests_total").increment(1);                       │    │
│  │  histogram!("request_duration_ms").record(45.5);                │    │
│  │  gauge!("active_connections").set(12);                          │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                              │                                          │
│                              ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                    metrics (facade crate)                       │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                              │                                          │
│            ┌─────────────────┼─────────────────┐                        │
│            ▼                 ▼                 ▼                        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                   │
│  │ Prometheus   │  │   Datadog    │  │  CloudWatch  │                   │
│  │  Exporter    │  │    Agent     │  │    Agent     │                   │
│  └──────────────┘  └──────────────┘  └──────────────┘                   │
│         │                  │                 │                          │
│         ▼                  ▼                 ▼                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                   │
│  │  Prometheus  │  │   Datadog    │  │     AWS      │                   │
│  │    Server    │  │    Cloud     │  │  CloudWatch  │                   │
│  └──────────────┘  └──────────────┘  └──────────────┘                   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Metric Types

Type	Purpose	Example
Counter	Monotonically increasing count	Total requests, errors
Gauge	Value that can go up or down	Active connections, queue depth
Histogram	Distribution of values	Request duration, response size

#![allow(unused)]
fn main() {
use metrics::{counter, gauge, histogram};

async fn handler(input: Input) -> Result<Output> {
    let start = Instant::now();

    // Count the request
    counter!("mcp.requests_total", "tool" => "get-weather").increment(1);

    // Track active requests
    gauge!("mcp.requests_active").increment(1.0);

    let result = process(input).await;

    // Record duration
    histogram!("mcp.request_duration_ms", "tool" => "get-weather")
        .record(start.elapsed().as_millis() as f64);

    // Track active requests
    gauge!("mcp.requests_active").decrement(1.0);

    // Count success/failure
    match &result {
        Ok(_) => counter!("mcp.requests_success").increment(1),
        Err(_) => counter!("mcp.requests_error").increment(1),
    }

    result
}
}

PMCP's Built-in Observability Metrics

PMCP v1.9.2+ includes a built-in observability module that automatically collects metrics without requiring manual middleware setup:

#![allow(unused)]
fn main() {
use pmcp::server::builder::ServerCoreBuilder;
use pmcp::server::observability::ObservabilityConfig;

// One line enables automatic metrics collection
let server = ServerCoreBuilder::new()
    .name("my-server")
    .version("1.0.0")
    .tool("weather", WeatherTool)
    .with_observability(ObservabilityConfig::development())
    .build()?;
}

Standard Metrics (Built-in)

The built-in observability automatically emits these metrics:

Metric	Type	Description
`mcp.request.duration`	Histogram (ms)	Request latency per tool
`mcp.request.count`	Counter	Total requests processed
`mcp.request.errors`	Counter	Error count by type
`mcp.response.size`	Histogram (bytes)	Response payload sizes
`mcp.composition.depth`	Gauge	Nesting depth for composed servers

For CloudWatch deployments, these are emitted as EMF (Embedded Metric Format) and automatically extracted as CloudWatch metrics under the configured namespace.

Custom MetricsMiddleware (Advanced)

For custom metric backends (Prometheus, Datadog, etc.), you can still use the MetricsMiddleware directly:

#![allow(unused)]
fn main() {
use pmcp::shared::MetricsMiddleware;
use pmcp::shared::EnhancedMiddlewareChain;
use std::sync::Arc;

fn build_instrumented_chain() -> EnhancedMiddlewareChain {
    let mut chain = EnhancedMiddlewareChain::new();

    // Add metrics collection
    chain.add(Arc::new(MetricsMiddleware::new("my-server".to_string())));

    chain
}
}

Recorded Metrics (Custom MetricsMiddleware)

The MetricsMiddleware automatically records:

Metric	Type	Labels	Description
`mcp.requests.total`	Counter	service, method	Total requests processed
`mcp.requests.duration_ms`	Histogram	service, method	Request latency
`mcp.requests.errors`	Counter	service, error_type	Error count by type
`mcp.requests.active`	Gauge	service	In-flight requests

Custom Metrics in Handlers

Add tool-specific metrics directly in handlers:

#![allow(unused)]
fn main() {
use metrics::{counter, histogram};
use std::time::Instant;

async fn handler(input: WeatherInput) -> Result<Weather> {
    let start = Instant::now();

    // Business metrics
    counter!(
        "weather.lookups_total",
        "city" => input.city.clone(),
        "units" => input.units.as_str()
    ).increment(1);

    let weather = match cache.get(&input.city) {
        Some(cached) => {
            counter!("weather.cache_hits").increment(1);
            cached
        }
        None => {
            counter!("weather.cache_misses").increment(1);
            let result = fetch_weather(&input.city).await?;

            histogram!("weather.api_latency_ms")
                .record(start.elapsed().as_millis() as f64);

            result
        }
    };

    // Track temperature extremes
    if weather.temperature > 40.0 {
        counter!("weather.extreme_heat_events").increment(1);
    }

    Ok(weather)
}
}

Platform Integration

Prometheus

Prometheus is the industry standard for cloud-native metrics:

// Cargo.toml
[dependencies]
metrics = "0.23"
metrics-exporter-prometheus = "0.15"

// main.rs
use metrics_exporter_prometheus::PrometheusBuilder;

fn init_metrics() {
    // Start Prometheus exporter on port 9090
    PrometheusBuilder::new()
        .with_http_listener(([0, 0, 0, 0], 9090))
        .install()
        .expect("Failed to install Prometheus exporter");
}

#[tokio::main]
async fn main() {
    init_metrics();

    // Metrics now available at http://localhost:9090/metrics
    run_server().await;
}

Prometheus output format:

# HELP mcp_requests_total Total MCP requests
# TYPE mcp_requests_total counter
mcp_requests_total{service="weather-server",method="get-weather"} 1523

# HELP mcp_request_duration_ms Request latency in milliseconds
# TYPE mcp_request_duration_ms histogram
mcp_request_duration_ms_bucket{service="weather-server",le="10"} 450
mcp_request_duration_ms_bucket{service="weather-server",le="50"} 1200
mcp_request_duration_ms_bucket{service="weather-server",le="100"} 1500
mcp_request_duration_ms_bucket{service="weather-server",le="+Inf"} 1523
mcp_request_duration_ms_sum{service="weather-server"} 45678.5
mcp_request_duration_ms_count{service="weather-server"} 1523

Datadog

Datadog integration via StatsD or direct API:

#![allow(unused)]
fn main() {
// Cargo.toml
[dependencies]
metrics = "0.23"
metrics-exporter-statsd = "0.7"

// Using StatsD (Datadog agent listens on port 8125)
use metrics_exporter_statsd::StatsdBuilder;

fn init_metrics() {
    StatsdBuilder::from("127.0.0.1", 8125)
        .with_queue_size(5000)
        .with_buffer_size(1024)
        .install()
        .expect("Failed to install StatsD exporter");
}
}

Datadog tags:

#![allow(unused)]
fn main() {
counter!(
    "mcp.requests",
    "service" => "weather-server",
    "tool" => "get-weather",
    "env" => "production"
).increment(1);

// Becomes: mcp.requests:1|c|#service:weather-server,tool:get-weather,env:production
}

AWS CloudWatch

CloudWatch integration for AWS-hosted servers:

#![allow(unused)]
fn main() {
// Cargo.toml
[dependencies]
metrics = "0.23"
aws-sdk-cloudwatch = "1.0"
tokio = { version = "1", features = ["full"] }

// Custom CloudWatch recorder
use aws_sdk_cloudwatch::{Client, types::MetricDatum, types::StandardUnit};
use metrics::{Counter, Gauge, Histogram, Key, KeyName, Recorder, Unit};
use std::sync::Arc;

struct CloudWatchRecorder {
    client: Client,
    namespace: String,
}

impl CloudWatchRecorder {
    async fn new(namespace: &str) -> Self {
        let config = aws_config::load_defaults(aws_config::BehaviorVersion::latest()).await;
        Self {
            client: Client::new(&config),
            namespace: namespace.to_string(),
        }
    }

    async fn publish_metrics(&self, metrics: Vec<MetricDatum>) {
        self.client
            .put_metric_data()
            .namespace(&self.namespace)
            .set_metric_data(Some(metrics))
            .send()
            .await
            .expect("Failed to publish metrics");
    }
}
}

Grafana Cloud / OpenTelemetry

For Grafana Cloud or any OpenTelemetry-compatible backend:

#![allow(unused)]
fn main() {
// Cargo.toml
[dependencies]
opentelemetry = "0.24"
opentelemetry_sdk = "0.24"
opentelemetry-otlp = "0.17"
tracing-opentelemetry = "0.25"

use opentelemetry::global;
use opentelemetry_sdk::metrics::MeterProvider;
use opentelemetry_otlp::WithExportConfig;

fn init_otel_metrics() -> Result<(), Box<dyn std::error::Error>> {
    let exporter = opentelemetry_otlp::new_exporter()
        .tonic()
        .with_endpoint("https://otlp.grafana.net:4317");

    let provider = MeterProvider::builder()
        .with_reader(
            opentelemetry_sdk::metrics::PeriodicReader::builder(exporter, opentelemetry_sdk::runtime::Tokio)
                .with_interval(std::time::Duration::from_secs(30))
                .build()
        )
        .build();

    global::set_meter_provider(provider);
    Ok(())
}
}

Multi-Platform Strategy

Design metrics to work across platforms:

┌─────────────────────────────────────────────────────────────────────────┐
│                    Multi-Platform Metrics Design                        │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                    Application Layer                            │    │
│  │                                                                 │    │
│  │  Use metrics crate with consistent naming:                      │    │
│  │  • mcp.requests.total                                           │    │
│  │  • mcp.requests.duration_ms                                     │    │
│  │  • mcp.requests.errors                                          │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                              │                                          │
│                              ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                   Platform Adapter                              │    │
│  │                                                                 │    │
│  │  Choose at deployment time via environment/config:              │    │
│  │                                                                 │    │
│  │  METRICS_BACKEND=prometheus  →  PrometheusBuilder               │    │
│  │  METRICS_BACKEND=datadog     →  StatsdBuilder                   │    │
│  │  METRICS_BACKEND=cloudwatch  →  CloudWatchRecorder              │    │
│  │  METRICS_BACKEND=otlp        →  OpenTelemetry                   │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Platform Selection at Runtime

#![allow(unused)]
fn main() {
use std::env;

fn init_metrics_backend() {
    let backend = env::var("METRICS_BACKEND")
        .unwrap_or_else(|_| "prometheus".to_string());

    match backend.as_str() {
        "prometheus" => {
            metrics_exporter_prometheus::PrometheusBuilder::new()
                .with_http_listener(([0, 0, 0, 0], 9090))
                .install()
                .expect("Prometheus exporter failed");
        }
        "statsd" | "datadog" => {
            let host = env::var("STATSD_HOST").unwrap_or_else(|_| "127.0.0.1".to_string());
            let port = env::var("STATSD_PORT")
                .unwrap_or_else(|_| "8125".to_string())
                .parse()
                .expect("Invalid STATSD_PORT");

            metrics_exporter_statsd::StatsdBuilder::from(&host, port)
                .install()
                .expect("StatsD exporter failed");
        }
        "none" | "disabled" => {
            // No-op for local development
            tracing::info!("Metrics collection disabled");
        }
        other => {
            panic!("Unknown metrics backend: {}", other);
        }
    }
}
}

Metrics Best Practices

Naming Conventions

#![allow(unused)]
fn main() {
// GOOD: Hierarchical, consistent naming
counter!("mcp.tool.requests_total", "tool" => "weather").increment(1);
histogram!("mcp.tool.duration_ms", "tool" => "weather").record(45.0);
counter!("mcp.tool.errors_total", "tool" => "weather", "error" => "timeout").increment(1);

// BAD: Inconsistent, flat naming
counter!("weather_requests").increment(1);
counter!("weatherToolDurationMs").increment(1);
counter!("errors").increment(1);
}

Cardinality Control

Cardinality refers to the number of unique combinations of label values for a metric. This is one of the most common pitfalls for newcomers to metrics—and it can crash your monitoring system.

┌─────────────────────────────────────────────────────────────────────────┐
│                    The Cardinality Problem                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  What happens with high cardinality labels?                             │
│  ══════════════════════════════════════════                             │
│                                                                         │
│  Each unique label combination creates a NEW time series in memory:     │
│                                                                         │
│  counter!("requests", "user_id" => user_id)                             │
│                                                                         │
│  With 1 million users, this creates 1 MILLION time series:              │
│                                                                         │
│  requests{user_id="user-000001"} = 5                                    │
│  requests{user_id="user-000002"} = 12                                   │
│  requests{user_id="user-000003"} = 3                                    │
│  ... (999,997 more) ...                                                 │
│  requests{user_id="user-999999"} = 7                                    │
│  requests{user_id="user-1000000"} = 1                                   │
│                                                                         │
│  Each time series consumes memory in:                                   │
│  • Your application                                                     │
│  • Prometheus/metrics backend                                           │
│  • Grafana/dashboard queries                                            │
│                                                                         │
│  Result: Memory exhaustion, slow queries, crashed monitoring            │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Good labels (bounded):              Bad labels (unbounded):            │
│  ══════════════════════              ══════════════════════             │
│                                                                         │
│  • tool: 10-50 tools max             • user_id: millions of users       │
│  • status: success/error             • request_id: infinite             │
│  • tier: free/pro/enterprise         • city: thousands of cities        │
│  • environment: dev/staging/prod     • email: unbounded                 │
│  • http_method: GET/POST/PUT/DELETE  • timestamp: infinite              │
│                                                                         │
│  Rule of thumb: Labels should have fewer than 100 possible values       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

If you need per-user or per-request data, use logs instead of metrics. Logs are designed for high-cardinality data; metrics are not.

#![allow(unused)]
fn main() {
// BAD: Unbounded cardinality (user_id could be millions)
counter!("requests", "user_id" => user_id).increment(1);

// BAD: High cardinality (city names - thousands of values)
counter!("weather_requests", "city" => &input.city).increment(1);

// GOOD: Bounded cardinality (only 3 possible values)
counter!(
    "requests",
    "user_tier" => user.tier.as_str()  // "free", "pro", "enterprise"
).increment(1);

// GOOD: Use histogram for distribution instead of labels
histogram!("request_duration_ms").record(duration);

// GOOD: Log high-cardinality data instead of metrics
tracing::info!(user_id = %user_id, city = %city, "Request processed");
}

Standard Labels

Apply consistent labels across all metrics:

#![allow(unused)]
fn main() {
use std::sync::OnceLock;

struct MetricsContext {
    service: String,
    version: String,
    environment: String,
}

static CONTEXT: OnceLock<MetricsContext> = OnceLock::new();

fn init_context() {
    CONTEXT.get_or_init(|| MetricsContext {
        service: env::var("SERVICE_NAME").unwrap_or_else(|_| "mcp-server".to_string()),
        version: env!("CARGO_PKG_VERSION").to_string(),
        environment: env::var("ENV").unwrap_or_else(|_| "development".to_string()),
    });
}

// Helper for consistent labeling
macro_rules! labeled_counter {
    ($name:expr, $($key:expr => $value:expr),*) => {{
        let ctx = CONTEXT.get().expect("Metrics context not initialized");
        counter!(
            $name,
            "service" => ctx.service.clone(),
            "version" => ctx.version.clone(),
            "env" => ctx.environment.clone(),
            $($key => $value),*
        )
    }};
}

// Usage
labeled_counter!("mcp.requests", "tool" => "weather").increment(1);
}

Dashboard Examples

Key Performance Indicators

# Grafana dashboard panels (pseudo-config)
panels:
  - title: "Request Rate"
    query: rate(mcp_requests_total[5m])
    type: graph

  - title: "P95 Latency"
    query: histogram_quantile(0.95, rate(mcp_request_duration_ms_bucket[5m]))
    type: graph

  - title: "Error Rate"
    query: rate(mcp_requests_errors_total[5m]) / rate(mcp_requests_total[5m])
    type: gauge
    thresholds:
      - value: 0.01
        color: yellow
      - value: 0.05
        color: red

  - title: "Active Connections"
    query: mcp_connections_active
    type: stat

Alert Rules

# Prometheus alerting rules
groups:
  - name: mcp-server
    rules:
      - alert: HighErrorRate
        expr: rate(mcp_requests_errors_total[5m]) / rate(mcp_requests_total[5m]) > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "MCP server error rate above 5%"

      - alert: HighLatency
        expr: histogram_quantile(0.95, rate(mcp_request_duration_ms_bucket[5m])) > 1000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "MCP server P95 latency above 1 second"

      - alert: ServiceDown
        expr: up{job="mcp-server"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "MCP server is down"

Testing with Metrics

Use test scenarios as health checks that verify metrics:

# scenarios/smoke.yaml
name: "Smoke Test with Metrics Verification"
steps:
  - name: "Call weather tool"
    operation:
      type: tool_call
      tool: "get-weather"
      arguments:
        city: "London"
    assertions:
      - type: success
      - type: duration_ms
        max: 1000

  # Verify metrics endpoint
  - name: "Check metrics"
    operation:
      type: http_get
      url: "http://localhost:9090/metrics"
    assertions:
      - type: contains
        value: "mcp_requests_total"
      - type: contains
        value: 'tool="get-weather"'

Metrics in CI/CD

# .github/workflows/test.yml
jobs:
  test:
    steps:
      - name: Start server
        run: cargo run --release &
        env:
          METRICS_BACKEND: prometheus

      - name: Wait for startup
        run: sleep 5

      - name: Run tests
        run: cargo pmcp test --server weather

      - name: Verify metrics
        run: |
          curl -s http://localhost:9090/metrics | grep mcp_requests_total
          curl -s http://localhost:9090/metrics | grep mcp_request_duration_ms

Summary

Aspect	Recommendation
Crate	Use `metrics` facade for portability
Types	Counter (totals), Histogram (durations), Gauge (current state)
Naming	Hierarchical: `mcp.component.metric_name`
Labels	Service, tool, environment; avoid high cardinality
Platform	Configure at runtime via environment variables
Prometheus	Default for cloud-native, excellent Grafana support
Datadog	StatsD exporter, good for existing Datadog users
CloudWatch	Custom recorder for AWS-native deployments
Alerting	Error rate > 5%, P95 latency > 1s, service down

Metrics provide the quantitative foundation for understanding system behavior. Combined with logging and tracing, they complete the observability picture for enterprise MCP servers.

Return to Middleware and Instrumentation | Continue to Operations and Monitoring →

Advanced MCP: Enterprise-Grade AI Integration with Rust