Exercise: Pagination Patterns - Advanced MCP: Enterprise-Grade AI Integration with Rust

ch03-03-pagination-patterns

⭐⭐ intermediate ⏱️ 30 min

Your database query tool from the previous exercise works great for small result sets, but what happens when a table has millions of rows? Without proper pagination:

Memory exhaustion: Loading 10M rows into memory crashes your server
Timeouts: Long queries block the connection pool
Poor UX: AI assistants can't process massive JSON responses effectively

This exercise teaches cursor-based pagination - the production pattern for handling large datasets efficiently. You'll learn why it's superior to offset-based pagination and how to implement it safely.

🎯 Learning Objectives

Thinking

Why offset pagination fails at scale (OFFSET 1000000 is slow)
How cursor-based pagination maintains consistent performance
Tradeoffs between different pagination strategies

Doing

Implement cursor-based pagination with a 'next' token
Handle edge cases (empty results, last page, invalid cursors)
Design API responses that guide AI assistants to fetch more

💬 Discussion

If you have 10 million rows and an AI asks for "all customers", what should happen?
Why is `OFFSET 999000 LIMIT 1000` slower than `WHERE id > 999000 LIMIT 1000`?
How should an MCP response indicate that more data is available?
What makes a good pagination cursor? (hint: not just a page number)

src/main.rs

💡 Hints

Hint 1

Start by validating the table is in the allowlist:

#![allow(unused)]
fn main() {
if !ALLOWED_TABLES.contains(&input.table.as_str()) {
    return Err(anyhow::anyhow!("Table not allowed"));
}
}

Hint 2

Build the query with cursor support:

#![allow(unused)]
fn main() {
let start_id = if let Some(cursor_str) = &input.cursor {
    let cursor = Cursor::decode(cursor_str)?;
    if cursor.table != input.table {
        return Err(anyhow::anyhow!("Cursor table mismatch"));
    }
    cursor.last_id
} else {
    0
};
let query = format!(
"SELECT * FROM {} WHERE id > {} ORDER BY id LIMIT {}",
input.table, start_id, input.page_size + 1
);
}

Hint 3

Complete implementation with has_more detection:

#![allow(unused)]
fn main() {
async fn paginated_query(pool: &DbPool, input: PaginatedQueryInput) -> Result<PaginatedResult> {
    // Validate table
    if !ALLOWED_TABLES.contains(&input.table.as_str()) {
        return Err(anyhow::anyhow!("Table '{}' not allowed", input.table));
    }
// Limit page size
let page_size = input.page_size.min(100);

// Decode cursor
let start_id = match &amp;input.cursor {
    Some(c) =&gt; {
        let cursor = Cursor::decode(c)?;
        if cursor.table != input.table {
            return Err(anyhow::anyhow!("Cursor was for different table"));
        }
        cursor.last_id
    }
    None =&gt; 0,
};

// Build and execute query - fetch N+1 to detect more pages
let query = format!(
    "SELECT * FROM {} WHERE id &gt; {} ORDER BY id LIMIT {}",
    input.table, start_id, page_size + 1
);

let rows = sqlx::query(&amp;query)
    .fetch_all(pool.as_ref())
    .await?;

// Check for more results
let has_more = rows.len() &gt; page_size as usize;
let rows: Vec&lt;_&gt; = rows.into_iter().take(page_size as usize).collect();

// Build next_cursor if more pages exist...
}

}

⚠️ Try the exercise first!Show Solution

#![allow(unused)]
fn main() {
async fn paginated_query(pool: &DbPool, input: PaginatedQueryInput) -> Result<PaginatedResult> {
    // Validate table is in allowlist
    if !ALLOWED_TABLES.contains(&input.table.as_str()) {
        return Err(anyhow::anyhow!("Table '{}' not in allowlist", input.table));
    }
// Limit page size to max 100
let page_size = input.page_size.min(100).max(1);

// Decode cursor if provided
let start_id = match &amp;input.cursor {
    Some(cursor_str) =&gt; {
        let cursor = Cursor::decode(cursor_str)?;
        // Validate cursor is for same table (security check)
        if cursor.table != input.table {
            return Err(anyhow::anyhow!(
                &quot;Cursor was created for table &#x27;{}&#x27;, not &#x27;{}&#x27;&quot;,
                cursor.table, input.table
            ));
        }
        cursor.last_id
    }
    None =&gt; 0,
};

// Build query - fetch page_size + 1 to detect if more pages exist
let query = format!(
    &quot;SELECT * FROM {} WHERE id &gt; ? ORDER BY id LIMIT ?&quot;,
    input.table
);

let all_rows = sqlx::query(&amp;query)
    .bind(start_id)
    .bind(page_size + 1)
    .fetch_all(pool.as_ref())
    .await?;

// Determine if there are more results
let has_more = all_rows.len() &gt; page_size as usize;
let rows: Vec&lt;_&gt; = all_rows.into_iter().take(page_size as usize).collect();

// Extract column names
let columns: Vec&lt;String&gt; = if let Some(first_row) = rows.first() {
    first_row.columns().iter().map(|c| c.name().to_string()).collect()
} else {
    vec![]
};

// Convert rows to JSON values
let row_data: Vec&lt;Vec&lt;serde_json::Value&gt;&gt; = rows.iter().map(|row| {
    columns.iter().enumerate().map(|(i, _)| {
        // Try to get as different types
        if let Ok(v) = row.try_get::&lt;i64, _&gt;(i) {
            serde_json::Value::Number(v.into())
        } else if let Ok(v) = row.try_get::&lt;String, _&gt;(i) {
            serde_json::Value::String(v)
        } else {
            serde_json::Value::Null
        }
    }).collect()
}).collect();

// Get last ID for cursor
let last_id = row_data.last()
    .and_then(|row| row.first())
    .and_then(|v| v.as_i64());

// Create next cursor if more data exists
let next_cursor = if has_more {
    last_id.map(|id| Cursor {
        last_id: id,
        table: input.table.clone(),
    }.encode())
} else {
    None
};

// Human-readable status for AI
let status = if has_more {
    format!(
        &quot;Showing {} rows. More data available - pass next_cursor to continue.&quot;,
        row_data.len()
    )
} else {
    format!(&quot;Showing {} rows. This is all available data.&quot;, row_data.len())
};

Ok(PaginatedResult {
    columns,
    rows: row_data,
    count: row_data.len(),
    next_cursor,
    status,
})
}

}
// Key patterns demonstrated:
// 1. Opaque Cursors - base64 JSON hides implementation details
// 2. Fetch N+1 Pattern - efficiently detect more pages without COUNT
// 3. Table Validation in Cursor - prevent cursor reuse attacks
// 4. Human-Readable Status - helps AI understand pagination state

🧪 Tests

Run these tests locally with:

cargo test

View Test Code

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
#[tokio::test]
async fn test_first_page() {
    // First page should return results and a next_cursor
}

#[tokio::test]
async fn test_continue_with_cursor() {
    // Second page should have no overlap with first
}

#[tokio::test]
async fn test_last_page() {
    // Final page should have no next_cursor
}

#[tokio::test]
async fn test_invalid_table() {
    // Tables not in allowlist should error
}

#[tokio::test]
async fn test_cursor_table_mismatch() {
    // Cursor from table A shouldn&#x27;t work for table B
}
}

}

🤔 Reflection

Why do we include the table name in the cursor?
What would happen if rows were deleted between page fetches?
How would you support sorting by a non-unique column?
Why is the cursor base64-encoded JSON instead of just an ID?