Skip to main content

Troubleshooting Guide

Comprehensive guide for diagnosing and resolving common issues with the MCP Test Harness.

๐Ÿ“‹ Overviewโ€‹

This guide covers common problems, error messages, debugging techniques, and solutions to help you quickly resolve issues with the MCP Test Harness.

New to the Test Harness? Check the Installation Guide for setup instructions, or see the Quick Start Guide for basic usage.

๐Ÿšจ Quick Diagnosisโ€‹

Is your issue in this category?โ€‹

  • โš™๏ธ Configuration - YAML syntax, missing fields, invalid values
  • ๐Ÿ–ฅ๏ธ Server Connection - Cannot connect to or start MCP server
  • ๐Ÿงช Test Execution - Tests failing, timeouts, validation errors
  • ๐Ÿ“Š Performance - Slow execution, memory issues, resource limits
  • ๐Ÿ” Security - Authentication, authorization, compliance problems

Emergency Checklistโ€‹

If tests are completely failing, check these first:

  1. Basic connectivity: Can you connect to your MCP server manually?
  2. Configuration syntax: Is your YAML configuration valid?
  3. Required fields: Are all required configuration fields present?
  4. File paths: Do all referenced files and directories exist?
  5. Permissions: Does the test harness have necessary file/network permissions?
  6. Resource availability: Is there enough memory/disk/network capacity?

โš™๏ธ Configuration Issuesโ€‹

Issue: Invalid YAML Syntaxโ€‹

Symptoms:

โŒ Configuration Error: Invalid YAML syntax in test-harness.yaml at line 15
Error: mapping values are not allowed here

Solutions:

  1. Check YAML Indentation:
# โŒ Incorrect indentation
global:
max_global_concurrency: 4
global_timeout_seconds: 300

# โœ… Correct indentation
global:
max_global_concurrency: 4
global_timeout_seconds: 300
  1. Validate YAML Online:

  2. Use YAML Linter:

# Install yamllint
pip install yamllint

# Check configuration
yamllint test-harness.yaml

Issue: Missing Required Fieldsโ€‹

Symptoms:

โŒ Configuration Error: Missing required field 'server.start_command'

Solutions:

  1. Add Missing Fields:
server:
start_command: "cargo run --bin my-mcp-server" # Add this line
args: ["stdio"]
  1. Use Configuration Template:
# Generate a valid template
mcp-test-harness template --minimal --output template.yaml
# Copy required fields from template
  1. Check Configuration Reference:

Issue: Invalid Field Valuesโ€‹

Symptoms:

โŒ Configuration Error: Invalid value for 'global.max_global_concurrency': 0
Expected: Integer between 1 and 32

Solutions:

  1. Fix Value Ranges:
global:
max_global_concurrency: 4 # Must be 1-32
global_timeout_seconds: 300 # Must be 1-3600
  1. Check Data Types:
# โŒ Incorrect types
global:
max_global_concurrency: "4" # String instead of integer
fail_fast: "false" # String instead of boolean

# โœ… Correct types
global:
max_global_concurrency: 4 # Integer
fail_fast: false # Boolean

Issue: File Path Not Foundโ€‹

Symptoms:

โŒ Configuration Error: Script file not found: 'scripts/setup.py'

Solutions:

  1. Check File Exists:
ls -la scripts/setup.py
  1. Use Correct Relative Paths:
# โŒ Incorrect path
setup_script: "setup.py"

# โœ… Correct path
setup_script: "scripts/setup.py"
# or use absolute path
setup_script: "/home/user/project/scripts/setup.py"
  1. Create Missing Files:
mkdir -p scripts
touch scripts/setup.py
chmod +x scripts/setup.py

๐Ÿ–ฅ๏ธ Server Connection Issuesโ€‹

Issue: Cannot Connect to MCP Serverโ€‹

Symptoms:

โŒ Connection Error: Failed to connect to MCP server
Transport: stdio
Command: cargo run --bin my-mcp-server
Error: No such file or directory

Solutions:

  1. Verify Server Command:
# Test server command manually
cargo run --bin my-mcp-server stdio

# If command fails, check:
# - Is the binary name correct?
# - Is the project built?
# - Are you in the right directory?
  1. Check Working Directory:
server:
start_command: "cargo run --bin my-mcp-server"
working_dir: "/path/to/server/project" # Add this
  1. Use Absolute Paths:
server:
start_command: "/home/user/.cargo/bin/cargo"
args: ["run", "--bin", "my-mcp-server", "stdio"]

Issue: Server Startup Timeoutโ€‹

Symptoms:

โŒ Server Error: Server startup timeout after 30 seconds
Server command: cargo run --bin my-mcp-server
Status: Still starting

Solutions:

  1. Increase Startup Timeout:
server:
startup_timeout_seconds: 60 # Increase from default 30
  1. Check Server Logs:
# Run server manually to see error messages
cargo run --bin my-mcp-server stdio
  1. Optimize Server Build:
# Use release build for faster startup
cargo build --release
server:
start_command: "cargo run --release --bin my-mcp-server"

Issue: HTTP Connection Refusedโ€‹

Symptoms:

โŒ Connection Error: Connection refused
URL: http://localhost:3000
Error: Cannot connect to server

Solutions:

  1. Check Server is Running:
# Check if server is listening
netstat -tlnp | grep 3000
# or
ss -tlnp | grep 3000

# Test connection manually
curl http://localhost:3000/health
  1. Verify URL and Port:
server:
transport: "http"
url: "http://localhost:3000" # Check port is correct
  1. Check Firewall Settings:
# Check firewall rules
sudo ufw status
sudo iptables -L

# Allow port if needed
sudo ufw allow 3000

Issue: WebSocket Connection Failedโ€‹

Symptoms:

โŒ WebSocket Error: Connection failed
URL: ws://localhost:3000
Error: HTTP 404 Not Found

Solutions:

  1. Check WebSocket Endpoint:
server:
transport: "websocket"
url: "ws://localhost:3000/ws" # Add correct path
  1. Test WebSocket Manually:
# Use websocat to test
websocat ws://localhost:3000

# Or use browser dev tools
  1. Check Server WebSocket Support:
    • Verify your MCP server supports WebSocket transport
    • Check server documentation for correct WebSocket URL format

๐Ÿงช Test Execution Issuesโ€‹

Issue: Test Timeoutโ€‹

Symptoms:

โŒ FAIL: repository_stats
Duration: 30000ms (TIMEOUT)
Error: Test exceeded maximum execution time

Solutions:

  1. Increase Test Timeout:
test_cases:
- id: "repository_stats"
timeout_override_seconds: 60 # Increase timeout
  1. Check Server Performance:
# Monitor server resource usage
top -p $(pgrep my-mcp-server)
htop
  1. Optimize Test Data:
# Use smaller test datasets
input_params:
path: "test-projects/small" # Instead of large project

Issue: Validation Failureโ€‹

Symptoms:

โŒ FAIL: repository_stats
Validation Failure:
- Expected field 'result.total_files' not found
- Response: {"error": "Path not found"}

Solutions:

  1. Check Server Response:
# Debug server response manually
echo '{"tool": "repository_stats", "arguments": {"path": "/test/project"}}' | cargo run --bin my-mcp-server stdio
  1. Update Validation Patterns:
# Handle error responses
expected:
patterns:
- key: "result.total_files"
validation: { type: "exists" }
required: false # Make optional if server might return errors
  1. Fix Input Parameters:
input_params:
path: "${default_project_path}" # Use valid path

Issue: Script Execution Failedโ€‹

Symptoms:

โŒ Script Error: Custom validation script failed
Script: scripts/validate_response.py
Exit Code: 1
Error: ModuleNotFoundError: No module named 'json'

Solutions:

  1. Check Script Dependencies:
# Test script manually
python3 scripts/validate_response.py < test_input.json

# Install missing dependencies
pip3 install -r scripts/requirements.txt
  1. Fix Script Permissions:
chmod +x scripts/validate_response.py
  1. Use Correct Python Path:
custom_scripts:
- script: "scripts/validate_response.py"
engine: "python3" # Specify interpreter

Issue: Concurrent Test Failuresโ€‹

Symptoms:

โŒ Multiple tests failing with resource exhaustion
Error: Cannot create thread: Resource temporarily unavailable

Solutions:

  1. Reduce Concurrency:
global:
max_global_concurrency: 2 # Reduce from default 4

test_suites:
- name: "my-suite"
parallel_execution: false # Disable parallel execution
  1. Increase Resource Limits:
# Increase file descriptor limit
ulimit -n 4096

# Increase process limit
ulimit -u 2048
  1. Add Resource Monitoring:
environment:
limits:
max_memory_mb: 512 # Set reasonable limits
max_open_files: 256

๐Ÿ“Š Performance Issuesโ€‹

Issue: Slow Test Executionโ€‹

Symptoms:

Performance Warning: Test suite taking longer than expected
Average response time: 15 seconds (baseline: 2 seconds)

Solutions:

  1. Profile Server Performance:
# Use profiling tools
perf record -g cargo run --bin my-mcp-server
perf report

# Monitor system resources
iostat 1
vmstat 1
  1. Optimize Test Configuration:
# Reduce test parallelism if causing contention
global:
max_global_concurrency: 1

# Use smaller test datasets
input_params:
max_results: 100 # Limit result size
  1. Update Performance Baselines:
# Re-establish baselines if hardware changed
mcp-test-harness benchmark --establish-baseline --config prod.yaml

Issue: Memory Exhaustionโ€‹

Symptoms:

โŒ Resource Error: Memory limit exceeded
Used: 2048 MB
Limit: 1024 MB
Test: large_repository_analysis

Solutions:

  1. Increase Memory Limits:
environment:
limits:
max_memory_mb: 4096 # Increase limit

# Or per-test basis
performance:
max_memory_usage_mb: 2048
  1. Optimize Memory Usage:
# Process data in chunks
input_params:
batch_size: 100
stream_results: true
  1. Monitor Memory Usage:
# Check system memory
free -h
htop

# Check for memory leaks
valgrind --tool=memcheck cargo run --bin my-mcp-server

Issue: Disk Space Exhaustionโ€‹

Symptoms:

โŒ Storage Error: No space left on device
Available: 0 MB
Required: 500 MB

Solutions:

  1. Clean Up Temporary Files:
# Clean test artifacts
rm -rf test-reports/old-reports/
rm -rf /tmp/mcp-test-*

# Clean up Docker if using containers
docker system prune -a
  1. Configure Cleanup:
global:
cleanup_on_success: true
cleanup_on_failure: false # Keep for debugging

reporting:
retention_days: 30 # Automatically clean old reports
  1. Use External Storage:
reporting:
output_dir: "/mnt/external/test-reports" # Use external disk

๐Ÿ” Security Issuesโ€‹

Issue: Authentication Failedโ€‹

Symptoms:

โŒ Authentication Error: Invalid credentials
Status: 401 Unauthorized
Error: Authentication token invalid or expired

Solutions:

  1. Check Token Configuration:
server:
headers:
Authorization: "Bearer ${API_TOKEN}" # Verify token format

# Set environment variable
export API_TOKEN="your-actual-token"
  1. Verify Token Validity:
# Test authentication manually
curl -H "Authorization: Bearer $API_TOKEN" http://localhost:3000/auth/verify
  1. Update Authentication Method:
# Use different auth method if needed
server:
transport: "http"
headers:
X-API-Key: "${API_KEY}" # Alternative auth header

Issue: Certificate Verification Failedโ€‹

Symptoms:

โŒ TLS Error: Certificate verification failed
Error: self signed certificate in certificate chain

Solutions:

  1. Disable Certificate Verification (Development Only):
server:
tls:
verify_certificates: false # โš ๏ธ Only for development
  1. Add Custom CA Bundle:
server:
tls:
ca_bundle_path: "/path/to/custom-ca.pem"
  1. Use Self-Signed Certificate:
# Create self-signed certificate for testing
openssl req -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 365 -nodes

Issue: Audit Log Errorsโ€‹

Symptoms:

โŒ Audit Error: Cannot write to audit log
Path: /var/log/mcp-test-harness/audit.log
Error: Permission denied

Solutions:

  1. Fix Log Directory Permissions:
sudo mkdir -p /var/log/mcp-test-harness
sudo chown $USER:$USER /var/log/mcp-test-harness
chmod 755 /var/log/mcp-test-harness
  1. Use Alternative Log Path:
security:
audit_log_path: "./logs/audit.log" # Use local directory
  1. Disable Audit Logging (Temporary):
security:
enable_audit_logging: false # Temporary workaround

๐Ÿ“ˆ Monitoring Issuesโ€‹

Issue: Metrics Collection Failedโ€‹

Symptoms:

โŒ Metrics Error: Cannot connect to Prometheus push gateway
URL: http://localhost:9091
Error: Connection refused

Solutions:

  1. Start Prometheus Push Gateway:
# Start push gateway
docker run -d -p 9091:9091 prom/pushgateway

# Or install locally
prometheus-pushgateway --web.listen-address=:9091
  1. Disable Metrics Collection:
monitoring:
enabled: false # Temporary workaround
  1. Use Alternative Metrics Endpoint:
monitoring:
prometheus:
push_gateway:
url: "http://monitoring-server:9091" # Different server

Issue: Alert Rules Not Workingโ€‹

Symptoms:

โš ๏ธ Alert Warning: No alerts triggered despite test failures
Expected: Failure rate alert
Actual: No alerts sent

Solutions:

  1. Check Alert Configuration:
monitoring:
alerting:
enabled: true # Ensure alerting is enabled
rules:
- name: "test_failure_rate"
condition: "test_failure_rate > 0.1" # Check condition logic
  1. Test Alert Channels:
# Test email configuration
echo "Test email" | mail -s "Alert Test" team@company.com

# Test Slack webhook
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"Test alert"}' \
$SLACK_WEBHOOK_URL
  1. Check Alert Thresholds:
# Lower thresholds for testing
monitoring:
alerts:
error_rate_threshold: 0.01 # 1% instead of 5%

๐Ÿ”ง Advanced Debuggingโ€‹

Enable Debug Loggingโ€‹

# Maximum debug output
RUST_LOG=debug mcp-test-harness test --verbose --config my-config.yaml

# Trace level logging
RUST_LOG=trace mcp-test-harness test -vvv --config my-config.yaml

Analyze Log Filesโ€‹

# View recent logs
tail -f ~/.mcp-test-harness/logs/test-harness.log

# Search for specific errors
grep "ERROR" ~/.mcp-test-harness/logs/test-harness.log

# Analyze performance
grep "PERF" ~/.mcp-test-harness/logs/test-harness.log | tail -20

Network Debuggingโ€‹

# Monitor network traffic
sudo tcpdump -i lo -A port 3000

# Check network connections
netstat -tulpn | grep mcp

# Test connectivity
telnet localhost 3000

Process Debuggingโ€‹

# Monitor process tree
pstree -p mcp-test-harness

# Check file descriptors
lsof -p $(pgrep mcp-test-harness)

# Monitor system calls
strace -p $(pgrep mcp-test-harness)

Configuration Debuggingโ€‹

# Validate configuration
mcp-test-harness validate --comprehensive --config my-config.yaml

# Dry run to see execution plan
mcp-test-harness test --dry-run --config my-config.yaml

# Export effective configuration
mcp-test-harness config export --config my-config.yaml

๐Ÿ†˜ Getting Additional Helpโ€‹

Collecting Diagnostic Informationโ€‹

When reporting issues, collect this information:

#!/bin/bash
# diagnostic-info.sh

echo "=== System Information ==="
uname -a
cat /etc/os-release

echo "=== MCP Test Harness Version ==="
mcp-test-harness --version

echo "=== Configuration Validation ==="
mcp-test-harness validate --config test-harness.yaml

echo "=== Resource Usage ==="
free -h
df -h

echo "=== Network Connectivity ==="
netstat -tulpn | grep -E ":(3000|8080|9090)"

echo "=== Recent Errors ==="
tail -20 ~/.mcp-test-harness/logs/test-harness.log | grep ERROR

echo "=== Environment Variables ==="
env | grep -E "MCP|RUST|PATH"

Creating Minimal Reproductionโ€‹

Create a minimal configuration that reproduces the issue:

# minimal-repro.yaml
global:
max_global_concurrency: 1
log_level: "debug"

server:
start_command: "echo" # Simple command for testing
args: ["hello"]
transport: "stdio"

test_suites:
- name: "simple-test"
test_cases:
- id: "basic-test"
tool_name: "echo"
input_params: {}
expected:
patterns:
- key: "result"
validation: { type: "exists" }

Support Channelsโ€‹

Issue Templateโ€‹

When reporting issues, use this template:

## Problem Description
Brief description of the issue

## Environment
- OS: Ubuntu 22.04
- MCP Test Harness Version: 1.0.0
- Server Type: HTTP/stdio/WebSocket
- Server Implementation: Custom/CodePrism/Other

## Configuration
```yaml
# Your configuration (sanitized)

Expected Behaviorโ€‹

What you expected to happen

Actual Behaviorโ€‹

What actually happened

Error Messagesโ€‹

Complete error messages and stack traces

Steps to Reproduceโ€‹

  1. Step 1
  2. Step 2
  3. Step 3

Additional Contextโ€‹

Any other relevant information


---

## ๐Ÿ“š Additional Resources

- [User Guide](user-guide.md) - Complete user manual
- [Configuration Reference](configuration-reference.md) - Complete configuration documentation
- [Production Deployment](production-deployment.md) - Enterprise deployment guide
- [Developer Guide](developer-guide.md) - Extending the test harness

**Last Updated**: 2025-01-07
**Version**: 1.0.0