Architecture Guide
This document describes the high-level architecture, design decisions, and implementation details of CodeCodePrism.
🚨 ARCHITECTURAL PIVOT - December 2024
After detailed review against official MCP (Model Context Protocol) documentation, the architecture has been significantly simplified to ensure MCP compliance and optimal client integration.
Table of Contents
- System Overview
- Core Principles
- MCP-Compliant Architecture
- Data Flow
- Storage Design
- Performance Architecture
- Security Architecture
- Deployment Architecture
System Overview
CodeCodePrism is a MCP-compliant graph-first code intelligence system designed to provide real-time, accurate code understanding for LLM assistants. The system implements the Model Context Protocol (JSON-RPC 2.0) specification to integrate seamlessly with MCP clients like Claude Desktop, Cursor, and VS Code GitHub Copilot.
MCP-Optimized Architecture
┌─────────────────────────────────────────────────────────────┐
│ MCP-Compliant CodeCodePrism │
├─────────────────────────────────────────────────────────────┤
│ MCP Clients │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Claude │ │ Cursor │ │ VS Code │ │
│ │ Desktop │ │ Editor │ │ Copilot │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ └───────────────┼───────────────┘ │
│ ▼ (JSON-RPC 2.0) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ CodeCodePrism MCP Server │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Resources │ │ Tools │ │ Prompts │ │ │
│ │ │ Manager │ │ Manager │ │ Manager │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ JSON-RPC 2.0 Transport │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ stdio │ │ HTTP + SSE │ │ │ │
│ │ │ │ (Primary) │ │ (Optional) │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Repository Manager │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Repository │ │ Parser │ │ File │ │ │
│ │ │ Scanner │ │ Engine │ │ Watcher │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Bulk │ │ Pipeline │ │ Language │ │ │
│ │ │ Indexer │ │ Integration │ │ Parsers │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ In-Memory Graph + Optional Persistence │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ DashMap │ │ LRU Cache │ │ Optional │ │ │
│ │ │ (Live Graph)│ │ (Parsed AST)│ │ File Cache │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Key Architectural Changes:
- ❌ Removed: Neo4j, Kafka, Redis (over-engineered for MCP)
- ✅ Added: JSON-RPC 2.0 transport layer (MCP requirement)
- ✅ Simplified: In-memory graph storage with optional persistence
- ✅ Optimized: stdio transport for fast MCP client integration
Core Principles
1. MCP Protocol Compliance
CodeCodePrism strictly adheres to the Model Context Protocol specification:
- JSON-RPC 2.0: All communication uses proper JSON-RPC 2.0 format
- Initialization Handshake: Proper capability negotiation
- Resource/Tool/Prompt Standards: Exact specification compliance
- Transport Layer: stdio (primary) and HTTP+SSE (optional)
2. Simplicity Over Complexity
Based on MCP best practices:
- Local Execution: Fast, local processing without network dependencies
- Simple Storage: In-memory graph with optional file persistence
- Direct Access: No middleware layers or complex routing
- Fast Response: < 100ms for most operations
3. Graph-First Design
Maintains the core strength of CodeCodePrism:
- Structural Understanding: Relationships between functions, classes, and modules
- Cross-Language Analysis: Unified representation across programming languages
- Efficient Queries: Graph traversal for code navigation and analysis
4. Real-Time Updates
Optimized for MCP client expectations:
- File Watching: Sub-second detection of changes
- Incremental Updates: Only changed components are updated
- Event Notifications: Optional real-time updates via SSE
MCP-Compliant Architecture
JSON-RPC 2.0 Transport Layer
// MCP Message Format
#[derive(Serialize, Deserialize)]
pub struct McpRequest {
pub jsonrpc: String, // Always "2.0"
pub id: serde_json::Value, // Request ID (number | string)
pub method: String, // MCP method name
pub params: Option<serde_json::Value>, // Method parameters
}
#[derive(Serialize, Deserialize)]
pub struct McpResponse {
pub jsonrpc: String, // Always "2.0"
pub id: serde_json::Value, // Matching request ID
pub result: Option<serde_json::Value>, // Success result
pub error: Option<McpError>, // Error details
}
// Transport Options
pub enum Transport {
Stdio, // Primary: stdin/stdout
Http { // Optional: HTTP + SSE
port: u16,
sse_endpoint: Option<String>,
},
}
MCP Server Components
pub struct McpServer {
// MCP Core Components
capabilities: ServerCapabilities,
resources: ResourceManager,
tools: ToolManager,
prompts: PromptManager,
// Repository Components
repository: RepositoryManager,
transport: Transport,
// State Management
graph: Arc<DashMap<NodeId, Node>>,
edges: Arc<DashMap<NodeId, Vec<Edge>>>,
}
// MCP Capability Declaration
pub struct ServerCapabilities {
pub resources: ResourceCapabilities,
pub tools: ToolCapabilities,
pub prompts: PromptCapabilities,
pub sampling: Option<SamplingCapabilities>,
}
Resource Manager (MCP Resources)
pub struct ResourceManager {
repository_path: PathBuf,
supported_extensions: HashSet<String>,
}
impl ResourceManager {
// MCP: resources/list
pub async fn list_resources(&self) -> McpResult<ResourceList> {
// Return available resources with URIs like:
// - codeprism://repo/src/main.py (file content)
// - codeprism://graph/nodes (graph nodes)
// - codeprism://symbols/functions (code symbols)
}
// MCP: resources/read
pub async fn read_resource(&self, uri: &str) -> McpResult<ResourceContent> {
// Handle URIs and return appropriate content
match uri {
uri if uri.starts_with(codeprism://repo/") => self.read_file(uri).await,
uri if uri.starts_with(codeprism://graph/") => self.read_graph_data(uri).await,
uri if uri.starts_with(codeprism://symbols/") => self.read_symbols(uri).await,
_ => Err(McpError::InvalidResource(uri.to_string())),
}
}
}
Tool Manager (MCP Tools)
pub struct ToolManager {
tools: HashMap<String, Box<dyn McpTool>>,
graph: Arc<DashMap<NodeId, Node>>,
edges: Arc<DashMap<NodeId, Vec<Edge>>>,
}
#[async_trait]
pub trait McpTool: Send + Sync {
fn name(&self) -> &str;
fn description(&self) -> &str;
fn input_schema(&self) -> serde_json::Value; // JSON Schema
async fn call(&self, params: serde_json::Value) -> McpResult<ToolResult>;
}
// Example Tool Implementation
pub struct TracePathTool {
graph: Arc<DashMap<NodeId, Node>>,
edges: Arc<DashMap<NodeId, Vec<Edge>>>,
}
impl McpTool for TracePathTool {
fn name(&self) -> &str { "trace_path" }
fn description(&self) -> &str {
"Trace execution paths between code symbols"
}
fn input_schema(&self) -> serde_json::Value {
json!({
"type": "object",
"properties": {
"source": {"type": "string", "description": "Source symbol ID"},
"target": {"type": "string", "description": "Target symbol ID"},
"max_depth": {"type": "number", "default": 10}
},
"required": ["source", "target"]
})
}
}
Prompt Manager (MCP Prompts)
pub struct PromptManager {
prompts: HashMap<String, Box<dyn McpPrompt>>,
repository: Arc<RepositoryManager>,
}
#[async_trait]
pub trait McpPrompt: Send + Sync {
fn name(&self) -> &str;
fn description(&self) -> &str;
fn arguments(&self) -> Vec<PromptArgument>;
async fn generate(&self, args: HashMap<String, String>) -> McpResult<PromptResult>;
}
// Example: Repository Overview Prompt
pub struct RepoOverviewPrompt {
repository: Arc<RepositoryManager>,
}
impl McpPrompt for RepoOverviewPrompt {
fn name(&self) -> &str { "repo_overview" }
async fn generate(&self, args: HashMap<String, String>) -> McpResult<PromptResult> {
let stats = self.repository.get_statistics().await?;
let overview = format!(
"Repository Analysis:\n\
Total files: {}\n\
Languages: {:?}\n\
Code symbols: {} functions, {} classes\n\
...",
stats.total_files,
stats.languages,
stats.functions,
stats.classes
);
Ok(PromptResult {
description: "Comprehensive repository analysis".to_string(),
messages: vec![PromptMessage {
role: "user".to_string(),
content: TextContent { text: overview },
}],
})
}
}
Data Flow
1. MCP Client Connection
sequenceDiagram
participant Client
participant McpServer
participant Repository
Client->>McpServer: initialize (JSON-RPC 2.0)
McpServer->>McpServer: Load capabilities
McpServer->>Repository: Initialize repository
Repository->>Repository: Scan and index
McpServer->>Client: initialize response
Client->>McpServer: initialized notification
Note over Client,McpServer: Connection ready for use
2. Resource Access
sequenceDiagram
participant Client
participant McpServer
participant ResourceManager
participant FileSystem
Client->>McpServer: resources/list
McpServer->>ResourceManager: list_resources()
ResourceManager->>FileSystem: scan directory
FileSystem->>ResourceManager: file list
ResourceManager->>McpServer: resource URIs
McpServer->>Client: resource list
Client->>McpServer: resources/read (codeprism://repo/file.py)
McpServer->>ResourceManager: read_resource()
ResourceManager->>FileSystem: read file
FileSystem->>ResourceManager: file content
ResourceManager->>McpServer: content + metadata
McpServer->>Client: resource content
3. Tool Execution
sequenceDiagram
participant Client
participant McpServer
participant ToolManager
participant Graph
Client->>McpServer: tools/list
McpServer->>ToolManager: list_tools()
ToolManager->>McpServer: available tools
McpServer->>Client: tool definitions
Client->>McpServer: tools/call (trace_path)
McpServer->>ToolManager: execute_tool()
ToolManager->>Graph: find_path()
Graph->>ToolManager: path result
ToolManager->>McpServer: tool result
McpServer->>Client: execution result
Storage Design
Simplified In-Memory Storage
// Primary Graph Storage (In-Memory)
pub struct GraphStore {
nodes: Arc<DashMap<NodeId, Node>>,
edges: Arc<DashMap<NodeId, Vec<Edge>>>,
file_index: Arc<DashMap<PathBuf, Vec<NodeId>>>,
symbol_index: Arc<DashMap<String, Vec<NodeId>>>,
}
// Optional Persistence Layer
pub struct PersistenceLayer {
cache_dir: PathBuf,
enable_cache: bool,
}
impl PersistenceLayer {
pub async fn save_graph(&self, graph: &GraphStore) -> Result<()> {
// Optional: Save graph to disk for faster startup
if self.enable_cache {
let data = bincode::serialize(&graph)?;
tokio::fs::write(self.cache_dir.join("graph.bin"), data).await?;
}
Ok(())
}
pub async fn load_graph(&self) -> Result<Option<GraphStore>> {
// Optional: Load cached graph from disk
if self.enable_cache && self.cache_dir.join("graph.bin").exists() {
let data = tokio::fs::read(self.cache_dir.join("graph.bin")).await?;
let graph = bincode::deserialize(&data)?;
Ok(Some(graph))
} else {
Ok(None)
}
}
}
Performance Optimizations
// LRU Cache for Parsed ASTs
pub struct ParseCache {
cache: Arc<Mutex<lru::LruCache<PathBuf, ParseResult>>>,
max_size: usize,
}
// Memory Management
pub struct MemoryManager {
max_nodes: usize,
max_memory: usize,
cleanup_threshold: f64,
}
impl MemoryManager {
pub fn should_cleanup(&self, current_nodes: usize, current_memory: usize) -> bool {
current_nodes > (self.max_nodes as f64 * self.cleanup_threshold) as usize ||
current_memory > (self.max_memory as f64 * self.cleanup_threshold) as usize
}
pub fn cleanup_strategy(&self) -> CleanupStrategy {
// Remove least recently used nodes/edges
CleanupStrategy::LeastRecentlyUsed
}
}
Performance Architecture
MCP-Optimized Performance Targets
Target Metrics (MCP Requirement):
- Initialization: < 2s for typical repository (1000 files)
- Resource Access: < 100ms per file read
- Tool Execution: < 500ms for complex queries
- Memory Usage: < 1GB for 10k nodes
- Update Latency: < 250ms for file changes
Performance Strategies:
- Lazy Loading: Only parse files when accessed
- Incremental Processing: Only update changed files
- Memory Limits: Automatic cleanup when limits reached
- Async Operations: Non-blocking I/O for all operations
Caching Strategy (Simplified)
┌─────────────────────────────────────────────────────────────┐
│ MCP-Optimized Caching │
├─────────────────────────────────────────────────────────────┤
│ L1: In-Process Memory (Primary) │
│ ├─ Live Graph: DashMap (thread-safe) │
│ ├─ Parse Cache: LRU (recent files) │
│ └─ Query Cache: HashMap (common queries) │
├─────────────────────────────────────────────────────────────┤
│ L2: Optional File Cache (Secondary) │
│ ├─ Serialized Graph: bincode format │
│ ├─ Parse Results: msgpack format │
│ └─ Statistics: JSON format │
└─────────────────────────────────────────────────────────────┘
Security Architecture
MCP Security Model
Based on MCP security requirements:
pub struct SecurityManager {
allowed_paths: Vec<PathBuf>,
file_access_limits: FileAccessLimits,
resource_permissions: ResourcePermissions,
}
#[derive(Debug)]
pub struct FileAccessLimits {
max_file_size: usize, // 10MB default
max_files_per_request: usize, // 100 default
allowed_extensions: HashSet<String>,
blocked_paths: Vec<PathBuf>, // .git, node_modules, etc.
}
impl SecurityManager {
pub fn validate_file_access(&self, path: &Path) -> SecurityResult<()> {
// Check if path is within allowed repository
if !self.is_path_allowed(path) {
return Err(SecurityError::PathNotAllowed(path.to_path_buf()));
}
// Check file size limits
if let Ok(metadata) = path.metadata() {
if metadata.len() > self.file_access_limits.max_file_size as u64 {
return Err(SecurityError::FileTooLarge);
}
}
// Check extension whitelist
if let Some(ext) = path.extension() {
if !self.file_access_limits.allowed_extensions.contains(ext.to_str().unwrap_or("")) {
return Err(SecurityError::ExtensionNotAllowed);
}
}
Ok(())
}
}
Privacy Controls (MCP Requirements)
- Repository Boundaries: Strict containment within specified paths
- File System Permissions: Respects OS access controls
- No External Network: Pure local analysis
- User Consent: Clear indication of access scope
- Data Minimization: Only process requested files
Deployment Architecture
MCP Client Integration
# Claude Desktop Configuration
# ~/.config/claude-desktop/claude-desktop.json
{
"mcpServers": {
codeprism": {
"command": codeprism",
"args": ["serve", "/path/to/repository"],
"env": {
"PRISM_LOG_LEVEL": "info",
"PRISM_CACHE_ENABLED": "true"
}
}
}
}
# Cursor Configuration
# .vscode/settings.json
{
"mcp.servers": [
{
"name": codeprism",
"command": [codeprism", "serve", "."],
"capabilities": ["resources", "tools", "prompts"]
}
]
}
Development Environment
# docker-compose.yml (Optional - for development)
version: '3.8'
services:
codeprism-dev:
build: .
environment:
RUST_LOG: debug
PRISM_REPOSITORY_PATH: /workspace
volumes:
- ./:/workspace
command: [codeprism", "serve", "/workspace", "--http", "--port", "8080"]
ports:
- "8080:8080"
Production Deployment (Simplified)
# Single Binary Deployment
curl -L https://github.com/org /codeprism/releases/latest/download/codeprism-linux-x64 -o codeprism
chmod +x codeprism
# Configure MCP Client
prism configure --client claude-desktop --repository /path/to/repo
# Start as daemon (optional)
prism daemon /path/to/repo --log-level info
Monitoring (Simplified)
// Built-in Metrics
pub struct Metrics {
pub requests_total: Counter,
pub request_duration: Histogram,
pub active_connections: Gauge,
pub memory_usage: Gauge,
}
// Health Check Endpoint (HTTP mode only)
#[derive(Serialize)]
pub struct HealthStatus {
pub status: String,
pub uptime: Duration,
pub repository_path: PathBuf,
pub nodes_count: usize,
pub memory_usage: usize,
}
Conclusion
This MCP-compliant architecture provides a robust, performant, and standards-compliant foundation for CodeCodePrism that:
- Meets MCP Requirements: Full JSON-RPC 2.0 compliance with proper transport
- Optimizes for Simplicity: Removed unnecessary complexity for better performance
- Enables Client Integration: Direct compatibility with all major MCP clients
- Maintains Core Strengths: Graph-first intelligence with real-time updates
- Ensures Security: Proper boundaries and permission controls
The simplified architecture delivers the same graph-based code intelligence capabilities while ensuring seamless integration with the rapidly growing MCP ecosystem.