Developer Guide
This guide covers the development workflow, architecture, and best practices for contributing to CodeCodePrism.
Table of Contentsโ
- Development Setup
- Project Structure
- Development Workflow
- Testing
- Code Style
- Architecture Overview
- Adding Language Support
- Debugging
- Performance
Development Setupโ
Prerequisitesโ
- Rust: 1.82+ with
rustfmt
andclippy
- Docker: For development services (Neo4j, Kafka)
- Make: For build automation
- Git: Version control
Initial Setupโ
# Clone the repository
git clone https://github.com/rustic-ai /codeprism
cd codeprism
# Install Rust toolchain components
rustup component add rustfmt clippy
# Install development tools
cargo install cargo-tarpaulin # Code coverage
cargo install cargo-watch # File watching
cargo install cargo-expand # Macro expansion
# Start development services
make dev-up
# Verify setup
make check
Development Servicesโ
The project uses Docker Compose for development dependencies:
# docker-compose.yml
services:
neo4j:
image: neo4j:5.15
ports: ["7474:7474", "7687:7687"]
environment:
NEO4J_AUTH: neo4j/password
kafka:
image: confluentinc/cp-kafka:latest
ports: ["9092:9092"]
redis:
image: redis:7-alpine
ports: ["6379:6379"]
Start services:
make dev-up # Start all services
make dev-down # Stop all services
make dev-logs # View service logs
Project Structureโ
prism/
โโโ crates/ # Rust workspace crates
โ โโโ codeprism/ # Core library
โ โ โโโ src/
โ โ โ โโโ lib.rs # Public API exports
โ โ โ โโโ ast/ # Universal AST types
โ โ โ โโโ parser/ # Parser engine
โ โ โ โโโ patch/ # Graph patch system
โ โ โ โโโ watcher/ # File system watcher
โ โ โ โโโ error.rs # Error types
โ โ โโโ tests/ # Integration tests
โ โ โโโ Cargo.toml
โ โ
โ โโโ codeprism-lang-js/ # JavaScript/TypeScript parser
โ โ โโโ src/
โ โ โ โโโ lib.rs # Public API
โ โ โ โโโ parser.rs # Main parser logic
โ โ โ โโโ ast_mapper.rs # CST to U-AST conversion
โ โ โ โโโ adapter.rs # Integration adapter
โ โ โ โโโ types.rs # Language-specific types
โ โ โ โโโ error.rs # Error handling
โ โ โโโ tests/
โ โ โ โโโ fixtures/ # Test files
โ โ โ โโโ integration_test.rs
โ โ โโโ build.rs # Build script
โ โ โโโ Cargo.toml
โ โ
โ โโโ codeprism-lang-python/ # Python parser (planned)
โ โโโ codeprism-lang-java/ # Java parser (planned)
โ โโโ codeprism-storage/ # Neo4j integration (planned)
โ โโโ codeprism-bus/ # Kafka integration (planned)
โ โโโ codeprism-mcp/ # MCP server (planned)
# (CLI and daemon components have been removed)
โ
โโโ docs/ # Documentation
โ โโโ DEVELOPER.md # This file
โ โโโ API.md # API documentation
โ โโโ ARCHITECTURE.md # System architecture
โ โโโ LANGUAGE_PARSERS.md # Language parser guide
โ
โโโ Cargo.toml # Workspace configuration
โโโ Makefile # Build automation
โโโ docker-compose.yml # Development services
โโโ README.md # Project overview
Crate Organizationโ
Each crate follows Rust conventions:
src/lib.rs
: Public API and re-exportssrc/error.rs
: Error types usingthiserror
src/types.rs
: Core data structurestests/
: Integration testsbenches/
: Performance benchmarks (when needed)examples/
: Usage examples
Development Workflowโ
Daily Developmentโ
# Start file watcher for continuous testing
cargo watch -x "test --all"
# Run specific crate tests
cargo test -p codeprism-lang-js
# Check code formatting and linting
make check
# Generate documentation
make doc
# Run benchmarks
cargo bench
Making Changesโ
-
Create a feature branch:
git checkout -b feature/new-language-parser
-
Write tests first (TDD approach):
# Add test cases
cargo test --test integration_test -- --nocapture -
Implement the feature:
# Use cargo-expand to debug macros
cargo expand --package codeprism-lang-js -
Verify quality:
make check # Format, lint, test
make coverage # Generate coverage report -
Update documentation:
cargo doc --no-deps --open
Code Quality Checksโ
The project enforces quality through:
# Formatting (required)
cargo fmt --all
# Linting (required)
cargo clippy --all-targets --all-features -- -D warnings
# Testing (required)
cargo test --all
# Coverage (target: 80%+)
cargo tarpaulin --out Html --all-features
# Documentation (required for public APIs)
cargo doc --no-deps
Testingโ
Test Organizationโ
tests/
โโโ fixtures/ # Test data files
โ โโโ simple.js # Basic JavaScript
โ โโโ typescript.ts # TypeScript features
โ โโโ imports.js # Import/export patterns
โโโ integration_test.rs # End-to-end tests
โโโ common/ # Test utilities
โโโ mod.rs
Test Categoriesโ
- Unit Tests: In
src/
files using#[cfg(test)]
- Integration Tests: In
tests/
directory - Documentation Tests: In doc comments
- Benchmark Tests: In
benches/
directory
Writing Testsโ
// Unit test example
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_node_id_generation() {
let span = Span::new(0, 10, 1, 1, 1, 11);
let id1 = NodeId::new("repo", Path::new("file.js"), &span, &NodeKind::Function);
let id2 = NodeId::new("repo", Path::new("file.js"), &span, &NodeKind::Function);
assert_eq!(id1, id2); // Same inputs = same ID
}
}
// Integration test example
#[test]
fn test_parse_real_file() {
let fixture_path = get_fixture_path("complex.js");
let content = fs::read_to_string(&fixture_path).unwrap();
let mut parser = JavaScriptParser::new();
let context = ParseContext {
repo_id: "test".to_string(),
file_path: fixture_path,
old_tree: None,
content,
};
let result = parser.parse(&context).unwrap();
// Verify expected nodes
assert!(result.nodes.iter().any(|n| n.name == "expectedFunction"));
}
Test Fixturesโ
Create realistic test files in tests/fixtures/
:
// tests/fixtures/react-component.jsx
import React, { useState } from 'react';
export function Counter({ initialValue = 0 }) {
const [count, setCount] = useState(initialValue);
const increment = () => setCount(c => c + 1);
return (
<div>
<span>Count: {count}</span>
<button onClick={increment}>+</button>
</div>
);
}
Coverage Requirementsโ
- Overall: 80%+ test coverage
- Core crates: 85%+ coverage
- Language parsers: 80%+ coverage
- Critical paths: 95%+ coverage
Generate coverage reports:
make coverage
open tarpaulin-report.html
Code Styleโ
Rust Style Guideโ
Follow the Rust Style Guide with these additions:
- Error Handling: Use
thiserror
for error types - Async Code: Use
tokio
for async runtime - Serialization: Use
serde
with appropriate derives - Documentation: Document all public APIs
Error Handling Patternโ
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ParseError {
#[error("Failed to parse {file}: {message}")]
Parse { file: PathBuf, message: String },
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("UTF-8 error: {0}")]
Utf8(#[from] std::str::Utf8Error),
}
pub type Result<T> = std::result::Result<T, ParseError>;
Documentation Standardsโ
/// Parse a JavaScript or TypeScript file into Universal AST.
///
/// This function performs incremental parsing when an old tree is provided,
/// which significantly improves performance for small edits.
///
/// # Arguments
///
/// * `context` - Parse context containing file path, content, and optional old tree
///
/// # Returns
///
/// Returns a `ParseResult` containing the syntax tree, extracted nodes, and edges.
///
/// # Errors
///
/// Returns `ParseError` if:
/// - The file contains syntax errors
/// - The file encoding is invalid
/// - Tree-sitter fails to parse
///
/// # Examples
///
/// ```rust
/// use codeprism_lang_js::{JavaScriptParser, ParseContext};
///
/// let mut parser = JavaScriptParser::new();
/// let context = ParseContext {
/// repo_id: "my-repo".to_string(),
/// file_path: PathBuf::from("app.js"),
/// old_tree: None,
/// content: "function hello() {}".to_string(),
/// };
///
/// let result = parser.parse(&context)?;
/// assert!(!result.nodes.is_empty());
/// ```
pub fn parse(&mut self, context: &ParseContext) -> Result<ParseResult> {
// Implementation...
}
Architecture Overviewโ
Core Componentsโ
-
Universal AST (codeprism::ast`):
- Language-agnostic representation
- Stable NodeId generation with Blake3
- Serializable types
-
Parser Engine (codeprism::parser`):
- Language registry for parser plugins
- Incremental parsing support
- Thread-safe operation
-
File Watcher (codeprism::watcher`):
- Real-time file system monitoring
- Debouncing for performance
- Async event streams
-
Graph Patches (codeprism::patch`):
- Incremental graph updates
- Serializable patch format
- Batch operations
Data Flowโ
File Change โ Watcher โ Parser โ AST โ Patch โ Storage โ MCP Server โ LLM
Thread Safetyโ
All components are designed for concurrent access:
- Parser Engine: Uses
DashMap
for thread-safe registry - File Watcher: Async with
tokio
- Language Parsers: Wrapped in
Mutex
for safety
Adding Language Supportโ
1. Create New Crateโ
# Create crate structure
mkdir crates/codeprism-lang-python
cd crates/codeprism-lang-python
# Initialize Cargo.toml
cat > Cargo.toml << EOF
[package]
name = "codeprism-lang-python"
version.workspace = true
edition.workspace = true
[dependencies]
tree-sitter.workspace = true
tree-sitter-python = "0.20"
# ... other dependencies
EOF
2. Implement Parserโ
// src/parser.rs
use tree_sitter::{Parser, Tree};
pub struct PythonParser {
parser: Parser,
}
impl PythonParser {
pub fn new() -> Self {
let mut parser = Parser::new();
parser.set_language(&tree_sitter_python::language())
.expect("Failed to load Python grammar");
Self { parser }
}
pub fn parse(&mut self, context: &ParseContext) -> Result<ParseResult> {
// Implementation similar to JavaScript parser
}
}
3. Implement AST Mapperโ
// src/ast_mapper.rs
impl AstMapper {
fn visit_node(&mut self, cursor: &TreeCursor) -> Result<()> {
match cursor.node().kind() {
"function_definition" => self.handle_function(cursor)?,
"class_definition" => self.handle_class(cursor)?,
"import_statement" => self.handle_import(cursor)?,
// ... other node types
_ => {}
}
Ok(())
}
}
4. Add Testsโ
// tests/integration_test.rs
#[test]
fn test_parse_python_function() {
let content = r#"
def greet(name: str) -> str:
return f"Hello, {name}!"
class Person:
def __init__(self, name: str):
self.name = name
"#;
let result = parse_python(content).unwrap();
assert!(result.nodes.iter().any(|n| n.name == "greet"));
}
Debuggingโ
Loggingโ
Use tracing
for structured logging:
use tracing::{debug, info, warn, error, instrument};
#[instrument(skip(content))]
pub fn parse_file(path: &Path, content: &str) -> Result<ParseResult> {
info!("Parsing file: {}", path.display());
debug!("Content length: {} bytes", content.len());
// ... parsing logic
info!("Extracted {} nodes", result.nodes.len());
Ok(result)
}
Tree-Sitter Debuggingโ
Debug tree-sitter parsing:
#[cfg(test)]
fn debug_tree_structure() {
let tree = parser.parse(content, None).unwrap();
let mut cursor = tree.walk();
fn print_tree(cursor: &mut TreeCursor, depth: usize) {
let node = cursor.node();
println!("{}{} [{:?}]",
" ".repeat(depth),
node.kind(),
node.start_byte()..node.end_byte()
);
if cursor.goto_first_child() {
loop {
print_tree(cursor, depth + 1);
if !cursor.goto_next_sibling() {
break;
}
}
cursor.goto_parent();
}
}
print_tree(&mut cursor, 0);
}
Performance Profilingโ
Use criterion
for benchmarking:
// benches/parse_benchmark.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn bench_parse_large_file(c: &mut Criterion) {
let content = include_str!("../tests/fixtures/large.js");
let mut parser = JavaScriptParser::new();
c.bench_function("parse_large_js", |b| {
b.iter(|| {
let context = ParseContext {
repo_id: "bench".to_string(),
file_path: PathBuf::from("large.js"),
old_tree: None,
content: black_box(content.to_string()),
};
parser.parse(&context).unwrap()
})
});
}
criterion_group!(benches, bench_parse_large_file);
criterion_main!(benches);
Performanceโ
Optimization Guidelinesโ
- Minimize Allocations: Use string slices where possible
- Batch Operations: Group related operations
- Cache Results: Cache expensive computations
- Profile Regularly: Use
cargo flamegraph
Performance Targetsโ
- Parse Speed: < 5ยตs per line of code
- Memory Usage: < 2GB for 10M nodes
- Update Latency: < 250ms for typical file
- Query Response: < 1s for complex queries
Monitoringโ
use std::time::Instant;
#[instrument]
pub fn parse_with_timing(&mut self, context: &ParseContext) -> Result<ParseResult> {
let start = Instant::now();
let result = self.parse(context)?;
let duration = start.elapsed();
let lines = context.content.lines().count();
let us_per_line = duration.as_micros() as f64 / lines as f64;
info!("Parsed {} lines in {:?} ({:.2}ยตs/line)",
lines, duration, us_per_line);
Ok(result)
}
Continuous Integrationโ
The project uses GitHub Actions for CI:
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: cargo test --all
- run: cargo clippy -- -D warnings
- run: cargo fmt --check
Pre-commit Hooksโ
Set up pre-commit hooks:
# .git/hooks/pre-commit
#!/bin/bash
set -e
echo "Running pre-commit checks..."
make check
echo "All checks passed!"
Getting Helpโ
- Documentation:
cargo doc --open
- Issues: GitHub Issues for bugs and features
- Discussions: GitHub Discussions for questions
- Code Review: All changes require review
Next Stepsโ
- Read the API Documentation (coming soon)
- Review the Architecture Guide
- Try implementing a simple language parser
- Contribute to existing parsers or core functionality