Writing Effective Tests for Rustic AI Agents¶
This guide provides practical advice and patterns for writing effective tests for Rustic AI agents. It builds on the concepts in the Testing Agents guide, with a focus on real-world testing examples from the Rustic AI codebase.
Key Testing Patterns in Rustic AI¶
When examining tests across the Rustic AI ecosystem (including specialized agents like PlaywrightAgent, SERPAgent, and LiteLLMAgent), several effective patterns emerge:
- Isolation testing with
wrap_agent_for_testing
- Integration testing with probe agents
- Targeted dependency mocking for external service dependencies
- Async testing with
asyncio
for agents with asynchronous operations - Environment setup and cleanup using pytest fixtures
- Conditional tests that skip when external service credentials aren't available
Testing External Service Agents¶
Many Rustic AI agents integrate with external services like APIs, databases, or other tools. Here's a pattern for testing such agents:
import pytest
import os
from unittest.mock import patch
from rustic_ai.core.guild.builders import AgentBuilder
from rustic_ai.core.messaging.core.message import AgentTag, Message
from rustic_ai.core.utils.priority import Priority
from rustic_ai.core.utils.basic_class_utils import get_qualified_class_name
from rustic_ai.testing.helpers import wrap_agent_for_testing
from rustic_ai.your_module.agent import YourServiceAgent, ServiceRequest, ServiceResponse
class TestYourServiceAgent:
# Test with real API if credentials are available
@pytest.mark.skipif(
os.getenv("YOUR_API_KEY") is None,
reason="YOUR_API_KEY environment variable not set"
)
def test_with_real_api(self, generator):
# Create and wrap the agent
agent, results = wrap_agent_for_testing(
AgentBuilder(YourServiceAgent)
.set_name("TestServiceAgent")
.set_id("test_agent")
.set_description("Test Service Agent")
.build(),
generator,
)
# Create a test request message
request = Message(
topics="default_topic",
sender=AgentTag(id="testerId", name="tester"),
format=get_qualified_class_name(ServiceRequest),
payload={"param1": "value1", "param2": "value2"},
id_obj=generator.get_id(Priority.NORMAL),
)
# Send the message
agent._on_message(request)
# Assert on the results
assert len(results) == 1
assert results[0].in_response_to == request.id
# Validate the response
response = ServiceResponse.model_validate(results[0].payload)
assert response.success == True
assert response.data is not None
# Test with mocked API
def test_with_mocked_api(self, generator):
# Mock the external service
with patch("your_module.service.Client.call_api") as mock_api:
# Configure the mock
mock_api.return_value = {"result": "mocked_data"}
# Create and wrap the agent
agent, results = wrap_agent_for_testing(
AgentBuilder(YourServiceAgent)
.set_name("TestServiceAgent")
.set_id("test_agent")
.set_description("Test Service Agent")
.build(),
generator,
)
# Create a test request message
request = Message(
topics="default_topic",
sender=AgentTag(id="testerId", name="tester"),
format=get_qualified_class_name(ServiceRequest),
payload={"param1": "value1", "param2": "value2"},
id_obj=generator.get_id(Priority.NORMAL),
)
# Send the message
agent._on_message(request)
# Assert the mock was called correctly
mock_api.assert_called_once_with("value1", "value2")
# Assert on the results
assert len(results) == 1
response = ServiceResponse.model_validate(results[0].payload)
assert response.data["result"] == "mocked_data"
Testing Asynchronous Agents¶
For agents that perform asynchronous operations (like web scraping or API calls), your tests need to account for the asynchronous nature:
import asyncio
import pytest
from rustic_ai.core.guild.builders import AgentBuilder
from rustic_ai.testing.helpers import wrap_agent_for_testing
from rustic_ai.your_module.agent import AsyncAgent
class TestAsyncAgent:
@pytest.mark.asyncio # Requires pytest-asyncio
async def test_async_operation(self, generator):
# Create and wrap the agent
agent, results = wrap_agent_for_testing(
AgentBuilder(AsyncAgent)
.set_name("TestAsyncAgent")
.set_id("test_agent")
.build(),
generator,
)
# Create and send a test message
# ... code to create message ...
agent._on_message(message)
# Wait for async operations to complete
# This is important! We need to give the async operations time to run
tries = 0
while True:
await asyncio.sleep(0.5) # Wait a bit
tries += 1
# Exit when we get the expected results or timeout
if len(results) > 0 or tries > 10:
break
# Now perform assertions
assert len(results) > 0
# ... more specific assertions ...
Testing with External Dependencies¶
Agents often depend on external services. You can inject mock or real dependencies during testing:
from rustic_ai.core.guild.agent_ext.depends.dependency_resolver import DependencySpec
from rustic_ai.testing.helpers import wrap_agent_for_testing
def test_agent_with_dependencies(generator):
# Define test dependencies
filesystem = DependencySpec(
class_name="rustic_ai.core.guild.agent_ext.depends.filesystem.FileSystemResolver",
properties={
"path_base": "/tmp/test",
"protocol": "file",
"storage_options": {"auto_mkdir": True},
},
)
# Create and wrap the agent with dependencies
agent, results = wrap_agent_for_testing(
AgentBuilder(YourAgent)
.set_id("test_agent")
.set_name("TestAgent")
.build(),
generator,
{"filesystem": filesystem}, # Inject dependencies here
)
# Create and send a test message
# ... code to create and send message ...
# You can even access and test the injected dependency
fs = filesystem.to_resolver().resolve(agent.guild_id, "GUILD_GLOBAL")
assert fs.exists(some_path)
Testing Agent Error Handling¶
Good agent tests not only verify correct behavior but also proper error handling:
def test_error_handling(generator):
# Create and wrap the agent
agent, results = wrap_agent_for_testing(
AgentBuilder(YourAgent)
.set_name("TestAgent")
.set_id("test_agent")
.build(),
generator,
)
# 1. Test with invalid input
invalid_message = create_message_with_invalid_payload()
agent._on_message(invalid_message)
# Verify agent produces appropriate error response
assert len(results) == 1
assert results[0].is_error_message
assert "Invalid input" in results[0].payload.get("error", "")
# 2. Test when a dependency fails
with patch("some_dependency.method") as mock_dep:
mock_dep.side_effect = Exception("Dependency failed")
results.clear() # Clear previous results
valid_message = create_valid_message()
agent._on_message(valid_message)
# Verify agent handles dependency failure gracefully
assert len(results) == 1
assert results[0].is_error_message
assert "Dependency failed" in results[0].payload.get("error", "")
Integration Testing with Probe Agent¶
The ProbeAgent
is a powerful tool for testing agent interactions within a guild:
import pytest
import time
from rustic_ai.core.agents.testutils.probe_agent import ProbeAgent
from rustic_ai.core.guild.builders import AgentBuilder, GuildBuilder
@pytest.fixture
def test_guild():
# Create a guild for testing
guild = GuildBuilder("test_guild", "Test Guild", "Guild for testing").launch(add_probe=True)
# Add your test agents
agent1_spec = AgentBuilder(Agent1Class).set_name("Agent1").build_spec()
agent2_spec = AgentBuilder(Agent2Class).set_name("Agent2").build_spec()
guild.launch_agent(agent1_spec)
guild.launch_agent(agent2_spec)
# Get the probe agent
probe_agent = guild.get_agent_of_type(ProbeAgent)
# Return guild and probe agent
try:
yield guild, probe_agent
finally:
# Always clean up
guild.shutdown()
def test_agent_interaction(test_guild):
guild, probe_agent = test_guild
# Send an initial message to trigger interaction
probe_agent.publish("default_topic", StartMessage(data="test"))
# Wait for messages to be processed
time.sleep(1)
# Get all messages captured by the probe
messages = probe_agent.get_messages()
# Verify the correct sequence of interactions
assert len(messages) >= 2
assert messages[0].sender.id == "agent1"
assert messages[1].sender.id == "agent2"
# ... more detailed assertions about message content ...
Testing LLM-based Agents¶
For agents that use language models (like LiteLLMAgent), you can use the mock_response pattern:
from rustic_ai.core.guild.agent_ext.depends.llm.models import (
ChatCompletionRequest, SystemMessage, UserMessage
)
def test_llm_agent(probe_agent, guild):
# Create and add the LLM agent to the guild
agent = AgentBuilder(LLMAgent) \
.set_name("Test LLM Agent") \
.set_id("llm_agent") \
.build()
guild._add_local_agent(agent)
# Create a request with a mock response to avoid real API calls
chat_completion_request = ChatCompletionRequest(
messages=[UserMessage(content="What is the capital of France?")],
mock_response="The capital of France is Paris.", # Mock response
)
# Send the request
probe_agent.publish_dict(
guild.DEFAULT_TOPIC,
chat_completion_request,
format=ChatCompletionRequest,
)
time.sleep(1) # Wait for processing
# Check the results
messages = probe_agent.get_messages()
assert len(messages) == 1
assert "Paris" in messages[0].payload["choices"][0]["message"]["content"]
Common Testing Pitfalls and Solutions¶
1. Race Conditions in Async Tests¶
Problem: Tests sometimes pass, sometimes fail due to timing issues.
Solution: Use asyncio.sleep()
with retries to wait for operations to complete:
# Instead of a fixed sleep time:
tries = 0
while True:
await asyncio.sleep(0.2)
tries += 1
# Exit condition based on expected state
if len(results) > 0 or tries > 10: # Timeout after 10 tries
break
2. Resource Cleanup in Tests¶
Problem: Tests leave behind resources that affect other tests.
Solution: Use pytest fixtures with cleanup:
@pytest.fixture
def resource_fixture():
# Setup
resource = create_resource()
yield resource # Provide the resource to the test
# Cleanup (always runs, even if test fails)
resource.cleanup()
3. External Service Dependency¶
Problem: Tests fail when run without access to external services.
Solution: Use pytest.mark.skipif
to conditionally skip tests:
@pytest.mark.skipif(
os.getenv("REQUIRED_API_KEY") is None,
reason="API key not available"
)
def test_with_external_service():
# Test that requires external service
Best Practices Summary¶
-
Parametrize Common Tests: Use
@pytest.mark.parametrize
to run the same test with different inputs. -
Keep Dependencies Consistent: Use the same dependency structure in tests as in production code.
-
Test Failure Modes: Don't just test the happy path; test how your agent handles errors.
-
Mock External Services: Use
unittest.mock.patch
to replace external API calls. -
Combine Unit and Integration Tests: Test components in isolation first, then together.
-
Use Descriptive Test Names: Make your test names describe the behavior being tested.
-
Isolate Test State: Ensure tests don't interfere with each other's state.
Real-World Examples from Rustic AI¶
To see these patterns in action, examine the test suites for these agents:
- PlaywrightAgent: Tests asynchronous web scraping with injected filesystem dependency
- SERPAgent: Tests API calls with both real and mocked services
- LiteLLMAgent: Tests LLM integration with mock responses
These examples demonstrate how to effectively test agents with different requirements and complexity levels.
Conclusion¶
Effective testing is crucial for building reliable agent-based systems. By following these patterns and guidelines, you can create tests that validate your agents' behavior, handle edge cases, and catch regressions early. Remember that good tests not only verify that the right thing happens when the right input is provided, but also that the right thing happens when something goes wrong.