RunwaymlStableDiffusionAgent¶

The RunwaymlStableDiffusionAgent is an image generation agent that creates images from text prompts using Stability AI's Stable Diffusion model.

Purpose¶

This agent provides text-to-image generation capabilities within a Rustic AI guild, enabling the creation of images from natural language descriptions. It uses Stable Diffusion 3.5, a powerful diffusion model capable of generating high-quality images.

When to Use¶

Use the RunwaymlStableDiffusionAgent when your application needs to:

Generate images from text descriptions
Create visual content based on textual prompts
Produce illustrations or artwork programmatically
Visualize concepts or ideas described in text
Add image generation capabilities to your AI system

Configuration¶

The RunwaymlStableDiffusionAgent is configured through the RunwaymlStableDiffusionProps class, which allows setting:

class RunwaymlStableDiffusionProps(PyTorchAgentProps):
    model_id: str = "stabilityai/stable-diffusion-3.5-medium"  # The model ID to load from Hugging Face

The agent inherits from the PyTorchAgentProps base class, which provides:

class PyTorchAgentProps(BaseAgentProps):
    torch_device: str = "cuda" if torch.cuda.is_available() else "cpu"  # Device to run the model on

Dependencies¶

The RunwaymlStableDiffusionAgent requires:

filesystem (Guild-level dependency): A file system implementation for storing generated images

Message Types¶

Input Messages¶

ImageGenerationRequest¶

A request to generate images:

class ImageGenerationRequest(BaseModel):
    generation_prompt: str  # The text prompt describing the desired image
    num_images: int = 1  # Number of images to generate
    guidance_scale: float = 7.5  # How closely to follow the prompt (higher = more faithful)
    num_inference_steps: int = 50  # Number of denoising steps (higher = better quality, slower)
    height: int = 1024  # Height of the generated image
    width: int = 1024  # Width of the generated image
    image_format: str = "png"  # Output format for the image

Output Messages¶

ImageGenerationResponse¶

Sent when image generation is completed:

class ImageGenerationResponse(BaseModel):
    files: List[MediaLink]  # List of generated image files
    errors: List[str]  # Any errors that occurred during generation
    request: str  # The original request (JSON string)

Each MediaLink contains: - url: Path to the generated image file - name: Filename (UUID-based) - mimetype: Content type (image/png) - on_filesystem: Always True for generated images

ErrorMessage¶

Sent when image generation fails:

class ErrorMessage(BaseModel):
    agent_type: str
    error_type: str  # "ImageGenerationError"
    error_message: str

Behavior¶

The agent receives an ImageGenerationRequest with a text prompt
The prompt is processed through the Stable Diffusion pipeline
The specified number of images are generated with the requested parameters
The images are saved to files with generated UUID filenames
A MediaLink is created for each generated image
An ImageGenerationResponse containing all the image links is sent
If any errors occur, an ErrorMessage is sent

Sample Usage¶

from rustic_ai.core.guild.builders import AgentBuilder
from rustic_ai.core.guild.agent_ext.depends.dependency_resolver import DependencySpec
from rustic_ai.huggingface.agents.diffusion.stable_diffusion_agent import (
    RunwaymlStableDiffusionAgent,
    RunwaymlStableDiffusionProps
)

# Define a file system dependency
filesystem = DependencySpec(
    class_name="rustic_ai.core.guild.agent_ext.depends.filesystem.FileSystemResolver",
    properties={
        "path_base": "/tmp",
        "protocol": "file",
        "storage_options": {
            "auto_mkdir": True,
        },
    },
)

# Create the agent spec
sd_agent_spec = (
    AgentBuilder(RunwaymlStableDiffusionAgent)
    .set_id("image_generator")
    .set_name("Image Generator")
    .set_description("Generates images from text using Stable Diffusion")
    .set_properties(
        RunwaymlStableDiffusionProps(
            model_id="stabilityai/stable-diffusion-3.5-medium",  # Default model
            torch_device="cuda:0"  # Specify GPU device if needed
        )
    )
    .build_spec()
)

# Add dependency to guild when launching
guild_builder.add_dependency("filesystem", filesystem)
guild_builder.add_agent_spec(sd_agent_spec)

Example Request¶

from rustic_ai.huggingface.agents.models import ImageGenerationRequest

# Create an image generation request
image_request = ImageGenerationRequest(
    generation_prompt="A serene forest lake at sunset with mountains in the background",
    num_images=2,  # Generate 2 variations
    guidance_scale=8.0,  # Slightly higher guidance for more prompt adherence
    num_inference_steps=30,  # Fewer steps for faster generation
    height=768,
    width=768
)

# Send to the agent
client.publish("default_topic", image_request)

Technical Details¶

The agent uses: - Hugging Face's diffusers library with the StableDiffusion3Pipeline - Stability AI's Stable Diffusion 3.5 model - PyTorch for tensor operations - Automatic hardware detection to use GPU when available

Notes and Limitations¶

Requires significant VRAM to run efficiently (at least 8GB recommended)
Performance is much better with a GPU
Generation time depends on the number of inference steps and image size
First-time initialization may take longer as models are downloaded
Large image sizes (>1024x1024) may require more memory
Model is run locally, so consider hardware requirements when deploying
Consider using a custom model_id for different versions of Stable Diffusion