Skip to content

Oryn

The Browser Designed for AI Agents

Oryn (Open Runtime for Intentful Navigation) is a browser automation system designed specifically for AI agents. Instead of forcing agents to understand screenshots, parse HTML, or construct complex function calls, Oryn provides a semantic intent language that speaks naturally to how agents think about web interaction.


Why Oryn?

Current approaches to browser automation for AI agents fall into predictable failure patterns:

Approach Problem
Screenshot/Vision Expensive inference, unreliable text extraction, no understanding of interactive state
HTML Parsing Thousands of tokens of markup, complex reasoning about visibility and interactivity
Function Calls Rigid schemas, verbose definitions, no tolerance for natural variation

Oryn solves this by design:

Capability Description
Semantic Observations Structured descriptions of interactive elements with meaningful labels, types, roles, and states
Intent Language Natural, forgiving syntax for expressing actions at the appropriate level of abstraction
Consistent Behavior Identical semantics across embedded, headless, and remote modes
Token Efficient Minimal verbosity means more context for agent reasoning

Key Features

Three Deployment Modes

A single unified binary adapts to any environment: embedded IoT devices, headless cloud servers, or browser extensions for user assistance.

Semantic Targeting

Agents can reference elements by meaning rather than implementation. Say "click login" instead of hunting for CSS selectors.

Pattern Detection

Common UI patterns (login forms, search boxes, cookie banners) are automatically recognized and reported to agents.

Intent Engine

High-level intents like login, search, and accept_cookies encapsulate common workflows, expandable via YAML definitions.

Quick Example

Instead of parsing HTML or analyzing screenshots, agents interact naturally:

goto github.com/login
observe

@ github.com/login "Sign in to GitHub"
[1] input/email "Username or email" {required}
[2] input/password "Password" {required}
[3] button/submit "Sign in" {primary}

type 1 "myusername"
type 2 "mypassword"
click 3

The agent sees labeled interactive elements and issues simple commands. No CSS selectors, no XPath, no DOM traversal—just intent.


Architecture Overview

Oryn's layered architecture separates concerns for maximum consistency and flexibility:

┌─────────────────────────────────────────────────────────────┐
│                      AI Agent                                │
│  (Issues intent commands: login, search, click)             │
├─────────────────────────────────────────────────────────────┤
│                    Oryn CLI / Protocol                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │ Intent      │  │   Intent    │  │   Scanner           │  │
│  │ Parser      │→ │   Engine    │→ │   Interface         │  │
│  └─────────────┘  └─────────────┘  └──────────┬──────────┘  │
└─────────────────────────────────────────────────┼────────────┘
                    ┌─────────────────────────────┴─────────────────────────────┐
                    │                    Browser Backend                         │
                    │     ┌──────────┐    ┌──────────┐    ┌──────────┐          │
                    │     │ oryn-e   │    │ oryn-h   │    │ oryn-r   │          │
                    │     │ Embedded │    │ Headless │    │ Remote   │          │
                    │     └──────────┘    └──────────┘    └──────────┘          │
                    └───────────────────────────────────────────────────────────┘

Learn more about the architecture →


The Three Modes

Mode Binary Engine Best For
Embedded oryn-e WPE WebKit IoT, containers, edge (~50MB RAM)
Headless oryn-h Chromium Cloud automation, CI/CD (~99% compatibility)
Remote oryn-r User's Browser User assistance, authenticated sessions

All three modes share the same protocol, intent language, and Universal Scanner—ensuring consistent behavior regardless of deployment environment.


Quick Start

Get up and running with Oryn:

# Clone and build
git clone https://github.com/dragonscale/oryn.git
cd oryn
cargo build --release -p oryn

# Run in headless mode
./target/release/oryn headless

# In the REPL, navigate and observe
> goto example.com
> observe
> click "More information..."

Complete Quick Start Guide →


Documentation

Getting Started

Core Concepts

Developer Guides

Integrations

  • Google ADK — Using Oryn with Google ADK agents
  • IntentGym — Benchmark harness for evaluating Oryn-based web agents
  • Python SDK — Sync/async Python client for OIL command execution
  • Remote Extension — Connect oryn remote to the browser extension
  • WASM Extension — Building and running the standalone extension-w workflow

Reference


Project Status

Oryn is under active development. Current status:

Feature Status
Intent Language Parser Stable
Universal Scanner Runtime Stable
Headless Mode (oryn-h) Stable
Embedded Mode (oryn-e) Partial
Remote Mode (oryn-r) Partial
Unified Command End-to-End Coverage Partial
Built-in Intent Commands in Unified CLI (login, search, dismiss, accept_cookies) Stable
Declarative Intent/Pack Management Commands (intents, define, run, ...) Stubbed
Multi-step Automation via .oil Scripts Stable

Contributing

We welcome contributions! See the Contributing Guide for details.

# Run tests
./scripts/run-tests.sh

# Run E2E tests
./scripts/run-e2e-tests.sh

# Check formatting and lints
cargo fmt --check && cargo clippy --workspace

License

Oryn is open source under the Apache 2.0 License.