API - Types

Core data types used throughout ARISE.

from arise.types import (
    Skill, SkillStatus, SkillOrigin,
    ToolSpec,
    Trajectory, Step,
    GapAnalysis,
    EvolutionReport,
    SandboxResult, TestResult,
    SkillValidationError,
)

`Skill`

A Python tool in the skill library.

@dataclass
class Skill:
    id: str                      # 8-char UUID prefix (auto-generated)
    name: str                    # function name (must match [a-z_][a-z0-9_]*)
    description: str             # human-readable description
    implementation: str          # Python source code
    test_suite: str              # test source code (run in sandbox)
    version: int                 # incremented on each patch/refinement
    status: SkillStatus          # TESTING, ACTIVE, or DEPRECATED
    origin: SkillOrigin          # MANUAL, SYNTHESIZED, REFINED, COMPOSED, or PATCHED
    parent_id: str | None        # ID of the skill this was derived from (patches)
    created_at: datetime

    # Performance tracking (updated by ARISE on each invocation)
    invocation_count: int
    success_count: int
    avg_latency_ms: float
    error_log: list[str]

    # Computed property
    success_rate: float          # success_count / invocation_count

Methods:

skill.to_callable() -> Callable    # exec implementation, return the function
skill.to_tool_spec() -> ToolSpec   # convert to ToolSpec for agent use

Skill names must match [a-z_][a-z0-9_]* — lowercase, underscores, no spaces.

`SkillStatus`

class SkillStatus(Enum):
    TESTING    = "testing"     # synthesized but not yet promoted
    ACTIVE     = "active"      # promoted, available to agents
    DEPRECATED = "deprecated"  # removed (rollback, lost A/B test, manually removed)

`SkillOrigin`

class SkillOrigin(Enum):
    MANUAL     = "manual"      # added via arise.add_skill()
    SYNTHESIZED = "synthesized" # generated by LLM from scratch
    REFINED    = "refined"     # regenerated after failing adversarial tests
    COMPOSED   = "composed"    # composed from multiple existing skills
    PATCHED    = "patched"     # incremental fix applied to existing skill

`ToolSpec`

The representation of a skill as seen by the agent. ARISE builds this from a Skill and passes it to agent_fn.

@dataclass
class ToolSpec:
    name: str
    description: str
    parameters: dict[str, Any]   # JSON Schema for the function parameters
    fn: Callable                 # the actual callable (wraps Skill.to_callable())
    skill_id: str | None         # back-reference to the Skill (None for seed tools)

ToolSpec is callable — tool_spec(arg1, arg2) delegates to tool_spec.fn(arg1, arg2).

The parameters schema follows JSON Schema format:

{
    "type": "object",
    "properties": {
        "path": {"type": "string"},
        "encoding": {"type": "string", "default": "utf-8"},
    },
    "required": ["path"],
}

`Trajectory`

A complete record of one agent episode.

@dataclass
class Trajectory:
    task: str                      # task string passed to arise.run()
    steps: list[Step]              # every tool call the agent made
    outcome: str                   # agent's final response (truncated to 1000 chars)
    reward: float                  # score assigned by reward_fn (set after evaluation)
    skill_library_version: int     # library version at start of episode
    timestamp: datetime
    metadata: dict[str, Any]       # kwargs passed to arise.run(task, **kwargs)

`Step`

One tool invocation within a trajectory.

@dataclass
class Step:
    observation: str             # description of what happened
    reasoning: str               # agent's stated reasoning (if available)
    action: str                  # tool name that was called
    action_input: dict[str, Any] # keyword arguments passed to the tool
    result: str                  # tool return value (truncated to 500 chars)
    error: str | None            # exception message if the tool raised
    latency_ms: float            # wall-clock time for the tool call

step.error is None on success. In reward functions, check any(s.error for s in trajectory.steps) to detect tool failures.

`GapAnalysis`

A detected capability gap — the output of SkillForge.detect_gaps().

@dataclass
class GapAnalysis:
    description: str             # what capability is missing
    evidence: list[str]          # quotes from failure trajectories
    suggested_name: str          # proposed function name
    suggested_signature: str     # proposed function signature
    similar_existing: list[str]  # names of existing skills that partially cover this gap

`EvolutionReport`

Summary of one evolution cycle.

@dataclass
class EvolutionReport:
    timestamp: datetime
    gaps_detected: list[str]              # suggested names of detected gaps
    tools_synthesized: list[str]          # names of tools attempted
    tools_promoted: list[str]             # names of tools that passed and went ACTIVE
    tools_rejected: list[dict[str, str]]  # [{"name": "...", "reason": "..."}]
    duration_ms: float                    # total time for the evolution cycle
    cost_usd: float                       # estimated LLM cost (if tracked)

Access via:

report = arise.last_evolution
for report in arise.evolution_history:
    ...

`SandboxResult`

Output of running a skill’s test suite in the sandbox.

@dataclass
class SandboxResult:
    success: bool                  # True if all tests passed
    test_results: list[TestResult]
    total_passed: int
    total_failed: int
    stdout: str
    stderr: str

`TestResult`

Result of a single test case.

@dataclass
class TestResult:
    passed: bool
    test_name: str
    error: str | None
    stdout: str
    execution_time_ms: float

`SkillValidationError`

Raised by SkillRegistry.pull() when validate=True and the pulled skill fails sandbox testing.

from arise.types import SkillValidationError

try:
    skill = registry.pull("parse_csv", validate=True, sandbox=sandbox)
except SkillValidationError as e:
    print(f"Skill failed validation: {e}")