Skip to content

API - Types

Core data types used throughout ARISE.

from arise.types import (
Skill, SkillStatus, SkillOrigin,
ToolSpec,
Trajectory, Step,
GapAnalysis,
EvolutionReport,
SandboxResult, TestResult,
SkillValidationError,
)

A Python tool in the skill library.

@dataclass
class Skill:
id: str # 8-char UUID prefix (auto-generated)
name: str # function name (must match [a-z_][a-z0-9_]*)
description: str # human-readable description
implementation: str # Python source code
test_suite: str # test source code (run in sandbox)
version: int # incremented on each patch/refinement
status: SkillStatus # TESTING, ACTIVE, or DEPRECATED
origin: SkillOrigin # MANUAL, SYNTHESIZED, REFINED, COMPOSED, or PATCHED
parent_id: str | None # ID of the skill this was derived from (patches)
created_at: datetime
# Performance tracking (updated by ARISE on each invocation)
invocation_count: int
success_count: int
avg_latency_ms: float
error_log: list[str]
# Computed property
success_rate: float # success_count / invocation_count

Methods:

skill.to_callable() -> Callable # exec implementation, return the function
skill.to_tool_spec() -> ToolSpec # convert to ToolSpec for agent use

Skill names must match [a-z_][a-z0-9_]* — lowercase, underscores, no spaces.


class SkillStatus(Enum):
TESTING = "testing" # synthesized but not yet promoted
ACTIVE = "active" # promoted, available to agents
DEPRECATED = "deprecated" # removed (rollback, lost A/B test, manually removed)

class SkillOrigin(Enum):
MANUAL = "manual" # added via arise.add_skill()
SYNTHESIZED = "synthesized" # generated by LLM from scratch
REFINED = "refined" # regenerated after failing adversarial tests
COMPOSED = "composed" # composed from multiple existing skills
PATCHED = "patched" # incremental fix applied to existing skill

The representation of a skill as seen by the agent. ARISE builds this from a Skill and passes it to agent_fn.

@dataclass
class ToolSpec:
name: str
description: str
parameters: dict[str, Any] # JSON Schema for the function parameters
fn: Callable # the actual callable (wraps Skill.to_callable())
skill_id: str | None # back-reference to the Skill (None for seed tools)

ToolSpec is callable — tool_spec(arg1, arg2) delegates to tool_spec.fn(arg1, arg2).

The parameters schema follows JSON Schema format:

{
"type": "object",
"properties": {
"path": {"type": "string"},
"encoding": {"type": "string", "default": "utf-8"},
},
"required": ["path"],
}

A complete record of one agent episode.

@dataclass
class Trajectory:
task: str # task string passed to arise.run()
steps: list[Step] # every tool call the agent made
outcome: str # agent's final response (truncated to 1000 chars)
reward: float # score assigned by reward_fn (set after evaluation)
skill_library_version: int # library version at start of episode
timestamp: datetime
metadata: dict[str, Any] # kwargs passed to arise.run(task, **kwargs)

One tool invocation within a trajectory.

@dataclass
class Step:
observation: str # description of what happened
reasoning: str # agent's stated reasoning (if available)
action: str # tool name that was called
action_input: dict[str, Any] # keyword arguments passed to the tool
result: str # tool return value (truncated to 500 chars)
error: str | None # exception message if the tool raised
latency_ms: float # wall-clock time for the tool call

step.error is None on success. In reward functions, check any(s.error for s in trajectory.steps) to detect tool failures.


A detected capability gap — the output of SkillForge.detect_gaps().

@dataclass
class GapAnalysis:
description: str # what capability is missing
evidence: list[str] # quotes from failure trajectories
suggested_name: str # proposed function name
suggested_signature: str # proposed function signature
similar_existing: list[str] # names of existing skills that partially cover this gap

Summary of one evolution cycle.

@dataclass
class EvolutionReport:
timestamp: datetime
gaps_detected: list[str] # suggested names of detected gaps
tools_synthesized: list[str] # names of tools attempted
tools_promoted: list[str] # names of tools that passed and went ACTIVE
tools_rejected: list[dict[str, str]] # [{"name": "...", "reason": "..."}]
duration_ms: float # total time for the evolution cycle
cost_usd: float # estimated LLM cost (if tracked)

Access via:

report = arise.last_evolution
for report in arise.evolution_history:
...

Output of running a skill’s test suite in the sandbox.

@dataclass
class SandboxResult:
success: bool # True if all tests passed
test_results: list[TestResult]
total_passed: int
total_failed: int
stdout: str
stderr: str

Result of a single test case.

@dataclass
class TestResult:
passed: bool
test_name: str
error: str | None
stdout: str
execution_time_ms: float

Raised by SkillRegistry.pull() when validate=True and the pulled skill fails sandbox testing.

from arise.types import SkillValidationError
try:
skill = registry.pull("parse_csv", validate=True, sandbox=sandbox)
except SkillValidationError as e:
print(f"Skill failed validation: {e}")