PhyAgentOS — Workspace & Protocols

Workspace Topology

PhyAgentOS stores all state in local Markdown files. In single-machine mode, all files live in one directory; Fleet mode introduces a shared workspace.

flowchart TB subgraph SINGLE["Single-Machine Mode"] direction TB W["~/.PhyAgentOS/workspace/"] W --> E1["ENVIRONMENT.md"] W --> E2["EMBODIED.md"] W --> E3["ACTION.md"] W --> E4["LESSONS.md"] W --> E5["TASK.md"] end subgraph FLEET["Fleet Mode"] direction TB SW["shared_workspace/"] SW --> S1["ENVIRONMENT.md"] SW --> S2["TASK.md"] SW --> S3["ORCHESTRATOR.md"] SW --> S4["LESSONS.md"] SW --> S5["ROBOTS.md"] RW_A["robot_a/"] RW_A --> R1_A["ACTION.md"] RW_A --> R2_A["EMBODIED.md"] RW_B["robot_b/"] RW_B --> R1_B["ACTION.md"] RW_B --> R2_B["EMBODIED.md"] end

File	Single Mode	Fleet: Shared	Fleet: Per-Robot
`ENVIRONMENT.md`	workspace/	shared_workspace/	—
`EMBODIED.md`	workspace/	—	robot_X/
`ACTION.md`	workspace/	—	robot_X/
`TASK.md`	workspace/	shared_workspace/	—
`LESSONS.md`	workspace/	shared_workspace/	—
`ORCHESTRATOR.md`	—	shared_workspace/	—
`ROBOTS.md`	—	shared_workspace/	—

Single mode: all files in one directory, created by paos onboard.
Fleet mode: the shared workspace holds global state (environment, tasks, orchestration, lessons), while each robot's dedicated workspace holds its own ACTION.md and EMBODIED.md.

ENVIRONMENT.md — Source of Truth

The single authoritative source for environment state. Updated by the Watchdog after every action execution. The Agent reads it before every reasoning cycle.

JSON Example

{
  "schema_version": "v2.0",
  "updated_at": "2025-04-01T12:00:00Z",
  "scene_graph": {
    "nodes": [
      {"id": "table_01", "type": "furniture", "position": [1.2, 0.0, 0.8]},
      {"id": "apple_01", "type": "object", "position": [1.2, 0.75, 0.8], "status": "on_table"}
    ],
    "edges": [
      {"from": "apple_01", "to": "table_01", "relation": "on"}
    ]
  },
  "robots": [
    {
      "robot_id": "franka_001",
      "pose": [0.5, 0.0, 0.0],
      "joint_state": {"joint_1": 0.0, "joint_2": -0.3},
      "gripper": "open",
      "holding": null
    }
  ],
  "objects": [
    {"id": "apple_01", "name": "Red Apple", "category": "fruit", "position": [1.2, 0.75, 0.8]}
  ],
  "perception": {
    "camera_rgb": "artifacts/camera/front_001.jpg",
    "depth": "artifacts/camera/front_001_depth.npy"
  }
}

Field	Type	Description
`schema_version`	string	Protocol version (v1 / v2). v2 adds `perception` and `scene_graph.edges`
`updated_at`	ISO 8601	Timestamp of the last file write
`scene_graph`	object	Scene graph: `nodes` record all entities, `edges` record spatial relationships
`robots`	array	Pose and joint state of all robots in the scene
`objects`	array	Position and properties of all interactable objects
`perception`	object	Perception data references (RGB/depth image paths). Added in v2
`map`	object	Optional: occupancy grid map data
`tf`	object	Optional: coordinate frame transform tree

v1 vs v2 Differences

Feature	v1	v2
Scene graph relations	nodes list only	nodes + edges (spatial relations)
Perception data	none	`perception` field, supports RGB/depth refs
Multi-robot	single robot object	`robots` array, supports Fleet
Map data	none	optional `map` and `tf` fields

EMBODIED.md — Capability Profile

A Markdown file describing the robot's physical capabilities. Copied by the Watchdog at startup from hal/profiles/*.md into the workspace.

Example

# EMBODIED — Franka Emika Panda

## Identity
- **Robot Model**: Franka Emika Panda
- **DOF**: 7
- **End Effector**: Parallel Jaw Gripper
- **Driver**: rekep_real

## Sensors
- [x] RGB-D Camera (Intel RealSense D435)
- [x] Force-Torque Sensor
- [x] Joint Encoders (7x)

## Supported Actions
| Action Type       | Description                     | Parameters              |
|-------------------|--------------------------------|--------------------------|
| move_to           | Cartesian-space move to target  | target_pose: [x,y,z,r,p,y] |
| pick_up           | Grasp an object by ID           | object_id: string        |
| place             | Place held object at location   | target_position: [x,y,z] |
| target_navigation | Visual navigation to a target   | target_label: string     |
| real_execute      | Execute a natural-language ReKep task | nl_task: string     |

## Physical Constraints
- **Max Reach**: 0.855 m
- **Max Payload**: 3.0 kg
- **Workspace Volume**: ~1.5 m³

The Critic must validate every action against EMBODIED.md: any action exceeding workspace bounds or payload limits is rejected.

ACTION.md — Action Queue

The JSON queue through which the Agent dispatches actions to Track B. The Watchdog polls this file and picks up pending actions.

JSON Example

{
  "queue": [
    {
      "action_id": "act_001",
      "action_type": "move_to",
      "params": {
        "target_pose": [0.8, 0.3, 0.5, 0.0, 1.57, 0.0]
      },
      "status": "completed",
      "robot_id": "franka_001",
      "created_at": "2025-04-01T12:00:00Z",
      "completed_at": "2025-04-01T12:00:02Z"
    },
    {
      "action_id": "act_002",
      "action_type": "pick_up",
      "params": {
        "object_id": "apple_01"
      },
      "status": "pending",
      "robot_id": "franka_001",
      "created_at": "2025-04-01T12:00:03Z"
    }
  ]
}

action_type Enumeration

Type	Description	Supported Drivers
`move_to`	Cartesian-space motion to target pose	All physical drivers
`pick_up`	Grasp object by `object_id`	rekep_sim, rekep_real
`place`	Place held object at target position	rekep_sim, rekep_real
`target_navigation`	Navigate to a visual target using perception feedback	simulation, go2_edu
`real_execute`	Execute a natural-language ReKep grasping task	rekep_real
`strategy`	Scripted strategy action (model-free)	simulation

status State Machine

pending → running → completed | failed

Agent writes pending; Watchdog picks it up and changes to running; upon completion changes to completed or failed. Failure reason and exception stack trace are also written.

SESSIONS.md — Runtime Session Queue

In the V2 architecture, this session-level protocol replaces ACTION.md. The runtime Watchdog reads this file to schedule execution sessions.

YAML Example (from templates/SESSIONS.md)

sessions:
  - session_id: sess_pick_apple_001
    target_ref: "sim_franka_tabletop"
    skill_ref: "rekep_pick"
    status: pending
    priority: high
    timeouts:
      session: 120
      skill: 60
    retry:
      max_attempts: 3
      backoff: "exponential"
    routing:
      target_adapter: "SimTargetAdapter"
      skill_runtime: "ReKepPolicyRuntime"
    execution:
      params:
        object_id: "apple_01"
        gripper: "parallel_jaw"
    created_at: "2025-04-01T12:00:00Z"

Field	Type	Description
`session_id`	string	Unique session ID, format `sess_<skill>_<seq>`
`target_ref`	string	References a target id registered in TARGETS.md
`skill_ref`	string	References a skill id registered in SKILLS.md
`status`	enum	State: `pending` → `running` → `succeeded` / `failed`
`priority`	enum	Scheduling priority: `high` / `normal` / `low`
`timeouts`	object	`session` overall timeout and `skill` per-step timeout (seconds)
`retry`	object	`max_attempts` and `backoff` strategy
`routing`	object	Specifies the `target_adapter` and `skill_runtime` to use
`execution`	object	Execution parameters, e.g. `params` (key-value pairs passed to the skill)

Priority Scheduling Rules

high priority sessions execute before normal and low
Same priority sessions are processed FIFO by created_at timestamp
Sessions targeting the same robot (same target_ref) run serially; different robots may run in parallel

TARGETS.md — Runtime Target Registry

Registers all available rollout targets (simulation environments or real-robot instances). The runtime Watchdog uses this file to route sessions.

YAML Example (from templates/TARGETS.md)

targets:
  - id: sim_franka_tabletop
    type: sim
    enabled: true
    backend: mujoco
    supported_skills:
      - rekep_pick
      - rekep_place
      - openvla_manipulation
    adapter:
      class: SimTargetAdapter
      obs_format: mujoco_standard
    perception:
      cameras:
        - name: front
          resolution: [640, 480]
          fps: 30
        - name: wrist
          resolution: [640, 480]
          fps: 30
    config:
      scene_xml: /path/to/tabletop.xml

  - id: real_franka_lab
    type: real_robot
    enabled: true
    supported_skills:
      - rekep_pick
    adapter:
      class: RealRobotTargetAdapter
      driver: franka
    perception:
      cameras:
        - name: front
          device: "/dev/video0"
          resolution: [1280, 720]
    config:
      ip: "192.168.1.100"
      control_mode: "joint_position"

Field	Type	Description
`id`	string	Unique target ID, referenced by `target_ref` in SESSIONS.md
`type`	enum	`sim` or `real_robot`
`backend`	string	Simulation backend (`mujoco`, `maniskill`, `isaac_sim`)
`enabled`	bool	Set to `false` to temporarily take a target offline
`supported_skills`	array	List of skill ids this target supports
`adapter`	object	Specifies the `class` name and `obs_format` for observations
`perception`	object	`cameras` array config (resolution, FPS, device path)
`config`	object	Target-specific config (scene file, IP address, etc.)

SKILLS.md — Runtime Skill Registry

Registers all available skills, defining each skill's runtime requirements, policy client, and environment contract.

YAML Example (from templates/SKILLS.md)

skills:
  - id: rekep_pick
    category: manipulation
    description: "ReKep-based grasping skill with natural-language target description"
    runtime: ReKepPolicyRuntime
    supported_target_types:
      - sim
      - real_robot
    policy_client:
      type: http
      endpoint: "http://localhost:8765/predict"
    requires:
      sensors:
        - rgb_camera
        - depth_camera
        - joint_states
      environment_outputs:
        - scene_graph
        - perception_data
      strict_environment_contract: true

  - id: openvla_manipulation
    category: manipulation
    description: "OpenVLA-based general-purpose manipulation skill"
    runtime: VLAPolicyRuntime
    supported_target_types:
      - sim
    policy_client:
      type: local
      checkpoint: "/models/openvla-7b"
    requires:
      sensors:
        - rgb_camera
      environment_outputs:
        - scene_graph
      strict_environment_contract: true

Field	Type	Description
`id`	string	Unique skill ID, referenced by `skill_ref` in SESSIONS.md
`category`	string	Skill category: `manipulation`, `navigation`, `perception`
`runtime`	string	SkillRuntime class name that defines the algorithm for executing this skill
`supported_target_types`	array	Supported target types: `sim`, `real_robot`
`policy_client`	object	Policy client config: `type` (http/local), `endpoint` or `checkpoint`
`requires.sensors`	array	Required sensor list (rgb_camera, depth_camera, joint_states)
`requires.environment_outputs`	array	Data types the runtime needs from the environment
`requires.strict_environment_contract`	bool	If true, missing any required sensor causes execution to be rejected

TASK.md — Long-Horizon Task Decomposition

The Agent breaks down long-horizon user instructions into sub-tasks and tracks progress. The Critic evaluates overall completion against this file.

Example

# Task: Clear the Table

| Sub-Task | Action              | Target Device | Status     | Depends On | Result |
|----------|---------------------|---------------|------------|------------|--------|
| 1        | Navigate to table   | franka_001    | ✅ done    | —          | Arrived at table_01 |
| 2        | Grasp apple         | franka_001    | ✅ done    | 1          | Apple picked via ReKep |
| 3        | Place in basket     | franka_001    | ⏳ running | 2          | Needs precise positioning |
| 4        | Grasp cup           | franka_001    | ⬜ pending | 3          | — |
| 5        | Place on cup holder | franka_001    | ⬜ pending | 4          | — |

**Overall Progress**: 2/5 (40%)

The table format tracks each sub-task's ID, action, target device, status, dependencies, and result. The Agent updates status columns after each action; the Critic verifies progress against the user's original goal.

ORCHESTRATOR.md — Global Dashboard

In Fleet mode, the Orchestrator maintains global task assignment and robot scheduling plans in this file.

# Orchestrator Dashboard

## Active Missions
| Mission        | Assigned To  | Priority | Status  |
|----------------|--------------|----------|---------|
| Clear the table| franka_001   | high     | running |
| Patrol area    | go2_edu_001  | normal   | running |

## Robot Pool
| Robot         | Status | Last Heartbeat | Current Mission |
|---------------|--------|----------------|-----------------|
| franka_001    | busy   | 12:00:15       | Clear the table |
| go2_edu_001   | busy   | 12:00:12       | Patrol area     |

## Pending Queue
| Mission         | Requirements            | Priority | Queued At |
|-----------------|------------------------|----------|-----------|
| Water delivery  | mobile + manipulation  | normal   | 12:00:20  |

Active Missions

Currently running missions with priority and assigned robot.

Robot Pool

Live status, heartbeat, and current mission for each robot.

Pending Queue

Missions waiting for a robot to become available, with required capabilities.

LESSONS.md — Failure Memory

A log of Critic rejections. The Critic writes rejection reasons here; the Agent searches this file before planning to avoid repeating known failure patterns.

# LESSONS

## 2025-04-01 12:00:05 — Grasp Failed: Object Out of Workspace
- **Action**: pick_up apple_01
- **Reason**: Target position [1.8, 0.75, 0.8] exceeds Franka max reach (0.855m)
- **Critic Rejection**: EMBODIED.md Physical Constraints validation failed
- **Fix**: Execute move_to first to bring the robot closer

## 2025-04-01 11:55:00 — Navigation Collision: Path Blocked by Obstacle
- **Action**: target_navigation to kitchen_counter
- **Reason**: Direct path blocked by a chair
- **Critic Rejection**: No obstacle consideration; suggested adding intermediate waypoints
- **Fix**: Decompose into multi-segment navigation via hallway_midpoint

Self-evolution core: LESSONS.md is PhyAgentOS's failure experience database. The Agent uses the search_lessons tool to retrieve historical failure patterns and avoid repeating mistakes.

Who Reads, Who Writes

Read/write permission matrix for each component relative to protocol files. R = Read, W = Write.

File	Planner (Agent)	Critic	Watchdog (Track B)	Orchestrator	Runtime Watchdog
`ENVIRONMENT.md`	R	R	W	R	R
`EMBODIED.md`	R	R	W (at startup)	—	—
`ACTION.md`	W	R	R + W	—	—
`SESSIONS.md`	W	—	—	W	R + W
`TARGETS.md`	R	—	—	R	R
`SKILLS.md`	R	—	—	R	R
`TASK.md`	W	R	—	R	—
`ORCHESTRATOR.md`	—	—	—	W	—
`LESSONS.md`	R	W	—	R	—
`ROBOTS.md`	R	—	R	R	—

File Lifecycle

A complete step-by-step timeline from paos onboard to state write-back.

sequenceDiagram actor User participant CLI as paos onboard participant WD as Watchdog (Track B) participant DRV as Driver participant ENV as ENVIRONMENT.md participant EMB as EMBODIED.md participant AGT as Agent (Track A) participant CRT as Critic participant ACT as ACTION.md participant LSN as LESSONS.md User->>CLI: paos onboard CLI->>ENV: create (empty template) CLI->>EMB: create (empty template) CLI->>ACT: create (empty template) CLI->>LSN: create (empty template) User->>WD: start watchdog WD->>DRV: load driver DRV->>WD: observe() → initial state WD->>ENV: write initial scene graph WD->>EMB: copy from hal/profiles/*.md User->>AGT: paos agent → give task loop Planner-Critic Loop AGT->>ENV: read state AGT->>EMB: read capabilities AGT->>LSN: search past failures AGT->>CRT: propose action CRT->>EMB: validate against constraints alt rejected CRT->>LSN: write rejection reason CRT-->>AGT: reject + feedback else approved CRT->>ACT: write action (status: pending) end end WD->>ACT: poll → pick up pending action WD->>ACT: mark running WD->>DRV: execute(action) DRV->>WD: observe() → new state WD->>ENV: write updated scene graph WD->>ACT: mark completed / failed

1. paos onboard — Initialization

Creates ~/.PhyAgentOS/config.json and workspace/ directory. Generates empty templates: ENVIRONMENT.md, EMBODIED.md, ACTION.md, LESSONS.md.

2. Watchdog Startup — Environment Initialization

The HAL Watchdog starts the driver, calls driver.observe() to get the initial scene state, and writes it to ENVIRONMENT.md. Copies the matching EMBODIED.md from hal/profiles/.

3. Agent Startup — Planner-Critic Loop

Each turn: Planner reads ENVIRONMENT.md + EMBODIED.md + TASK.md + LESSONS.md → generates action plan → Critic validates against EMBODIED.md → if approved, writes ACTION.md (status: pending).

4. Watchdog Polling — Execute Action

Polls ACTION.md, picks up pending action → changes to running → driver calls driver.execute(action) → upon completion changes to completed/failed.

5. State Write-Back — Update ENVIRONMENT.md

The driver calls driver.observe() after each execution. The Watchdog writes the latest scene graph to ENVIRONMENT.md. The Agent reads the updated state in the next cycle.

6. Critic Rejection — Write to LESSONS.md

If the Critic rejects an action (violates EMBODIED.md constraints or no safe path), the rejection reason and context are written to LESSONS.md. The Agent's search_lessons tool retrieves this in later cycles.