API Usage / Troubleshooting

Troubleshooting

Common problems, causes, and solutions — organized by component.

Debugging Methodology

When something goes wrong, verify each layer in order. Don't skip steps — each layer depends on the one before it.

1

Can it import?

python -c "from hal.drivers.simulation import SimulationDriver" — verify all dependencies resolve.

2

Can it start?

Launch the component on its own. Does it reach a ready state without immediately crashing?

3

Can it execute?

Feed it a minimal valid input. Does it produce output without errors?

4

Can it write back?

Verify the output lands in the correct state file with the expected format.

Installation Issues

SymptomCauseSolution
pip install fails Python version below 3.11 Check with python --version. Install Python ≥ 3.11.
paos: command not found Editable install not applied or in wrong environment Run pip install -e . again in the project root. Verify with which paos.
Conda environment conflicts Dependency version clashes from old env Create a fresh conda environment: conda create -n phyagentos python=3.11 then reinstall.
ModuleNotFoundError: hal PYTHONPATH doesn't include project root Ensure you run commands from the project root directory, or set export PYTHONPATH=.

Watchdog Issues

SymptomCauseSolution
Driver not found Driver name typo or plugin not registered Verify driver name. For plugins, run the deploy script and check PhyAgentOS_plugin.toml.
Profile not installed get_profile_path() returns None or invalid path Check the driver's get_profile_path() implementation. The EMBODIED.md file must exist at the returned path.
Connection timeout Driver connect() failed — network or hardware issue Check network connectivity to the robot. Verify IP/port in driver config. Try pinging the robot.
ACTION.md format error Invalid JSON structure or status not "pending" Verify JSON is valid. The status field must be "pending" for the Watchdog to consume the action.
Port conflicts Multiple Watchdogs on the same port Only one Watchdog instance per workspace. Use different workspaces or --driver-config with different ports per instance.

Agent Issues

SymptomCauseSolution
LLM API call fails Missing or invalid api_key in config.json Verify api_key is set in ~/.PhyAgentOS/config.json. Check your LLM provider's API status.
Tool not available Tool not configured in tools config Check the tools section in config.json. Ensure the tool name matches the registered tool class.
Critic rejects all actions EMBODIED.md doesn't match robot capabilities, or LESSONS.md blocking patterns exist Check EMBODIED.md for accuracy — joint ranges, payload limits, workspace boundaries. Check LESSONS.md for accumulated rejection patterns. Try resetting LESSONS.md if it's polluted.
Agent stuck / no response Watchdog not running or not consuming ACTION.md Verify the Watchdog is running and consuming actions. Check ACTION.md status — if stuck at executing, the Watchdog may be hung on a long-running action.

Fleet Issues

SymptomCauseSolution
Multiple Watchdog port conflict Two Watchdog instances sharing a network port Use --driver-config to assign unique ports per instance. Each robot should have its own workspace.
ROBOTS.md not updating Watchdog hasn't refreshed or write permissions issue Restart the affected Watchdog. Check write permissions on the shared workspace directory.
Wrong ACTION.md target robot_id in action parameters doesn't match intended robot Verify robot_id in each action matches the intended robot's config. Check ORCHESTRATOR.md for task assignment.
Shared state not visible ENVIRONMENT.md in shared workspace is stale or locked Check ENVIRONMENT.md in the shared workspace. Verify each Watchdog is writing to its robots.<robot_id> key.

Runtime Issues

SymptomCauseSolution
Preflight fails Sensor config YAML paths incorrect or calibration files missing Verify paths in configs/runtime/sensors/*.yaml point to real files. Check calibration files exist.
Session marked rejected TARGETS.md perception config incompatible, or skill requirements unmet Check perception section in TARGETS.md. Verify SKILLS.md declares the sensors the skill needs. Check preflight error details.
No episode.json generated Session didn't complete — check status in SESSIONS.md Look at SESSIONS.md for the session's final status. Check LOG.md for error messages.
Perception outputs missing Pipeline doesn't cover required output channels Verify the perception YAML config includes all required output channels. Check the model's output keys match the pipeline config.

File Inspection Checklist

When you can't identify the root cause, inspect each state file in order. This is the most reliable fallback debugging method.

  1. ENVIRONMENT.md — Is the robot state correct? Look for stale poses, wrong connection_state, or missing sensor data. If state is frozen, the Watchdog may have stopped updating.
  2. ACTION.md — Are there pending actions? If the queue is growing, the Watchdog isn't consuming them. Check the Watchdog terminal for errors.
  3. EMBODIED.md — Does the profile match the actual robot? Joint ranges, control modes, and payload limits must reflect reality. A mismatched profile causes the Critic to reject valid actions.
  4. LESSONS.md — Any recent rejections? Each rejection includes a timestamp, robot_id, and reason. Scan for patterns — repeated rejections of the same action type indicate a systemic config issue.
  5. SESSIONS.md (runtime path only) — Check session statuses. If all are rejected, the preflight or config is the problem. If timeout, check execute_timeout values.
  6. LOG.md (runtime path only) — Look at the most recent entries. LOG.md records every session outcome with error details.
Pro tip: Use watch -n 1 'cat ACTION.md' to monitor a file in real time. Combine with separate Watchdog and Agent terminals to see the full loop.

Log Locations

ComponentLog LocationWhat to look for
Watchdog Terminal output (stdout) Action consumption cycles, driver connect/disconnect events, execution errors
Agent Terminal output (stdout) LLM response times, tool call traces, Critic validation results
Runtime (LOG.md) <workspace>/LOG.md Session outcome table, error messages, return values
Gateway Terminal output (stdout) Channel connection/disconnection, message routing logs
Channels Depends on channel implementation Refer to the channel's own logging configuration (file or stdout)