Running Spirit AI Moz Robot

Moz robot is a new generation high-DOF intelligent robot launched by Spirit AI (see Moz Robot).

This document covers:

How to fine-tune the PI05 base model based on Spirit AI open source datasets, enabling the fine-tuned model to control Moz robots;
How to deploy the inference environment and use Moz robot to complete specific tasks.

Model Fine-tuning

Environment Installation

Using uv Python virtual environment is recommended.

Download source code

bash

# TODO

Install dependencies

bash

cd openpi
GIT_LFS_SKIP_SMUDGE=1 uv sync
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e .

# ffmpeg library may be required
sudo apt-get install -y ffmpeg

Replace the installed transformers library

bash

uv pip show transformers
cp -r ./src/openpi/models_pytorch/transformers_replace/* .venv/lib/python3.11/site-packages/transformers/

Prepare Official Checkpoint and Dataset

Download official checkpoint

python

# Execute using uv run python xxx.py
from openpi.shared import download
checkpoint_dir = download.maybe_download("gs://openpi-assets/checkpoints/pi05_base")
print(checkpoint_dir)

Refer to the "Converting JAX Models to PyTorch" section in the official repository to convert the JAX model to PyTorch model.

Download dataset

bash

# TODO

Compute dataset statistics (stats), which will be used during training:

The default dataset path is ~/.cache/huggingface/lerobot/spirit-ai/pickplace. To modify, change the repo_id of TrainConfig corresponding to "pi05_moz" in src/openpi/training/config.py from spirit-ai/pickplace to the actual dataset absolute path.

Then execute

python

python scripts/compute_norm_stats.py --config-name=pi05_moz

The computed norm stats will be written to openpi/assets/pi05_moz. If repo_id is specified as an absolute path, the norm stats will be saved to the directory specified by repo_id.

P.S.: If the dataset is large, computing norm stats for all data takes a long time. In this case, you can sample a certain number of batches to compute norm parameters instead of using the entire dataset:

python

max_batches = 20000 // config.batch_size
limit = min(num_batches, max_batches)
for i, batch in enumerate(tqdm.tqdm(data_loader, total=limit, desc="Computing stats")):
    if i >= limit:
        break
    for key in keys:
        stats[key].update(np.asarray(batch[key]))

Execute Training

Single machine 4-GPU example

bash

CUDA_VISIBLE_DEVICES=1,2,3,4 uv run torchrun --standalone --nnodes=1 --nproc_per_node=4 scripts/train_pytorch.py pi05_moz --exp_name YOUR_EXP_NAME --data.repo-id "PATH_TO_YOUR_DATASET"

If the downloaded base model is not in the default location, you can specify the model loading path by setting pytorch_weight_path in TrainConfig.

Inference

Environment Installation

First refer to the environment installation section in "Model Fine-tuning" to complete source code download and dependency installation.

mozrobot SDK Installation

Download and extract mozrobot SDK

bash

# TODO

unzip mozrobot-x.x.x.zip

Please refer to the documentation included with the mozrobot package for configuration and installation. Note: To correctly connect to the robot, you need to properly configure the network IP segment and install ROS 2.

Tip: It is not recommended to install mozrobot into the previously created uv virtual environment, as there may be dependency conflicts. It is recommended to install directly to system Python, or preferably use Docker for environment isolation.

Execute Inference

Start Inference Service

bash

cd openpi/
uv run scripts/serve_policy.py --env=MOZ --default_prompt='Pick up the marker pen.'

The above command will load the model from /openpi_assets/checkpoints/pi05_pickplace/.

To specify a custom model path:

bash

uv run scripts/serve_policy.py --default-prompt='Pick up the marker pen.' \
    policy:checkpoint \
      --policy.config=pi05_moz \
      --policy.dir=/openpi_assets/20251120_official_pi05_cleantable_iter3w/30000/

Start Robot Inference

Use system Python to start robot inference. Before this, complete the necessary dependency installation:

bash

cd openpi/
uv pip compile ./packages/openpi-client/pyproject.toml -o /tmp/requirements.txt
pip install -r /tmp/requirements.txt

pip install typing_extensions tyro

bash

export PYTHONPATH=$(pwd):$(pwd)/packages/openpi-client/src/:$PYTHONPATH
python3 examples/moz1_real/main.py

Parameter Description

args.host: IP of the inference service;
args.port: Port of the inference service;
args.realsense-serials: Realsense serial numbers, in order: head camera, left wrist camera, right wrist camera (see mozrobot SDK documentation for details);
args.structure: Robot configuration, must match the actual robot, usually wholebody_without_base;
args.action-horizon: Number of actions to execute before triggering the next inference (based on robot control frequency);
args.max-episode-steps: Maximum number of action steps allowed.

Docker-based Inference

Build image. Before building, place mozrobot-0.1.2.zip in the thirdparty directory

bash

docker compose -f examples/moz1_real/compose.yml build

Start container

bash

export SERVER_ARGS="--env=MOZ --default_prompt='Pick up the marker pen.'"
docker compose -f examples/moz1_real/compose.yml up

Enter the container corresponding to the moz1_real image

bash

docker exec -it <container_name_of_moz1_real> bash

Execute robot inference inside the container

bash

export PYTHONPATH=$(pwd):$(pwd)/packages/openpi-client/src/:$PYTHONPATH
python3 examples/moz1_real/main.py

FAQ

Timeout issue when sending first request to model inference service

The service executes torch.compile during the first inference, which may cause the first request to take longer.

It is recommended to wait a moment before sending another request to the inference service.

About action_horizon

The current model returns 50 frames of actions (30 Hz) per inference, which can be interpolated to 200 frames of actions (matching robot control frequency of 120 Hz). It is recommended to set the args.action-horizon parameter between 100-200.

If the robot doesn't move or actions are abnormal

Please first troubleshoot according to the mozrobot SDK documentation: whether robot data is being transmitted normally, whether camera images are being captured correctly; then check the data processing pipeline, and finally locate issues with the model itself.

It is recommended to start with simple tasks and verify step by step.

Running Spirit AI Moz Robot ​

Model Fine-tuning ​

Environment Installation ​

Prepare Official Checkpoint and Dataset ​

Execute Training ​

Inference ​

Environment Installation ​

mozrobot SDK Installation ​

Execute Inference ​

Start Inference Service ​

Start Robot Inference ​

Docker-based Inference ​

FAQ ​

Timeout issue when sending first request to model inference service ​

About action_horizon ​

If the robot doesn't move or actions are abnormal ​

Running Spirit AI Moz Robot

Model Fine-tuning

Environment Installation

Prepare Official Checkpoint and Dataset

Execute Training

Inference

Environment Installation

mozrobot SDK Installation

Execute Inference

Start Inference Service

Start Robot Inference

Docker-based Inference

FAQ

Timeout issue when sending first request to model inference service

About action_horizon

If the robot doesn't move or actions are abnormal