Skip to content

Running the Spirit Moz Robot

The Moz robot is a new generation high-DOF intelligent robot launched by Spirit AI (see Moz Robot).

This document describes:

  1. How to fine-tune the PI05 base model based on the Spirit open-source dataset, so that the fine-tuned model can control the Moz robot;
  2. How to deploy the inference environment and use the Moz robot to complete specific tasks.

Model Fine-tuning

Environment Installation

It is recommended to use a Python virtual environment with uv.

Please download TOS Browser, and contact Spirit after-sales support to obtain the key and download the source code.

Install dependencies

bash
cd openpi
GIT_LFS_SKIP_SMUDGE=1 uv sync
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e .

# ffmpeg library may be required
sudo apt-get install -y ffmpeg

Replace the installed transformers library

bash
uv pip show transformers
cp -r ./src/openpi/models_pytorch/transformers_replace/* .venv/lib/python3.11/site-packages/transformers/

Prepare Official Checkpoint and Dataset

Download official checkpoint

python
# Execute using uv run python xxx.py
from openpi.shared import download
checkpoint_dir = download.maybe_download("gs://openpi-assets/checkpoints/pi05_base")
print(checkpoint_dir)

Refer to the "Converting JAX Models to PyTorch" section in the official repository to convert the JAX model to a PyTorch model.

Download dataset: Use the same TOS key to download the Pi source code and Spirit dataset.

Compute dataset statistics (stats), which will be used during training:

The default dataset path is ~/.cache/huggingface/lerobot/spirit-ai/pickplace. If you need to modify it, please change the repo_id of the TrainConfig corresponding to "pi05_moz" in src/openpi/training/config.py from spirit-ai/pickplace to the actual absolute path of the dataset.

Then execute

python
python scripts/compute_norm_stats.py --config-name=pi05_moz

The computed norm stats will be written to openpi/assets/pi05_moz. If repo_id is specified as an absolute path, the norm stats will be saved to the directory specified by repo_id.

P.S.: If the dataset is large and computing the norm stats for all data takes a long time, you can sample a certain number of batches for norm parameter calculation instead of using the entire dataset:

python
max_batches = 20000 // config.batch_size
limit = min(num_batches, max_batches)
for i, batch in enumerate(tqdm.tqdm(data_loader, total=limit, desc="Computing stats")):
    if i >= limit:
        break
    for key in keys:
        stats[key].update(np.asarray(batch[key]))

Execute Training

Single machine 4 GPU example

bash
CUDA_VISIBLE_DEVICES=1,2,3,4 uv run torchrun --standalone --nnodes=1 --nproc_per_node=4 scripts/train_pytorch.py pi05_moz --exp_name YOUR_EXP_NAME --data.repo-id "PATH_TO_YOUR_DATASET"

If the downloaded base model is not in the default location, you can specify the model loading path by setting pytorch_weight_path in TrainConfig.

Inference

Environment Installation

First refer to the environment installation section in the "Model Fine-tuning" chapter to complete source code download and dependency installation.

mozrobot SDK Installation

Download mozrobot SDK version 0.1.0 (refer to the Moz Resources page), and configure and install according to its documentation.

Notes:

  • To properly connect to the robot, you need to configure the network IP segment correctly and install ROS 2.
  • It is not recommended to install mozrobot into the previously created uv virtual environment as there may be dependency conflicts. It is recommended to install directly to the system Python, or preferably use Docker for environment isolation.

Execute Inference

Start Inference Service

bash
cd openpi/
uv run scripts/serve_policy.py --env=MOZ --default_prompt='Pick up the marker pen.'

The above command will load the model from /openpi_assets/checkpoints/pi05_pickplace/.

To specify a custom model path:

bash
uv run scripts/serve_policy.py --default-prompt='Pick up the marker pen.' \
    policy:checkpoint \
      --policy.config=pi05_moz \
      --policy.dir=/openpi_assets/20251120_official_pi05_cleantable_iter3w/30000/

Start Robot Inference

Use system Python to start robot inference. Before doing so, please complete the necessary dependency installation:

bash
cd openpi/
uv pip compile ./packages/openpi-client/pyproject.toml -o /tmp/requirements.txt
pip install -r /tmp/requirements.txt

pip install typing_extensions tyro
bash
export PYTHONPATH=$(pwd):$(pwd)/packages/openpi-client/src/:$PYTHONPATH
python3 examples/moz1_real/main.py

Parameter description

  • args.host: IP of the inference service;
  • args.port: Port of the inference service;
  • args.realsense-serials: RealSense serial numbers, in order: head camera, left wrist camera, right wrist camera (see mozrobot SDK documentation for details);
  • args.structure: Robot configuration, must match the actual robot, usually wholebody_without_base;
  • args.action-horizon: How many actions to execute (at robot control frequency) before triggering the next inference;
  • args.max-episode-steps: Maximum number of action steps allowed.

Docker-based Inference

Build image. Before building the image, please place mozrobot-0.1.2.zip in the thirdparty directory

bash
docker compose -f examples/moz1_real/compose.yml build

Start container

bash
export SERVER_ARGS="--env=MOZ --default_prompt='Pick up the marker pen.'"
docker compose -f examples/moz1_real/compose.yml up

Enter the container corresponding to the moz1_real image

bash
docker exec -it <container_name_of_moz1_real> bash

Execute robot inference inside the container

bash
export PYTHONPATH=$(pwd):$(pwd)/packages/openpi-client/src/:$PYTHONPATH
python3 examples/moz1_real/main.py

FAQ

Timeout issue when sending request to model inference service for the first time

The service will execute torch.compile during the first inference, which may cause the first request to take a long time.

It is recommended to wait a moment and then send the request to the inference service again.

About action_horizon

The current model returns 50 frames of actions per inference (30 Hz), which can be interpolated to obtain 200 frames of actions (matching the robot control frequency of 120 Hz). It is recommended to set the args.action-horizon parameter between 100-200.

If the robot doesn't move, or actions are abnormal

Please first troubleshoot according to the mozrobot SDK documentation: is the robot data being transmitted normally, are the camera images being captured normally; then check the data processing pipeline, and finally locate issues with the model itself.

It is recommended to start with simple tasks for step-by-step verification.