Giter Site home page Giter Site logo

aloha's Introduction

ALOHA: A Low-cost Open-source Hardware System for Bimanual Teleoperation

This codebase contains implementation for teleoperation and data collection with the ALOHA hardware. To build ALOHA, follow the Hardware Assembly Tutorial and the quick start guide below. To train imitation learning algorithms, you would also need to install ACT.

Repo Structure

  • config: a config for each robot, designating the port they should bind to, more details in quick start guide.
  • launch: a ROS launch file for all 4 cameras and all 4 robots.
  • aloha_scripts: python code for teleop and data collection

Quick start guide

Hardware selection

We suggest using a "heavy-duty" computer if possible.

In particular, at least 6 USB3 ports are needed. 4 ports for robot connections and 2 ports for cameras. We have seen cases that a machine was not able to stably connect to all 4 robot arms simultaneously over USB, especially when USB hubs are used.

Software selection -- OS:

Currently tested and working configurations:

  • ✅ Ubuntu 18.04 + ROS 1 noetic
  • ✅ Ubuntu 20.04 + ROS 1 noetic

Ongoing testing (compatibility effort underway):

  • 🚧 ROS 2

Software installation - ROS:

  1. Install ROS and interbotix software following https://docs.trossenrobotics.com/interbotix_xsarms_docs/
  2. This will create the directory ~/interbotix_ws which contains src.
  3. git clone this repo inside ~/interbotix_ws/src
  4. source /opt/ros/noetic/setup.sh && source ~/interbotix_ws/devel/setup.sh
  5. sudo apt-get install ros-noetic-usb-cam && sudo apt-get install ros-noetic-cv-bridge
  6. run catkin_make inside ~/interbotix_ws, make sure the build is successful
  7. go to ~/interbotix_ws/src/interbotix_ros_toolboxes/interbotix_xs_toolbox/interbotix_xs_modules/src/interbotix_xs_modules/arm.py, find function publish_positions. Change self.T_sb = mr.FKinSpace(self.robot_des.M, self.robot_des.Slist, self.joint_commands) to self.T_sb = None. This prevents the code from calculating FK at every step which delays teleoperation.

Hardware installation:

The goal of this section is to run roslaunch aloha 4arms_teleop.launch, which starts communication with 4 robots and 4 cameras. It should work after finishing the following steps:

Step 1: Connect 4 robots to the computer via USB, and power on. Do not use extension cable or usb hub.

  • To check if the robot is connected, install dynamixel wizard here

  • Dynamixel wizard is a very helpful debugging tool that connects to individual motors of the robot. It allows things such as rebooting the motor (very useful!), torque on/off, and sending commands. However, it has no knowledge about the kinematics of the robot, so be careful about collisions. The robot will collapse if motors are torque off i.e. there is no automatically engaged brakes in joints.

  • Open Dynamixel wizard, go into options and select:

    • Protocal 2.0
    • All ports
    • 1000000 bps
    • ID range from 0-10
  • Note: repeat above everytime before you scan.

  • Then hit Scan. There should be 4 devices showing up, each with 9 motors.

  • One issue that arises is the port each robot binds to can change over time, e.g. a robot that is initially ttyUSB0 might suddenly become ttyUSB5. To resolve this, we bind each robot to a fixed symlink port with the following mapping:

    • ttyDXL_master_right: right master robot (master: the robot that the operator would be holding)
    • ttyDXL_puppet_right: right puppet robot (puppet: the robot that performs the task)
    • ttyDXL_master_left: left master robot
    • ttyDXL_puppet_left: left puppet robot
  • Take ttyDXL_master_right: right master robot as an example:

    1. Find the port that the right master robot is currently binding to, e.g. ttyUSB0

    2. run udevadm info --name=/dev/ttyUSB0 --attribute-walk | grep serial to obtain the serial number. Use the first one that shows up, the format should look similar to FT6S4DSP.

    3. sudo vim /etc/udev/rules.d/99-fixed-interbotix-udev.rules and add the following line:

      SUBSYSTEM=="tty", ATTRS{serial}=="<serial number here>", ENV{ID_MM_DEVICE_IGNORE}="1", ATTR{device/latency_timer}="1", SYMLINK+="ttyDXL_master_right"
      
    4. This will make sure the right master robot is always binding to ttyDXL_master_right

    5. Repeat with the rest of 3 arms.

  • To apply the changes, run sudo udevadm control --reload && sudo udevadm trigger

  • If successful, you should be able to find ttyDXL* in your /dev

Step 2: Set max current for gripper motors

  • Open Dynamixel Wizard, and select the wrist motor for puppet arms. The name of it should be [ID:009] XM430-W350
  • Tip: the LED on the base of robot will flash when it is talking to Dynamixel Wizard. This will help determine which robot is selected.
  • Find 38 Current Limit, enter 200, then hit save at the bottom.
  • Repeat this for both puppet robots.
  • This limits the max current through gripper motors, to prevent overloading errors.

Step 3: Setup 4 cameras

  • You may use usb hub here, but maximum 2 cameras per hub for reasonable latency.

  • To make sure all 4 cameras are binding to a consistent port, similar steps are needed.

  • Cameras are by default binding to /dev/video{0, 1, 2...}, while we want to have symlinks {CAM_RIGHT_WRIST, CAM_LEFT_WRIST, CAM_LOW, CAM_HIGH}

  • Take CAM_RIGHT_WRIST as an example, and let's say it is now binding to /dev/video0. run udevadm info --name=/dev/video0 --attribute-walk | grep serial to obtain it's serial. Use the first one that shows up, the format should look similar to 0E1A2B2F.

  • Then sudo vim /etc/udev/rules.d/99-fixed-interbotix-udev.rules and add the following line

    SUBSYSTEM=="video4linux", ATTRS{serial}=="<serial number here>", ATTR{index}=="0", ATTRS{idProduct}=="085c", ATTR{device/latency_timer}="1", SYMLINK+="CAM_RIGHT_WRIST"
    
  • Repeat this for {CAM_LEFT_WRIST, CAM_LOW, CAM_HIGH} in additional to CAM_RIGHT_WRIST

  • To apply the changes, run sudo udevadm control --reload && sudo udevadm trigger

  • If successful, you should be able to find {CAM_RIGHT_WRIST, CAM_LEFT_WRIST, CAM_LOW, CAM_HIGH} in your /dev

At this point, have a new terminal

conda deactivate # if conda shows up by default
source /opt/ros/noetic/setup.sh && source ~/interbotix_ws/devel/setup.sh
roslaunch aloha 4arms_teleop.launch

If no error message is showing up, the computer should be successfully connected to all 4 cameras and all 4 robots.

Trouble shooting

  • Make sure Dynamixel Wizard is disconnected, and no app is using webcam's stream. It will prevent ROS from connecting to these devices.

Software installation - Conda:

conda create -n aloha python=3.8.10
conda activate aloha
pip install torchvision
pip install torch
pip install pyquaternion
pip install pyyaml
pip install rospkg
pip install pexpect
pip install mujoco==2.3.7
pip install dm_control==1.0.14
pip install opencv-python
pip install matplotlib
pip install einops
pip install packaging
pip install h5py

Testing teleoperation

Notice: Before running the commands below, be sure to place all 4 robots in their sleep positions, and open master robot's gripper. All robots will rise to a height that is easy for teleoperation.

# ROS terminal
conda deactivate
source /opt/ros/noetic/setup.sh && source ~/interbotix_ws/devel/setup.sh
roslaunch aloha 4arms_teleop.launch

# Right hand terminal
conda activate aloha
cd ~/interbotix_ws/src/aloha/aloha_scripts
python3 one_side_teleop.py right

# Left hand terminal
conda activate aloha
cd ~/interbotix_ws/src/aloha/aloha_scripts
python3 one_side_teleop.py left

The teleoperation will start when the master side gripper is closed.

Example Usages

To set up a new terminal, run:

conda activate aloha
cd ~/interbotix_ws/src/aloha/aloha_scripts

The one_side_teleop.py we ran is for testing teleoperation and has no data collection. To collect data for an episode, run:

python3 record_episodes.py --dataset_dir <data save dir> --episode_idx 0

This will store a hdf5 file at <data save dir>. To change episode length and other params, edit constants.py directly.

To visualize the episode collected, run:

python3 visualize_episodes.py --dataset_dir <data save dir> --episode_idx 0

To replay the episode collected with real robot, run:

python3 replay_episodes.py --dataset_dir <data save dir> --episode_idx 0

To lower 4 robots before e.g. cutting off power, run:

python3 sleep.py

aloha's People

Contributors

ayzaan avatar goodrichse-google avatar jonathantompson avatar peteflorence avatar tonyzhaozh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aloha's Issues

conda python version 3.8.16 libffi error

Hey Tony,

Latest 3.8.16 version of python is broken on Ubuntu 20.04

will give the below error.. (when executing recording.py)

cv_bridge.boost.cv_bridge_boost import getCvType
ImportError: /lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0

This is due to an incompatibility between the libffi shipped with 20.04, and the version python 3.8.16 expects.. ( think it's 3.3 in 20.04 but python is expecting 3.4... something like that)

conda install python=3.8.10

in the aloha environment fixes the problem.

Recommend specifying 3.8.10 in the README.md instead of 3.8 as current

Multiple Repositories, many moving parts not clear where to start with Imitation learning

Saw the videos of the mobile aloha project, and wanted to start dissecting it for my own learning. However, it seems there are multiple layers in which aloha is built from.

Since yall are the foremost experts on Imitation learning, I'd appreciate if you could provide some direction about getting started. Goal: Get a 3D virtual model of a simple robotic arm to work with imitation learning.

There are the important base components so far, not sure if I'm missing anything.

Mobile ALOHA: comes from ALOHA
ALOHA: Comes from ACT.

ALOHA: repo provides a link to the act repo
Mobile ALOHA: relies on act plus plus repo. Website

ACT Plus Plus - linked to from website under the ML link.

Two Papers. One related to ACT, the other related to Mobile ALOHA.

"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware" paper here

"Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation" paper here

Also the interbox module seems to be used a lot

questions about simulation

Cool job ! The simulation videos showcased in the simulation section of your website, https://aloha-2.github.io/, are particularly captivating.

Could you kindly provide information about the software used for these simulations? Any additional details regarding the simulation process, tools, or methodologies would be greatly appreciated. I am keen on understanding more about how these simulations were created and the technology behind them.

Thank you very much for your time and assistance.

May I ask whether this system is understood as an anti-interference teaching device?

Hello, may I ask that this system can cope with interference, but there is no introduction of reinforcement learning in the paper, may I ask whether this is similar to the teaching device to demonstrate the playback of objects in accordance with the original teaching Angle? Instead of learning how to tie your shoes, play ping-pong, and thread in different directions?

Question about the control frequency and joint states

Hello, there.

Your work looks amazing, and I want to try it on my project. After several days strugglling, I cannot get any exciting result for a pick and place task.
The robot arm I used is different (Mycobot 280m5) and I have serval questions.

  1. Does the control frequency affect the algorithm a lot? Since my robot arm is much cheaper, looks like it can only support 5Hz during collecting episodes. Get the joint coordinates and execute takes long time.
  2. What is the joint states for your robot? For my robot I can get the joint coordinates (x, y, z, rx, ry, rz), is it the same as yours? In inference, when the robot execute the predicted coordinates, it will go to a weired position or in a weired pose that did not show in training dataset.
  3. My experiment environment is not clean. The lighting and other stuffs are in camera view. The object need to pick is small. So I think resnet18 cannot extract good features for locate the object. Then I replace it by a yolov8 backbone, which pretrained on my object's detection task. But I still cannot get good results.

Any suggestion is appreciated.

Thank you so much

I need your help! I can't training ACT policy as good as you...

Hi, I've recently been using your ACT strategy on my platform for imitation learning. Here's the setup:

  • Environment: CoppeliaSim 4.5.2 (simulation environment)
  • Robotic arm: single UR3 with gripper(7 DOF)
  • Task: grabbing and placing wooden blocks at a destination (pick and place)
    During training, I've found that more than just using 50 or 100 demonstrations is needed to get the robot to complete the task, even though my task is simpler compared to what's in the paper. Specifically, after training with 50 demonstrations, the ACT algorithm couldn't get my robotic arm to reach around the wooden block, and with 100 demonstrations, it couldn't successfully grab the block. Here are my hyperparameters:
  • policy_class: ACT
  • kl_weight: 10
  • chunk_size: 100
  • hidden_dim: 512
  • batch_size: 2
  • dim_feedforward: 3200
  • num_epochs: 2000
    I want your help on this.

record_episodes.py freq_mean below 42

Hi,

Thanks for the great work.

When I run python3 record_episodes.py, it doesn't save demonstrations because the parameter feq_mean is below the threshold 42, in my case, it is about 36-38.

freq_mean = print_dt_diagnosis(actual_dt_history)
if freq_mean < 42:
    return False

I checked that the time cost is from the code chunk below:

self.puppet_bot_left.arm.set_joint_positions(left_action[:6], blocking=False)
self.puppet_bot_right.arm.set_joint_positions(right_action[:6], blocking=False)
self.set_gripper_pose(left_action[-1], right_action[-1])
time.sleep(DT)
return dm_env.TimeStep(
    step_type=dm_env.StepType.MID,
    reward=self.get_reward(),
    discount=None,
    observation=self.get_observation())

Any suggestions to solve this problem?

Platform NOTE:
RTX4090, Ubuntu20.04, ROS noetic

Training cost

Hi there,

How much did the actual training cost?
I thought training transformers was rather costly?
Or are the transformers pretrained?

kind regards

How to set the current limit on a gripper using python api

Hey Tony,

Not an issue, feel free to close when you see it...

I read a comment in your code that you were trying to figure out how to set the current limit.

I chanced upon the solution, below...

I think you can get or set any register on the dynamixels using these ros services...

#!/usr/bin/env python

import rospy
from interbotix_xs_msgs.srv import RegisterValues


def service_command(service_path, cmd_type, name, reg, value=0):
    rospy.wait_for_service(service_path)

    try:
        get_registers = rospy.ServiceProxy(service_path, RegisterValues)

        # Make the service call
        response = get_registers(cmd_type, name, reg, value)

        # Process the response
        if response:
            print("Register values: ", response.values)
        else:
            print("Failed to get register values.")

    except rospy.ServiceException as e:
        print("Service call failed:", str(e))




if __name__ == '__main__':
    rospy.init_node('gripper_register_node')
    service_command('/puppet_left/get_motor_registers', 'single', 'gripper', 'Current_Limit')
    service_command('/puppet_left/set_motor_registers', 'single', 'gripper', 'Current_Limit', 800)
    service_command('/puppet_left/get_motor_registers', 'single', 'gripper', 'Current_Limit')

there is an equivalent ros command also, the form is something like...

rosservice call /puppet_left/get_motor_registers single gripper Hardware_Error_Status 0

What if the policy network without CVAE?

Hey Tony,

Thank you for your great work!

I have a question about the policy network selection. I noticed that in order to consider the stochasity of human demonstrations, you choose to leverage CVAE to model the action distribution, which outperforms the method w/o CVAE in the ablation study.

I wonder whether the policy without CVAE is identical to a distributional policy. What if the mean and covariance are both learned during training? To this end, will introducing policy entropy into a distributional policy w/o CVAE work?

Thank you for your time. I am looking forward to your reply :D

Regards,
Dongjie

CVAE and DETR Network Problems

Why not use detr directly for network training, but instead use CVAE for joint state to joint state restoration? Will the effect of using detr directly differ greatly

h5py_cache deprecated, expects old version of numpy

on recording.py, got this exception...

Avg freq: 48.48 Get action: 0.000 Step env: 0.021
Traceback (most recent call last):
  File "record_episodes.py", line 230, in <module>
    main(vars(parser.parse_args()))
  File "record_episodes.py", line 190, in main
    is_healthy = capture_one_episode(DT, max_timesteps, camera_names, dataset_dir, dataset_name, overwrite)
  File "record_episodes.py", line 153, in capture_one_episode
    with h5py_cache.File(dataset_path + '.hdf5', 'w', chunk_cache_mem_size=1024**2*2) as root:
  File "/home/duane/miniconda3/envs/aloha/lib/python3.8/site-packages/h5py_cache/__init__.py", line 67, in File
    bytes_per_object = np.dtype(np.float).itemsize  # assume float as most likely
  File "/home/duane/.local/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

resolved by reverting to the commented code and adding rdcc_nbytes argument

from

    # import h5py_cache
    with h5py_cache.File(dataset_path + '.hdf5', 'w', chunk_cache_mem_size=1024**2*2) as root:
    # with h5py.File(dataset_path + '.hdf5', 'w') as root:

to

    # import h5py_cache
    # with h5py_cache.File(dataset_path + '.hdf5', 'w', chunk_cache_mem_size=1024**2*2) as root:
    with h5py.File(dataset_path + '.hdf5', 'w', rdcc_nbytes=1024**2*2) as root:

train

For each timestep:
observations
- images
- cam_high (480, 640, 3) 'uint8'
- cam_low (480, 640, 3) 'uint8'
- cam_left_wrist (480, 640, 3) 'uint8'
- cam_right_wrist (480, 640, 3) 'uint8'
- qpos (14,) 'float64'
- qvel (14,) 'float64'

action                  (14,)         'float64'
"""

data_dict = {
    '/observations/qpos': [],
    '/observations/qvel': [],
    '/observations/effort': [],
    '/action': [],
}

May I ask if qvel and effort are used in the training phase? What about in the inference phase? What would happen if qvel and effort are missing during the training phase?

ACT Tuning Tips

thank for you work!!!

  • Chunk size is the most important param to tune when applying ACT to a new environment. One chunk should correspond to ~1 secs wall-clock robot motion.
  • what it means? ~1 secs wall-clock robot motion.

Camera autofocus parameters in 4_arms_teleop.launch

My system was auto_focusing, leading to some blurry images in certain cases.

Currently I'm resolving this by running the below commands....

v4l2-ctl -d /dev/CAM_RIGHT_WRIST  -c focus_automatic_continuous=0
v4l2-ctl -d /dev/CAM_RIGHT_WRIST  -c focus_absolute=40
v4l2-ctl -d /dev/CAM_LEFT_WRIST  -c focus_automatic_continuous=0
v4l2-ctl -d /dev/CAM_LEFT_WRIST  -c focus_absolute=40
v4l2-ctl -d /dev/CAM_HIGH -c focus_automatic_continuous=0
v4l2-ctl -d /dev/CAM_HIGH -c focus_absolute=5
v4l2-ctl -d /dev/CAM_LOW  -c focus_automatic_continuous=0
v4l2-ctl -d /dev/CAM_LOW -c focus_absolute=35

Grippers not respecting Current_Limit or Goal_Current

I've been having trouble with the grippers frequently having overload errors and requiring reboot.

After a lot of reading and debugging, I've come up with this chart.

motor_current

Here we can see, according to the rospy API, that Current_Limit is set to 300 and Goal_Current to 200. However we can see that this is not being respected by the gripper motor.

I'll attach the code I used to create this graph, but to summarize I'm pulling this data from the rospy service get_motor_registers

'/puppet_left/get_motor_registers', 'single', 'gripper', 'Present_Current',
'/puppet_right/get_motor_registers', 'single', 'gripper', 'Goal_Current'
'/puppet_left/get_motor_registers', 'single', 'gripper', 'Current_Limit'

I suspect this is a problem with the API and not the motor.

The reason is, if I place an object in the gripper in Dynamixel Wizard. Set "current based position mode", set the Current_Limit to 300 and set the Goal_Current to 300, the motor will not go over 850 mA (300 units) and it wont overload, no matter how far I push the goal position past the point where the gripper wont close.

Unfortunately, we don't have the source code for the Dynamixel Wizard, so I have no idea what it's doing thats special here ;)

Anyway, I'll keep looking into it.

Frequency Problem

Hello! The RGB image of my camera can only be sent out at a maximum frequency of 20hz, if I acquire data and control the robotic arm at 50hz, will the lack of image acquisition frequency have a greater impact?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.