alescontrela / amp_for_hardware Goto Github PK

Code for "Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions"

License: Other

Python 100.00%

amp_for_hardware's Introduction

Hi there 👋

🤖 Robotics


daydreamer		DayDreamer: World Models for Physical Robot Learning
AMP from mocap		Learning reward functions from watching animals

👾 Reinforcement Learning


VIPER		Extracting general reward functions from video models
Score Matching RL		Multimodal diffusion policies for off-policy RL

👨‍🏫 Tutorials


Numpy-CNN		Convolutional neural networks from the ground up
Differential Dynamic Programming		Differential dynamic programming examples

🛠️ Tools


coinmarketcap-history*		Coinmarketcap extraction tools
Dotfiles		dotfiles and installer for Mac
alescontrela.github.io		Custom website template (aifolio fork)

* Archived

amp_for_hardware's People

Contributors

Stargazers

Watchers

Forkers

catachiii xbpeng wx-b hellozjj chasebrignac chengxuxin angle20462046 h-zhao1997 hajun0219 superdiode dmarew hjydyn lonelyfluency yangskywalker skeli9989 onesoulkang guoyulovesunshine ke-wang1017 fishjohn onlyfuture calvindrinks yangtao121 terry97-guel mcx kc-ustc whoknowsssss xinyangjiang techthiyanes moreinfoy leedsramseypeng xzbreeze xieyanxiaomi johnny09 nayariml rookielittlep mtchen2016 martain-liu woohyuncha

amp_for_hardware's Issues

Bug to load a trained policy with go2 in pybullet -'SafetyError'

Hi,

Thank you for sharing this amazing work.

I trained a policy with the go2 robot using the retarget_scripts branch, but when I tried to load the policy with play_real.py in pybullet, I received the following message:

"except robot_config.SafetyError as e:
AttributeError: module 'legged_gym.envs.go2_robot.robot_config' has no attribute 'SafetyError'

What could I be missing?

Thank you

Custom Motion Dataset?

Awesome work!

I am curious as to how you format/convert the mocap data to the specific dataset format in your dataset files. If I want to train A1 with other mocap data, how to get the motion into matching/similar to your format?

(Edit: didn't mean to add "bug" label, just a question. I apologize.)

Deployment code

This looks great. Would you mind sharing the deployment code. This will help me deploy the policy on a Unitree A1 robot. Many thanks

RuntimeError: vstack expects a non-empty TensorList

i just followed readme.md
when running train.py or play.py:

Traceback (most recent call last):
File "../AMP_for_hardware-main/legged_gym/scripts/play.py", line 126, in
play(args)
File "../AMP_for_hardware-main/legged_gym/scripts/play.py", line 58, in play
env, _ = task_registry.make_env(name=args.task, args=args, env_cfg=env_cfg)
File "../AMP_for_hardware-main/legged_gym/utils/task_registry.py", line 97, in make_env
env = task_class( cfg=env_cfg,
File "../AMP_for_hardware-main/legged_gym/envs/base/legged_robot.py", line 90, in init
self.amp_loader = AMPLoader(motion_files=self.cfg.env.amp_motion_files, device=self.device, time_between_frames=self.dt)
File "../AMP_for_hardware-main/rsl_rl/rsl_rl/datasets/motion_loader.py", line 131, in init
self.all_trajectories_full = torch.vstack(self.trajectories_full)
RuntimeError: vstack expects a non-empty TensorList

ERROR:CHANGE GO1

Describe the bug
It seems like when I replace the robot with 'go1', the limbs become detached from the body. Additionally, in the final learned behavior of 'go1', the front legs move, but the hind legs only spread apart to the sides without any movement. Have you encountered such an issue before, and how should it be resolved?

To Reproduce

Expected behavior
A clear and concise description of what you expected to happen.

System (please complete the following information):

Commit: [e.g. 8f3b9ca]
OS: [e.g. Ubuntu 20.04]
GPU: [e.g. RTX 4070TI]
CUDA: [e.g. 12.3]
GPU Driver: [e.g. 535]

Additional context
Add any other context about the problem here.

more mocap motions data

Hello:
Your work is interesting!
How to obtain more motions data (A1 robot)?
The file "dog_clips_info.txt" included many task names, but the file "mocap_motions_a1/" only included several tasks.

How to balance task_reward and style_reward

I want to use amp training to walk on complex terrain. Have you tried it before? What parameters do you think have a significant impact on terrain adaptability

RuntimeError: CUDA error: an illegal memory access was encountered

Describe the bug
It worked fine with CPU but when I switched to GPU, the following error shown many times
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Steps to reproduce the behavior:

Execute python3 legged_gym/scripts/train.py --task=a1_amp
See error: ...

Expected behavior
A clear and concise description of what you expected to happen.

System (please complete the following information):

Commit: 799ded4
OS: Ubuntu 20.04
GPU: RTX4070
CUDA: 11.8
GPU Driver: 525.125.06

Additional Notice

The following showing is the whole output:

python3 legged_gym/scripts/train.py --task=a1_amp
Importing module 'gym_38' (/home/tianhu/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/tianhu/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
PyTorch version 2.0.0+cu118
Device count 1
/home/tianhu/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/tianhu/.cache/torch_extensions/py38_cu118 as PyTorch extensions root...
Emitting ninja build file /home/tianhu/.cache/torch_extensions/py38_cu118/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
Setting seed: 1
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
/home/tianhu/anaconda3/envs/amp_hw/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Loaded 2.499s. motion from datasets/mocap_motions/rightturn0.txt.
Loaded 10.311s. motion from datasets/mocap_motions/pace1.txt.
Loaded 0.798s. motion from datasets/mocap_motions/pace0.txt.
Loaded 0.672s. motion from datasets/mocap_motions/trot1.txt.
Loaded 0.672s. motion from datasets/mocap_motions/trot0.txt.
Loaded 0.9450000000000001s. motion from datasets/mocap_motions/leftturn0.txt.
AMPOnPolicyRunner
Actor MLP: Sequential(
(0): Linear(in_features=42, out_features=512, bias=True)
(1): ELU(alpha=1.0)
(2): Linear(in_features=512, out_features=256, bias=True)
(3): ELU(alpha=1.0)
(4): Linear(in_features=256, out_features=128, bias=True)
(5): ELU(alpha=1.0)
(6): Linear(in_features=128, out_features=12, bias=True)
)
Critic MLP: Sequential(
(0): Linear(in_features=48, out_features=512, bias=True)
(1): ELU(alpha=1.0)
(2): Linear(in_features=512, out_features=256, bias=True)
(3): ELU(alpha=1.0)
(4): Linear(in_features=256, out_features=128, bias=True)
(5): ELU(alpha=1.0)
(6): Linear(in_features=128, out_features=1, bias=True)
)
Loaded 2.499s. motion from datasets/mocap_motions/rightturn0.txt.
Loaded 10.311s. motion from datasets/mocap_motions/pace1.txt.
Loaded 0.798s. motion from datasets/mocap_motions/pace0.txt.
Loaded 0.672s. motion from datasets/mocap_motions/trot1.txt.
Loaded 0.672s. motion from datasets/mocap_motions/trot0.txt.
Loaded 0.9450000000000001s. motion from datasets/mocap_motions/leftturn0.txt.
Preloading 2000000 transitions
Finished preloading
PxgCudaDeviceMemoryAllocator fail to allocate memory 339738624 bytes!! Result = 2
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 4210
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3480
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3535
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 6137
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 991
Traceback (most recent call last):
File "legged_gym/scripts/train.py", line 47, in
train(args)
File "legged_gym/scripts/train.py", line 42, in train
ppo_runner, train_cfg = task_registry.make_alg_runner(env=env, name=args.task, args=args)
File "/home/tianhu/AMP_for_hardware/legged_gym/utils/task_registry.py", line 149, in make_alg_runner
runner = runner_class(env, train_cfg_dict, log_dir, device=args.rl_device)
File "/home/tianhu/AMP_for_hardware/rsl_rl/rsl_rl/runners/amp_on_policy_runner.py", line 104, in init
_, _ = self.env.reset()
File "/home/tianhu/AMP_for_hardware/legged_gym/envs/base/legged_robot.py", line 99, in reset
obs, privileged_obs, _, _, _, _, _ = self.step(torch.zeros(self.num_envs, self.num_actions, device=self.device, requires_grad=False))
File "/home/tianhu/AMP_for_hardware/legged_gym/envs/base/legged_robot.py", line 113, in step
self.torques = self._compute_torques(self.actions).view(self.torques.shape)
File "/home/tianhu/AMP_for_hardware/legged_gym/envs/base/legged_robot.py", line 431, in _compute_torques
actions_scaled = actions * self.cfg.control.action_scale
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

ValueError: Expected parameter loc (Tensor of shape (32880, 12)) of distribution Normal(loc: torch.Size([32880, 12]), scale: torch.Size([32880, 12])) to satisfy the constraint Real(), but found invalid values:

Describe the bug
'ValueError: Expected parameter loc (Tensor of shape (32880, 12)) of distribution Normal(loc: torch.Size([32880, 12]), scale: torch.Size([32880, 12])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=)`

To Reproduce
Execute python legged_gym/scripts/train.py --task=a1_amp
This error occurs.

Expected behavior
The normal record and no bugs.

System (please complete the following information):
Commit: https://github.com/Alescontrela/AMP_for_hardware/commit/8f3b9caa6376b81b36d5242c45a297f79ee5348c
OS: Ubuntu 18.04
GPU: GeForce GTX 1080 Ti
CUDA: 11.1
GPU Driver: 510.47.03

Low speed for training

Describe the bug
Hi, the problem I meet now is the low speed for training. I remember you presented a result that the training was finished in 20 minutes, while executing the program by myself costed about 9-10 days. Do you know what happened with it? And what were commands you used in the terminal when you operated the program？

To Reproduce
Steps to reproduce the behavior:

cd AMP_for_hardware/
Then use ‘’’python legged_gym/scripts/train.py —task=a1_amp —headless —num_envs=1000 ‘’’
The terminal shows Iteration time is about 1.76s, ETA is about 760000s

Expected behavior
I hope I can reproduce your result that spending about 20minutes finishes the train. Do you know what happened with it that the training speed is too slow? And what were commands you input to the terminal when you operated the program？

System (please complete the following information):

OS: Centos7
GPU: Intel A100(80G), Note: the usage is 37% when I operate the train.py
CUDA: 11.3
GPU Driver: 510.47.03

What's the hopturn task config setting?

Hi @Alescontrela I'm trying to reproduce the hopturn result using the provided hopturn dataset, I'm wondering what's your training setting for the hopturn task, if possible can you share the config used for the hopturn task?

Race Data

Describe the bug
I check the website https://sites.google.com/berkeley.edu/amp-in-real/home experiment result, from the result "Informal race between Fu et al.* (left) and AMP Policy (right)", we could see A1 could run at a very high speed, but I check the trot0 and trot1 data from datasets file, it seems these data are not related to race, if yes, could you provide motion data related to race experiment? Thanks!

To Reproduce
Steps to reproduce the behavior:

Execute ./path/script arg1 arg2
Then ...
See error: ...

Expected behavior
A clear and concise description of what you expected to happen.

System (please complete the following information):

Commit: [e.g. 8f3b9ca]
OS: [e.g. Ubuntu 20.04]
GPU: [e.g. RTX 2060 Super]
CUDA: [e.g. 11.4]
GPU Driver: [e.g. 470.82.01]

Additional context
Add any other context about the problem here.

No trained file

Describe the bug
There is no logs file.

To Reproduce
Steps to reproduce the behavior:

Execute python legged_gym/scripts/play.py --task=a1_amp
Then report no logs run file
I assume that you didn't put the saved agent policy model file which you used in the code file, could you release it?
And could you give a guide how to implement the code in Unitree A1? I will appreciate it very much!
.

RuntimeError: vstack expects a non-empty TensorList

Describe the bug
There is something wrong when I run the train.py.

To Reproduce
Steps to reproduce the behavior:

Execute ./legged_gym/scripts/train.py task=a1_amp
See error:
RuntimeError: vstack expects a non-empty TensorList

Expected behavior
I want to run the demo of train.py.

System (please complete the following information):

Commit: [a8b6f44]
OS: [e.g. Ubuntu 18.04]
GPU: [e.g. RTX 3090]
CUDA: [e.g. 11.4]
GPU Driver: [e.g. 470.82.01]

Additional context
Add any other context about the problem here.

Question about motion data.

Hi Alescontrela,
Thanks for your great job on amp_for_hardware!
I can train it on a1, it is awesome. I have a question that I find that your mocap data is txt file and issacgym offical data, ase/amp mocap data is npy file. I want to try humanoid motion data on your repo. could you tell me how to translate that data?
I noticed that your code AMP_LOADER has 'reorder_from_pybullet_to_isaac', is that translated by pybullet?
Thanks!

Problem to work with another quadruped robot

Hi,

Thank you very much for sharing your work! it is awesome :)

I want to work with my quadruped robot. However, after many attempts and changes in the configurations, I still don’t know what's going wrong.

Please can you give me an idea of what's wrong with my settings to make my quadruped robot behave so strangely from the start, as you can see in the video below?

test_compressed.mp4

I generated the reference motions using this repository for my robot configuration, and it was looking fine.

I’ve created a robot class for this robot and modified some parameters to my robot config:

class QUADAMPCfg( LeggedRobotCfg ):

    class env( LeggedRobotCfg.env ):
        num_envs = 5480
        include_history_steps = None  # Number of steps of history to include.
        num_observations = 42
        num_privileged_obs = 48
        reference_state_initialization = True
        reference_state_initialization_prob = 0.85
        amp_motion_files = MOTION_FILES
        get_commands_from_joystick = False


    class init_state( LeggedRobotCfg.init_state ):
        pos = [0.0, 0.0, 0.3] # x,y,z [m]
        default_joint_angles = { # = target angles [rad] when action = 0.0
            'fl_abad': 0.126,   # [rad]
            'fl_shoulder': 0.61,   # [rad]
            'fl_knee': -1.22 ,  # [rad]

            'fr_abad': -0.126,     # [rad]
            'fr_shoulder': 0.61,   # [rad]
            'fr_knee': -1.22,     # [rad]
            
            'bl_abad': 0.126,   # [rad]
            'bl_shoulder': 0.689,
            'bl_knee':  -1.22,

            'br_abad': -0.126,   # [rad]
            'br_shoulder': 0.61,    # [rad]
            'br_knee':  -1.22,  # [rad]
        }

    class control( LeggedRobotCfg.control ):
        # PD Drive parameters:
        control_type = 'P'
        stiffness = {'': 16.}  # [N*m/rad]
        damping = {'': 0.35}     # [N*m*s/rad]
        # action scale: target angle = actionScale * action + defaultAngle
        action_scale = 0.25 
        # decimation: Number of control action updates @ sim DT per policy DT
        decimation = 6

    class terrain( LeggedRobotCfg.terrain ):
        mesh_type = 'plane'
        measure_heights = False

    class asset( LeggedRobotCfg.asset ):
        file = '{LEGGED_GYM_ROOT_DIR}/resources/robots/QuadB12/quad.urdf'
        foot_name = "contact"
        penalize_contacts_on = ["HFE", "KFE"]
        terminate_after_contacts_on = [
            "base_link", "HFE", "KFE"]
        
        self_collisions = 0 # 1 to disable, 0 to enable...bitwise filter
  
    class domain_rand:
        randomize_friction = True
        friction_range = [0.25, 1.75]
        randomize_base_mass = True
        added_mass_range = [-1., 1.]
        push_robots = True
        push_interval_s = 15
        max_push_vel_xy = 1.0
        randomize_gains = True
        stiffness_multiplier_range = [0.9, 1.1]
        damping_multiplier_range = [0.9, 1.1]

    class noise:
        add_noise = True
        noise_level = 1.0 # scales other values
        class noise_scales:
            dof_pos = 0.03
            dof_vel = 1.5
            lin_vel = 0.1
            ang_vel = 0.3
            gravity = 0.05
            height_measurements = 0.1


    class sim:
        dt =  0.005
        substeps = 1
        gravity = [0., 0. ,-9.81]  # [m/s^2]
        up_axis = 1  # 0 is y, 1 is z

        class physx:
            num_threads = 10
            solver_type = 1  # 0: pgs, 1: tgs
            num_position_iterations = 4
            num_velocity_iterations = 0
            contact_offset = 0.01  # [m]
            rest_offset = 0.0   # [m]
            bounce_threshold_velocity = 0.5 #0.5 [m/s]
            max_depenetration_velocity = 1.0
            max_gpu_contact_pairs = 2**23 #2**24 -> needed for 8000 envs and more
            default_buffer_size_multiplier = 5
            contact_collection = 2 # 0: never, 1: last sub-step, 2: all sub-steps (default=2)


    class rewards( LeggedRobotCfg.rewards ):
        soft_dof_pos_limit = 0.9
        base_height_target = 0.20
        class scales( LeggedRobotCfg.rewards.scales ):
            termination = 0.0
            tracking_lin_vel = 1.5 * 1. / (.005 * 6)
            tracking_ang_vel = 0.5 * 1. / (.005 * 6)
            lin_vel_z = 0.0
            ang_vel_xy = 0.0
            orientation = 0.0
            torques = 0.0
            dof_vel = 0.0
            dof_acc = 0.0
            base_height = 0.0 
            feet_air_time =  0.0
            collision = 0.0
            feet_stumble = 0.0 
            action_rate = 0.0
            stand_still = 0.0
            dof_pos_limits = 0.0

    class commands:
        curriculum = True
        max_curriculum = 1.
        num_commands = 4 # default: lin_vel_x, lin_vel_y, ang_vel_yaw, heading (in heading mode ang_vel_yaw is recomputed from heading error)
        resampling_time = 10. # time before command are changed[s]
        heading_command = False # if true: compute ang vel command from heading error
        class ranges:
            lin_vel_x = [-1.0, 2.0] # min max [m/s]
            lin_vel_y = [-0.3, 0.3]   # min max [m/s]
            ang_vel_yaw = [-1.57, 1.57]    # min max [rad/s]
            heading = [-3.14, 3.14]

class QUADAMPCfgPPO( LeggedRobotCfgPPO ):
    runner_class_name = 'AMPOnPolicyRunner'
    class algorithm( LeggedRobotCfgPPO.algorithm ):
        entropy_coef = 0.01
        amp_replay_buffer_size = 1000000
        num_learning_epochs = 5
        num_mini_batches = 4

    class runner( LeggedRobotCfgPPO.runner ):
        run_name = ''
        experiment_name = 'QuadB12_amp_'
        algorithm_class_name = 'AMPPPO'
        policy_class_name = 'ActorCritic'
        max_iterations = 500000 # number of policy updates

        amp_reward_coef = 2.0
        amp_motion_files = MOTION_FILES
        amp_num_preload_transitions = 2000000
        amp_task_reward_lerp = 0.3
        amp_discr_hidden_dims = [1024, 512]

        min_normalized_std = [0.05, 0.02, 0.05] * 4

I also checked if the body names were read correctly:

body_names ['base_link', 'bl_HAA_link', 'bl_HFE_link', 'bl_KFE_link', 'bl_contact', 'br_HAA_link', 'br_HFE_link', 'br_KFE_link', 'br_contact', 'fl_HAA_link', 'fl_HFE_link', 'fl_KFE_link', 'fl_contact', 'fr_HAA_link', 'fr_HFE_link', 'fr_KFE_link', 'fr_contact']

Termination contact names: ['base_link', 'bl_HFE_link', 'br_HFE_link', 'fl_HFE_link', 'fr_HFE_link', 'bl_KFE_link', 'br_KFE_link', 'fl_KFE_link', 'fr_KFE_link']

In addition, I made sure that my robot is with the right orientations, and I have simplified mash as you can see in the picture:

Here is the picture after more than 1k interactions, looks like it is learning not to move.

Thank you in advance!

alescontrela / amp_for_hardware Goto Github PK

amp_for_hardware's Introduction

Hi there 👋

🤖 Robotics

👾 Reinforcement Learning

👨‍🏫 Tutorials

🛠️ Tools

amp_for_hardware's People

Contributors

Stargazers

Watchers

Forkers

amp_for_hardware's Issues

Recommend Projects

Recommend Topics

Recommend Org