Giter Site home page Giter Site logo

Comments (23)

jhwangbo avatar jhwangbo commented on August 18, 2024

I tried the same and I don't get any error. I ran Valgrind and no memory errors at all. Can you copy the snippet of your code? then I might be able to tell what you did wrong

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

To add more context, I am importing a1 urdf from here: https://github.com/unitreerobotics/unitree_pybullet/tree/master/data/a1. For this, I have changed the line

anymal_ = world_->addArticulatedSystem(resourceDir_+"/a1/urdf/a1.urdf");

Additionally, I have added the following lines at the end of updateObservation()

auto footIndex = anymal_->getBodyIdx("FR_calf");
for(auto& contact: anymal_->getContacts()) {
  if ( footIndex == contact.getlocalBodyIndex() ) {
    std::cout<<"Contact impulse in the contact frame: "<<contact.getImpulse()->e()<<std::endl;
    std::cout<<"Contact frame: \n"<<contact.getContactFrame().e()<<std::endl;
    std::cout<<"Contact impulse in the world frame: "<<contact.getContactFrame().e() * contact.getImpulse()->e()<<std::endl;
    std::cout<<"Contact Normal in the world frame: "<<contact.getNormal().e().transpose()<<std::endl;
    std::cout<<"Contact position in the world frame: "<<contact.getPosition().e().transpose()<<std::endl;
    std::cout<<"It collides with: "<<world_->getObject(contact.getPairObjectIndex())<<std::endl;
    std::cout<<"please check Contact.hpp for the full list of the methods"<<std::endl;
  }

If I comment out the two lines containing contact.getImpulse()->e(), I don't get a segmentation fault and everything else works fine.

Just to confirm your observation, I ran this code snippet with the anymal.urdf and everything works fine.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

I copy-pasted your example and it works fine. I pushed my code so you can take a look at it.

You cannot tell where the segfault occurs unless you check it with Valgrind. The source of error is not at getImpulse, but somewhere else you modified. That's why you have to setup Clion+Valgrind+debug_app. In this setup, you can immediately spot where the issue is

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

I still get a segmentation fault on the a1 example you pushed. I'm a bit new to valgrind and CLion but I am trying to set them up on my end. If I am able to localize the problem I will update here.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

One very common reason for a segfault is due to old installations of raisim. RaisimGymTorch will use the lib from the source for compilation but might use an old verson during runtime. So you have to make sure that you delete all old version and install it again.

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

I removed the raisim_build and raisim_worspace folder and did a fresh install. The segmentation fault happens sometimes, not always. I am running with config --eval_every_n 2.

When I run rsg_a1_debug_app with valgrind from terminal (still figuring out how to run it in CLion, although I have been able to run runner.py in CLion), I get "license was issued to another machine ...".

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Oops, Valgrind will create a virtual machine and the key will not work. Let me figure out how to bypass it

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

now raisim should work with Valgrind if you pull

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Just to check, did you build the environments again using python setup develop? Since it's a C++ environment, you have to build it every time you pull or change the environment file

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

Thanks for the push!
Yes, I rebuild the environment every time I change any cpp file. Running valgrind with rsg_a1_debug does not report any errors, but when I run the runner.py, it still seems to give a segmentation fault. valgrind was able to catch a different segmentation fault I was running into, so I think that I'm running valgrind correctly. Are you able to reliably run the runner.py file every single time? I'll probably try to dig a bit deeper sometime next week.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Yes, it always works for me. Two things to check.

  1. How many threads are you using. Did you overclock your CPU? I once had consistent segfaults due to overclocking.
  2. can you check if you don't get any segfault if you remove e() part?

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024
  1. Reducing the number of threads seems to resolve the problem.
  2. If I remove the e() part, I am able to run it even with a large number of threads.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

If you don't get the segfault after removing e(), this means that the segfault is due to the Eigen lib. Eigen has an well-known issue with SIMD. Did you change any of the compilation flags in the CMakeLists.txt? SIMD instruction sets are disabled by defaults.

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024
  1. I don't get a segmentation fault when I run this on my remote machine. So it is likely an eigen3 issue.
  2. On my laptop, I get a segmentation fault even for 1 thread and 1 environment when I run it with raisimUnity (opengl version).

I installed eigen using sudo apt-get install libeigen3-dev. I did not modify the CMakeLists.txt of the raisim software.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Does that mean that you don't get any segfault if you run it without visualization? Can you tell me the model of your laptop (or just cpu model number)?

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

It happens both with and without visualization but is very consistent with visualization with a single thread, single environment and a sleep time of 1 sec (in the visualize loop). Here are my cpu details:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 78
model name	: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz

I installed eigen-3.3.8 (stable release) from source instead of the one from apt-get, which is 3.3.4, and the seg fault still occurs.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

That is really an old computer. I pushed a change that forces gcc not to use AVX2 instruction sets. I am not sure if it helps.

If that doesn't work, can you run ./runner with valgrind? check out this link https://stackoverflow.com/questions/20112989/how-to-use-valgrind-with-python

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

I haven't tried the recent commit, but I now sometimes get a segmentation fault even on my remote machine. The segfault usually happens within or immediately after the visualization loop in the runner.py. Once it makes it past the visualization loop (after multiple relaunches), it doesn't segfault (with the exception of one run in all the runs I have had until now). I will try to setup valgrind with python from the link you sent me above and update you on what I find.

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

I was able to reproduce the bug in the debug_app.cpp with valgrind. I made the following modification --

  for(int i =0; i< 10; i++){
  vecEnv.reset();
  for(int j = 0; j < 400; j++){
  action =  EigenRowMajorMat::Random(config["num_envs"].template As<int>(), vecEnv.getActionDim());
  Eigen::Ref<EigenRowMajorMat> action_ref(action);
  vecEnv.step(action_ref, reward_ref, dones_ref);
  }
 }

This is the error I get --

==14270== Thread 8:
==14270== Invalid read of size 8
==14270==    at 0x1263E8: raisim::Mat<3ul, 1ul>::operator()(unsigned long) const (Matrix.hpp:75)
==14270==    by 0x120B45: raisim::Mat<3ul, 1ul>::norm() const (Matrix.hpp:81)
==14270==    by 0x11F3C4: raisim::ENVIRONMENT::step(Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> > const&) (Environment.hpp:141)
==14270==    by 0x12A9A0: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::perAgentStep(int, Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) (VectorizedEnvironment.hpp:124)
==14270==    by 0x110BB7: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::step(Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) [clone ._omp_fn.0] (VectorizedEnvironment.hpp:72)
==14270==    by 0x580C96D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==14270==    by 0x5C446DA: start_thread (pthread_create.c:463)
==14270==    by 0x5F7DA3E: clone (clone.S:95)
==14270==  Address 0xcd11f28 is 1,896 bytes inside a block of size 3,160 free'd
==14270==    at 0x4C31078: free (vg_replace_malloc.c:538)
==14270==    by 0x4E82931: raisim::World::integrate1() (in /home/aliengo/raisim_build/lib/libraisim.so)
==14270==    by 0x4E829B8: raisim::World::integrate() (in /home/aliengo/raisim_build/lib/libraisim.so)
==14270==    by 0x11F20D: raisim::ENVIRONMENT::step(Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> > const&) (Environment.hpp:125)
==14270==    by 0x12A9A0: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::perAgentStep(int, Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) (VectorizedEnvironment.hpp:124)
==14270==    by 0x110BB7: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::step(Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) [clone ._omp_fn.0] (VectorizedEnvironment.hpp:72)
==14270==    by 0x580C96D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==14270==    by 0x5C446DA: start_thread (pthread_create.c:463)
==14270==    by 0x5F7DA3E: clone (clone.S:95)
==14270==  Block was alloc'd at
==14270==    at 0x4C32443: memalign (vg_replace_malloc.c:906)
==14270==    by 0x4C32546: posix_memalign (vg_replace_malloc.c:1070)
==14270==    by 0x4E82886: raisim::World::integrate1() (in /home/aliengo/raisim_build/lib/libraisim.so)
==14270==    by 0x4E829B8: raisim::World::integrate() (in /home/aliengo/raisim_build/lib/libraisim.so)
==14270==    by 0x11F20D: raisim::ENVIRONMENT::step(Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> > const&) (Environment.hpp:125)
==14270==    by 0x12A9A0: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::perAgentStep(int, Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) (VectorizedEnvironment.hpp:124)
==14270==    by 0x110BB7: raisim::VectorizedEnvironment<raisim::ENVIRONMENT>::step(Eigen::Ref<Eigen::Matrix<float, -1, -1, 1, -1, -1>, 0, Eigen::OuterStride<-1> >&, Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&, Eigen::Ref<Eigen::Matrix<bool, -1, 1, 0, -1, 1>, 0, Eigen::InnerStride<1> >&) [clone ._omp_fn.0] (VectorizedEnvironment.hpp:72)
==14270==    by 0x580C96D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==14270==    by 0x5C446DA: start_thread (pthread_create.c:463)
==14270==    by 0x5F7DA3E: clone (clone.S:95)

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Thx for the input. I'll run try it myself and locate the issue today

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

Can you send me the code that you have? In the error output that you copy-pasted, I should have norm() function call in the line 141 of Environment.hpp but it has something else. So it seems like you are not running my A1 example. I made the modification you wrote and ran it with Valgrind but I don't get any error.

from raisimlib.

jhwangbo avatar jhwangbo commented on August 18, 2024

This issue should be fixed by the new commit.

When there is an internal collision, only one of them is initialized. So we have to check them with .skip() method. The example is updated here http://raisim.com/sections/Contact.html

I also changed raisim such that it doesn't segfault even if you ask for a value.

Please pull and test. If it works for you, let's close the issue.

Thanks for bug reporting :)

from raisimlib.

ashish-kmr avatar ashish-kmr commented on August 18, 2024

This seems to have fixed it! Thanks for the swift response!

from raisimlib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.