Giter Site home page Giter Site logo

Comments (12)

ap229997 avatar ap229997 commented on May 24, 2024

The simulator is run at 20 fps whereas the rotation frequency of the LiDAR sensor is 10 fps (in accordance with the official leaderboard framework). This is why you notice the flickering. I agree that the rotation frequency should be set to 20 fps to get proper LiDAR input but we had to change it to match the leaderboard framework. Generally, there isn't much difference between consecutive frames so this should still work fine.

In my experiments, I had noticed that removing the LiDAR (which is equivalent to the AIM model) leads to worse performance. Can you tell me which setting you used to test the performance without the LiDAR input?

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

I use latefusion. I think the difference between consecutive frames is very big, because it is the point cloud before and after, but the point cloud is very different when it is processed and input to the network. For example, the first frame is normal, and the second frame is the point cloud behind the car. After this part of the code, the point cloud will be basically completely black.

lidar_processed = list()
# transform the lidar point clouds to local coordinate frame
ego_theta = self.input_buffer['thetas'][-1]
ego_x, ego_y = self.input_buffer['gps'][-1]
for i, lidar_point_cloud in enumerate(self.input_buffer['lidar']):
curr_theta = self.input_buffer['thetas'][i]
curr_x, curr_y = self.input_buffer['gps'][i]
lidar_point_cloud[:,1] *= -1 # inverts x, y
lidar_transformed = transform_2d_points(lidar_point_cloud,
np.pi/2-curr_theta, -curr_x, -curr_y, np.pi/2-ego_theta, -ego_x, -ego_y)
lidar_transformed = torch.from_numpy(lidar_to_histogram_features(lidar_transformed, crop=self.config.input_resolution)).unsqueeze(0)
lidar_processed.append(lidar_transformed.to('cuda', dtype=torch.float32))
encoding.append(self.net.lidar_encoder(lidar_processed))

Unless we modify this line of code, the point cloud behind the vehicle in the second frame will be displayed normally. But even this is still unreasonable.
Wait a moment. I will send you a video via email.

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

Modify this line of code for the point cloud behind the vehicle in the second frame to display it normally. lidar_point_cloud[:,1] *= -1 # inverts x, y

from transfuser.

ap229997 avatar ap229997 commented on May 24, 2024

Can you tell me which CARLA version you are using?

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

CARLA 0.9.10.1

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

Sorry, the video is temporarily inconvenient to record. Probably the effect is to visualize lidar_transformed = transform_2d_points(lidar_point_cloud, np.pi/2-curr_theta, -curr_x, -curr_y, np.pi/2-ego_theta, -ego_x, -ego_y) This variable, there will be a frame of normal, and a frame of basically black, so I think such input gives It is unreasonable to train on the network.

from transfuser.

ap229997 avatar ap229997 commented on May 24, 2024

Are you visualizing the LiDAR input at every frame or every 10th frame (since we save the data after every 10th frame)?

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

I'm not quite sure what you mean specifically. I just added a few lines of visualization code to the part of the code I mentioned above. It should be displayed at a frequency of 20 frames.

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

But every runstep, the input that the network gets is really unreasonable, I think this is the key to the problem.

from transfuser.

ap229997 avatar ap229997 commented on May 24, 2024

In our code, the LiDAR is processed so as to give the front half of the point cloud at every frame and this leads to alternating normal and black frames. However, when we generate the data for training, we save data every 10th frame, i.e., 2 frames per second even though the input stream consists of 20 frames per second (we don't store every frame in the training dataset).

if self.step % 10 == 0 and self.save_path is not None:
            self.save(far_node, near_command, steer, throttle, brake, target_speed, data)

So, even though the input alternates between normal and black, our training dataset only contains the normal frames.

I agree that at runtime, the network input is unreasonable. The best solution would be to set the rotation frequency of LiDAR to 20 fps (as you pointed out).

from transfuser.

HongYegg avatar HongYegg commented on May 24, 2024

Very good, what I worry about is whether the trained model will have this problem. That being the case, I can adjust it myself, thank you for your answer, thank you very much. If I still have any questions, I will ask you again.

from transfuser.

ap229997 avatar ap229997 commented on May 24, 2024

Thanks for pointing this out. It'll be interesting to visualize the attention maps of the transfuser in these 'blank' frames.

from transfuser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.