First of all, thank you for releasing your code and your great work. I have a shor

When I run my version of the I got the following, which looks a bit different t

Training Set size not 17065 for NuScenes after preprocessing about motionnet HOT 5 CLOSED

pxiangwu commented on May 24, 2024

Training Set size not 17065 for NuScenes after preprocessing

from motionnet.

Comments (5)

pxiangwu commented on May 24, 2024

Hi, let me check the code. I may need to run the code to see what happens. Please give me a little bit time.

Also, below are the links to the pre-trained models. which might be helpful for you.

The pre-trained model for train_multi_seq.py can be downloaded from Google Drive or Dropbox
The pre-trained model for train_multi_seq_MGDA.py can be downloaded from Google Drive or Dropbox

from motionnet.

pxiangwu commented on May 24, 2024

@DavidS3141 , after quickly running the code, the output told me that for the scene 411 it generates 34 files, for scene 662 it generates 34 files, and for scene 2 it also generates 34 files, etc. So roughly in total we have 34 * 500 = 17000 files (close to the 17065). So I think the code is correct.

Could you run the code on your system to check how many files scene 411, 662, 2 would generate? (These 3 scenes are the first 3 scenes that will be processed by the code).

And you could comment out some of the proprocessing code, such as BEV rasterization and file saving, etc to accelerate the code running. In this way we may quickly check the total number of files it would dump (see the code below).

# COPYRIGHT (C) Mitsubishi Electric Research Labs (MERL) 2020
# Code written by Pengxiang Wu
# March 2020

from nuscenes.nuscenes import NuScenes
import os
from nuscenes.utils.data_classes import LidarPointCloud
import numpy as np
import argparse
from data.data_utils import voxelize_occupy, gen_2d_grid_gt


parser = argparse.ArgumentParser()
parser.add_argument('-r', '--root', default=None, type=str, help='Root path to nuScenes dataset')
parser.add_argument('-s', '--split', default='train', type=str, help='The data split [train/val/test]')
parser.add_argument('-p', '--savepath', default=None, type=str, help='Directory for saving the generated data')
args = parser.parse_args()

if args.root is None or args.savepath is None:
    raise ValueError("Should specify the dataset path and the savepath.")

nusc = NuScenes(version='v1.0-trainval', dataroot=args.root, verbose=True)
print("Total number of scenes:", len(nusc.scene))

class_map = {'vehicle.car': 1, 'vehicle.bus.rigid': 1, 'vehicle.bus.bendy': 1, 'human.pedestrian': 2,
             'vehicle.bicycle': 3}  # background: 0, other: 4


if args.split == 'train':
    num_keyframe_skipped = 0  # The number of keyframes we will skip when dumping the data
    nsweeps_back = 30  # Number of frames back to the history (including the current timestamp)
    nsweeps_forward = 20  # Number of frames into the future (does not include the current timestamp)
    skip_frame = 0  # The number of frames skipped for the adjacent sequence
    num_adj_seqs = 2  # number of adjacent sequences, among which the time gap is \delta t
else:
    num_keyframe_skipped = 1
    nsweeps_back = 25  # Setting this to 30 (for training) or 25 (for testing) allows conducting ablation studies on frame numbers
    nsweeps_forward = 20
    skip_frame = 0
    num_adj_seqs = 1


# The specifications for BEV maps
voxel_size = (0.25, 0.25, 0.4)
area_extents = np.array([[-32., 32.], [-32., 32.], [-3., 2.]])
past_frame_skip = 3  # when generating the BEV maps, how many history frames need to be skipped
future_frame_skip = 0  # when generating the BEV maps, how many future frames need to be skipped
num_past_frames_for_bev_seq = 5  # the number of past frames for BEV map sequence


scenes = np.load('data/split.npy', allow_pickle=True).item().get(args.split)
print("Split: {}, which contains {} scenes.".format(args.split, len(scenes)))

# ---------------------- Extract the scenes, and then pre-process them into BEV maps ----------------------
def gen_data():
    res_scenes = list()
    for s in scenes:
        s_id = s.split('_')[1]
        res_scenes.append(int(s_id))

    total = 0
    for scene_idx in res_scenes:
        curr_scene = nusc.scene[scene_idx]

        first_sample_token = curr_scene['first_sample_token']
        curr_sample = nusc.get('sample', first_sample_token)
        curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])

        save_data_dict_list = list()  # for storing consecutive sequences; the data consists of timestamps, points, etc
        save_box_dict_list = list()  # for storing box annotations in consecutive sequences
        save_instance_token_list = list()
        adj_seq_cnt = 0
        save_seq_cnt = 0  # only used for save data file name


        # Iterate each sample data
        print("Processing scene {} ...".format(scene_idx))
        while curr_sample_data['next'] != '':

            # Get the synchronized point clouds
            all_pc, all_times, trans_matrices = \
                LidarPointCloud.from_file_multisweep_bf_sample_data(nusc, curr_sample_data,
                                                                    return_trans_matrix=True,
                                                                    nsweeps_back=nsweeps_back,
                                                                    nsweeps_forward=nsweeps_forward)
            # Store point cloud of each sweep
            pc = all_pc.points
            _, sort_idx = np.unique(all_times, return_index=True)
            unique_times = all_times[np.sort(sort_idx)]  # Preserve the item order in unique_times
            num_sweeps = len(unique_times)

            # Make sure we have sufficient past and future sweeps
            if num_sweeps != (nsweeps_back + nsweeps_forward):

                # Skip some keyframes if necessary
                flag = False
                for _ in range(num_keyframe_skipped + 1):
                    if curr_sample['next'] != '':
                        curr_sample = nusc.get('sample', curr_sample['next'])
                    else:
                        flag = True
                        break

                if flag:  # No more keyframes
                    break
                else:
                    curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])

                # Reset
                adj_seq_cnt = 0
                save_data_dict_list = list()
                save_box_dict_list = list()
                save_instance_token_list = list()
                continue

            adj_seq_cnt += 1
            if adj_seq_cnt == num_adj_seqs:

                print(">> Finish sample Num: {}".format(total + 1))
                total += 1
                # --------------------------------------------------------------------------------

                save_seq_cnt += 1
                adj_seq_cnt = 0
                save_data_dict_list = list()
                save_box_dict_list = list()
                save_instance_token_list = list()

                # Skip some keyframes if necessary
                flag = False
                for _ in range(num_keyframe_skipped + 1):
                    if curr_sample['next'] != '':
                        curr_sample = nusc.get('sample', curr_sample['next'])
                    else:
                        flag = True
                        break

                if flag:  # No more keyframes
                    break
                else:
                    curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])
            else:
                flag = False
                for _ in range(skip_frame + 1):
                    if curr_sample_data['next'] != '':
                        curr_sample_data = nusc.get('sample_data', curr_sample_data['next'])
                    else:
                        flag = True
                        break

                if flag:  # No more sample frames
                    break


# ---------------------- Convert the raw data into (dense) BEV maps ----------------------
def convert_to_dense_bev(data_dict):
    num_sweeps = data_dict['num_sweeps']
    times = data_dict['times']
    trans_matrices = data_dict['trans_matrices']

    num_past_sweeps = len(np.where(times >= 0)[0])
    num_future_sweeps = len(np.where(times < 0)[0])
    assert num_past_sweeps + num_future_sweeps == num_sweeps, "The number of sweeps is incorrect!"

    # Load point cloud
    pc_list = []

    for i in range(num_sweeps):
        pc = data_dict['pc_' + str(i)]
        pc_list.append(pc.T)

    # Reorder the pc, and skip sample frames if wanted
    # Currently the past frames in pc_list are stored in the following order [current, current + 1, current + 2, ...]
    # Therefore, we would like to reorder the frames
    tmp_pc_list_1 = pc_list[0:num_past_sweeps:(past_frame_skip + 1)]
    tmp_pc_list_1 = tmp_pc_list_1[::-1]
    tmp_pc_list_2 = pc_list[(num_past_sweeps + future_frame_skip)::(future_frame_skip + 1)]
    pc_list = tmp_pc_list_1 + tmp_pc_list_2  # now the order is: [past frames -> current frame -> future frames]

    num_past_pcs = len(tmp_pc_list_1)
    num_future_pcs = len(tmp_pc_list_2)

    # Discretize the input point clouds, and compute the ground-truth displacement vectors
    # The following two variables contain the information for the
    # compact representation of binary voxels, as described in the paper
    voxel_indices_list = list()
    padded_voxel_points_list = list()

    past_pcs_idx = list(range(num_past_pcs))
    past_pcs_idx = past_pcs_idx[-num_past_frames_for_bev_seq:]  # we typically use 5 past frames (including the current one)
    for i in past_pcs_idx:
        res, voxel_indices = voxelize_occupy(pc_list[i], voxel_size=voxel_size, extents=area_extents, return_indices=True)
        voxel_indices_list.append(voxel_indices)
        padded_voxel_points_list.append(res)

    # Compile the batch of voxels, so that they can be fed into the network.
    # Note that, the padded_voxel_points in this script will only be used for sanity check.
    padded_voxel_points = np.stack(padded_voxel_points_list, axis=0).astype(np.bool)

    # Finally, generate the ground-truth displacement field
    # - all_disp_field_gt: the ground-truth displacement vectors for each grid cell
    # - all_valid_pixel_maps: the masking map for valid pixels, used for loss computation
    # - non_empty_map: the mask which represents the non-empty grid cells, used for loss computation
    # - pixel_cat_map: the map specifying the category for each non-empty grid cell
    # - pixel_indices: the indices of non-empty grid cells, used to generate sparse BEV maps
    # - pixel_instance_map: the map specifying the instance id for each grid cell, used for loss computation
    all_disp_field_gt, all_valid_pixel_maps, non_empty_map, pixel_cat_map, pixel_indices, pixel_instance_map \
        = gen_2d_grid_gt(data_dict, grid_size=voxel_size[0:2], extents=area_extents,
                         frame_skip=future_frame_skip, return_instance_map=True)

    return voxel_indices_list, padded_voxel_points, pixel_indices, pixel_instance_map, all_disp_field_gt,\
        all_valid_pixel_maps, non_empty_map, pixel_cat_map, num_past_frames_for_bev_seq, num_future_pcs, trans_matrices


# ---------------------- Convert the dense BEV data into sparse format ----------------------
# This will significantly reduce the space used for data storage
def convert_to_sparse_bev(dense_bev_data):
    save_voxel_indices_list, save_voxel_points, save_pixel_indices, save_pixel_instance_maps, \
        save_disp_field_gt, save_valid_pixel_maps, save_non_empty_maps, save_pixel_cat_maps, \
        save_num_past_pcs, save_num_future_pcs, save_trans_matrices = dense_bev_data

    save_valid_pixel_maps = save_valid_pixel_maps.astype(np.bool)
    save_voxel_dims = save_voxel_points.shape[1:]
    num_categories = save_pixel_cat_maps.shape[-1]

    sparse_disp_field_gt = save_disp_field_gt[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1], :]
    sparse_valid_pixel_maps = save_valid_pixel_maps[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1]]
    sparse_pixel_cat_maps = save_pixel_cat_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1]]
    sparse_pixel_instance_maps = save_pixel_instance_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1]]

    save_data_dict = dict()
    for i in range(len(save_voxel_indices_list)):
        save_data_dict['voxel_indices_' + str(i)] = save_voxel_indices_list[i].astype(np.int32)

    save_data_dict['disp_field'] = sparse_disp_field_gt
    save_data_dict['valid_pixel_map'] = sparse_valid_pixel_maps
    save_data_dict['pixel_cat_map'] = sparse_pixel_cat_maps
    save_data_dict['num_past_pcs'] = save_num_past_pcs
    save_data_dict['num_future_pcs'] = save_num_future_pcs
    save_data_dict['trans_matrices'] = save_trans_matrices
    save_data_dict['3d_dimension'] = save_voxel_dims
    save_data_dict['pixel_indices'] = save_pixel_indices
    save_data_dict['pixel_instance_ids'] = sparse_pixel_instance_maps

    # -------------------------------- Sanity Check --------------------------------
    dims = save_non_empty_maps.shape

    test_disp_field_gt = np.zeros((save_num_future_pcs, dims[0], dims[1], 2), dtype=np.float32)
    test_disp_field_gt[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1], :] = sparse_disp_field_gt[:]
    assert np.all(test_disp_field_gt == save_disp_field_gt), "Error: Mismatch"

    test_valid_pixel_maps = np.zeros((save_num_future_pcs, dims[0], dims[1]), dtype=np.bool)
    test_valid_pixel_maps[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = sparse_valid_pixel_maps[:]
    assert np.all(test_valid_pixel_maps == save_valid_pixel_maps), "Error: Mismatch"

    test_pixel_cat_maps = np.zeros((dims[0], dims[1], num_categories), dtype=np.float32)
    test_pixel_cat_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1], :] = sparse_pixel_cat_maps[:]
    assert np.all(test_pixel_cat_maps == save_pixel_cat_maps), "Error: Mismatch"

    test_non_empty_map = np.zeros((dims[0], dims[1]), dtype=np.float32)
    test_non_empty_map[save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = 1.0
    assert np.all(test_non_empty_map == save_non_empty_maps), "Error: Mismatch"

    test_pixel_instance_map = np.zeros((dims[0], dims[1]), dtype=np.uint8)
    test_pixel_instance_map[save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = sparse_pixel_instance_maps[:]
    assert np.all(test_pixel_instance_map == save_pixel_instance_maps), "Error: Mismatch"

    for i in range(len(save_voxel_indices_list)):
        indices = save_data_dict['voxel_indices_' + str(i)]
        curr_voxels = np.zeros(save_voxel_dims, dtype=np.bool)
        curr_voxels[indices[:, 0], indices[:, 1], indices[:, 2]] = 1
        assert np.all(curr_voxels == save_voxel_points[i]), "Error: Mismatch"

    return save_data_dict


if __name__ == "__main__":
    gen_data()

from motionnet.

demmerichs commented on May 24, 2024

When I run my version of the script I got the following, which looks a bit different then yours because of your changes:

Processing scene 411 ...                                                                                                                                                                                           
  >> Finish sample: 0, sequence 0                                                                                                                                                                                  
  >> Finish sample: 0, sequence 1                                                                                                                                                                                  
  >> Finish sample: 1, sequence 0                                                                                                                                                                                  
  >> Finish sample: 1, sequence 1                                                                                                                                                                                  
  >> Finish sample: 2, sequence 0                                                                                                                                                                                  
  >> Finish sample: 2, sequence 1                                                                                                                                                                                  
  >> Finish sample: 3, sequence 0                                                                                                                                                                                  
  >> Finish sample: 3, sequence 1                                                                                                                                                                                  
  >> Finish sample: 4, sequence 0                                                                                                                                                                                  
  >> Finish sample: 4, sequence 1                                                                                                                                                                                  
  >> Finish sample: 5, sequence 0                                                                                                                                                                                  
  >> Finish sample: 5, sequence 1                                                                                                                                                                                  
  >> Finish sample: 6, sequence 0                                                                                                                                                                                  
  >> Finish sample: 6, sequence 1                                                                                                                                                                                  
  >> Finish sample: 7, sequence 0                                                                                                                                                                                  
  >> Finish sample: 7, sequence 1                                                                                                                                                                                  
  >> Finish sample: 8, sequence 0                                                                                                                                                                                  
  >> Finish sample: 8, sequence 1                                                                                                                                                                          
  >> Finish sample: 9, sequence 0                                                                                                                                                                                 
  >> Finish sample: 9, sequence 1                                                                                                                                                                                  
  >> Finish sample: 10, sequence 0                                                                                                                                                                                 
  >> Finish sample: 10, sequence 1                                                                                                                                                                                
  >> Finish sample: 11, sequence 0                                                                                                                                                                                 
  >> Finish sample: 11, sequence 1                                                                                                                                                                                 
  >> Finish sample: 12, sequence 0                                                                                                                                                                                 
  >> Finish sample: 12, sequence 1                                                                                                                                                                                 
  >> Finish sample: 13, sequence 0                                                                                                                                                                                 
  >> Finish sample: 13, sequence 1                                                                                                                                                                                 
  >> Finish sample: 14, sequence 0                                                                                                                                                                                 
  >> Finish sample: 14, sequence 1                                                                                                                                                                                 
  >> Finish sample: 15, sequence 0                                                                                                                                                                                 
  >> Finish sample: 15, sequence 1                                                                                                                                                                                 
  >> Finish sample: 16, sequence 0                                                                                                                                                                                 
  >> Finish sample: 16, sequence 1                                                                                                                                                                                 
  >> Finish sample: 17, sequence 0                                                                                                                                                                                 
  >> Finish sample: 17, sequence 1                                                                                                                                                                                 
  >> Finish sample: 18, sequence 0                                                                                                                                                                                 
  >> Finish sample: 18, sequence 1                                                                                                                                                                                 
  >> Finish sample: 19, sequence 0                                                                                                                                                                                 
  >> Finish sample: 19, sequence 1                                                                                                                                                                                 
  >> Finish sample: 20, sequence 0                                                                                                                                                                                 
  >> Finish sample: 20, sequence 1                                                                                                                                                                                 
  >> Finish sample: 21, sequence 0                                                                                                                                                                                 
  >> Finish sample: 21, sequence 1                                                                                                                                                                                 
  >> Finish sample: 22, sequence 0                                                                                                                                                                                 
  >> Finish sample: 22, sequence 1                                                                                                                                                                                 
  >> Finish sample: 23, sequence 0                                                                                                                                                                                 
  >> Finish sample: 23, sequence 1                                                                                                                                                                                 
  >> Finish sample: 24, sequence 0
  >> Finish sample: 24, sequence 1
  >> Finish sample: 25, sequence 0
  >> Finish sample: 25, sequence 1
  >> Finish sample: 26, sequence 0
  >> Finish sample: 26, sequence 1
  >> Finish sample: 27, sequence 0
  >> Finish sample: 27, sequence 1
  >> Finish sample: 28, sequence 0
  >> Finish sample: 28, sequence 1
  >> Finish sample: 29, sequence 0
  >> Finish sample: 29, sequence 1
  >> Finish sample: 30, sequence 0
  >> Finish sample: 30, sequence 1
  >> Finish sample: 31, sequence 0
  >> Finish sample: 31, sequence 1
  >> Finish sample: 32, sequence 0
  >> Finish sample: 32, sequence 1
  >> Finish sample: 33, sequence 0
  >> Finish sample: 33, sequence 1
Processing scene 662 ...

The first four scenes all have 34 samples, and are 411, 662, 225, 2 in that order (I think you just missed 225, because also your script provided here gives this order of scenes). Sadly for me the loading is also quite slow, but I am running your shortened script to find out the total count of samples. But right now everything points to some error happening during the preprocessing which I missed and which resulted in the drop of scenes. I just realized "again" that nuscenes is quite memory hungry and I also had some other applications running and started automatically also val and test preprocessing, so maybe the generation script was killed because of OOM. Right now stopped other programs and will update this as soon as the scripts have finished (might take a day :/, tqdm would have been nice for this).

from motionnet.

pxiangwu commented on May 24, 2024

Yes. The data loading is slow since the data reader of nuScenes is implemented in Python instead of C++.

Let me know if you successfully generate 17065 files.

from motionnet.

demmerichs commented on May 24, 2024

It was a problem on my side, thanks for your help again. I now processed all scenes as expected. Closing this.

from motionnet.

Training Set size not 17065 for NuScenes after preprocessing about motionnet HOT 5 CLOSED

Comments (5)

Related Issues (17)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent