Giter Site home page Giter Site logo

cvpr2023-3d-occupancy-prediction / cvpr2023-3d-occupancy-prediction Goto Github PK

View Code? Open in Web Editor NEW
736.0 19.0 56.0 10.7 MB

CVPR2023-Occupancy-Prediction-Challenge

Home Page: https://opendrivelab.com/AD23Challenge.html

License: MIT License

Python 99.78% Shell 0.22%
autonomous-vehicles computer-vision dataset occupancy-detection

cvpr2023-3d-occupancy-prediction's Introduction

CVPR 2023 3D Occupancy Prediction Challenge

The world's First 3D Occupancy Benchmark for Scene Perception in Autonomous Driving.

devkit: v0.1.0 License: Apache2.0

Introduction

Understanding the 3D surroundings including the background stuffs and foreground objects is important for autonomous driving. In the traditional 3D object detection task, a foreground object is represented by the 3D bounding box. However, the geometrical shape of the object is complex, which can not be represented by a simple 3D box, and the perception of the background stuffs is absent. The goal of this task is to predict the 3D occupancy of the scene. In this task, we provide a large-scale occupancy benchmark based on the nuScenes dataset. The benchmark is a voxelized representation of the 3D space, and the occupancy state and semantics of the voxel in 3D space are jointly estimated in this task. The complexity of this task lies in the dense prediction of 3D space given the surround-view images.

If you use the challenge dataset in your paper, please consider citing OccNet and Occ3D with the following BibTex:

@article{sima2023_occnet,
      title={Scene as Occupancy},
      author=author={Chonghao Sima and Wenwen Tong and Tai Wang and Li Chen and Silei Wu and Hanming Deng and Yi Gu and Lewei Lu and Ping Luo and Dahua Lin and Hongyang Li},
      year={2023},
      eprint={2306.02851},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
@article{tian2023occ3d,
  title={Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving},
  author={Tian, Xiaoyu and Jiang, Tao and Yun, Longfei and Wang, Yue and Wang, Yilun and Zhao, Hang},
  journal={arXiv preprint arXiv:2304.14365},
  year={2023}
}

Leaderboard

3D Occupancy Prediction Challenge at CVPR 2023 (Server remains active)

Please refer to this link. If you wish to add new / modify results to the leaderboard, please drop us an email to [email protected]

Top 10 at a glance by June 10 2023. teaser

Table of Contents

Changelog

  • June 1: Challenge closed! Please refer to the final leaderboard at here.
  • May 16: Note that it is a must to append a correct email address and other information to validate your submissions in the Challenge.❗
  • May 12: The evaluation server is finally online! Check out submission format
  • April 18: πŸš€ A strong baseline based on InternImage released. Check out here.
  • April 13: we add visualization code under utils/vis.py. And we add rules about using other datasets as well as future frame. Please take a look at the rule section and strickly follow them.

Task Definition

Given images from multiple cameras, the goal is to predict the current occupancy state and semantics of each voxel grid in the scene. The voxel state is predicted to be either free or occupied. If a voxel is occupied, its semantic class needs to be predicted, as well. Besides, we also provide a binary observed/unobserved mask for each frame. An observed voxel is defined as an invisible grid in the current camera observation, which is ignored in the evaluation stage.

Rules for Occupancy Challenge

  • We allow using annotations provided in the nuScenes dataset, and during inference, the input modality of the model should be camera only.
  • No future frame is allowed during inference.
  • In order to check the compliance, we will ask the participants to provide technical reports to the challenge committee and the participant will be asked to provide a public talk about the method after winning the award.
  • Every submission provides method information. We encourage publishing code, but do not make it a requirement.
  • Each team can have at most one account on the evaluation server. Users that create multiple accounts to circumvent the rules will be excluded from the challenge.
  • Each team can submit at most three results during the challenge.
  • Faulty submissions that return an error on Eval AI do not count towards the submission limit.
  • Any attempt to circumvent these rules will result in a permanent ban of the team or company from the challenge.

(back to top)

Evaluation Metrics

Leaderboard ranking for this challenge is by the intersection-over-union (mIoU) over all classes.

mIoU

Let $C$ be he number of classes.

$$ mIoU=\frac{1}{C}\displaystyle \sum_{c=1}^{C}\frac{TP_c}{TP_c+FP_c+FN_c}, $$

where $TP_c$ , $FP_c$ , and $FN_c$ correspond to the number of true positive, false positive, and false negative predictions for class $c_i$.

(back to top)

Data

Figure 1. Semantic labels (left), visibility masks in the LiDAR (middle) and the camera (right) view. Grey voxels are unobserved in LiDAR view and white voxels are observed in the accumulative LiDAR view but unobserved in the current camera view.

Basic Information

Type Info
mini 404
train 28,130
val 6,019
test 6,008
cameras 6
voxel size 0.4m
range [-40m, -40m, -1m, 40m, 40m, 5.4m]
volume size [200, 200, 16]
#classes 0 - 17
  • The dataset contains 18 classes. The definition of classes from 0 to 16 is the same as the nuScenes-lidarseg dataset. The label 17 category represents voxels that are not occupied by anything, which is named as free. Voxel semantics for each sample frame is given as [semantics] in the labels.npz.

  • How are the labels annotated? The ground truth labels of occupancy derive from accumulative LiDAR scans with human annotations, and we annotate the occupancy in the ego coordinate system.

    • If a voxel reflects a LiDAR point, then it is assigned as the same semantic label as the LiDAR point;
    • If a LiDAR beam passes through a voxel in the air, the voxel is set to be free;
    • Otherwise, we set the voxel to be unknown, or unobserved. This happens due to the sparsity of the LiDAR or the voxel is occluded, e.g. by a wall. In the dataset, [mask_lidar] is a 0-1 binary mask, where 0's represent unobserved voxels. As shown in Fig.1(b), grey voxels are unobserved. Due to the limitation of the visualization tool, we only show unobserved voxels at the same height as the ground.
  • Camera visibility. Note that the installation positions of LiDAR and cameras are different, therefore, some observed voxels in the LiDAR view are not seen by the cameras. Since we focus on a vision-centric task, we provide a binary voxel mask [mask_camera], indicating whether the voxels are observed or not in the current camera view. As shown in Fig.1(c), white voxels are observed in the accumulative LiDAR view but unobserved in the current camera view.

  • Both [mask_lidar] and [mask_camera] masks are optional for training. Participants do not need to predict the masks. Only [mask_camera] is used for evaluation; the unobserved voxels are not involved during calculating the F-score and mIoU.

Download

The files mentioned below can also be downloaded via OpenDataLabOpenDataLab.It is recommended to use provided command line interface for acceleration.

Subset Google Drive Google Drive Baidu Cloud Baidu Yun Size
mini data data approx. 440M
trainval data data approx. 32G
test data data approx. 6G
  • Mini and trainval data contain three parts -- imgs, gts and annotations. The imgs datas have the same hierarchy with the image samples in the original nuScenes dataset.

Hierarchy

The hierarchy of folder Occpancy3D-nuScenes-V1.0/ is described below:

└── Occpancy3D-nuScenes-V1.0
    |
    β”œβ”€β”€ mini
    |
    β”œβ”€β”€ trainval
    |   β”œβ”€β”€ imgs
    |   |   β”œβ”€β”€ CAM_BACK
    |   |   |   β”œβ”€β”€ n015-2018-07-18-11-07-57+0800__CAM_BACK__1531883530437525.jpg
    |   |   |   └── ...
    |   |   β”œβ”€β”€ CAM_BACK_LEFT
    |   |   |   β”œβ”€β”€ n015-2018-07-18-11-07-57+0800__CAM_BACK_LEFT__1531883530447423.jpg
    |   |   |   └── ...
    |   |   └── ...
    |   |     
    |   β”œβ”€β”€ gts  
    |   |   β”œβ”€β”€ [scene_name]
    |   |   |   β”œβ”€β”€ [frame_token]
    |   |   |   |   └── labels.npz
    |   |   |   └── ...
    |   |   └── ...
    |   |
    |   └── annotations.json
    |
    └── test
        β”œβ”€β”€ imgs
        └── annotations.json

  • imgs/ contains images captured by various cameras.
  • gts/ contains the ground truth of each sample. [scene_name] specifies a sequence of frames, and [frame_token] specifies a single frame in a sequence.
  • annotations.json contains meta infos of the dataset.
  • labels.npz contains [semantics], [mask_lidar], and [mask_camera] for each frame.
annotations {
    "train_split": ["scene-0001", ...],                         <list> -- training dataset split by scene_name
    "val_split": list ["scene-0003", ...],                      <list> -- validation dataset split by scene_name
    "scene_infos" {                                             <dict> -- meta infos of the scenes    
        [scene_name]: {                                         <str> -- name of the scene.  
            [frame_token]: {                                    <str> -- samples in a scene, ordered by time
                    "timestamp":                                <str> -- timestamp (or token), unique by sample
                    "camera_sensor": {                          <dict> -- meta infos of the camera sensor
                        [cam_token]: {                          <str> -- token of the camera
                            "img_path":                         <str> -- corresponding image file path, *.jpg
                            "intrinsic":                        <float> [3, 3] -- intrinsic camera calibration
                            "extrinsic":{                       <dict> -- extrinsic parameters of the camera
                                "translation":                  <float> [3] -- coordinate system origin in meters
                                "rotation":                     <float> [4] -- coordinate system orientation as quaternion
                            }   
                            "ego_pose": {                       <dict> -- vehicle pose of the camera
                                "translation":                  <float> [3] -- coordinate system origin in meters
                                "rotation":                     <float> [4] -- coordinate system orientation as quaternion
                            }                
                        },
                        ...
                    },
                    "ego_pose": {                               <dict> -- vehicle pose
                        "translation":                          <float> [3] -- coordinate system origin in meters
                        "rotation":                             <float> [4] -- coordinate system orientation as quaternion
                    },
                    "gt_path":                                  <str> -- corresponding 3D voxel gt path, *.npz
                    "next":                                     <str> -- frame_token of the previous keyframe in the scene 
                    "prev":                                     <str> -- frame_token of the next keyframe in the scene
                }
            ]             
        }
    }
}

Known Issues

  • Nuscene (issues-721) lacks translation in the z-axis, which makes it hard to recover accurate 6d localization and would lead to the misalignment of point clouds while accumulating them over whole scenes. Ground stratification occurs in several data.

(back to top)

Getting Started

We provide a baseline model based on BEVFormer.

Please refer to getting_started for details.

(back to top)

Submission

Submission site

Please submit your result on our evaluation server. The submission rule can be referred to here

Submission format

We define a standardized 3D occupancy prediction result format that serves as an input to the evaluation code. Results are evaluated for each sample. The 3D occupancy prediction results for a the test evaluation set are stored in a folder. The participant needs to zip the results folder and submit it to the official evaluation server.

The folder structure of the results should be as follows:

└── results_folder
    β”œβ”€β”€ [frame_token].npz
    └── ...

The result folder contains .npz files, where each .npz file contains the labels of the voxels for the 3D grids with the shape of [200,200,16]. Pay special attention that each set of predictions in the folder must be a .npz file and named as [frame_token].npz. The [frame_token] in annotations.json is the same as the sample_token in nuscenes. A .npz file contains an array of uint8 values in which each value is the predicted semantic class index of the corresponding grid in the 3D space.

Below is an example of how to save the predictions for a single sample:

save_path = os.path.join(submission_prefix,'{}.npz'.format(sample_token))
np.savez_compressed(save_path,occ_pred.astype(np.uint8))

We provide example scripts based on mmdetection3d to generate the submission file, please refer to getting_started for details.

The official evaluation server only accepts a single *.zip file; you can zip the results folder as below:

zip -r occ_submission.zip results_folder

(back to top)

Challenge Timeline

  • May 12, 2023 - Challenge Period Open.
  • Jun 01, 2023 11:59:00 PM CST(UTC+8) - Challenge Period End.
  • Jun 03, 2023 - Finalist Notification.
  • Jun 10, 2023 - Technical Report Deadline.
  • Jun 12, 2023 - Winner Announcement.

(back to top)

License

Before using the dataset, you should register on the website and agree to the terms of use of the nuScenes. All code within this repository is under MIT License.

(back to top)

cvpr2023-3d-occupancy-prediction's People

Contributors

1349949 avatar chonghaosima avatar faikit avatar hangzhaomit avatar hli2020 avatar tongwwt avatar waveleaf27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cvpr2023-3d-occupancy-prediction's Issues

Question about the mIoU calculation

mIoU = self.per_class_iu(self.hist)
print(f'===> mIoU of {self.cnt} samples: ' + str(round(np.nanmean(mIoU[:self.num_classes-1]) * 100, 2)))

It seems that 'free' class prediction will affect the calculation of mIoU, but not print and not participate in the final ranking.
This is not equivalent to ignore, as predicting cars or other categories in empty spaces can lead to FP.
So is it the same calculation process of mIoU in the test set ?

Which timestamp (sweep or camera) the `occ_gt` lies in?

As you said the occ_gt lies in the ego coordinates, but both the timestamp of the sweep and the camera have the ego coordinates, I wonder it's in the timestamp of the sweep or of the camera? If camera, which camera? Thanks.

The ego poses in different cameras also have small differences.

Where can I download v1.0-trainval?

Hi,

thank you for your great work to organize such a challenge.

I follow the getting_started guideline to prepare the dataset. However, I ran into a error:

FileNotFoundError: [Errno 2] No such file or directory: './data/occ3d-nus/v1.0-trainval/attribute.json'

Although it is mentioned that v1.0-trainval is not necessary to download in issue #28 , this error indicates a missing file in that dir. So I go to the nuscenes downloads, but there are too many files to check. I tried these files:

  1. nuImages -- All -- Metadata --> nuimages-v1.0-all-metadata.tgz. But the files inside are split to train and val. And they are json files so they are not easy to merge.
  2. nuScenes-lidarseg -- All -- Metadata and sensor file blobs --> nuScenes-lidarseg-all-v1.0.tar. There is a subfolder named as v1.0-trainval, but I can not find the file attribute.json.

Could you explicitly let me know where I can obtain the v1.0-trainval?

Thanks in advance.

TypeError: NuSceneOcc: Unsupported format: None

Hello there! I was hoping to ask for some assistance with an issue I'm encountering. I've successfully utilized the TPVFormer environment, but I'm now facing an error message when attempting to transfer it to another project. Specifically, the error reads "TypeError: NuSceneOcc: Unsupported format: None". Would you happen to have any suggestions on how to resolve this? Thank you for your help in advance!

The voxels on the rear side of the vehicle seem to be neatly divided?

I visualized the first sample of the mini dataset as shown in the image follow.
image
I found that the voxels on the rear side of the vehicle seem to be neatly divided (the positive direction of the z-axis in the figure corresponds to the negative direction of the y-axis of the lidar sensor in nuscenes), and other samples seem to be the same. That's the truth, or my fault for causing?

HELP! GPU memory overflow during testing.

My partner and I both encountered GPU memory overflow during testing.
The error message is as follows:
d02dfafc1ac158be4215c9b158fe66b

Because the error message is not saved, the image is sent.

About the training log

I wonder if you could provide the training log for the baseline for our reference? Thanks

Is training with other public/private dataset allowed?

Hi,
I'm wondering if the following behavior allowed in this challege:

(1) train/pretrain with lidar sementatic anotations of original NuScenes dataset.
(2) train/pretrain with private SSC anotations of NuScenes dataset.
(3) train/pretrain with other public dataset.
(4) train/pretrain with other private dataset.

If not, how to check the compliance of the uploaded results.

Thanks

Data prepare

Hi,

I'm not familiar with bevformer. When I prepare the data folder following the instruction, I find there is no information about how to get the v1.0-trainval folder in data/occ3d-nus/. Is this folder empty? If not, how can I get the files in v1.0-trainval folder?

Thanks for your help!

About the `free` class and the `occ_metrics`

Hi, I want to ask that

It seems the baseline puts free class in the calculation of mIoU, is it right? And in print function, the free class not printed


for ind_class in range(self.num_classes-1):
print(f'===> {self.class_names[ind_class]} - IoU = ' + str(round(mIoU[ind_class] * 100, 2)))

About the `gts` semantics in labels

I tried to visualize the semantics in gts and got one of them:
image
image

I have removed the voxels which are labeled in mask_lidar and mask_camera, so is it right? It looks like something wrong with so many whith cubes. And could you provide the official guidance of the visualization?

BEVDet baseline with 42 mIOU

We support the occupancy prediction task with the BEVDet codebase and offer a strong baseline with performance up to 42.0 mIOU:

Config mIOU GPU Consumption
bevformer_base_occ 23.7 645.3 GPUHours
BEVDet-Occ-R50-4D-Stereo-2x 36.1 132.0 GPUHours
BEVDet-Occ-R50-4D-Stereo-2x-384704 37.3 164.0 GPUHours
BEVDet-Occ-R50-4DLongterm-Stereo-2x-384704 39.3 512.0 GPUHours
BEVDet-Occ-STBase-4D-Stereo-2x 42.0 531.2 GPUHours

GPU Consumption is measured on GTX 3090.
More ablations will be updated gradually.

Dependencies for a Docker container

I tried to use the dependencies in Getting Started (BevFormer) inside a docker container but got mmdetection3d errors.
Could you please provide some alternative dependences for non conda environments?
Thanks

The voxel GTs do not aglin well with the images?

I project the centers of voxels to the corresponding image, and find that it does not align well with the image. Refering to the image below:
image

Take a look at the person in the image, according to voxels, it is 7 * 0.4 = 2.8m high. Is that reasonable?

I used the "voxel to points" function from here: https://github.com/CVPR2023-3D-Occupancy-Prediction/CVPR2023-3D-Occupancy-Prediction/blob/main/utils/vis.py#L36
And camera intrinsics and extrinsics from the "annotation.json" file.
Please tell me if I missed anything.

Eval code Error

Hi, I have some problem when I eval the code, The error message is as follows.

trainval data.

2023-03-06 23:15:59,690 - mmdet - INFO - Epoch [1][14050/14065] lr: 2.000e-04, e
ta: 7 days, 18:42:48, time: 2.075, data_time: 0.027, memory: 14379, loss_occ: 0.
1698, loss: 0.1698, grad_norm: 0.2118
2023-03-06 23:16:30,723 - mmdet - INFO - Saving checkpoint at 1 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>           ] 4712/6019, 6.7 task/s, elap
sed: 706s, ETA:   196sERROR:torch.distributed.elastic.multiprocessing.api:failed
 (exitcode: -9) local_rank: 1 (pid: 3636) of binary: /home/aiia611/anaconda3/env
s/open-mmlab/bin/python
Traceback (most recent call last):
  File "/home/aiia611/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 19
4, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/aiia611/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 87
, in _run_code
    exec(code, run_globals)

Bug in nuscenes_dataset.py

hi, When I was single-stepping, I found that in nuscene_dataset.py, line 212, rotation is assigned as a quaternion.
assigning a quaternion to a list results in only the first value being assigned. like this
{b5d659c8-4d80-4476-8f8e-87399552f9e7}

Do we need to predict the free voxel?

I notice that there is no result for the β€œfree” class (label 17) in the baseline results provided, is this category not required for calculating the mIoU?

Is the `mask camera` mandatory in evaluatioin?

I noticed that your baseline config gives use_mask to False, that means you didn't you any mask (neither mask_lidar nor mask_camera) when evaluation and loss calculation, so is this config mandatory? or we could set them freely?

can not find `class NuScenesDataset_eval_modified`, specified in `projects/configs/datasets/custom_nus-3d.py`

Thx for the project! But I have one question.
I can not find class NuScenesDataset_eval_modified, specified in projects/configs/datasets/custom_nus-3d.py, in which dataset_type = 'NuScenesDataset_eval_modified' is defined. So how it works? How does the dataset load into the training phase? Thx in advance!

dataset_type = 'NuScenesDataset_eval_modified'
...
data = dict(
    samples_per_gpu=4,
    workers_per_gpu=4,
    train=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'nuscenes_infos_train.pkl',
        pipeline=train_pipeline,
        classes=class_names,
        modality=input_modality,
        test_mode=False,
        # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
        # and box_type_3d='Depth' in sunrgbd and scannet dataset.
        box_type_3d='LiDAR'),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'nuscenes_infos_val.pkl',
        pipeline=test_pipeline,
        classes=class_names,
        modality=input_modality,
        test_mode=True,
        box_type_3d='LiDAR'),
    test=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'nuscenes_infos_val.pkl',
        pipeline=test_pipeline,
        classes=class_names,
        modality=input_modality,
        test_mode=True,
        box_type_3d='LiDAR'))
...

visualization error

Hello,

I cannot run your utils/vis.py script. I get the error:

Traceback (most recent call last):
  File "utils/vis.py", line 194, in <module>
    vis_nuscene()
  File "utils/vis.py", line 175, in vis_nuscene
    vis = show_point_cloud(points=points, colors=True, points_colors=pcd_colors, voxelize=True, obj_bboxes=None,
  File "utils/vis.py", line 110, in show_point_cloud
    opt.background_color = np.asarray([1, 1, 1])
AttributeError: 'NoneType' object has no attribute 'background_color'

Any idea? I couldn't find the open3d verison you use. Thanks

About the relationship between voxel indexes and coordinates.

According to the issue, I understand that voxel uses ego coordinates, but I still have doubts about the relationship between vexol index and coordinates.

image

As shown in the above picture, In the ego coordinate system, the position of the [0,0,0] voxel relative to the coordinate (0,0,0) belongs to which one in the above picture, or neither.

Regarding the rule

I noticed that the rule regarding restrictions on the use of other datasets has been removed. Does this mean that it is now permissible to use other public or private datasets?

Question about mask_camera

Thanks for work!
I would like to ask what 0 1 stands for in the mask_camera.
If 1 represents the voxel that the camera can observe at the current frame, in 'gts/scene-0362/80d56e801c7e465995bdb116b3e678aa/labels.npz' there are only 98390 voxels.
Is this right or not?
Meanwhile, can mask_camera be used for loss?
image
Waiting for your reply

GPU Memory Consumption

Hi,

It would be helpful if anyone could tell me the least GPU memory consumption during training and testing the base BEVFormer-Occ model.

Thanks a lot!

The "ignore" category should not be taken into account in mIoU

The first class, named "ignore" in nuScenes ("others" here), indicates noise point. This class is very small in number, usually for auxiliary training (such as ignore_index=0 in CrossEntropyLoss), and is not taken into account in the semantic segmentation task of nuScenes. In the evaluation code I noticed that this category is calculated into mIoU, it is better to not calculate this category for more stable evaluation results.

Error when training with fp16

Hi,

  1. I'm trying to train bevformer_small_occ with fp16 but failed.
    I add fp16 = dict(loss_scale=512.) in config and warp the model to fp16 in train.py using following codes.
    model = build_model(
        cfg.model,
        train_cfg=cfg.get('train_cfg'),
        test_cfg=cfg.get('test_cfg'))
    
    # wrap the model to fp16
    fp16_cfg = cfg.get('fp16', None)
    if fp16_cfg is not None:
        wrap_fp16_model(model)

    model.init_weights()

Follow the getting_started.md, the training process failed with following error.
1680267950902

  1. I also wonder is it possible to provide an official guide or implementation of fp16 training?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.