Hello, thanks for your wonder work!
I also come across an NCCL error on a single node with 4 GPUs.
I run the follow script, as suggested in issue #5:
NCCL_DEBUG=INFO python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model deit_tiny_patch16_224 --batch-size 256 --data-path /path/to/imagenet
The terminal complaints:
(deit92) [yuxin.fang@gpu-dev006 deit]$ NCCL_DEBUG=INFO bash train.sh
training...
| distributed init (rank 3): env://
| distributed init (rank 1): env://
| distributed init (rank 0): env://
| distributed init (rank 2): env://
gpu-dev006:25486:25486 [0] NCCL INFO Bootstrap : Using [0]enp7s0:10.10.112.56<0> [1]virbr0:192.168.122.1<0> [2]vethee19468:fe80::4463:98ff:fe1a:66c9%vethee19468<0> [3]veth717ea13:fe80::3c8e:dcff:fed2:2236%veth717ea13<0> [4]veth9e7cb5a:fe80::94c9:90ff:fe6f:7fcb%veth9e7cb5a<0> [5]veth74a5bff:fe80::d01d:81ff:fee9:4dfa%veth74a5bff<0> [6]veth8231c1a:fe80::9068:abff:fe35:e6ad%veth8231c1a<0> [7]veth57f4fc5:fe80::446e:a2ff:fe34:fd05%veth57f4fc5<0> [8]veth35d67ed:fe80::9037:67ff:feb8:17b6%veth35d67ed<0> [9]veth22216db:fe80::70b3:b9ff:feef:be53%veth22216db<0> [10]veth207d721:fe80::1837:b5ff:feb6:b5b0%veth207d721<0> [11]veth19a2645:fe80::e4b3:40ff:fe8e:9756%veth19a2645<0> [12]veth52b5332:fe80::8052:d6ff:fe39:7c28%veth52b5332<0> [13]vethef511ca:fe80::64d0:3aff:fe3b:61d7%vethef511ca<0> [14]veth93f8d8c:fe80::d870:9bff:fec8:6c6f%veth93f8d8c<0> [15]vethcbdf2e2:fe80::786d:4fff:fef5:6daf%vethcbdf2e2<0>
gpu-dev006:25486:25486 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
gpu-dev006:25486:25486 [0] NCCL INFO NET/IB : Using [0]mlx4_0:1/RoCE ; OOB enp7s0:10.10.112.56<0>
gpu-dev006:25486:25486 [0] NCCL INFO Using network IB
NCCL version 2.7.8+cuda9.2
gpu-dev006:25488:25488 [2] NCCL INFO Bootstrap : Using [0]enp7s0:10.10.112.56<0> [1]virbr0:192.168.122.1<0> [2]vethee19468:fe80::4463:98ff:fe1a:66c9%vethee19468<0> [3]veth717ea13:fe80::3c8e:dcff:fed2:2236%veth717ea13<0> [4]veth9e7cb5a:fe80::94c9:90ff:fe6f:7fcb%veth9e7cb5a<0> [5]veth74a5bff:fe80::d01d:81ff:fee9:4dfa%veth74a5bff<0> [6]veth8231c1a:fe80::9068:abff:fe35:e6ad%veth8231c1a<0> [7]veth57f4fc5:fe80::446e:a2ff:fe34:fd05%veth57f4fc5<0> [8]veth35d67ed:fe80::9037:67ff:feb8:17b6%veth35d67ed<0> [9]veth22216db:fe80::70b3:b9ff:feef:be53%veth22216db<0> [10]veth207d721:fe80::1837:b5ff:feb6:b5b0%veth207d721<0> [11]veth19a2645:fe80::e4b3:40ff:fe8e:9756%veth19a2645<0> [12]veth52b5332:fe80::8052:d6ff:fe39:7c28%veth52b5332<0> [13]vethef511ca:fe80::64d0:3aff:fe3b:61d7%vethef511ca<0> [14]veth93f8d8c:fe80::d870:9bff:fec8:6c6f%veth93f8d8c<0> [15]vethcbdf2e2:fe80::786d:4fff:fef5:6daf%vethcbdf2e2<0>
gpu-dev006:25488:25488 [2] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
gpu-dev006:25488:25488 [2] NCCL INFO NET/IB : Using [0]mlx4_0:1/RoCE ; OOB enp7s0:10.10.112.56<0>
gpu-dev006:25488:25488 [2] NCCL INFO Using network IB
gpu-dev006:25489:25489 [3] NCCL INFO Bootstrap : Using [0]enp7s0:10.10.112.56<0> [1]virbr0:192.168.122.1<0> [2]vethee19468:fe80::4463:98ff:fe1a:66c9%vethee19468<0> [3]veth717ea13:fe80::3c8e:dcff:fed2:2236%veth717ea13<0> [4]veth9e7cb5a:fe80::94c9:90ff:fe6f:7fcb%veth9e7cb5a<0> [5]veth74a5bff:fe80::d01d:81ff:fee9:4dfa%veth74a5bff<0> [6]veth8231c1a:fe80::9068:abff:fe35:e6ad%veth8231c1a<0> [7]veth57f4fc5:fe80::446e:a2ff:fe34:fd05%veth57f4fc5<0> [8]veth35d67ed:fe80::9037:67ff:feb8:17b6%veth35d67ed<0> [9]veth22216db:fe80::70b3:b9ff:feef:be53%veth22216db<0> [10]veth207d721:fe80::1837:b5ff:feb6:b5b0%veth207d721<0> [11]veth19a2645:fe80::e4b3:40ff:fe8e:9756%veth19a2645<0> [12]veth52b5332:fe80::8052:d6ff:fe39:7c28%veth52b5332<0> [13]vethef511ca:fe80::64d0:3aff:fe3b:61d7%vethef511ca<0> [14]veth93f8d8c:fe80::d870:9bff:fec8:6c6f%veth93f8d8c<0> [15]vethcbdf2e2:fe80::786d:4fff:fef5:6daf%vethcbdf2e2<0>
gpu-dev006:25489:25489 [3] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
gpu-dev006:25489:25489 [3] NCCL INFO NET/IB : Using [0]mlx4_0:1/RoCE ; OOB enp7s0:10.10.112.56<0>
gpu-dev006:25489:25489 [3] NCCL INFO Using network IB
gpu-dev006:25487:25487 [1] NCCL INFO Bootstrap : Using [0]enp7s0:10.10.112.56<0> [1]virbr0:192.168.122.1<0> [2]vethee19468:fe80::4463:98ff:fe1a:66c9%vethee19468<0> [3]veth717ea13:fe80::3c8e:dcff:fed2:2236%veth717ea13<0> [4]veth9e7cb5a:fe80::94c9:90ff:fe6f:7fcb%veth9e7cb5a<0> [5]veth74a5bff:fe80::d01d:81ff:fee9:4dfa%veth74a5bff<0> [6]veth8231c1a:fe80::9068:abff:fe35:e6ad%veth8231c1a<0> [7]veth57f4fc5:fe80::446e:a2ff:fe34:fd05%veth57f4fc5<0> [8]veth35d67ed:fe80::9037:67ff:feb8:17b6%veth35d67ed<0> [9]veth22216db:fe80::70b3:b9ff:feef:be53%veth22216db<0> [10]veth207d721:fe80::1837:b5ff:feb6:b5b0%veth207d721<0> [11]veth19a2645:fe80::e4b3:40ff:fe8e:9756%veth19a2645<0> [12]veth52b5332:fe80::8052:d6ff:fe39:7c28%veth52b5332<0> [13]vethef511ca:fe80::64d0:3aff:fe3b:61d7%vethef511ca<0> [14]veth93f8d8c:fe80::d870:9bff:fec8:6c6f%veth93f8d8c<0> [15]vethcbdf2e2:fe80::786d:4fff:fef5:6daf%vethcbdf2e2<0>
gpu-dev006:25487:25487 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
gpu-dev006:25487:25487 [1] NCCL INFO NET/IB : Using [0]mlx4_0:1/RoCE ; OOB enp7s0:10.10.112.56<0>
gpu-dev006:25487:25487 [1] NCCL INFO Using network IB
gpu-dev006:25486:25652 [0] NCCL INFO Channel 00/02 : 0 1 2 3
gpu-dev006:25489:25656 [3] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 8/8/64
gpu-dev006:25487:25659 [1] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 8/8/64
gpu-dev006:25488:25654 [2] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 8/8/64
gpu-dev006:25487:25659 [1] NCCL INFO Trees [0] 2/-1/-1->1->0|0->1->2/-1/-1 [1] 2/-1/-1->1->0|0->1->2/-1/-1
gpu-dev006:25486:25652 [0] NCCL INFO Channel 01/02 : 0 1 2 3
gpu-dev006:25489:25656 [3] NCCL INFO Trees [0] -1/-1/-1->3->2|2->3->-1/-1/-1 [1] -1/-1/-1->3->2|2->3->-1/-1/-1
gpu-dev006:25488:25654 [2] NCCL INFO Trees [0] 3/-1/-1->2->1|1->2->3/-1/-1 [1] 3/-1/-1->2->1|1->2->3/-1/-1
gpu-dev006:25487:25659 [1] NCCL INFO Setting affinity for GPU 1 to ff
gpu-dev006:25488:25654 [2] NCCL INFO Setting affinity for GPU 2 to ff00
gpu-dev006:25489:25656 [3] NCCL INFO Setting affinity for GPU 3 to ff00
gpu-dev006:25486:25652 [0] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 8/8/64
gpu-dev006:25486:25652 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1|-1->0->1/-1/-1 [1] 1/-1/-1->0->-1|-1->0->1/-1/-1
gpu-dev006:25486:25652 [0] NCCL INFO Setting affinity for GPU 0 to ff
gpu-dev006:25488:25654 [2] NCCL INFO Channel 00 : 2[82000] -> 3[83000] via direct shared memory
gpu-dev006:25486:25652 [0] NCCL INFO Channel 00 : 0[2000] -> 1[3000] via direct shared memory
gpu-dev006:25489:25656 [3] NCCL INFO Channel 00 : 3[83000] -> 0[2000] via direct shared memory
gpu-dev006:25487:25659 [1] NCCL INFO Channel 00 : 1[3000] -> 2[82000] via direct shared memory
gpu-dev006:25489:25656 [3] NCCL INFO Channel 00 : 3[83000] -> 2[82000] via direct shared memory
gpu-dev006:25488:25654 [2] NCCL INFO Channel 00 : 2[82000] -> 1[3000] via direct shared memory
gpu-dev006:25487:25659 [1] NCCL INFO Channel 00 : 1[3000] -> 0[2000] via direct shared memory
gpu-dev006:25489:25656 [3] NCCL INFO Channel 01 : 3[83000] -> 0[2000] via direct shared memory
gpu-dev006:25488:25654 [2] NCCL INFO Channel 01 : 2[82000] -> 3[83000] via direct shared memory
gpu-dev006:25487:25659 [1] NCCL INFO Channel 01 : 1[3000] -> 2[82000] via direct shared memory
gpu-dev006:25486:25652 [0] NCCL INFO Channel 01 : 0[2000] -> 1[3000] via direct shared memory
gpu-dev006:25489:25656 [3] NCCL INFO Channel 01 : 3[83000] -> 2[82000] via direct shared memory
gpu-dev006:25488:25654 [2] NCCL INFO Channel 01 : 2[82000] -> 1[3000] via direct shared memory
gpu-dev006:25489:25656 [3] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
gpu-dev006:25489:25656 [3] NCCL INFO comm 0x7f93ac000d70 rank 3 nranks 4 cudaDev 3 busId 83000 - Init COMPLETE
gpu-dev006:25487:25659 [1] NCCL INFO Channel 01 : 1[3000] -> 0[2000] via direct shared memory
gpu-dev006:25486:25652 [0] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
gpu-dev006:25486:25652 [0] NCCL INFO comm 0x7fc27c000d70 rank 0 nranks 4 cudaDev 0 busId 2000 - Init COMPLETE
gpu-dev006:25486:25486 [0] NCCL INFO Launch mode Parallel
gpu-dev006:25488:25654 [2] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
gpu-dev006:25488:25654 [2] NCCL INFO comm 0x7ff108000d70 rank 2 nranks 4 cudaDev 2 busId 82000 - Init COMPLETE
gpu-dev006:25487:25659 [1] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
gpu-dev006:25487:25659 [1] NCCL INFO comm 0x7f5a68000d70 rank 1 nranks 4 cudaDev 1 busId 3000 - Init COMPLETE
Namespace(aa='rand-m9-mstd0.5-inc1', batch_size=256, clip_grad=None, color_jitter=0.4, cooldown_epochs=10, cutmix=1.0, cutmix_minmax=None, data_path='/home/public_data/zhigang.yang/data/orig_data/imagenet', data_set='IMNET', decay_epochs=30, decay_rate=0.1, device='cuda', dist_backend='nccl', dist_url='env://', distributed=True, drop=0.0, drop_block=None, drop_path=0.1, epochs=300, eval=False, gpu=0, inat_category='name', input_size=224, lr=0.0005, lr_noise=None, lr_noise_pct=0.67, lr_noise_std=1.0, min_lr=1e-05, mixup=0.8, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='deit_tiny_patch16_224', model_ema=True, model_ema_decay=0.99996, model_ema_force_cpu=False, momentum=0.9, num_workers=10, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='', patience_epochs=10, pin_mem=True, rank=0, recount=1, remode='pixel', repeated_aug=True, reprob=0.25, resplit=False, resume='', sched='cosine', seed=0, smoothing=0.1, start_epoch=0, train_interpolation='bicubic', warmup_epochs=5, warmup_lr=1e-06, weight_decay=0.05, world_size=4)
Creating model: deit_tiny_patch16_224
number of params: 5717416
Start training
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [0,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [1,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [7,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [9,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [10,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [13,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [14,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [16,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [17,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [21,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [25,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [30,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [31,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [32,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [35,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [36,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [37,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [39,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [41,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [44,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [46,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [48,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [49,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [50,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [53,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [54,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [56,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [58,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [60,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [61,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [62,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [0,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [1,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [7,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [10,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [13,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [14,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [16,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [17,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [24,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [25,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [31,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [32,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [33,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [35,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [37,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [41,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [44,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [46,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [48,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [49,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [50,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [53,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [56,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [58,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [60,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [61,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [62,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [0,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [1,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [10,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [14,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [17,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [22,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [24,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [31,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [32,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [33,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [35,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [37,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [40,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [41,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [43,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [44,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [48,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [49,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [50,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [53,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [55,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [56,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [58,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [60,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [61,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [62,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [0,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [1,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [2,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [8,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [10,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [11,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [14,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [17,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [22,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [24,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [26,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [31,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [32,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [33,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [37,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [40,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [43,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [47,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [48,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [49,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [50,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [55,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [58,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [59,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [60,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [61,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
/opt/conda/conda-bld/pytorch_1607370144807/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [62,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
Traceback (most recent call last):
File "main.py", line 335, in
main(args)
File "main.py", line 295, in main
args.clip_grad, model_ema, mixup_fn
File "/home/users/yuxin.fang/vt/deit/engine.py", line 42, in train_one_epoch
outputs = model(samples)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 281, in forward
x = self.forward_features(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 267, in forward_features
x = self.patch_embed(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 165, in forward
x = self.proj(x).flatten(2).transpose(1, 2)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
Traceback (most recent call last):
File "main.py", line 335, in
main(args)
File "main.py", line 295, in main
args.clip_grad, model_ema, mixup_fn
File "/home/users/yuxin.fang/vt/deit/engine.py", line 39, in train_one_epoch
samples, targets = mixup_fn(samples, targets)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/data/mixup.py", line 217, in call
target = mixup_target(target, self.num_classes, lam, self.label_smoothing)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/data/mixup.py", line 27, in mixup_target
return y1 * lam + y2 * (1. - lam)
RuntimeError: CUDA error: device-side assert triggered
gpu-dev006:25487:25487 [1] init.cc:924 NCCL WARN Cuda failure 'device-side assert triggered'
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370144807/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
gpu-dev006:25489:25489 [3] init.cc:924 NCCL WARN Cuda failure 'device-side assert triggered'
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370144807/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
Traceback (most recent call last):
File "main.py", line 335, in
main(args)
File "main.py", line 295, in main
args.clip_grad, model_ema, mixup_fn
File "/home/users/yuxin.fang/vt/deit/engine.py", line 42, in train_one_epoch
outputs = model(samples)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 281, in forward
x = self.forward_features(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 267, in forward_features
x = self.patch_embed(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 165, in forward
x = self.proj(x).flatten(2).transpose(1, 2)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
Traceback (most recent call last):
File "main.py", line 335, in
main(args)
File "main.py", line 295, in main
args.clip_grad, model_ema, mixup_fn
File "/home/users/yuxin.fang/vt/deit/engine.py", line 42, in train_one_epoch
outputs = model(samples)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 281, in forward
x = self.forward_features(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 267, in forward_features
x = self.patch_embed(x)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/timm/models/vision_transformer.py", line 165, in forward
x = self.proj(x).flatten(2).transpose(1, 2)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
gpu-dev006:25488:25488 [2] init.cc:924 NCCL WARN Cuda failure 'device-side assert triggered'
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370144807/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
gpu-dev006:25486:25486 [0] init.cc:924 NCCL WARN Cuda failure 'device-side assert triggered'
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370144807/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
Traceback (most recent call last):
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/users/yuxin.fang/anaconda3/envs/deit92/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/users/yuxin.fang/anaconda3/envs/deit92/bin/python', '-u', 'main.py', '--model', 'deit_tiny_patch16_224', '--batch-size', '256', '--data-path', '/home/public_data/zhigang.yang/data/orig_data/imagenet']' died with <Signals.SIGABRT: 6>.
(deit92) [yuxin.fang@gpu-dev006 deit]$
Since the DeiT's implementation is heavily depends on the timm
, so I run the training script of EfficientNet_B0 using timm
on the same machine under the same env, with 0 warning & 0 error.
Could you help me fix this? Thanks.