(base) root@I12f01b1f1300b01732:/hy-tmp/code/NLP-HuggingFace-Tutorial/text_classification/T5# !53
python train.py --device='cuda'
/usr/local/miniconda3/lib/python3.8/site-packages/torch/cuda/init.py:88: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
{'batch_size': 16,
'classes_map_dir': '/hy-tmp/code/NLP-HuggingFace-Tutorial/text_classification/T5/classes_map.json',
'data_dir': '/hy-tmp/code/NLP-HuggingFace-Tutorial/text_classification/T5/dataset',
'device': 'cuda',
'learning_rate': 0.0001,
'lr_warmup_steps': 0,
'num_train_epochs': 10,
'num_workers': 8,
'prefix_text': 'tweet_eval emotion sentence: ',
'pretrained_model_name_or_path': 't5-base',
'save_weights_path': '/hy-tmp/code/NLP-HuggingFace-Tutorial/text_classification/T5/weights',
'use_Adafactor': True,
'use_AdafactorSchedule': True,
'use_weighted_random_sampler': False,
'weight_decay': 0}
Traceback (most recent call last):
File "train.py", line 187, in
main(args)
File "train.py", line 93, in main
model.to(args.device)
File "/usr/local/miniconda3/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1811, in to
return super().to(*args, **kwargs)
File "/usr/local/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 989, in to
return self._apply(convert)
File "/usr/local/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
File "/usr/local/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 664, in _apply
param_applied = fn(param)
File "/usr/local/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 987, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/usr/local/miniconda3/lib/python3.8/site-packages/torch/cuda/init.py", line 229, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
你好,在CPU上可以正常训练,在linux的GPU上有报错