limeng95 / pytorch_hand_classifier Goto Github PK
View Code? Open in Web Editor NEWSimple hand classifier by Pytorch and ResNet
Simple hand classifier by Pytorch and ResNet
你好啊,我在运行classifier_train.py的时候遇到这个问题,我想知道你在运行这个程序的时候的环境是什么。我的是windows,10,pytorch=1.0.0,torchvision=0.2.1,python=3.5.6和python=3.6.9,cuda=9.0,我百度出来是由于pytorch版本或python版本或cuda版本不匹配,所以我想知道是不是版本问题。谢谢。
Hello, I encountered this problem when running classifier_train.py. I want to know what is the environment when you run this program. My environment is windows, 10, pytorch=1.0.0, torchvision=0.2.1, python=3.5.6 and python=3.6.9, cuda=9.0, my Baidu came out because the pytorch version or python version or cuda version does not match , So I want to know if it is a version issue. Thank you.
由于自己的笔记本无GPU加速,所以将此项目运行在Google colab平台上。colab提供免费的GPU,但是运行时发现 共享内存不足,不知道有没有其他同学遇到此问题?
2018-07-28 12:48:24 [INFO]: Start training epoch 1
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
Traceback (most recent call last):
File "classifier_train.py", line 65, in <module>
trainer.train()
File "/content/drive/pytorch_hand_classifier/utils/Trainer.py", line 104, in train
self._train_one_epoch()
File "/content/drive/pytorch_hand_classifier/utils/Trainer.py", line 133, in _train_one_epoch
for step, (data, label) in enumerate(self.train_data):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 275, in __next__
idx, batch = self._get_batch()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 254, in _get_batch
return self.data_queue.get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 335, in get
res = self._reader.recv_bytes()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 175, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 2068) is killed by signal: Bus error.
I have trained 50 epochs and found that it can improve further. How can T make it continue to train on the basis of these 50 epochs?Thank you!
官方的resnet模型不能使用GPU吗
I printed the loss for each step. Why does the loss change greatly when each epoch starts? The first two epoch suddenly increase, and the latter is suddenly reduced.
如题
Traceback (most recent call last):
File "D:\workspace\pytorch_hand_classify\pytorch_hand_classifier\classifier_test.py", line 18, in
tester = Tester(model, params)
File "D:\workspace\pytorch_hand_classify\pytorch_hand_classifier\utils\Tester.py", line 34, in init
self._load_ckpt(ckpt)
File "D:\workspace\pytorch_hand_classify\pytorch_hand_classifier\utils\Tester.py", line 67, in _load_ckpt
self.model.load_state_dict(torch.load(ckpt))
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 367, in load
return _load(f, map_location, pickle_module)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 538, in _load
result = unpickler.load()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 504, in persistent_load
data_type(size), location)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 113, in default_restore_location
result = fn(storage, location)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 94, in _cuda_deserialize
device = validate_cuda_device(location)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\serialization.py", line 78, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
[Finished in 6.1s]
在utils文件夹里的trainer.py里, self.train.model()是不是应该放在
def _train_one_epoch(self):
for step, (data, label) in enumerate(self.train_data):
的后面
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.