hysts / pytorch_image_classification Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet
License: MIT License
PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet
License: MIT License
This repo is very detailed and informative, thanks!
However, I have a little bit confused about the WRN20-4 in the results. Since the depth of WRN should satisfy (6*n)+4, and 20 depth is not satisfied. I want to know how to get the result of WRN-20-4?
https://github.com/hysts/pytorch_image_classification#results
Best!
I have a dataset including many folders, each folder (as a class) contains images. So I want to train with my own dataset but I don't know how to set up my data structure. Thank you so much!
this
Thank you for your excellent work and share !!!
The result of pyramidnet or resnext is very good !
However, I want to try some more algorithms such as efficientnet, the github project pytorch-image-models you mentioned before has many algorithms, but the project is build to train and test in open dataset, not good at run our own dataset. Even I changed the code to run my own dataset, the result is worse than your project with the same algorithms.
So, could you give me some advice about how to add efficientnet from pytorch-image-models to your project for imagenet?
Thanks a lot !!!
how to create the label?
Are there training logs available?
First of all, thanks for the great repository!
I think there might be a slight inaccuracy in the way VGG is defined here: https://github.com/hysts/pytorch_image_classification/blob/master/pytorch_image_classification/models/cifar/vgg.py
Specifically, in the make_stage
function:
def _make_stage(self, in_channels, out_channels, n_blocks):
stage = nn.Sequential()
for index in range(n_blocks):
if index == 0:
conv = nn.Conv2d(
in_channels,
out_channels,
kernel_size=3,
stride=1,
padding=1,
)
else:
conv = nn.Conv2d(
out_channels,
out_channels,
kernel_size=3,
stride=1,
padding=1,
)
stage.add_module(f'conv{index}', conv)
if self.use_bn:
stage.add_module(f'bn{index}', nn.BatchNorm2d(out_channels))
stage.add_module('relu', nn.ReLU(inplace=True))
stage.add_module('pool', nn.MaxPool2d(kernel_size=2, stride=2))
return stage
I think currently, only one relu
module is being added, i.e. not one for each value of index
. Whereas in the VGG paper, there should be a relu
for each value of index. It seems like the fix for this should just be modifying the line:
stage.add_module('relu', nn.ReLU(inplace=True))
, to make it
stage.add_module(f'relu{index}', nn.ReLU(inplace=True))
.
I'm having the following problem when running the read.me command and would really appreciate your help :
Traceback (most recent call last):
File "train.py", line 445, in
main()
File "train.py", line 371, in main
model, optimizer, opt_level=config.train.precision)
File "/mnt/cephfs/training/users/lilujun/miniconda3/envs/py/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/amp/frontend.py", line 358, in initialize
return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
File "/mnt/cephfs/training/users/lilujun/miniconda3/envs/py/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/amp/_initialize.py", line 171, in _initialize
check_params_fp32(models)
File "/mnt/cephfs/training/users/lilujun/miniconda3/envs/py/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/amp/_initialize.py", line 116, in check_params_fp32
name, buf.type()))
File "/mnt/cephfs/training/users/lilujun/miniconda3/envs/py/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/amp/_amp_state.py", line 32, in warn_or_err
raise RuntimeError(msg)
RuntimeError: Found buffer total_ops with type torch.DoubleTensor, expected torch.cuda.FloatTensor.
When using amp.initialize, you need to provide a model with buffers
located on a CUDA device before passing it no matter what optimization level
you chose. Use model.to('cuda') to use the default device.
se_resnet_preact.yaml: model.resnet_preact should be model.se_resnet_preact
se_resnet_preact.py: in initialize_weights(), change all module.biasd.* to module.bias.* (without d)
the normalization step might be wrong.
https://github.com/hysts/pytorch_image_classification/blob/master/dataloader.py#L134
the correct order is
https://github.com/kuangliu/pytorch-cifar/blob/master/main.py#L35
Any idea?
Hi,
I am a new in this area. Could you please give me some suggestions about how can I train the model using my own dataset. Thanks in advance.
Kind regards,
Sean
您好,请问下分布式训练怎么设置?谢谢
Hello, what is the test command?
The question is If I want to use the trained model to test my own test dataset, what should I do?
thanks!
I have a folder containing the images which need to evaluate. All images where stored as
"train dataset folder is:"
/path/female_bag/female_bag*.jpg,
/path/makeup_bag/makeup_bag*.jpg,
" test dataset folder is:"
/path/female_bag/female_bag*.jpg,
/path/makeup_bag/makeup_bag*.jpg,
How should I write the code to test those images?
Hi @hysts , I've trained and evaluated successfully with my own dataset. And now I want to test with only an image, but I'm not sure about any files I need to change. Can you help me this problem? I've tried to change some files and some functions but it's not work.
Thank you so much!
Hi, in your reported results, are these (1) best performing test accuracies without a validation set, (2) final test accuracies on the last epoch of training, or (3) test accuracies on the model checkpoint with the best performing validation accuracy?
https://github.com/hysts/pytorch_image_classification/blob/master/evaluate.py#L22
when I use ebaluate.py , cannot find CheckPointer
When I trying to use self-changed evaluate.py to evaluate my own dataset this error always came first. Do you know how could I change the code? I already add this :
transforms.ToTensor()
in evaluate.py and changed the code in dataset.py as return self.transform(self.x[index]), self.transform(self.y[index])
Is there any other way to eliminate the error?
Hi,
Any chance to save the weights of those trainings?
Thank you for your excellent work and share !!!
I have own dataset with channel 1, 64X64 gray scale images.
For all network(vgg16, resnet18 ...), if I set n_channels: 1 in yaml file, following error shows:
Traceback (most recent call last):
File "train.py", line 436, in
main()
File "train.py", line 404, in main
validate(0, config, model, val_loss, val_loader, logger,
File "train.py", line 259, in validate
outputs = model(data)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/apex/amp/_initialize.py", line 196, in new_fwd
output = old_fwd(*applier(args, input_caster),
File "/media/zzks/xi/2020PJ/lab/pytorch_image_classification/pytorch_image_classification/models/imagenet/vgg.py", line 80, in forward
x = self._forward_conv(x)
File "/media/zzks/xi/2020PJ/lab/pytorch_image_classification/pytorch_image_classification/models/imagenet/vgg.py", line 72, in _forward_conv
x = self.stage1(x)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "/home/zzks/anaconda3/envs/dbnet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 341, in conv2d_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Given groups=1, weight of size 64 1 3 3, expected input[100, 3, 64, 64] to have 1 channels, but got 3 channels instead
I can't find the way to resume from early epoch.
pytorch_image_classification/train.py
Line 573 in 1d76092
I notice that if no_weight_decay_on_bn
is set to True
, weight decay will only apply to conv.weight
. It seems that weight decay on fc layers are also removed at the same time. Is there any reason to do so?
Hi,
I was trying to run the code from terminal directly using
pip install -r requirements.txt
python train.py --config configs/cifar/resnet.yaml
However, I kept getting this assertion error:
AssertionError: Key env_info.cuda_version with value <class 'NoneType'> is not a valid type;
valid types: {<class 'bool'>, <class 'float'>, <class 'tuple'>, <class 'int'>, <class 'list'>, <class 'str'>}
Below is the traceback message:
Traceback (most recent call last):
File "train.py", line 436, in <module>
main()
File "train.py", line 340, in main
save_config(get_env_info(config), output_dir / 'env.yaml')
File "/Users/charlotte/Desktop/classification/pytorch_image_classification/utils/env_info.py", line 19, in get_env_info
return ConfigNode({'env_info': info})
File "/Users/charlotte/Desktop/classification/pytorch_image_classification/config/config_node.py", line 6, in __init__
super().__init__(init_dict, key_list, new_allowed)
File "/Users/charlotte/opt/miniconda3/lib/python3.7/site-packages/yacs/config.py", line 86, in __init__
init_dict = self._create_config_tree_from_dict(init_dict, key_list)
File "/Users/charlotte/opt/miniconda3/lib/python3.7/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict
dic[k] = cls(v, key_list=key_list + [k])
File "/Users/charlotte/Desktop/classification/pytorch_image_classification/config/config_node.py", line 6, in __init__
super().__init__(init_dict, key_list, new_allowed)
File "/Users/charlotte/opt/miniconda3/lib/python3.7/site-packages/yacs/config.py", line 86, in __init__
init_dict = self._create_config_tree_from_dict(init_dict, key_list)
File "/Users/charlotte/opt/miniconda3/lib/python3.7/site-packages/yacs/config.py", line 132, in _create_config_tree_from_dict
".".join(key_list + [str(k)]), type(v), _VALID_TYPES
File "/Users/charlotte/opt/miniconda3/lib/python3.7/site-packages/yacs/config.py", line 525, in _assert_with_logging
assert cond, msg
Can you please give me some hints about how to fix this? Thanks!
I'm on my ubuntu server(without GPU), trying to run your script.
pip install -r requirements.txt is successful (i ignored the apex requirement, don't have gpu now)
but when I runned the following command, I meet the following issue
and all other .yaml throwed me the same issue.
~/code/pytorch_image_classification# python train.py --config configs/cifar/resnet.yaml
Traceback (most recent call last):
File "train.py", line 449, in
main()
File "train.py", line 353, in main
save_config(get_env_info(config), output_dir / 'env.yaml')
File "/root/code/pytorch_image_classification/pytorch_image_classification/utils/env_info.py", line 19, in get_env_info
return ConfigNode({'env_info': info})
File "/root/code/pytorch_image_classification/pytorch_image_classification/config/config_node.py", line 6, in init
super().init(init_dict, key_list, new_allowed)
File "/root/archiconda3/envs/python38/lib/python3.8/site-packages/yacs/config.py", line 86, in init
init_dict = self._create_config_tree_from_dict(init_dict, key_list)
File "/root/archiconda3/envs/python38/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict
dic[k] = cls(v, key_list=key_list + [k])
File "/root/code/pytorch_image_classification/pytorch_image_classification/config/config_node.py", line 6, in init
super().init(init_dict, key_list, new_allowed)
File "/root/archiconda3/envs/python38/lib/python3.8/site-packages/yacs/config.py", line 86, in init
init_dict = self._create_config_tree_from_dict(init_dict, key_list)
File "/root/archiconda3/envs/python38/lib/python3.8/site-packages/yacs/config.py", line 129, in _create_config_tree_from_dict
_assert_with_logging(
File "/root/archiconda3/envs/python38/lib/python3.8/site-packages/yacs/config.py", line 545, in _assert_with_logging
assert cond, msg
AssertionError: Key env_info.pytorch_version with value <class 'torch.torch_version.TorchVersion'> is not a valid type; valid types: {<class 'NoneType'>, <class 'list'>, <class 'bool'>, <class 'int'>, <class 'tuple'>, <class 'float'>, <class 'str'>}
Thanks hysts for sharing the package.
One question is where can I find the error rate of each epoch?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.