Comments (15)
Zero padding means padding the image by zero and then cropping it to the be of original size. It's equivalent to a translation and then padding the other side with zeros. I think from this view it makes more sense.
from densenet.
Oops, I see you've actually published # of params inside your paper.
Mind, btw, that Zagoruyko's numbers are wrong. He messed up the dataset szagoruyko/wide-residual-networks#17 (comment)
from densenet.
You can see that DenseNet performed worse with 7m params vs WideResnet with 2.7m on SVHN. I'm seeing it from your paper that that's because Sergey didn't mess up the data in that one, though, you both only did [0..1] scaling when std+mean would have probably done better. Can't say certainly, but likely.
from densenet.
On the other hand, one can always just go ahead and test the speed himself. =) Though, that's a last resort measure considering the need for that info to be in tables/figures.
from densenet.
Mind, btw, that Zagoruyko's numbers are wrong. He messed up the dataset szagoruyko/wide-residual-networks#17 (comment)
Thanks for your remind. Yeah, we read the wide-resnet paper, and knew that the wide-resnet's preprocessing (whitend instead of only normalized) and data augmentation(relect-padding instead of zero-padding) are both different from and slightly heavier than us. But ours is more widely used (see our references in the paper), it follows most of the publications.
We cannot rerun every baseline methods, and we think it's fair to compare our model with wide-resnet under the setting above.
from densenet.
We appreciate your remind on #parameters and #computation. We'll consider including #computation in our next updates.
For training time reference, our Densenet(L=40, k=12) with batch size 64 and 300 epochs takes about 7 hours to finish on one TITAN X GPU. This includes about 0.5 hour test time.
Also as you said, it's important to keep other settings the same when comparing different architectures. So we keep every hyperparameter and other setting the same with the official implementation of ResNet, and use standard preprocessing and data augmentations. We're interested in the performance comparison of your architecture with densenets, under the same setting. But note that it's possible that a set of hyperparameters and settings is good for one architecture but bad for another architecture.
from densenet.
Yes, of course you wouldn't rerun every model =) I'm just saying that for you to now know that some reevaluation of how good is DenseNet model vs ResNet is, or at least will soon be needed. Zagoruyko is going to retest his model, and, I'm guessing, update his paper accordingly.
Reguarding mirroring/reflections, fb.resnet.torch uses those too https://github.com/facebook/fb.resnet.torch/blob/master/datasets/cifar10.lua actually, for images of such types it's the most default kind of preprocessing, much more usual than zero-padding. For images of numbers and letters on the other hand, of course, it's not used at all in general. Except, maybe, for hand-selected parts of dataset.
our Densenet(L=40, k=12) with batch size 64 and 300 epochs takes about 7 hours to finish on one TITAN X GPU
Yes, I actually saw that info on the main page. =) It's just that it would be more interesting to see it in a figure with dots mapped on accuracy and #parameters (#computation in another one). I'll suggest that to Sergey too.
Selecting proper hyperparameters is indeed an important part and one should select best hyperparams for it when evaluating model's performance. Though it's also a part of model's evaluation to tell how critically is it's performance dependent on having some very exact hyperparameters.
from densenet.
for images of such types it's the most default kind of preprocessing, much more usual than zero-padding
It seems to me that fb.resnet.torch's code use zero-padding instead of reflect padding. https://github.com/facebook/fb.resnet.torch/blob/master/datasets/transforms.lua
And zero-padding is more usual than reflecting padding (see our references).
For SVHN dataset we followed wide resnets' preprocessing and data augmentation(no data augmentation).
Thanks for your information. Just to clarify we didn't play any tricks and tried to keep the comparison as fair as possible.
from densenet.
https://github.com/facebook/fb.resnet.torch/blob/master/datasets/cifar10.lua
Notice
t.HorizontalFlip(0.5),
As for what method is more popular - to tell the truth, I honestly don't know. Reflections just seem a more natural thing for this type of images and for convolutional networks with large pooling layer at the end like in all modern architectures. You see, for some images padding can remove an important part of an image and that padding you add adds some informational noise. I don't even understand how padding actually improves anything other than, maybe, giving more attention to the central part of the image and, maybe, helping learn to classify a cut part of an object as this object.
If you'll get a dataset where the classifier object is not located in the image center, but at many times near edges, my guess is zero padding will do you no good.
Meanwhile reflections can only hurt in some very rare cases when the object is not symmetric and when a symmetric reflection of it should classify as a different object. In real life that's almost completely restricted to symbols and characters. So it's the least risky and most obvious type of augmentation.
Scaling and rotation seem to me as much more meaningful augmentations than zero-padding for convolutional networks, but I guess for CIFAR they don't work very well, as the images there are very tiny. They should work to some extent if you're upscaling the images first, though. But that will make the network much slower and will probably defeat the purpose, as CIFAR is more of just a quick playground to test ideas than something like an actual data. If people were to just restrict themselves to testing on purely mirrored CIFAR without zero-padding that would be of great help, IMHO, as there's plenty of ways to zero-pad the dataset, but doing this makes no sense at all as the only purpose of CIFAR is just to compare ideas, not to win some competition. I'm saying we should keep reflections, though, because these definitely can't hurt and are a very tiny unambiguous type of augmentation that should help those networks that would otherwise tend to become overfit very fast, but are not inherently bad, as actual datasets are usually much bigger and have a possibility of rotation, scaling and other better augmentations. Meanwhile, my guess is that zero-padding will probably help them to a much smaller degree and is not a very fair type of augmentation as in CIFAR it almost completely eliminates any other things from the picture but the actual classifier-object when you combine the cuts made by zero-padding algo.
from densenet.
Okay, I just understood that we were talking about different things =D Reflection-padding means padding with non-0 pixels. Now I get it. Yeah, reflection-padding is likely not a very good type of augmentation, IMO. You're providing the net with a lot of garbage data. With zero-padding you do that as well, but it quickly learns that that data is garbage, while in this type of augmentation, I don't think it's that easy.
from densenet.
I just confused this with image horizontal mirroring.
from densenet.
Yeah, I just found it that in WRN and so in my code it seems like we used reflection padding type. Lol.
I'll continue my investigations into CNNs with Cifar without using any type of padding.
from densenet.
As for how any type of zero-padding influences learning, I guess, it tells that patterns from the cut-out parts should probably be generally ignored and that the most important part of the object is in those patterns that are located in the middle. Quite a lot of additional info.
from densenet.
@ibmua regarding zero vs reflection padding you might also have a look at their opinion https://twitter.com/karpathy/status/720622989289644033
from densenet.
As far as I see it, with zero padding you're effectively adding some gray color instead of unimportant part of an image, therefore making classifier more indifferent to the cut-out part and to such large gray color patches. With reflection-padding you're adding very risky info. You're still reaping the benefit of making classifier more indifferent to the cut-out part, but the info you're adding will be impressed into weights. Likely, much more than with zero-padding. Yeah, maybe, for some cases it works better. Because it adds indifference to some background-ish patterns. But it's risky, IMHO.
Anyways, not cutting the image altogether is more interesting to me from a perspective of evaluation of NN quality. IMHO, it's probably only reasonable to use it if you're cooking the final net for production. It's pretty orthogonal to the network model itself. This is just a way of using your knowledge about the exact dataset to make it universally easier to grasp by any type of a regression model.
from densenet.
Related Issues (20)
- Covolution before entering the first dense block for imagenet dataset HOT 1
- DenseNet on Pascal VOC HOT 2
- results on cifar100 HOT 1
- I tried to reproduce Wide-DenseNet-BC results on cifar10, but got 0.5% more than your error HOT 4
- Why is composite function BN-ReLU-Conv3x3 ? HOT 1
- Pretrained weights for the 0.8M parameters config HOT 1
- Why not share the first BN and ReLU? HOT 2
- The layers within the second and third dense block don't assign the least weight to the outputs of the transition layer in my trained model
- Why we can detach any layer without affecting others in densenet?
- question about standardization HOT 6
- cifar validation loss decrease than increase after learning rate change HOT 4
- Question on channel before entering the first block HOT 2
- Question on impede information flow HOT 1
- Is there a pretrained CIFAR 100 or CIFAR 10 model? HOT 2
- Densenet on CIFAR training from scratch
- Question on the last transition layer
- Receptive field of DenseNet
- image classification
- cannot open </cifar-10-python/data_batch_1>
- Different DensceNet
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from densenet.