Comments (10)
from u-2-net.
thanks for your detailed answer, I start to be afraid completely miss the point here, sorry if this question is too dumb.
as far as I see it, it doesn't matter with what size RescaleT is called in training, the inputs for net() is always the crop size (288) in net(inputs_v) -> crop happens after scale.
I understood crop is for data aug, but should't be then the test size the same as used for crop?
from u-2-net.
from u-2-net.
thanks now all is clear and I could reproduce good results with arbitrary size!
Although I still wonder how the model can handle same objects which are just further away or closer, if its not scale invariant. Anyway, I don't bother you anymore, thanks again!
from u-2-net.
Thanks for your great works!
I am using this on iOS application and converted to MLModel . When I use MLModel, I want any arbitrary size for input but it seems to support only square size. for example portrait, landscape like (240 * 320, 320 * 300)
I am getting an error with arbitrary size. What is the solution? Is there a problem with converting?
from u-2-net.
from u-2-net.
It is safer to resize all the input to 320x320, which will theoretically give better results. Since there are several downsample and upsample operations, your size may trigger some errors in that part. So it is good to show the error, otherwise we cann't give exact solutions.
…
On Mon, Jul 19, 2021 at 6:59 AM mgstar1021 @.***> wrote: Thanks for your great works! I am using this on iOS application and converted to MLModel . When I use MLModel, I want any arbitrary size for input but it seems to support only square size. for example portrait, landscape like (240 * 320, 320 * 300) I am getting an error with arbitrary size. What is the solution? Is there a problem with converting? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORN2ZYJNOR6XRGDYUA3TYOIHHANCNFSM4NEKY46A .
-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
Thanks for your reply. Is it better to use square image than portrait or landscape image which sets one side(height or width) to 320? Is it difficult to support it?
from u-2-net.
from u-2-net.
"but should't be then the test size the same as used for crop?" RES: No. The networks are usually (theoretically) translation invariant but not scale invariant. The cropping mainly changes the translation. But it doesn't change the receptive fields. In both training and test, keeping the scaling consistent is necessary, while cropping isn't. Because most of the networks are not scale invariant. Besides, cropping in testing will introduce another problem. How can we achieve the complete prediction map of the whole input image in the testing process.
…
On Mon, May 18, 2020 at 3:39 PM dinoByteBr @.***> wrote: thanks for your detailed answer, I start to be afraid completely miss the point here, sorry if this question is too dumb. as far as I see it, it doesn't matter with what size RescaleT is called in training, the inputs for net() is always the crop size (288) in net(inputs_v) -> crop happens after scale. I understood crop is for data aug, but should't be then the test size the same as used for crop? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNLKWQZMJCHGAWXVWTRSGTIJANCNFSM4NEKY46A .
-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
Hey, first of all, thank you for your work. Eagerly waiting to see your new paper (and model).
Regarding the fact that the input sizes are different for training (288x288 after cropping) and training (320x320 after resizing), you say that scaling has to be consistent since generally models are not scale-invariant.
This brings me to the following questions:-
- Since cropping is being applied after resizing, the model gets 288x288 sized input images during training, whereas during testing, the input images are 320x320. Since the images are of different shapes, aren't their scales different (thus leading to a case of scale-variance)?
- Wouldn't it be better to apply a percentage based cropping (to account for different dataset images) and then resizing them to 320x320? If this is done, the model would have the same input size (320x320) during both training and testing, thus keeping the scaling consistent.
Once again, thanks a lot for your work. This model is a godsend.
I have been using it for my own background removal module (trained on 720x720 images and L2 loss to predict an alpha matte).
from u-2-net.
"but should't be then the test size the same as used for crop?" RES: No. The networks are usually (theoretically) translation invariant but not scale invariant. The cropping mainly changes the translation. But it doesn't change the receptive fields. In both training and test, keeping the scaling consistent is necessary, while cropping isn't. Because most of the networks are not scale invariant. Besides, cropping in testing will introduce another problem. How can we achieve the complete prediction map of the whole input image in the testing process.
…
On Mon, May 18, 2020 at 3:39 PM dinoByteBr @.***> wrote: thanks for your detailed answer, I start to be afraid completely miss the point here, sorry if this question is too dumb. as far as I see it, it doesn't matter with what size RescaleT is called in training, the inputs for net() is always the crop size (288) in net(inputs_v) -> crop happens after scale. I understood crop is for data aug, but should't be then the test size the same as used for crop? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNLKWQZMJCHGAWXVWTRSGTIJANCNFSM4NEKY46A .
-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/Hey, first of all, thank you for your work. Eagerly waiting to see your new paper (and model).
Regarding the fact that the input sizes are different for training (288x288 after cropping) and training (320x320 after resizing), you say that scaling has to be consistent since generally models are not scale-invariant. This brings me to the following questions:-
- Since cropping is being applied after resizing, the model gets 288x288 sized input images during training, whereas during testing, the input images are 320x320. Since the images are of different shapes, aren't their scales different (thus leading to a case of scale-variance)?
- Wouldn't it be better to apply a percentage based cropping (to account for different dataset images) and then resizing them to 320x320? If this is done, the model would have the same input size (320x320) during both training and testing, thus keeping the scaling consistent.
Once again, thanks a lot for your work. This model is a godsend. I have been using it for my own background removal module (trained on 720x720 images and L2 loss to predict an alpha matte).
@xuebinqin I have the same doubts
from u-2-net.
Related Issues (20)
- How can I input video or webcam in the test.py script?
- 能否麻烦将整个项目打个包(包含运行文件和预训练模型),集成一个 .bat 文件,点击运行即可使用? HOT 2
- 请问该模型只能输出二分类结果么?可以输出为多分类么?比如三分类 HOT 1
- Is it support person segmentation now? HOT 2
- 如何添加评价指标
- ImportError: cannot import name 'U2NET' from 'model' HOT 2
- Can' Access Human Segmentation Model Weights HOT 1
- Usage on cross platform mobile devices using tenser flow or pytourch ( Special on Flutter )
- 请问大家,我需要前景而非mask,该怎样修改代码呢,感谢
- /alueror: At least one stride in the given numpy aray is mepative, ad tensors with nepative strides are not curently suported. (You can probably work around this by making a copy of your array with array.copy( ).)
- u2net_train.py returning ValueError HOT 1
- the effect of replacing background with U-2-Net is not as expected, anyone can offer a help
- can tell me , label mask must be 0 or 1,if 0 or 1,why binary is smooth, if dataset mask edge in (0,1),it can work? HOT 1
- The results are not good when the object has small holes. HOT 1
- 有什么办法只保留抠图部分尺寸,不需要原图尺寸大小,谢谢 HOT 1
- num_workers=1
- u2net for semantic segmentation
- 自己数据训练的u2net,归一化尺寸320,但是抠图效果边缘锯齿明显
- Paddlehub is no longer under maintenance, what should I do if an error message is installed
- How can i use my own GPU to train
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from u-2-net.