kaixin96 / panet Goto Github PK
View Code? Open in Web Editor NEWCode for our ICCV 2019 paper PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
Code for our ICCV 2019 paper PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
I evaluate the processing time on inference with the line of code as below.
'''
time_s = time.time()
query_pred, _ = model(support_images, support_fg_mask, support_bg_mask,
query_images)
time_elapsed = time.time() - time_s
'''
I just found that the FPS of the model can above 30, is that really?
Because I though instance segmentation usually time-consuming.
Hi Thank for open sourced code. How long the trainig time will be ? @kaixin96
Hi Kaixin,
It seems that during both training and testing both the support and query images are resized to a fixed size (e.g. [417, 417]). However, in many few-shot segmentation works, the segmentation mask output is resized to the original image resolution for evaluation. How can I get the original query images and the corresponding ground-truth masks during testing phase?
I also output some unused key-value pairs in the dataloader dictionary, but they seem to be of the same fixed shape:
sample_batched['support_images_t'][0][0].shape = torch.Size([1, 3, 417, 417]) sample_batched['query_images_t'][0].shape = torch.Size([1, 3, 417, 417]) sample_batched['query_masks'][0][0].shape = torch.Size([1, 1, 417, 417]) sample_batched['query_labels'][0].shape = torch.Size([1, 417, 417])
Thanks for sharing your code. It is a really great work! But the accuracy rate I got based on this code is lower than that reported in the paper. Can you share the hyperparameter settings on the VOC2012 dataset?
Hi there,
I'm working on IDD dataset for performing semantic segmentation on it. I find PANet is a quite good model to carry out few shot learning for semantic segmentation and is suitable for my task. It would be really helpful if you could guide on how can I train PANet on my dataset (IDD dataset).
Thanks.
Hello
Why in train.py's optimizer definition **_config is used instead of _config?
optimizer = torch.optim.SGD(model.parameters(), **_config['optim'])
This caused error when I tried to run train.py
optimizer = torch.optim.SGD(model.parameters(), **_config['optim']) KeyError: 'optim'
Thx for the help!
Hi Kaixin,
Thank you for releasing the code. It shows the transparency of your work. Can you share the environment.yml file for PANet? It would be a great help!
Hi,
thanks for sharing the code.
I checked the files in the ScribbleAugAuto folder. The annotations here are just lines and cannot form closed graphics. How can I generate annotations similar to Figure 6?
If not, what is the value of scribble_dilation in the config file during the test? It seems that the performance is not ideal (in my test) when the value is 0. Did I neglect something?
Look forward to your reply.
Regards
Hello. I found that when I switched the backbone to resnet with last two layer replaced by dilation2 and dilation4 conv with all other settings unchanged, the accuracy seems much lower than vgg , should I change some hyper paramters , like the learning rate or training steps ? thank you
Hello, thanks for open source your nice work! Did you explore other encoder backbones? For example, resnet50. I used resnet50 as encoder but got worse mIOU, specifically 40.7%. How do you think of this?
Hi,
I was looking through the code but couldn't figure out what "scribble" indicates.
From dataloaders/customized.py
Line 103
support_scribbles = [[paired_sample[cumsum_idx[i] + j]['scribble'] for j in range(n_shots)]
for i in range(n_ways)]
Could you clarify what the support_scribbles mean? Also, what is "image_t"
From dataloaders/customized.py
Line 93
support_images_t = [[paired_sample[cumsum_idx[i] + j]['image_t'] for j in range(n_shots)]
for i in range(n_ways)]
Is it the segmentation target mask of the given image? While, label indicates the class_index of the mask?
Thanks in advance,
Hi! I am interesting to your awesome work but having limited hardware resource.
May I ask you about CPU and GPU(s) information? Thanks!
Hi, thanks for the source code, it really helps me on my on-going project.
Can you give me some hints about how to visualize the prediction from the network? The results in the paper looks good but I am also interested in what package is used to visualize the output?
I was using Tensorboard to visualize the the input images and masks, and it works fine. But for prediction made by the network, I had little to no clue if I should do some post-processing or not even if the paper said no post-processing and decoding is needed. The output looks like some heatmap under greyscale. (not black and white mask-like stuff)
BTW, the training loss curve looks exactly like the paper said and the test result is also close to the paper's results.
Hi, it has been very helpful to my on going project.
However, is there any on-going plan that the dataloader could be simplified? Transfering the model to other dataset is really inconvenient.
Another problem: increasing the batchsize dramatically increases the VRAM usage. Not sure whether this is normal or not. And if it is normal, is there anyway to do some optimization?
@kaixin96 How can I run your code with two GPU cards when the batch_size is set to 4 for fast training ? Could you provide the corresponding modifications and run command ?
Thanks for your great work!
Could you give me some instructions about the Data Preparation for COCO Dataset?
Thanks for your work.
I ran your code and got the following error. Could you give me some clue to solve it?
sacred.utils.ConfigAddedError: Added new config entry that is not used anywhere
Conflicting configuration values:
model.sparse=False
Hi Kaixin sorry to trouble u. May i ask what is your 'SBD' folder and how i can get it? Sorry if i missed out something. From my undertanding it seemed u already have the SBD folder in place and process them and save them toSegmentationClassAug
folder. Thank you very much.
sbd_instance_process.py
and transform it from .mat to .png. Then transformed
images will be saved in VOC data folder. The name of
the new folder is "SegmentationObjectAug"
Hi @kaixin96 I was trying to run this code, and I ended up with a out of memory error:
RuntimeError: CUDA out of memory. Tried to allocate 340.00 MiB (GPU 0; 1.96 GiB total capacity; 946.64 MiB already allocated; 23.38 MiB free; 1.33 GiB reserved in total by PyTorch)
I noticed that the batch size is already 1 so I cannot reduce it further, is there a workaround?
Or is there any way I can use an interactive shell like colab to run it? because sacred doesnt support interactive environments afaik.
Your help would be greatly appreciated.
Hi@kaixin,
Thank you for sharing the code and it is helpful for me, but the performence is bad in my datasets that have small target as follow:
There is a crack defect in this picture, the result of miou only is about 0.1 and the visualization is very bad. Do you any suggestions obout this? Thank you very much!
BTW: The size of image is 1488*1488 and the format is .bmp.
Thanks for your wonder work!
I have a question about training code.
I trained and tested without any modification, i got only 41 miou.
What is the problem?
INFO - main - ----- Final Result -----
INFO - main - classIoU mean: [0.56933684 0.50324984 0.51414957 0.32956392 0.16224402]
INFO - main - classIoU std: [0.02441738 0.02545722 0.02529587 0.01004774 0.01762301]
INFO - main - meanIoU mean: 0.41570883715177515
INFO - main - meanIoU std: 0.005918026601930923
INFO - main - classIoU_binary mean: [0.88759253 0.43259521]
INFO - main - classIoU_binary std: [0.00280747 0.00433909]
INFO - main - meanIoU_binary mean: 0.66009386980461
INFO - main - meanIoU_binary std: 0.0024477935615900745
INFO - PANet - Completed after 0:14:51
Hi @kaixin96
I can't found the JPEGImage folder, as the error information shows:
FileNotFoundError: [Errno 2] No such file or directory: './data/VOCdevkit/VOC2012/JPEGImages/2011_000997.jpg'
Where can I find the whole datasets?
Thanks.
I increased my shot from 5 to 200, but the accuracy didn't improve
few shot learning中增加shot为什么正确率不会提高?将shot从5shot提高到200shot,准确率没有提高
Hi,
I read the paper and code but not sure why the query label's values are changed here:
query_labels_tmp = [torch.zeros_like(x) for x in query_labels]
for i, query_label_tmp in enumerate(query_labels_tmp):
query_label_tmp[query_labels[i] == 255] = 255
for j in range(n_ways):
query_labels_tmp[query_labels[i] == class_ids[j]] = j + 1
My other question is why query_label_tmp is returned instead of query_labels_tmp?
Sir, I'd want to use the PANet to do few-shot segmentation on a dataset I've gathered. I looked over the code but couldn't figure out how to train/test the model with an external dataset. Could you please include a script for this?
Thank you for your great work.
I have a question regarding the evaluation. As mentioned in the paper, the evaluation is computed by "average the results from 5 runs with different random seeds, each run containing 1,000 episodes."
In this case, do you keep the 5*1000 episodes fixed when you run the baseline methods (such as PL[4] and SG-One[28]) and ablation studies? And in the 2-way setting, is "person" class always held as in PL[4]?
Thanks for sharing your work! I am a little confused about the metrics. It is said in the code that the mean IoU binary regards all class as one foreground class. But why the results between mean IoU and mean IoU binary are different in 1-way setting? I mean there is only one class which is foreground class,right?
Have you ever tried any other backbone to evalute the effect of yur methods? what about the effect?
Hi Kaixin,
May I know why the batch size is set to 1? Did you try larger batch size in your experiments, and will it have any effect?
Thanks.
Hello, thanks for publicly sharing the code of your paper. I was going over the config.py file to see if the default setting match what is reported in the paper, but noticed that n_runs = 5
. Shouldn't it be set to 4 as the reported accuracy is apparently for 4 splits ?
I am under the assumption that for each run, the number reported in the paper (e.g. split-1 for run 0, split-2 for run 1 etc.) is the meanIoU
as printed in the outputs of test.py. And by setting label_sets = 0
, one should be able to run the inference on the same splits as reported in the paper ( and seemingly by setting n_runs = 4
as commented in the above. I Would appreciate some clarification.
I would like to ask the author, what are the tags in scribbleauauto generated by
Hi, kaixin:
I can't find 'SegmentationClassAug' and file 'segmentation' in PASCAL or SBD.
PASCAL: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html
SBD: http://home.bharathh.info/pubs/codes/SBD/download.html
Where did you download it and what is the source of this dataset?
jachymchen
Thanks dear for publishing the code.
I want to try test.py but when I run it, I get this error:
INFO - PANet - Running command 'main'
INFO - PANet - Started run with ID "2"
INFO - main - ###### Create model ######
ERROR - PANet - Failed after 0:00:13!
Traceback (most recent calls WITHOUT Sacred internals):
File "/content/drive/MyDrive/sofa/PANet/test.py", line 42, in main
model.load_state_dict(torch.load(_config['snapshot'], map_location='cpu'))
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 271, in _open_file_like
return open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 252, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './runs/PANet_VOC_sets_0_1way_1shot[train]/1/snapshots/30000.pth'
Hello, and first of all thank you for the open source code.
I believe that the size of fore_mask
and back_mask
is Wa x Sh x B x H x W
, not Wa x Sh x B x H' x W'
. Please correct me if I'm wrong.
Lines 64 to 67 in 39815f7
Hello,I have such a warning during training, but I don't know what to modify?
”UserWarning: ndarray is defined by reference to an object we do not know how to serialize. A deep copy is serialized instead, breaking memory aliasing.
warnings.warn(msg)“
Lines 155 to 159 in c248b25
In this sample operation, the 'k' is set to be 'n_elements'. I think the 'k' here should be the number of class, but 'n_elements' is the number of images per class. Is it wrong?
Hello,
Thank you for your work.
I would like to test your model on the other validation settings.
Could you provide the trained weights of your model?
Ahyun Seo
Hi, thanks for your great works!!!
In align_loss, i find for a certain sample, it is just a Binary classification problem [fore_ground and back_ground].
So...Maybe we can replace the cross_entropy for align_loss with the binary_loss??
please point out if my understanding is biased.
sorry to bother. :)
First of all, thank you very much for your open source code.
And I have a question for you. The mean-IoU of 1-way 1-shot segmentation on PASCAL-5i dataset in your paper is 48.1 , but the result of my training with your code is 41.9 .Do you have any suggestions or tricks for me? Look forward to your reply.
Hi Kaixin, thank you for the source code. May I ask if you have any suggestion to download the images dataset? I tried to download them from the source link (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar) but its site can never be reached and it got timed out all the times. I tried to search for other sources but they all reload back to the original link.
I know from the link you provided I can have all the masks.
Sorry if this question troubles you as this is my very first time working on this dataset. Thanks in advance.
Hi Kaixin. Thank for sharing your great code.
I want to use your model to test MRI data (nii).
How can i put MRI data into your model?
I am very interested in the PANet you made, but I want to know how to visualize the segmentation results. In the code, I did not find the visualization code of the segmentation results, thank you
Hello,
Thank you for your nice work.
I am new in this area, my question is how can I obtain the qualitative results as in your paper Figure 3/4 ?
Thank you,
Alex ~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.