Giter Site home page Giter Site logo

panet's People

Contributors

2448845600 avatar kaixin96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

panet's Issues

Model Computation Time

I evaluate the processing time on inference with the line of code as below.
'''
time_s = time.time()
query_pred, _ = model(support_images, support_fg_mask, support_bg_mask,
query_images)
time_elapsed = time.time() - time_s
'''
I just found that the FPS of the model can above 30, is that really?
Because I though instance segmentation usually time-consuming.

Question about test image size

Hi Kaixin,
It seems that during both training and testing both the support and query images are resized to a fixed size (e.g. [417, 417]). However, in many few-shot segmentation works, the segmentation mask output is resized to the original image resolution for evaluation. How can I get the original query images and the corresponding ground-truth masks during testing phase?
I also output some unused key-value pairs in the dataloader dictionary, but they seem to be of the same fixed shape:
sample_batched['support_images_t'][0][0].shape = torch.Size([1, 3, 417, 417]) sample_batched['query_images_t'][0].shape = torch.Size([1, 3, 417, 417]) sample_batched['query_masks'][0][0].shape = torch.Size([1, 1, 417, 417]) sample_batched['query_labels'][0].shape = torch.Size([1, 417, 417])

The accuracy is lower

Thanks for sharing your code. It is a really great work! But the accuracy rate I got based on this code is lower than that reported in the paper. Can you share the hyperparameter settings on the VOC2012 dataset?

Train PANet on my own dataset

Hi there,
I'm working on IDD dataset for performing semantic segmentation on it. I find PANet is a quite good model to carry out few shot learning for semantic segmentation and is suitable for my task. It would be really helpful if you could guide on how can I train PANet on my dataset (IDD dataset).

Thanks.

**_config in train.py optimizer definition

Hello
Why in train.py's optimizer definition **_config is used instead of _config?
optimizer = torch.optim.SGD(model.parameters(), **_config['optim'])
This caused error when I tried to run train.py
optimizer = torch.optim.SGD(model.parameters(), **_config['optim']) KeyError: 'optim'
Thx for the help!

Request for environment.yml file

Hi Kaixin,

Thank you for releasing the code. It shows the transparency of your work. Can you share the environment.yml file for PANet? It would be a great help!

Question about "scribble"

Hi,
thanks for sharing the code.

I checked the files in the ScribbleAugAuto folder. The annotations here are just lines and cannot form closed graphics. How can I generate annotations similar to Figure 6?

If not, what is the value of scribble_dilation in the config file during the test? It seems that the performance is not ideal (in my test) when the value is 0. Did I neglect something?

Look forward to your reply.

Regards

Lower Accuracy With ResNet BackBone

Hello. I found that when I switched the backbone to resnet with last two layer replaced by dilation2 and dilation4 conv with all other settings unchanged, the accuracy seems much lower than vgg , should I change some hyper paramters , like the learning rate or training steps ? thank you

Performance of Resnet50

Hello, thanks for open source your nice work! Did you explore other encoder backbones? For example, resnet50. I used resnet50 as encoder but got worse mIOU, specifically 40.7%. How do you think of this?

Question about the Dataset

Hi,

I was looking through the code but couldn't figure out what "scribble" indicates.
From dataloaders/customized.py
Line 103
support_scribbles = [[paired_sample[cumsum_idx[i] + j]['scribble'] for j in range(n_shots)]
for i in range(n_ways)]

Could you clarify what the support_scribbles mean? Also, what is "image_t"
From dataloaders/customized.py
Line 93
support_images_t = [[paired_sample[cumsum_idx[i] + j]['image_t'] for j in range(n_shots)]
for i in range(n_ways)]

Is it the segmentation target mask of the given image? While, label indicates the class_index of the mask?

Thanks in advance,

Hardware information

Hi! I am interesting to your awesome work but having limited hardware resource.
May I ask you about CPU and GPU(s) information? Thanks!

Problems about visualizing the pred output

Hi, thanks for the source code, it really helps me on my on-going project.

Can you give me some hints about how to visualize the prediction from the network? The results in the paper looks good but I am also interested in what package is used to visualize the output?

I was using Tensorboard to visualize the the input images and masks, and it works fine. But for prediction made by the network, I had little to no clue if I should do some post-processing or not even if the paper said no post-processing and decoding is needed. The output looks like some heatmap under greyscale. (not black and white mask-like stuff)

BTW, the training loss curve looks exactly like the paper said and the test result is also close to the paper's results.

Any update plan about the Dataloaders and VRAM usage

Hi, it has been very helpful to my on going project.
However, is there any on-going plan that the dataloader could be simplified? Transfering the model to other dataset is really inconvenient.
Another problem: increasing the batchsize dramatically increases the VRAM usage. Not sure whether this is normal or not. And if it is normal, is there anyway to do some optimization?

mutli-GPU training

@kaixin96 How can I run your code with two GPU cards when the batch_size is set to 4 for fast training ? Could you provide the corresponding modifications and run command ?

sacred.utils.ConfigAddedError

Thanks for your work.

I ran your code and got the following error. Could you give me some clue to solve it?
sacred.utils.ConfigAddedError: Added new config entry that is not used anywhere
Conflicting configuration values:
model.sparse=False

'SBD' folder -sbd_instance_process.py

Hi Kaixin sorry to trouble u. May i ask what is your 'SBD' folder and how i can get it? Sorry if i missed out something. From my undertanding it seemed u already have the SBD folder in place and process them and save them toSegmentationClassAug folder. Thank you very much.

sbd_instance_process.py

and transform it from .mat to .png. Then transformed
images will be saved in VOC data folder. The name of
the new folder is "SegmentationObjectAug"

Cuda out of memory error

Hi @kaixin96 I was trying to run this code, and I ended up with a out of memory error:
RuntimeError: CUDA out of memory. Tried to allocate 340.00 MiB (GPU 0; 1.96 GiB total capacity; 946.64 MiB already allocated; 23.38 MiB free; 1.33 GiB reserved in total by PyTorch)

I noticed that the batch size is already 1 so I cannot reduce it further, is there a workaround?

Or is there any way I can use an interactive shell like colab to run it? because sacred doesnt support interactive environments afaik.

Your help would be greatly appreciated.

The performence is bad in small target

Hi@kaixin,
Thank you for sharing the code and it is helpful for me, but the performence is bad in my datasets that have small target as follow:
裂纹
There is a crack defect in this picture, the result of miou only is about 0.1 and the visualization is very bad. Do you any suggestions obout this? Thank you very much!
BTW: The size of image is 1488*1488 and the format is .bmp.

1-shot mIOU is far less than the paper

Thanks for your wonder work!
I have a question about training code.
I trained and tested without any modification, i got only 41 miou.
What is the problem?

INFO - main - ----- Final Result -----
INFO - main - classIoU mean: [0.56933684 0.50324984 0.51414957 0.32956392 0.16224402]
INFO - main - classIoU std: [0.02441738 0.02545722 0.02529587 0.01004774 0.01762301]
INFO - main - meanIoU mean: 0.41570883715177515
INFO - main - meanIoU std: 0.005918026601930923
INFO - main - classIoU_binary mean: [0.88759253 0.43259521]
INFO - main - classIoU_binary std: [0.00280747 0.00433909]
INFO - main - meanIoU_binary mean: 0.66009386980461
INFO - main - meanIoU_binary std: 0.0024477935615900745
INFO - PANet - Completed after 0:14:51

Question about datasets.

Hi @kaixin96
I can't found the JPEGImage folder, as the error information shows:

FileNotFoundError: [Errno 2] No such file or directory: './data/VOCdevkit/VOC2012/JPEGImages/2011_000997.jpg'

Where can I find the whole datasets?
Thanks.

Why the query label pixel values are modified?

Hi,
I read the paper and code but not sure why the query label's values are changed here:

query_labels_tmp = [torch.zeros_like(x) for x in query_labels]

for i, query_label_tmp in enumerate(query_labels_tmp):
query_label_tmp[query_labels[i] == 255] = 255

for j in range(n_ways):
query_labels_tmp[query_labels[i] == class_ids[j]] = j + 1

My other question is why query_label_tmp is returned instead of query_labels_tmp?

How can I train it on my data?

Sir, I'd want to use the PANet to do few-shot segmentation on a dataset I've gathered. I looked over the code but couldn't figure out how to train/test the model with an external dataset. Could you please include a script for this?

Question about random sampling for evaluation

Thank you for your great work.

I have a question regarding the evaluation. As mentioned in the paper, the evaluation is computed by "average the results from 5 runs with different random seeds, each run containing 1,000 episodes."
In this case, do you keep the 5*1000 episodes fixed when you run the baseline methods (such as PL[4] and SG-One[28]) and ablation studies? And in the 2-way setting, is "person" class always held as in PL[4]?

Some questions about dataloader


self.indices = [[(dataset_idx, random.randrange(self.n_data[dataset_idx]))

for dataset_idx in random.sample(range(self.n_datasets),

k=n_elements)]

for i in range(max_iters)]

In this sample operation, the 'k' is set to be 'n_elements'. I think the 'k' here should be the number of class, but 'n_elements' is the number of images per class. Is it wrong?

Question about meanIoU binary

Thanks for sharing your work! I am a little confused about the metrics. It is said in the code that the mean IoU binary regards all class as one foreground class. But why the results between mean IoU and mean IoU binary are different in 1-way setting? I mean there is only one class which is foreground class,right?

Question about batch size

Hi Kaixin,

May I know why the batch size is set to 1? Did you try larger batch size in your experiments, and will it have any effect?

Thanks.

Regarding the Data Splits

Hello, thanks for publicly sharing the code of your paper. I was going over the config.py file to see if the default setting match what is reported in the paper, but noticed that n_runs = 5. Shouldn't it be set to 4 as the reported accuracy is apparently for 4 splits ?

I am under the assumption that for each run, the number reported in the paper (e.g. split-1 for run 0, split-2 for run 1 etc.) is the meanIoU as printed in the outputs of test.py. And by setting label_sets = 0 , one should be able to run the inference on the same splits as reported in the paper ( and seemingly by setting n_runs = 4 as commented in the above. I Would appreciate some clarification.

No such file or directory: './runs/PANet_VOC_sets_0_1way_1shot_[train]/1/snapshots/30000.pth

Thanks dear for publishing the code.

I want to try test.py but when I run it, I get this error:

INFO - PANet - Running command 'main'
INFO - PANet - Started run with ID "2"
INFO - main - ###### Create model ######
ERROR - PANet - Failed after 0:00:13!
Traceback (most recent calls WITHOUT Sacred internals):
File "/content/drive/MyDrive/sofa/PANet/test.py", line 42, in main
model.load_state_dict(torch.load(_config['snapshot'], map_location='cpu'))
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 271, in _open_file_like
return open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 252, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './runs/PANet_VOC_sets_0_1way_1shot
[train]/1/snapshots/30000.pth'

Minor typo in comments

Hello, and first of all thank you for the open source code.
I believe that the size of fore_mask and back_mask is Wa x Sh x B x H x W, not Wa x Sh x B x H' x W'. Please correct me if I'm wrong.

PANet/models/fewshot.py

Lines 64 to 67 in 39815f7

fore_mask = torch.stack([torch.stack(way, dim=0)
for way in fore_mask], dim=0) # Wa x Sh x B x H' x W'
back_mask = torch.stack([torch.stack(way, dim=0)
for way in back_mask], dim=0) # Wa x Sh x B x H' x W'

UserWarning

Hello,I have such a warning during training, but I don't know what to modify?
”UserWarning: ndarray is defined by reference to an object we do not know how to serialize. A deep copy is serialized instead, breaking memory aliasing.
warnings.warn(msg)“

Some questions about the dataloader

PANet/dataloaders/common.py

Lines 155 to 159 in c248b25

else:
self.indices = [[(dataset_idx, random.randrange(self.n_data[dataset_idx]))
for dataset_idx in random.sample(range(self.n_datasets),
k=n_elements)]
for i in range(max_iters)]

In this sample operation, the 'k' is set to be 'n_elements'. I think the 'k' here should be the number of class, but 'n_elements' is the number of images per class. Is it wrong?

Trained weights of the model

Hello,

Thank you for your work.

I would like to test your model on the other validation settings.

Could you provide the trained weights of your model?

Ahyun Seo

Align loss

Hi, thanks for your great works!!!
In align_loss, i find for a certain sample, it is just a Binary classification problem [fore_ground and back_ground].
So...Maybe we can replace the cross_entropy for align_loss with the binary_loss??

please point out if my understanding is biased.

sorry to bother. :)

Can't train the accuracy of the paper

First of all, thank you very much for your open source code.
And I have a question for you. The mean-IoU of 1-way 1-shot segmentation on PASCAL-5i dataset in your paper is 48.1 , but the result of my training with your code is 41.9 .Do you have any suggestions or tricks for me? Look forward to your reply.

Pascal VOC 2012 images

Hi Kaixin, thank you for the source code. May I ask if you have any suggestion to download the images dataset? I tried to download them from the source link (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar) but its site can never be reached and it got timed out all the times. I tried to search for other sources but they all reload back to the original link.

I know from the link you provided I can have all the masks.

Sorry if this question troubles you as this is my very first time working on this dataset. Thanks in advance.

visualization of the segmentation results

I am very interested in the PANet you made, but I want to know how to visualize the segmentation results. In the code, I did not find the visualization code of the segmentation results, thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.