ziqi-jin / finetune-anything Goto Github PK

Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios

License: MIT License

Python 100.00%

computer-vision deep-learning segment-anything fine-tune

finetune-anything's People

Contributors

Stargazers

Watchers

Forkers

rane2021 susigo taylorhawkes wmpscc lxiu-yu keranli kaigelee lll9p guanshanjushi neverstoplearn guanshanjushia dylan8527 momo1986 biovbreed xie-muxi jie311 akring-creator zhaoxiaodong789 lifunudt sym330 zfb132 singhdavinderpa1 aliazak6 skyrookieyu ashen007 vuongvmu obiohagwu 2132660698 ander002 clxie caz-t blackboyzeus kimwoonggon ljqcn101 jamesjg marsbzp ivovdwerfdeltares xdedss 25benjaminli hamehrabi jzwei023 treestreamymw liruofu mcx mrcrapulence ericakcc llv22 tylin7111095022 keithyh yhc-777

finetune-anything's Issues

about TorchVOC finetuning results

Hello,
I ran your codes on TorchVOC dataset.

After 200 iterations, i got the following results.

It seems mIOU is very low. Did you get similar results?
I used your default configuration file.

Thanks,
Liya

No prompts inputted into ExtendSAM

Thank you very much for your meaningful work. I have briefly debugged the training of semantic segmentation and found that no prompt inputs were sent to ExtendSAM. My understanding is that it may be appropriate to use 32x32 grid points as a prompt and classify the mask. Or perhaps I have misunderstood the purpose of your prompt inputs, and I'm curious about why you used them.

期待作者的检测任务代码开源

Why the image_set in the config file is train for val?

Hi, I have a few questions about the config file.

Why the image_set in the config file is train for val? Will this cause the validation benchmark to run on the training set instead of the validation set?
Did the mIoU 78 you got based on the current version of FA? If not, can you update the code and provide some insights about the new version of FA?

Thanks.

数据集问题，使用的数据集格式是什么

only support VOC dataset?

Hi. Do you only support VOC dataset? Could you open some more formats of datasets. And could you please give some inference results if you have?

Thank you so much for the contribution.

Forecast!

Thank you for your warm answer! After finetuning, how do you successfully predict samples?

multi-instance segment

hi, whether to support a fine-tuning of multi-instance segment

语义分割推理代码何时发布？

迫不及待想试试！

微调sam与从头训练小模型哪个更好？

如题，如果sam在某些数据上表现不好，大概率是sam的数据集里面不包含某些场景。微调sam也需要不少资源，单纯的微调sam能解决数据集之间域的差别吗？会比从头训练一个小一点的模型好吗？

need test code

Finetuning with Customized Dataset

It's not a issue but I want to do a segmentation on point clouds and I have a .laz files for finetuning sam model. So, basically I should create a customized dataset for that but couldn't understand it properly. Can you explain it step-by-step how to finetune with customized dataset?

请问有支持MobileSAM的计划吗？

MobileSAM链接如下：
https://github.com/ChaoningZhang/MobileSAM

semantic segmentation

How to predict N classes in SemSegHead？

test code

Very eager for the author's test code

Semantic Segmentation (Customized Dataset)

Thank you very much for your work 👍
I want to train with my dataset. So can you tell me the guidelines of dataset in tree form?
I'm not sure what kind of file I need and what kind of structure.

Performance

Hello,

Thanks for this amazing repo! I was wondering how good of an MIOU you got on the PASCAL validation set. I trained for about 50,000 iterations and got an MIOU of about 4.12. What performance should I be expecting?

Update README to show that mobileSAM is supported

Based on #23 being closed, I assume the README should be updated to reflect the change.

https://github.com/ziqi-jin/finetune-anything#support-plan

After the fine-tuning is SamAutomaticMaskGenerator applicable

Hi,

very interesting work. After fine-tuning is the SamAutomaticMaskGenerator applicable
to generate masks similar to the original SAM.

thx

what's class_names?

Thanks for sharing this material first.

According to this line(https://github.com/ziqi-jin/finetune-anything/blob/main/extend_sam/runner.py#L97),
class_names = self.val_loader.dataset.class_names.
But I can't find class_names in config file.

Is it a list consisting of 19 semantic labels? Which labels does it have?
Could you please give an example of class_names?

Instance segmentation

Looking forward to instance segmentation!

多卡训练

作者您好，我在进行多卡训练时，指定了两张显卡，但是只有第一张显卡显存会上升，另一张不变，然后就爆显存了，请问该怎么解决呢？

mIoU

I would like to know, from anyone who has used the available script for a semantic seg task.
What was the best mIoU you could obtain. And using how many epochs/batch size.

Further details

Hi, will there be a paper for this or other documents describing all the changes made to the original SAM architecture?

Mutiple BBOX

Can I finetune the model using multiple bounding boxes for the same Image
(eg. in case of KITTI Dataset if I want to finetune the model to segment cars from background and there are multiple instances of cars in the background)
Is there a way I can provide the bounding boxes for all cars in that frame as my prompt or should I iteratively give the bounding box and segmentation of every single instance of a car.

fine tuning sam

Hello team,

great work.

I had one doubt, during fine tuning, What is used as ground truth to calculate the loss against the masks generated by SAM with any prompt?

training of custom resolution images.

hey i want to fine tune the model on my custom dataset of images. the images and masks have 1k resolution and cannot be downscaled or resized otherswise important detaled will be lost, please anyone if you have any suggestion le me know. ;)

How can I fine tune the dataset annotated with labelme, which only has images and labels? Thank you.

About the loss function

Hello, can you explain the design principle of the loss function?

Why set the mask_scale=1

An interesting job, but I have some questions. Why set mask_scale=1 with multimask_output=True instead of mask_scale=0 with multimask_output=False here?

finetune-anything/extend_sam/mask_decoder_heads.py

Line 192 in 85a0658

    
           hyper_in_list.append(self.output_hypernetworks_mlps[i](mask_tokens_out[:, mask_scale, :]))

May I ask if there will be any updates to the test code recently?

Thank you very much for the work you provided, but I encountered some difficulties when loading pre-trained models for prediction and outputting visualization results. Will there be any update for the test code recently?

Implement EfficientSAM or MobileSAM

https://github.com/yformer/EfficientSAM
https://github.com/ChaoningZhang/MobileSAM

I am not too sure how to implement EfficientSAM, but I do think I know how to implement MobileSAM, by replacing the full VIT model with Tiny-VIT.

You can see how that is done on another repo here ZrrSkywalker/Personalize-SAM#35

And the code that I think needs changed https://github.com/ziqi-jin/finetune-anything/blob/main/extend_sam/segment_anything_ori/build_sam.py

Update code train

Please release training code !

RuntimeError: CUDA error: device-side assert triggered

我使用我的数据集来进行微调，但是出现了以下错误
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/class_labels
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/class_labels
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [503,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [504,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [476,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [477,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [608,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [609,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [610,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [611,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [612,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [613,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [614,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [576,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [577,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [602,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [603,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [604,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [605,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [606,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [607,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "/mnt/hdd0/jwc/代码仓/finetune-anything/train.py", line 45, in
runner.train(train_cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 62, in train
self._compute_loss(total_loss, loss_dict, masks_pred, labels, cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 127, in _compute_loss
loss_dict[item[0]] = tmp_loss.item()
RuntimeError: CUDA error: device-side assert triggered
我要是只使用一部分数据集就没有这个问题，但是当我将完整数据集进行训练的时候，就出现了上面的问题，我的标签是0-12,255，类别数写的13，忽略了255这个数值，因为我的代码能力并不是很强，作者的工作非常优秀，你能给我一些指导吗？帮我解决一下这个问题，希望作者有时间给予一个回复，谢谢！

Semantic Segmentation Results on VOC2012 is not good.

Hi, I use your codebase to do training on voc2012, but the mIoU and visualization results are really poor. Can you provide more details or pretrained checkpoints?

Finetune on custom task

Thanks for your excellent work.
I am trying to finetune the SAM model on the Salient Object Detection task. I implement my custom dataset based on BaseSemanticDataset by replacing my own dataset_dirand replace ce loss with BCEWithLogitLoss to fix some bug.
However, the loss seems strange. Do you have any idea of this?

iteration : 1, bce : -218.368408203125, total_loss : -109.1842041015625, time : 0
iteration : 3, bce : -2733.0740509033203, total_loss : -1366.5370254516602, time : 0
iteration : 5, bce : -67497.31298828125, total_loss : -33748.656494140625, time : 0
iteration : 7, bce : -19547019.328125, total_loss : -9773509.6640625, time : 0
iteration : 9, bce : -303343570888.0, total_loss : -151671785444.0, time : 0
iteration : 11, bce : -6.906153512921124e+19, total_loss : -3.453076756460562e+19, time : 0
iteration : 13, bce : -inf, total_loss : -inf, time : 0
iteration : 15, bce : nan, total_loss : nan, time : 0
iteration : 17, bce : nan, total_loss : nan, time : 0
iteration : 19, bce : nan, total_loss : nan, time : 0
iteration : 21, bce : nan, total_loss : nan, time : 0

segmentation!

Issuer:'MaskDecoder' object has no attribute 'iou_head_depth'. What is the cause of this problem?

Semantic Segmentation

Thanks for you great work. I wonder whether the code for semantic segmentation had been finished.

大佬，目标检测微调啥时候出来？（When you release the detection task? ）

如题

TypeError: '<' not supported between instances of 'NoneType' and 'int'

你好，我在训练自己的数据集时出现了这个错误，运行环境为python3.8，cuda版本是11.1，自己的数据集配置在BaseSemanticDataset也修改了，请问改怎么解决

GPU内存问题

您好！请问Fine-turn SAM 至少需要多少gb的gpu内存呢

Issue for images

Where is the implementation of 'BaseOptimizer' class?

完整代码

感谢您非常棒的工作，请问完整代码何时能够上传？

当我使用交叉熵loss进行训练我的自定义数据集时，出现了loss不下降的问题（During the training of my custom dataset using cross-entropy loss, I encountered the issue of the loss not decreasing）

感谢大佬非常优秀的工作
当我使用交叉熵loss进行训练时，loss一直不下降，而且mIoU一直是7左右，不知道有没有什么解决办法

Thank you for the compliment. When I train using cross-entropy loss, the loss consistently does not decrease, and the mIoU (mean Intersection over Union) remains around 7. I'm not sure if there is a solution to this issue."

Parallel Training

When training on 4 GPUs (Tesla K-80 11441MiB)
my bs = 4 and workers = 4
but I always end up with the error

RuntimeError: CUDA error: the launch timed out and was terminated (is there a work around or is it due to my compute limitations)

How to test a picture?

test code

File "/content/finetune-anything/datasets/init.py", line 1, in <module> from .detection import BaseDetectionDataset ModuleNotFoundError: No module named 'datasets.detection'

推理代码咧？

用自己的数据finetune之后想要可视化推理结果

I want to learn with my data, but I found an error.

Traceback (most recent call last):
File "D:\3. Develop\2.AI\segmentation_anything_yaml\train.py", line 32, in
train_dataset = get_dataset(train_cfg.dataset)
File "D:\segmentation_anything_yaml\datasets_init_.py", line 28, in get_dataset
return segment_datasets[name](**cfg.params, transform=transform, target_transform=target_transform)
TypeError: init() got an unexpected keyword argument 'root'

How can I resolve TypeError?

I think it's a version issue. Can you tell me your version?
my cuda is 11.3, cudnn is v8.2.0.53