ziqi-jin / finetune-anything Goto Github PK
View Code? Open in Web Editor NEWFine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios
License: MIT License
Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios
License: MIT License
Thank you very much for your meaningful work. I have briefly debugged the training of semantic segmentation and found that no prompt inputs were sent to ExtendSAM. My understanding is that it may be appropriate to use 32x32 grid points as a prompt and classify the mask. Or perhaps I have misunderstood the purpose of your prompt inputs, and I'm curious about why you used them.
期待作者的检测任务代码开源
Hi, I have a few questions about the config file.
Why the image_set in the config file is train for val? Will this cause the validation benchmark to run on the training set instead of the validation set?
Did the mIoU 78 you got based on the current version of FA? If not, can you update the code and provide some insights about the new version of FA?
Thanks.
Hi. Do you only support VOC dataset? Could you open some more formats of datasets. And could you please give some inference results if you have?
Thank you so much for the contribution.
Thank you for your warm answer! After finetuning, how do you successfully predict samples?
hi, whether to support a fine-tuning of multi-instance segment
迫不及待想试试!
如题,如果sam在某些数据上表现不好,大概率是sam的数据集里面不包含某些场景。微调sam也需要不少资源,单纯的微调sam能解决数据集之间域的差别吗?会比从头训练一个小一点的模型好吗?
It's not a issue but I want to do a segmentation on point clouds and I have a .laz files for finetuning sam model. So, basically I should create a customized dataset for that but couldn't understand it properly. Can you explain it step-by-step how to finetune with customized dataset?
MobileSAM链接如下:
https://github.com/ChaoningZhang/MobileSAM
How to predict N classes in SemSegHead?
Very eager for the author's test code
Thank you very much for your work 👍
I want to train with my dataset. So can you tell me the guidelines of dataset in tree form?
I'm not sure what kind of file I need and what kind of structure.
Hello,
Thanks for this amazing repo! I was wondering how good of an MIOU you got on the PASCAL validation set. I trained for about 50,000 iterations and got an MIOU of about 4.12. What performance should I be expecting?
Based on #23 being closed, I assume the README should be updated to reflect the change.
Hi,
very interesting work. After fine-tuning is the SamAutomaticMaskGenerator applicable
to generate masks similar to the original SAM.
thx
Thanks for sharing this material first.
According to this line(https://github.com/ziqi-jin/finetune-anything/blob/main/extend_sam/runner.py#L97),
class_names = self.val_loader.dataset.class_names.
But I can't find class_names in config file.
Is it a list consisting of 19 semantic labels? Which labels does it have?
Could you please give an example of class_names?
Looking forward to instance segmentation!
作者您好,我在进行多卡训练时,指定了两张显卡,但是只有第一张显卡显存会上升,另一张不变,然后就爆显存了,请问该怎么解决呢?
I would like to know, from anyone who has used the available script for a semantic seg task.
What was the best mIoU you could obtain. And using how many epochs/batch size.
Hi, will there be a paper for this or other documents describing all the changes made to the original SAM architecture?
Can I finetune the model using multiple bounding boxes for the same Image
(eg. in case of KITTI Dataset if I want to finetune the model to segment cars from background and there are multiple instances of cars in the background)
Is there a way I can provide the bounding boxes for all cars in that frame as my prompt or should I iteratively give the bounding box and segmentation of every single instance of a car.
Hello team,
great work.
I had one doubt, during fine tuning, What is used as ground truth to calculate the loss against the masks generated by SAM with any prompt?
hey i want to fine tune the model on my custom dataset of images. the images and masks have 1k resolution and cannot be downscaled or resized otherswise important detaled will be lost, please anyone if you have any suggestion le me know. ;)
How can I fine tune the dataset annotated with labelme, which only has images and labels? Thank you.
Hello, can you explain the design principle of the loss function?
An interesting job, but I have some questions. Why set mask_scale=1 with multimask_output=True instead of mask_scale=0 with multimask_output=False here?
Thank you very much for the work you provided, but I encountered some difficulties when loading pre-trained models for prediction and outputting visualization results. Will there be any update for the test code recently?
https://github.com/yformer/EfficientSAM
https://github.com/ChaoningZhang/MobileSAM
I am not too sure how to implement EfficientSAM, but I do think I know how to implement MobileSAM, by replacing the full VIT model with Tiny-VIT.
You can see how that is done on another repo here ZrrSkywalker/Personalize-SAM#35
And the code that I think needs changed https://github.com/ziqi-jin/finetune-anything/blob/main/extend_sam/segment_anything_ori/build_sam.py
Please release training code !
我使用我的数据集来进行微调,但是出现了以下错误
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/class_labels
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/class_labels
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [503,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [504,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [476,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [477,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [608,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [609,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [610,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [611,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [612,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [613,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [614,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [576,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [577,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [602,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [603,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [604,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [605,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [606,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [607,0,0] Assertion t >= 0 && t < n_classes
failed.
Traceback (most recent call last):
File "/mnt/hdd0/jwc/代码仓/finetune-anything/train.py", line 45, in
runner.train(train_cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 62, in train
self._compute_loss(total_loss, loss_dict, masks_pred, labels, cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 127, in _compute_loss
loss_dict[item[0]] = tmp_loss.item()
RuntimeError: CUDA error: device-side assert triggered
我要是只使用一部分数据集就没有这个问题,但是当我将完整数据集进行训练的时候,就出现了上面的问题,我的标签是0-12,255,类别数写的13,忽略了255这个数值,因为我的代码能力并不是很强,作者的工作非常优秀,你能给我一些指导吗?帮我解决一下这个问题,希望作者有时间给予一个回复,谢谢!
Hi, I use your codebase to do training on voc2012, but the mIoU and visualization results are really poor. Can you provide more details or pretrained checkpoints?
Thanks for your excellent work.
I am trying to finetune the SAM model on the Salient Object Detection task. I implement my custom dataset based on BaseSemanticDataset
by replacing my own dataset_dir
and replace ce loss with BCEWithLogitLoss to fix some bug.
However, the loss seems strange. Do you have any idea of this?
iteration : 1, bce : -218.368408203125, total_loss : -109.1842041015625, time : 0
iteration : 3, bce : -2733.0740509033203, total_loss : -1366.5370254516602, time : 0
iteration : 5, bce : -67497.31298828125, total_loss : -33748.656494140625, time : 0
iteration : 7, bce : -19547019.328125, total_loss : -9773509.6640625, time : 0
iteration : 9, bce : -303343570888.0, total_loss : -151671785444.0, time : 0
iteration : 11, bce : -6.906153512921124e+19, total_loss : -3.453076756460562e+19, time : 0
iteration : 13, bce : -inf, total_loss : -inf, time : 0
iteration : 15, bce : nan, total_loss : nan, time : 0
iteration : 17, bce : nan, total_loss : nan, time : 0
iteration : 19, bce : nan, total_loss : nan, time : 0
iteration : 21, bce : nan, total_loss : nan, time : 0
Issuer:'MaskDecoder' object has no attribute 'iou_head_depth'. What is the cause of this problem?
Thanks for you great work. I wonder whether the code for semantic segmentation had been finished.
如题
您好!请问Fine-turn SAM 至少需要多少gb的gpu内存呢
感谢您非常棒的工作,请问完整代码何时能够上传?
When training on 4 GPUs (Tesla K-80 11441MiB)
my bs = 4 and workers = 4
but I always end up with the error
RuntimeError: CUDA error: the launch timed out and was terminated (is there a work around or is it due to my compute limitations)
test code
用自己的数据finetune之后想要可视化推理结果
Traceback (most recent call last):
File "D:\3. Develop\2.AI\segmentation_anything_yaml\train.py", line 32, in
train_dataset = get_dataset(train_cfg.dataset)
File "D:\segmentation_anything_yaml\datasets_init_.py", line 28, in get_dataset
return segment_datasets[name](**cfg.params, transform=transform, target_transform=target_transform)
TypeError: init() got an unexpected keyword argument 'root'
I think it's a version issue. Can you tell me your version?
my cuda is 11.3, cudnn is v8.2.0.53
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.