Giter Site home page Giter Site logo

ziqi-jin / finetune-anything Goto Github PK

View Code? Open in Web Editor NEW
704.0 13.0 53.0 110 KB

Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios

License: MIT License

Python 100.00%
computer-vision deep-learning segment-anything fine-tune

finetune-anything's People

Contributors

zhaoxiaodong789 avatar ziqi-jin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

finetune-anything's Issues

about TorchVOC finetuning results

Hello,
I ran your codes on TorchVOC dataset.

After 200 iterations, i got the following results.
image

It seems mIOU is very low. Did you get similar results?
I used your default configuration file.

Thanks,
Liya

No prompts inputted into ExtendSAM

Thank you very much for your meaningful work. I have briefly debugged the training of semantic segmentation and found that no prompt inputs were sent to ExtendSAM. My understanding is that it may be appropriate to use 32x32 grid points as a prompt and classify the mask. Or perhaps I have misunderstood the purpose of your prompt inputs, and I'm curious about why you used them.

Why the image_set in the config file is train for val?

Hi, I have a few questions about the config file.

  1. Why the image_set in the config file is train for val? Will this cause the validation benchmark to run on the training set instead of the validation set?

  2. Did the mIoU 78 you got based on the current version of FA? If not, can you update the code and provide some insights about the new version of FA?

Thanks.

only support VOC dataset?

Hi. Do you only support VOC dataset? Could you open some more formats of datasets. And could you please give some inference results if you have?

Thank you so much for the contribution.

Forecast!

Thank you for your warm answer! After finetuning, how do you successfully predict samples?

微调sam与从头训练小模型哪个更好?

如题,如果sam在某些数据上表现不好,大概率是sam的数据集里面不包含某些场景。微调sam也需要不少资源,单纯的微调sam能解决数据集之间域的差别吗?会比从头训练一个小一点的模型好吗?

Finetuning with Customized Dataset

It's not a issue but I want to do a segmentation on point clouds and I have a .laz files for finetuning sam model. So, basically I should create a customized dataset for that but couldn't understand it properly. Can you explain it step-by-step how to finetune with customized dataset?

test code

Very eager for the author's test code

Semantic Segmentation (Customized Dataset)

Thank you very much for your work 👍
I want to train with my dataset. So can you tell me the guidelines of dataset in tree form?
I'm not sure what kind of file I need and what kind of structure.

Performance

Hello,

Thanks for this amazing repo! I was wondering how good of an MIOU you got on the PASCAL validation set. I trained for about 50,000 iterations and got an MIOU of about 4.12. What performance should I be expecting?

多卡训练

作者您好,我在进行多卡训练时,指定了两张显卡,但是只有第一张显卡显存会上升,另一张不变,然后就爆显存了,请问该怎么解决呢?

mIoU

I would like to know, from anyone who has used the available script for a semantic seg task.
What was the best mIoU you could obtain. And using how many epochs/batch size.

Further details

Hi, will there be a paper for this or other documents describing all the changes made to the original SAM architecture?

Mutiple BBOX

Can I finetune the model using multiple bounding boxes for the same Image
(eg. in case of KITTI Dataset if I want to finetune the model to segment cars from background and there are multiple instances of cars in the background)
Is there a way I can provide the bounding boxes for all cars in that frame as my prompt or should I iteratively give the bounding box and segmentation of every single instance of a car.

fine tuning sam

Hello team,

great work.

I had one doubt, during fine tuning, What is used as ground truth to calculate the loss against the masks generated by SAM with any prompt?

training of custom resolution images.

hey i want to fine tune the model on my custom dataset of images. the images and masks have 1k resolution and cannot be downscaled or resized otherswise important detaled will be lost, please anyone if you have any suggestion le me know. ;)

RuntimeError: CUDA error: device-side assert triggered

我使用我的数据集来进行微调,但是出现了以下错误
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/train/class_labels
img_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/images, ann_folder_name: /mnt/hdd0/jwc/代码仓/finetune-anything/Datasetv1/train-and-val/val/class_labels
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [503,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [504,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [476,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [477,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [608,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [609,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [610,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [611,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [612,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [613,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [614,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [576,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [577,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [602,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [603,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [604,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [605,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [606,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [11,0,0], thread: [607,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "/mnt/hdd0/jwc/代码仓/finetune-anything/train.py", line 45, in
runner.train(train_cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 62, in train
self._compute_loss(total_loss, loss_dict, masks_pred, labels, cfg)
File "/mnt/hdd0/jwc/代码仓/finetune-anything/extend_sam/runner.py", line 127, in _compute_loss
loss_dict[item[0]] = tmp_loss.item()
RuntimeError: CUDA error: device-side assert triggered
我要是只使用一部分数据集就没有这个问题,但是当我将完整数据集进行训练的时候,就出现了上面的问题,我的标签是0-12,255,类别数写的13,忽略了255这个数值,因为我的代码能力并不是很强,作者的工作非常优秀,你能给我一些指导吗?帮我解决一下这个问题,希望作者有时间给予一个回复,谢谢!

Finetune on custom task

Thanks for your excellent work.
I am trying to finetune the SAM model on the Salient Object Detection task. I implement my custom dataset based on BaseSemanticDataset by replacing my own dataset_dirand replace ce loss with BCEWithLogitLoss to fix some bug.
However, the loss seems strange. Do you have any idea of this?

iteration : 1, bce : -218.368408203125, total_loss : -109.1842041015625, time : 0
iteration : 3, bce : -2733.0740509033203, total_loss : -1366.5370254516602, time : 0
iteration : 5, bce : -67497.31298828125, total_loss : -33748.656494140625, time : 0
iteration : 7, bce : -19547019.328125, total_loss : -9773509.6640625, time : 0
iteration : 9, bce : -303343570888.0, total_loss : -151671785444.0, time : 0
iteration : 11, bce : -6.906153512921124e+19, total_loss : -3.453076756460562e+19, time : 0
iteration : 13, bce : -inf, total_loss : -inf, time : 0
iteration : 15, bce : nan, total_loss : nan, time : 0
iteration : 17, bce : nan, total_loss : nan, time : 0
iteration : 19, bce : nan, total_loss : nan, time : 0
iteration : 21, bce : nan, total_loss : nan, time : 0

segmentation!

Issuer:'MaskDecoder' object has no attribute 'iou_head_depth'. What is the cause of this problem?

Semantic Segmentation

Thanks for you great work. I wonder whether the code for semantic segmentation had been finished.

GPU内存问题

您好!请问Fine-turn SAM 至少需要多少gb的gpu内存呢

完整代码

感谢您非常棒的工作,请问完整代码何时能够上传?

当我使用交叉熵loss进行训练我的自定义数据集时,出现了loss不下降的问题(During the training of my custom dataset using cross-entropy loss, I encountered the issue of the loss not decreasing)

感谢大佬非常优秀的工作
当我使用交叉熵loss进行训练时,loss一直不下降,而且mIoU一直是7左右,不知道有没有什么解决办法
image
Thank you for the compliment. When I train using cross-entropy loss, the loss consistently does not decrease, and the mIoU (mean Intersection over Union) remains around 7. I'm not sure if there is a solution to this issue."

Parallel Training

When training on 4 GPUs (Tesla K-80 11441MiB)
my bs = 4 and workers = 4
but I always end up with the error

RuntimeError: CUDA error: the launch timed out and was terminated (is there a work around or is it due to my compute limitations)

I want to learn with my data, but I found an error.

Traceback (most recent call last):
File "D:\3. Develop\2.AI\segmentation_anything_yaml\train.py", line 32, in
train_dataset = get_dataset(train_cfg.dataset)
File "D:\segmentation_anything_yaml\datasets_init_.py", line 28, in get_dataset
return segment_datasets[name](**cfg.params, transform=transform, target_transform=target_transform)
TypeError: init() got an unexpected keyword argument 'root'

How can I resolve TypeError?
a

I think it's a version issue. Can you tell me your version?
my cuda is 11.3, cudnn is v8.2.0.53

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.