camenduru / grounded-segment-anything-colab Goto Github PK

Grounding DINO with Segment Anything & Stable Diffusion colab

License: The Unlicense

Jupyter Notebook 100.00%

colab colab-notebook colaboratory inpaint inpainting segment segmentation stable-diffusion

grounded-segment-anything-colab's Introduction

🐣 Please follow me for new updates https://twitter.com/camenduru
🔥 Please join our discord server https://discord.gg/k5BwmmvJJU
🥳 Please join my patreon community https://patreon.com/camenduru

🚦 WIP 🚦

🦒 Colab

Colab	Info
	grounded-segment-anything-colab runwayml/stable-diffusion-inpainting
🚦 WIP 🚦	grounded-segment-anything-custom Custom Inpainting Model maybe only 16-bit inpainting diffuser models are compatible with Free T4 😐 16-bit models ckpt/dreamlike-diffusion-1.0-inpainting runwayml/stable-diffusion-inpainting ckpt/f222-inpainting ckpt/realistic_vision_inpainting ckpt/SS_0.15_x_protogen-inpainting ckpt/PhotoMerge-inpainting ckpt/AniMerge-inpainting

Main Repo

https://github.com/IDEA-Research/Grounded-Segment-Anything

Paper

https://arxiv.org/abs/2304.02643
https://arxiv.org/abs/2303.05499

Tutorial

https://www.youtube.com/watch?v=A7x513Ah1Zk

Output

.	.

grounded-segment-anything-colab's People

Contributors

Stargazers

Watchers

Forkers

soxunlocks aguusxdxd2 mazazama70 zzguy216 techthiyanes kcalm-f justinjing warlock0805 theicehole j-as-mine smallmickey tonywhite11 shitoudidi g-force78 thanhtd91 cvcuiwei

grounded-segment-anything-colab's Issues

Slow inference on Google Colab A100 GPU

Hi, thanks for this repo!

I am getting super slow inference, compared to Grounded-SAM HF demo, ex:

20 secs on T4 GPU
https://huggingface.co/spaces/IDEA-Research/Grounded-SAM

200sec on A100 GPU (10x times faster than T4)
https://github.com/camenduru/grounded-segment-anything-colab/

Did you also experience this?

Gradio interface runs indefinitely with no output

Nothing shows in the notebook either it just seems to be hanging in limbo

Error while using Inpainting

/usr/local/lib/python3.9/dist-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint.StableDiffusionInpaintPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
0% 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/content/Grounded-Segment-Anything/gradio_app.py", line 257, in run_grounded_sam
image = pipe(prompt=inpaint_prompt, image=image_pil, mask_image=mask_pil).images[0]
File "/usr/local/lib/python3.9/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 854, in call
latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 120 for tensor number 2 in the list.

No such file or directory: '/home/ecs-user/download/sam_vit_h_4b8939.pth'

Model loaded from /root/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/6fb3434d67548d71747b1ab3a32051d27a30c71f/groundingdino_swint_ogc.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py:830: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/usr/local/lib/python3.9/dist-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 915, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/content/Grounded-Segment-Anything/gradio_app.py", line 193, in run_grounded_sam
predictor = SamPredictor(build_sam(checkpoint=sam_checkpoint))
File "/usr/local/lib/python3.9/dist-packages/segment_anything/build_sam.py", line 15, in build_sam_vit_h
return _build_sam(
File "/usr/local/lib/python3.9/dist-packages/segment_anything/build_sam.py", line 104, in _build_sam
with open(checkpoint, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/ecs-user/download/sam_vit_h_4b8939.pth'

Error while trying to use inpaint

TypeError: 'NoneType' object is not callable
final text_encoder_type: bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Model loaded from /root/.cache/huggingface/hub/models--camenduru--GroundingDINO/snapshots/3d3869e41435a5dc9620b6b9bd37d2abb071e1c1/groundingdino_swint_ogc.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py:830: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/usr/local/lib/python3.9/dist-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")