Giter Site home page Giter Site logo

junyi42 / sd-dino Goto Github PK

View Code? Open in Web Editor NEW
242.0 242.0 12.0 35.48 MB

Official Implementation of paper "A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence"

Home Page: https://sd-complements-dino.github.io

Shell 0.04% Jupyter Notebook 91.50% Python 8.46%

sd-dino's People

Contributors

junyi42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sd-dino's Issues

Establish environment

Hello, I am very interested in your work, but I encountered some difficulties when setting up the environment. I followed the steps in the README, but there seems to be some problem somewhere, and I don't know how to fix it.
image

AttributeError: module 'keras.backend' has no attribute 'is_tensor'

Hello,I'm sorry to bother you again. I've encountered a version issue. My TensorFlow and Keras versions are 2.13.1, and I'm getting this error. Could you please let me know the Keras version requirements for this code? I couldn't find any helpful answers online, and despite using a global search, I haven't found any occurrences of the "is_tensor" function in the code.
Thanks!

Model parameter mismatch

Hi, thanks for sharing the codes.

I found a problem when running the demo codes. I followed all the setup in readme without changing anything, but it seems the download pre-trained weights mismatch the model:

image

so I got the results which are very different from yours:
image

This problem also occurs when I run Geoaware-SC. Could you give me some advice on how to solve this?

Result different from demo_vis_features.ipynb

Hello @Junyi42 , Thanks for your contribution. I ran the "demo_vis_features.ipynb on the dog that was given in the default image folder. My results are coming different than yours. Yours masked pca result was

image

while I am getting
image

Also, my clustering is

clustering

I didn't change anything in the code only dumped everything from the ipynb to .py file and I am getting these outputs in the results_vis folder in the form of png files.

Installation issues for Mask Former

Hello @Junyi42 ,
Thanks for your contribution. I am facing the an installation issue when running the "pip install -e ." command. This is giving the error as follows:

Emitting ninja build file /BS/keytr_neus/work/supplementary/sd-dino/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja...

error: [Errno 2] No such file or directory: '/BS/keytr_neus/work/supplementary/sd-dino/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja'

ERROR: Failed building wheel for mask2former

ERROR: Could not build wheels for mask2former, which is required to install pyproject.toml-based projects

Please help me in this

Collab Demo

Thank you for the amazing work! I am trying to visualize the feature maps for dino and SD. Do you have a collab notebook, that I can use to run it?

get_mask cannot return valid mask

Hi!
when running the demo,

src_img_path = "data/images/dog_00.jpg"
trg_img_path = "data/images/dog_59.jpg"
result = process_images(src_img_path, trg_img_path)

I found that the get_mask function cannot return a valid mask but an all-1 matrix. Is this a bug?

if DRAW_DENSE:
                if not Anno:
                    mask1 = get_mask(model, aug, img1, category[0])
                    mask2 = get_mask(model, aug, img2, category[-1])

cannot `get_mask` when I vary the cuda device

Hello Junyi, GREAT JOB! It seems that everything works well when calling get_features in extractor_sd.py using cuda:3
but the inference process failed even I change
def inference(model, aug, image, vocab, label_list):
from
demo = StableDiffusionSeg(inference_model, demo_metadata, aug)

pred = demo.predict(np.array(image))
to
demo = StableDiffusionSeg(inference_model, demo_metadata, aug)

demo.model = demo.model.to(torch.device("cuda:3"))

pred = demo.predict(np.array(image))

I guess the main problem lies in wrongly loading the decoder part of the model, but I'm not sure how to fix it.

License?

Hi,

Thanks for this awesome work! 🤩

DINO and StableDiffusion works have MIT licenses. Is your work also MIT?

Best,
Iago.

Details about how to extract sd features

Hi Junyi,

I am confused about how to extract sd features. Actually the file extractor_sd.py seems to output a feature in shape of [1, 1280, 16, 16] without obvious semantic information. And it seems to use the model weights from project ODISE. Could you please provide a script to easily extract and visualize the sd features using publicly available stable diffusion model weights? Thanks a lot!

image

Questions about sd features

Hello, I would like to know whether the 2, 5, 8-layer features mentioned in the paper refer to the actual 2, 5, 8 layers or the layers after processing with the UpSample block. Does it mean the results obtained after the UpSample block processing? I find it a bit challenging to understand the feature extraction in the code. I hope to receive your reply. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.