Giter Site home page Giter Site logo

Any plans for a diffusers version? about vico HOT 8 CLOSED

haoosz avatar haoosz commented on July 24, 2024 2
Any plans for a diffusers version?

from vico.

Comments (8)

okaris avatar okaris commented on July 24, 2024 1

I am working on this.

from vico.

haoosz avatar haoosz commented on July 24, 2024

Yeah. We will make diffusers version after all the work is done. Thank you.

from vico.

tonyf avatar tonyf commented on July 24, 2024

Amazing! Looking forward to seeing it. Just curious-- is there an expected timeline for the diffusers version? Debating whether to implement it myself

from vico.

haoosz avatar haoosz commented on July 24, 2024

Sorry, but I am occupied by the following work and might not work on the diffuser version right now. I will work on the diffuser version in August. If it is too late for you, I am very glad you can implement by yourself. Thank you!

from vico.

garychan22 avatar garychan22 commented on July 24, 2024

I have finished the diffusers version but simply feeding the reference image to the frozen unet and doing the otsu is low-speed, which is weird. hahaha

from vico.

okaris avatar okaris commented on July 24, 2024

@garychan22 I've also recently finished it and have been working on getting the hyperparams to fit my needs. otsu itself is the bottleneck, the point of having it is to escape the need of preprocessing, but if you are already doing that a manually supplied mask could also help and speed it up. Other than that this repo is not taking advantage of higher performance attention processors, which you can't use for the attention calculations where you need to extract the scores. But it's possible to use xformers or pytorch's scaled_dot_product_attention for faster calculations.

Were you able to replicate the results exactly like the samples here?

from vico.

okaris avatar okaris commented on July 24, 2024

Also if you would like to submit a PR, here is my issue: huggingface/diffusers#3719

from vico.

garychan22 avatar garychan22 commented on July 24, 2024

@garychan22 I've also recently finished it and have been working on getting the hyperparams to fit my needs. otsu itself is the bottleneck, the point of having it is to escape the need of preprocessing, but if you are already doing that a manually supplied mask could also help and speed it up. Other than that this repo is not taking advantage of higher performance attention processors, which you can't use for the attention calculations where you need to extract the scores. But it's possible to use xformers or pytorch's scaled_dot_product_attention for faster calculations.

Were you able to replicate the results exactly like the samples here?

Thanks for the useful tips here! For now, I have not replicated the similar results as this repo and I will keep working on this.

Moreover, I have been training my own blip-diffusion, finding that better results to dreambooth can be achieved within one-minute fine-tuning, which is awesome. Hope to replicate the results as shown in the paper and release the pre-trained model to the hub soon.

from vico.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.