Giter Site home page Giter Site logo

rq-wu / lamp Goto Github PK

View Code? Open in Web Editor NEW
225.0 7.0 9.0 101.7 MB

Official implement code of LAMP: Learn a Motion Pattern by Few-Shot Tuning a Text-to-Image Diffusion Model (Few-shot-based text-to-video diffusion)

Home Page: https://rq-wu.github.io/projects/LAMP/index.html

License: Other

Python 100.00%
aigc diffusion diffusion-model diffusion-models few-shot-learning stable-diffusion text-to-video video-editing

lamp's Introduction

武睿祺(Ruiqi Wu)

I am a master student of TMCC, College of Computer Science, Nankai University, China, under the supervision of Prof. Ming-Ming Cheng & Dr. Chun-Le Guo. My research interests are computer vision and machine learning, focusing on AIGC and low-level vision.

Visitor count

Star History

lamp's People

Contributors

anonymous-3917 avatar eltociear avatar guspan-tanadi avatar rq-wu avatar shashwatnigam99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lamp's Issues

plans for Google Drive?

Hi there - amazing work! Just wondering when you are planning to upload the models to google drive - excited to play with them

Is one-shot learning possible?

Great Work!

I have a question about the setting. Is LAMP suitable for only few-shot learning or it is also suitable for one-shot learning?
In other words, does LAMP require 8~16 videos always or is one video okay too?

Thank you in advance.

Regarding the paper

Hi, thank you for the interesting work. I have a question about the proposed method.

a 2D convolution with an output channel of 1 along with a Sigmoid function is added

self.conv_gate = nn.Conv2d(out_channels, 1, 3, stride=1, padding=1)
x_gate = rearrange(x_2d, "b c f h w -> (b f) c h w")
c = x_gate.shape[1]
x_gate = self.sigmoid(self.conv_gate(x_gate)).repeat(1, c, 1, 1)

I would like to know what is the insight behind using a c -> 1 channel convolution and then repeating back c times. As a side question, what is the purpose of using a sigmoid function after this branch before multiplying to the conv_1d output?
Thanks.

About multi-action

Thank you for your excellent work! I try to train three actions at the same time(horse run, birds fly, waterfall), but the result is not as good as the single action. Can you give me some suggestions?

Evaluation code

Hi! I was wondering if you could share the evaluation code used in LAMP or point me to references that you used for the results reported in the paper? Thank you!

inference_script

Hi, thanks a lot for your interesting work! I know that in your paper you explain that you use the T2I model to generate the first frame during inference, but there doesn't seem to be any code in the "inference_script" that generates the first frame. I'm wondering if I'm mistaken.

The script of evaluating the model

Thank you for your excellent work. I have noticed that you provided some functions to evaluate the model quantitatively. Can you provide a script to directly evaluate these metrics (i.e., alignment, consistency, diversity)? Thank you very much.

Question about the training time

Thanks for your great work! I have a question about the training time: when I train the horse_running in my GPU (RTX 3090 24G), it displays about 14hours for training. I want to know wheather this is normal?
image
Expecting your reply!

multi-gpu training

Thanks for your nice work! I want to ask that how to conduct multi-gpu training in your code. I set CUDA_VISIBLE_DEVICES=0,1, but it doesnot work. Hoping for your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.