brandontrabucco / da-fusion Goto Github PK

View Code? Open in Web Editor NEW

172.0 172.0 14.0 8.97 MB

Effective Data Augmentation With Diffusion Models

License: MIT License

Python 49.97% Shell 28.96% HTML 21.08%

da-fusion's People

Contributors

Stargazers

Watchers

Forkers

benjo9 tocmac zcsscz aic-teleflex chenjiehu shawnking98 lyf6 memesoo99 palimisis zelmously awj2021 cjfcsjt ma3252788 wensong110

da-fusion's Issues

Can you talk about how to reproduce the code and do data enhancement for small sample classification?

Hello, I am very interested after reading your paper! Can you tell us how to reproduce your code? I have a problem with the reproduction process: huggingface_hub.utils._headers. LocalTokenNotFoundError: Token is required (token=True), but no token found. You need to provide a token or be logged in to Hugging Face with huggingface-cli login or huggingface_hub.login. See https://huggingface.co/settings/tokens.
Trouble you!

Concept Erasure on PASACL dataset

Hi, @brandontrabucco

Sorry to bother you again. Which pre-trained concept erasure weights do you use in the implementation? Weights provided on https://erasing.baulab.info/weights/esd_models/ didn't involve any weights that were trained on pasacl dataset.

Fine-Tuned Token Locations

DEFAULT_EMBED_PATH = "/root/downloads/da-fusion/dataset)-tokens/(dataset)-{seedJ-(examples_per_class}.pt"

Hello，the. pt file cannot be found. What effect does it have on the program？

Where can I get the full info code

Thanks for wonderful project.
I would appreciate it if you could provide it. Thank you

the model is image2image or text2image

hi,i am reading the code.paper shows that the model uses img2img and textual inversion to generate data.but i did not find a img2img model in this code,only found a txt2img model.whether it is that i missed it.I am confused,hoping for your reply!

Other datasets?

Are you going to release training/evaluation of other datasets? such as FGVC-Aircraft, Stadford Cars?

How to train custom datasets

I'd appreciate that if you could write a tutorial on how to train custom datasets

Little mistake alarming

When I tried to run this code in pycharm and set the relevant parameters according to the instructions.

An error occurred: "float" objective is not iterable when we tried to zip these parameters into an iterable one.

These parameters should be in list forms rather than a single int/float

Can skeleton control be supported in da-fussion?

Can skeleton or keypoints information be used as a reference to generate images in da-fussion?

Fine-tune token

Is there any difference between fine_tune.py and fine_tune_upstream.py when I set num_vectors = 1 ? I do find that there are some validation steps in the latter one. I'm a little confused by these two files. Looking for your reply.

Can da-fusion be applied to object detection tasks?

Thanks for the good research. I have custom data for object detection, but I lack data for fine-tuning. I wonder if da-fusion can be applied. Even if semantic information is maintained, I think GT's bounding box coordinates will not be accurately mapped to the generated image. What do you think?

How to run it in multi-GPU setting without slurm

Hi,

I tried to run the fine_tune.py script on my lab's server, which is just a normal 4-GPU Ubuntu station without slurm support. When I ran it without distributed training setup, everything was okay. Then I tried to switch to multi-GPU setting and somehow I just couldn't get it work. I have tried the following ways and none of them seemed to work:

accelerate config and accelerate launch fine_tune.py --py_args, which gave me the following error while initializing the accelerator object:
```
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
```
torchrun fine_tune.py --py_args, which gave me the same error as method 1.
Write another shell script which calls the fine_tune_pascal.sh script 4 times and passes in different $SLURM_ARRAY_TASK_ID, which seems not to be the correct way since every process claimed to be the main process and I guess they were just generating replicate things.

Could you help me out with this? I'm pretty sure my accelerate library setting is okay since I'm able to run their official toy example. Is that because the codes inside if __name__ == "__main__": block is not fully wrapped as a main() function, as instructed by huggingface accelerate? Should I wrap it again?

erasuer-tokens for other dataset

Thank you for impressed studies.

It's same question on title.
Is there any erasure-tokens for any other dataset? (ex. flowers102, imagenet, coco, caltech101)
or how to get erasure-tokens weight for custom dataset ?

The ability to generate medical images

Hello @brandontrabucco ,

Thanks for your reputed work.

I have reproduced the plausible outcomes of generation on the Pascal dataset. Can we generate/augment medical images that distinguish a lot from common/natural images? I have fine-tuned DA-Fusion with the customized medical dataset; notwithstanding, over-fancy results are produced by executing generating_images.py, which is impossible for augmentation. Specifically, I built the dataset as a subclass of semantic_aug/few_shot_dataset.py and followed the identical configurations in the official codes. I also noted that the results of different .bin files from customized-x-y vary; what does x and y mean? Similarly, I attempted to augment medical images with generate_augmentations.py, setting --embed-path with learned_embeds.bin derived from fine-tuning operations mentioned above. I feel it is an inappropriate trial because of the default setting of ***.pt, which I have no idea to obtain thus far. In short, I have a customized medical dataset and corresponding labels and intend to achieve favorable augmentations with DA-Fusion thanks to its image-image generation.

I would appreciate any guidance or suggestions.

Best,
Young

Where in the code is SDEdit?

Hi, love the project. I'm trying to run this on a custom dataset, and I can't see where the image is inserted into the diffusion process as described in equation 6 in the paper. I see the textual inversion but not that. Can someone point me to where that is? Thanks!

OSError: Token is required (`token=True`), but no token found

Hi, I have ran the fine_tune.py with a custom dataset with only one class called "Chrysomelida hindwing" and the learned_embeddings are generated. However, when I try to run generate_images.py, an error said "OSError: Token is required (token=True), but no token found. "

The command is as follows:

python3 generate_images.py --out=../train_dirs/hwlfc_01/generated_images \
--embed-path="../train_dirs/hwlfc_01/fine-tuned/hwlfc-0-16/Chrysomelida_hindwing/learned_embeds.bin" \
--erasure-ckpt-name None \
--prompt "a photo of a <Chrysomelida hindwing>"

And the full backtrace is as follows:

Traceback (most recent call last):
  File "generate_images.py", line 46, in <module>
    pipe = StableDiffusionPipeline.from_pretrained(
  File "/home/dell/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 884, in from_pretrained
    cached_folder = cls.download(
  File "/home/dell/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1208, in download
    config_file = hf_hub_download(
  File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1181, in hf_hub_download
    headers = build_hf_headers(
  File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_headers.py", line 117, in build_hf_headers
    token_to_send = get_token_to_send(token)
  File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_headers.py", line 149, in get_token_to_send
    raise EnvironmentError(
OSError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.

evaluate the fine-tuning textual inversion

Hi,
how to evaluate fine-tuning textual inversion good or not? thanks

reproducing fine-tuned tokens. (textual inversion)

I have attempted to reproduce the results of the few-shot classification task on the PASCAL VOC dataset.
I managed to achieve comparable outcomes when utilizing the fine-tuned tokens you previously shared via the Google Drive link.
However, I was unsuccessful in reproducing the fine-tuned tokens.
When employing fine_tune.py and aggregate_embeddings.py with the provided scripts, I obtained inferior tokens, resulting in significantly lower accuracy (approximately a 10% gap in 1-shot).

Am I overlooking something?

Fix Textual Inversion pre-trained weights to CLIP model

Hi, it's me again.
When I go though your code, I find you directly fix the pre-trained textual inversion weights to the CLIP model in your implementation. Why? By intuition, the weights from Textual Inversion and those from CLIP should not be in the same space. Should there be an MLP to transform these weights to the same space? Forgive my foolishness, I am still new in this area.

FileNotFoundError: pascal-0-4.pt

Hi all, I have met the following error when I try to train image classification models using augmented images from DA-Fusion, anyone has the same issue? The execution code is:

python train_classifier.py --logdir pascal-baselines/textual-inversion-0.5 --synthetic-dir "aug/textual-inversion-0.5/{dataset}-{seed}-{examples_per_class}" --dataset pascal --prompt "a photo of a {name}" --aug textual-inversion --guidance-scale 7.5 --strength 0.5 --mask 0 --inverted 0 --num-synthetic 10 --synthetic-probability 0.5 --num-trials 1 --examples-per-class 4

The returned error:
FileNotFoundError: [Errno 2] No such file or directory: 'pascal-tokens/pascal-0-4.pt'

Paired Generation Possible?

Hi, thank you for your great work. I am amazed by the your model's effect in data augmentation. I am currently engaged in a project that necessitates the use of paired images—one to act as degraded input and the other as ground truth. I am interested in knowing whether your model supports the generation of such paired images, as it could greatly benefit my work.

Imagenet dataset format details

Hi,

Could you please provide details about the imagenet dataset and a run-nable script for imagenet?

For example, what LABEL_SYNSET in datasets.imagenet.py is?

What does MBDA stand for?

Thanks for the good research.
I'm not having any functional issues, just curious.

What does MBDA stand for? I assume DA stands for Data Augmentation or DA-Fusion, but what does MBDA stand for?

plot.py : line 49, 50

    parser.add_argument("--method-names", nargs="+", type=str, 
                        default=["Baseline", "Real Guidance", "MBDA (Ours)"])

Where in the code is randomized intensities?

I couldn't find it in the code.
It seems like just setting t_0=0.5.

Error while installing

Hi there!

Great repo and great work! I encountered an error while installing:

When I do this: pip install -e da-fusion

It throws this error:

ERROR: da-fusion is not a valid editable requirement. It should either be a path to a local project or a VCS URL (beginning with bzr+http, bzr+https, bzr+ssh, bzr+sftp, bzr+ftp, bzr+lp, bzr+file, git+http, git+https, git+ssh, git+git, git+file, hg+file, hg+http, hg+https, hg+ssh, hg+static-http, svn+ssh, svn+http, svn+https, svn+svn, svn+file).

Which figure does scripts/textual_inversion correspond to？

Hi, I want to know which figure in the paper is the result of scripts/textual_inversion/launch_textual_inversion=0.5_pascal.sh. It is model-centric leakage prevention. data-centric leakage prevention, or full information?
Here are some of the results of my run, is this normal?