brandontrabucco / da-fusion Goto Github PK
View Code? Open in Web Editor NEWEffective Data Augmentation With Diffusion Models
License: MIT License
Effective Data Augmentation With Diffusion Models
License: MIT License
Hello, I am very interested after reading your paper! Can you tell us how to reproduce your code? I have a problem with the reproduction process: huggingface_hub.utils._headers. LocalTokenNotFoundError: Token is required (token=True
), but no token found. You need to provide a token or be logged in to Hugging Face with huggingface-cli login
or huggingface_hub.login
. See https://huggingface.co/settings/tokens.
Trouble you!
Hi, @brandontrabucco
Sorry to bother you again. Which pre-trained concept erasure weights do you use in the implementation? Weights provided on https://erasing.baulab.info/weights/esd_models/ didn't involve any weights that were trained on pasacl dataset.
DEFAULT_EMBED_PATH = "/root/downloads/da-fusion/dataset)-tokens/(dataset)-{seedJ-(examples_per_class}.pt"
Hello,the. pt file cannot be found. What effect does it have on the program?
Thanks for wonderful project.
I would appreciate it if you could provide it. Thank you
hi,i am reading the code.paper shows that the model uses img2img and textual inversion to generate data.but i did not find a img2img model in this code,only found a txt2img model.whether it is that i missed it.I am confused,hoping for your reply!
Are you going to release training/evaluation of other datasets? such as FGVC-Aircraft, Stadford Cars?
I'd appreciate that if you could write a tutorial on how to train custom datasets
Can skeleton or keypoints information be used as a reference to generate images in da-fussion?
Is there any difference between fine_tune.py and fine_tune_upstream.py when I set num_vectors = 1 ? I do find that there are some validation steps in the latter one. I'm a little confused by these two files. Looking for your reply.
Thanks for the good research. I have custom data for object detection, but I lack data for fine-tuning. I wonder if da-fusion can be applied. Even if semantic information is maintained, I think GT's bounding box coordinates will not be accurately mapped to the generated image. What do you think?
Hi,
I tried to run the fine_tune.py
script on my lab's server, which is just a normal 4-GPU Ubuntu station without slurm support. When I ran it without distributed training setup, everything was okay. Then I tried to switch to multi-GPU setting and somehow I just couldn't get it work. I have tried the following ways and none of them seemed to work:
accelerate config
and accelerate launch fine_tune.py --py_args
, which gave me the following error while initializing the accelerator object:
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
torchrun fine_tune.py --py_args
, which gave me the same error as method 1.
Write another shell script which calls the fine_tune_pascal.sh
script 4 times and passes in different $SLURM_ARRAY_TASK_ID
, which seems not to be the correct way since every process claimed to be the main process and I guess they were just generating replicate things.
Could you help me out with this? I'm pretty sure my accelerate library setting is okay since I'm able to run their official toy example. Is that because the codes inside if __name__ == "__main__":
block is not fully wrapped as a main()
function, as instructed by huggingface accelerate? Should I wrap it again?
Thank you for impressed studies.
It's same question on title.
Is there any erasure-tokens for any other dataset? (ex. flowers102, imagenet, coco, caltech101)
or how to get erasure-tokens weight for custom dataset ?
Hello @brandontrabucco ,
Thanks for your reputed work.
I have reproduced the plausible outcomes of generation on the Pascal dataset. Can we generate/augment medical images that distinguish a lot from common/natural images? I have fine-tuned DA-Fusion with the customized medical dataset; notwithstanding, over-fancy results are produced by executing generating_images.py
, which is impossible for augmentation. Specifically, I built the dataset as a subclass of semantic_aug/few_shot_dataset.py
and followed the identical configurations in the official codes. I also noted that the results of different .bin
files from customized-x-y
vary; what does x
and y
mean? Similarly, I attempted to augment medical images with generate_augmentations.py
, setting --embed-path
with learned_embeds.bin
derived from fine-tuning operations mentioned above. I feel it is an inappropriate trial because of the default setting of ***.pt
, which I have no idea to obtain thus far. In short, I have a customized medical dataset and corresponding labels and intend to achieve favorable augmentations with DA-Fusion thanks to its image-image generation.
I would appreciate any guidance or suggestions.
Best,
Young
Hi, love the project. I'm trying to run this on a custom dataset, and I can't see where the image is inserted into the diffusion process as described in equation 6 in the paper. I see the textual inversion but not that. Can someone point me to where that is? Thanks!
Hi, I have ran the fine_tune.py with a custom dataset with only one class called "Chrysomelida hindwing" and the learned_embeddings are generated. However, when I try to run generate_images.py, an error said "OSError: Token is required (token=True
), but no token found. "
The command is as follows:
python3 generate_images.py --out=../train_dirs/hwlfc_01/generated_images \
--embed-path="../train_dirs/hwlfc_01/fine-tuned/hwlfc-0-16/Chrysomelida_hindwing/learned_embeds.bin" \
--erasure-ckpt-name None \
--prompt "a photo of a <Chrysomelida hindwing>"
And the full backtrace is as follows:
Traceback (most recent call last):
File "generate_images.py", line 46, in <module>
pipe = StableDiffusionPipeline.from_pretrained(
File "/home/dell/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 884, in from_pretrained
cached_folder = cls.download(
File "/home/dell/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1208, in download
config_file = hf_hub_download(
File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1181, in hf_hub_download
headers = build_hf_headers(
File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_headers.py", line 117, in build_hf_headers
token_to_send = get_token_to_send(token)
File "/home/dell/.local/lib/python3.8/site-packages/huggingface_hub/utils/_headers.py", line 149, in get_token_to_send
raise EnvironmentError(
OSError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.
Hi,
how to evaluate fine-tuning textual inversion good or not? thanks
I have attempted to reproduce the results of the few-shot classification task on the PASCAL VOC dataset.
I managed to achieve comparable outcomes when utilizing the fine-tuned tokens you previously shared via the Google Drive link.
However, I was unsuccessful in reproducing the fine-tuned tokens.
When employing fine_tune.py and aggregate_embeddings.py with the provided scripts, I obtained inferior tokens, resulting in significantly lower accuracy (approximately a 10% gap in 1-shot).
Am I overlooking something?
Hi, it's me again.
When I go though your code, I find you directly fix the pre-trained textual inversion weights to the CLIP model in your implementation. Why? By intuition, the weights from Textual Inversion and those from CLIP should not be in the same space. Should there be an MLP to transform these weights to the same space? Forgive my foolishness, I am still new in this area.
Hi all, I have met the following error when I try to train image classification models using augmented images from DA-Fusion, anyone has the same issue? The execution code is:
python train_classifier.py --logdir pascal-baselines/textual-inversion-0.5 --synthetic-dir "aug/textual-inversion-0.5/{dataset}-{seed}-{examples_per_class}" --dataset pascal --prompt "a photo of a {name}" --aug textual-inversion --guidance-scale 7.5 --strength 0.5 --mask 0 --inverted 0 --num-synthetic 10 --synthetic-probability 0.5 --num-trials 1 --examples-per-class 4
The returned error:
FileNotFoundError: [Errno 2] No such file or directory: 'pascal-tokens/pascal-0-4.pt'
Hi, thank you for your great work. I am amazed by the your model's effect in data augmentation. I am currently engaged in a project that necessitates the use of paired images—one to act as degraded input and the other as ground truth. I am interested in knowing whether your model supports the generation of such paired images, as it could greatly benefit my work.
Hi,
Could you please provide details about the imagenet dataset and a run-nable script for imagenet?
For example, what LABEL_SYNSET in datasets.imagenet.py is?
Thanks for the good research.
I'm not having any functional issues, just curious.
What does MBDA stand for? I assume DA stands for Data Augmentation or DA-Fusion, but what does MBDA stand for?
plot.py
: line 49, 50
parser.add_argument("--method-names", nargs="+", type=str,
default=["Baseline", "Real Guidance", "MBDA (Ours)"])
I couldn't find it in the code.
It seems like just setting t_0=0.5.
Hi there!
Great repo and great work! I encountered an error while installing:
When I do this: pip install -e da-fusion
It throws this error:
ERROR: da-fusion is not a valid editable requirement. It should either be a path to a local project or a VCS URL (beginning with bzr+http, bzr+https, bzr+ssh, bzr+sftp, bzr+ftp, bzr+lp, bzr+file, git+http, git+https, git+ssh, git+git, git+file, hg+file, hg+http, hg+https, hg+ssh, hg+static-http, svn+ssh, svn+http, svn+https, svn+svn, svn+file).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.