zqhang / anomalyclip Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation for AnomalyCLIP (ICLR 2024)
License: MIT License
Official implementation for AnomalyCLIP (ICLR 2024)
License: MIT License
Dear author,
I encountered an issue with running SDD.py.
No such file or directory: 'data/sdd/electrical commutators/train'
Can you tell me why the SDD dataset has class "electrical commutators"? I can only see classnames such as "kos35", "kos36", etc. in the datasets.
Thank you.
Good work! When I use the provided generate_dataset_json/SDD.py to generate meta.json, the lack of splited train and test datasets prevents the generation of meta.json. How can I obtain the train and test datasets? Additionally, some missing classes in the downloaded MPDD dataset are causing the code to fail. Could you provide the complete MPDD dataset?
I've encountered an issue in the paper's code regarding the training approach for the anomaly detector on the MVTec AD dataset. The problem lies in the train.py script at line 35 where the Dataset object for training is created without specifying mode="train":
train_data = Dataset(root=args.train_data_path, transform=preprocess, target_transform=target_transform, dataset_name=args.dataset)
This oversight leads to two critical problems:
Could this be revised to ensure the correct dataset partitioning and training setup?
Very interesting work! Could you kindly share if there are any plans to open source your AnomalyCLIP in the near future?
Hi there, Congrats for the great work!
In table 1, I ve noticed you also include the result of Original CLIP model
Could you please share the setting of this experiments? Cuz my reimplementation based on your code shows lots of differences than yours. Like
Thank you so much for your patience!
I sincerely thank you for your research and code sharing. As a medical doctor, I think your research can have a significant impact in the medical domain as well. I have read your paper and the reviews on the Openreview page, but I had a question because I couldn't grasp the details about the text prompt template (the part marked as V_1-V_E, W_1-W_E in the paper). Looking at the training code, it seems like that V_1 ~ V_E, W_1 ~ W_E are all filled with "X", is that correct?
P.S. I know how unexpectedly cumbersome and psychologically resistant it can be to organize and share experimental code after a paper is accepted, so I am truly grateful for sharing the code like this. Once again, congratulations on the ICLR acceptance.
When i test pixel-level with 3700 images, when calculate roc_auc_score, it is without RAM memory. How can I fix this? i use colab pro with 50gb ram
Hello @zqhang
Thanks for your work. When i run the bash test.sh
it gives permission error, please have a look maybe the code need to download the weight and there is no permission for it
bash test.sh
res.log
Namespace(data_path='/remote-home/iot_zhouqihang/data/mvdataset', save_path='./results/9_12_4_multiscale/zero_shot', checkpoint_path='./checkpoints/9_12_4_multiscale/epoch_15.pth', dataset='mvtec', features_list=[6, 12, 18, 24], image_size=518, depth=9, n_ctx=12, t_n_ctx=4, feature_map_layer=[0, 1, 2, 3], metrics='image-pixel-level', seed=111, sigma=4)
name ViT-L/14@336px
Traceback (most recent call last):
File "/media/cvpr/CM_1/AnomalyCLIP/test.py", line 195, in <module>
test(args)
File "/media/cvpr/CM_1/AnomalyCLIP/test.py", line 43, in test
model, _ = AnomalyCLIP_lib.load("ViT-L/14@336px", device=device, design_details = AnomalyCLIP_parameters)
File "/media/cvpr/CM_1/AnomalyCLIP/AnomalyCLIP_lib/model_load.py", line 145, in load
model_path = _download(_MODELS[name], download_root or os.path.expanduser("/remote-home/iot_zhouqihang/root/.cache/clip"))
File "/media/cvpr/CM_1/AnomalyCLIP/AnomalyCLIP_lib/model_load.py", line 39, in _download
os.makedirs(cache_dir, exist_ok=True)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 1 more time]
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 225, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/remote-home'
./checkpoints/9_12_4_multiscale/res.log
Namespace(data_path='/remote-home/iot_zhouqihang/data/Visa', save_path='./results/9_12_4_multiscale_visa/zero_shot', checkpoint_path='./checkpoints/9_12_4_multiscale_visa/epoch_15.pth', dataset='visa', features_list=[6, 12, 18, 24], image_size=518, depth=9, n_ctx=12, t_n_ctx=4, feature_map_layer=[0, 1, 2, 3], metrics='image-pixel-level', seed=111, sigma=4)
name ViT-L/14@336px
Traceback (most recent call last):
File "/media/cvpr/CM_1/AnomalyCLIP/test.py", line 195, in <module>
test(args)
File "/media/cvpr/CM_1/AnomalyCLIP/test.py", line 43, in test
model, _ = AnomalyCLIP_lib.load("ViT-L/14@336px", device=device, design_details = AnomalyCLIP_parameters)
File "/media/cvpr/CM_1/AnomalyCLIP/AnomalyCLIP_lib/model_load.py", line 145, in load
model_path = _download(_MODELS[name], download_root or os.path.expanduser("/remote-home/iot_zhouqihang/root/.cache/clip"))
File "/media/cvpr/CM_1/AnomalyCLIP/AnomalyCLIP_lib/model_load.py", line 39, in _download
os.makedirs(cache_dir, exist_ok=True)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 1 more time]
File "/home/cvpr/anaconda3/lib/python3.9/os.py", line 225, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/remote-home'
在仓库中下载的ISIC数据集是测试集,generate_dataset_json文件夹中的isbi.py文件也是针对测试集的,请问该如何使用ISIC训练模型呢?
+1
请问在您的工作中,zero-shot是如何体现的呢?既然是zero-shot,那么为什么会训练mvtec数据集,然后测试mvtec数据集的性能指标呢?最后还有一个问题,为什么在训练和测试的时候,都用的测试数据集呢?
It seems that you used the entire test set to train encode_text_learn, while also using the test set for testing, which is seriously inconsistent with the ZERO-SHOT mentioned in your paper.Can you explain this issue
I train with dataset visa,
I train 11 epoch, but loss and image_loss is 3.7960, 0.5325. I feel something wrong. I setup by your setting. Can you share me about loss and image_loss when u train?
I don't understand why you divide right here
text_probs = image_features.unsqueeze(1) @ text_features.permute(0, 2, 1) text_probs = text_probs[:, 0, ...]/0.07 text_probs = text_probs.squeeze()
and another here
logit_scale = self.logit_scale.exp() # nn.Parameter(torch.ones([]) * np.log(1 / 0.07)) logits_per_image = logit_scale * image_features @ text_features.t() logits_per_text = logits_per_image.t()
And in another paper , they use multi with 100, example
for layer in range(len(det_patch_tokens)): det_patch_tokens[layer] = det_patch_tokens[layer] / det_patch_tokens[layer].norm(dim=-1, keepdim=True) anomaly_map = (100.0 * det_patch_tokens[layer] @ text_features) anomaly_map = torch.softmax(anomaly_map, dim=-1)[:, :, 1] anomaly_score = torch.mean(anomaly_map, dim=-1) det_loss += loss_bce(anomaly_score, image_label)
First of all, thank you for your work and for releasing the AnomalyCLIP code!
I was wondering if it was possible to export the model to ONNX format? It could be very useful.
Thanks
Can you provide more details on the SDD or original datasets you tested?
I download it from DOWNLOAD HERE with fine annotations (for JIM2019 paper), and use this DRA
include: normal_samples 286 anomaly_samples 54, different from your work (181, 74)
and I got very different results based on your checkpoints:
In line 387
of AnomalyCLIP_lib/AnomalyCLIP.py
, the global class token from vision is directly feed into projector and don't process it with self.ln_post
, which is different from original CLIP. It is a mistake or some special settting?It will contribute a lot for AnomalyCLIP?
Input a picture, how to determine whether the picture is abnormal or not, how to write the code?
I can't understand the following code.
self.register_buffer("token_prefix_pos", embedding_pos[:, :, :1, :] )
self.register_buffer("token_suffix_pos", embedding_pos[:, :, 1 + n_ctx_pos:, :])
self.register_buffer("token_prefix_neg", embedding_neg[:, :, :1, :])
self.register_buffer("token_suffix_neg", embedding_neg[:, :, 1 + n_ctx_neg:, :])
I think the positive prompt should be ['X X X X X X X X X X X X object.'], so the prefix should be 'X X X X X X X X X X X X ', and the suffix should be '.'.
I don't know if my understanding is wrong, can you help me to answer it?
I have noticed this operation in section 3.3, can the score map after concat still calculate the loss normally? What is the purpose of this?
I want to use AnomalyCLIP to train my custom dataset. And my dataset has no pixel ground truth.
Hi, it is not very clear to me how to extrapolate a threshold at segmentation level to discriminate whether a pixel is classified as “normal” or “anomaly". The output anomaly map needs to be normalized with something like this?
normalize_anomalymap = (anomalymap- anomalymap.min()) / (anomalymap.max() - anomalymap.min())
Thanks
输入一张图片,如何判断图片是否异常,如何编写代码?
First of all, I would like to thank you and your colleagues for your contributions to this domain. I have a question that in the Implementation details
you said that The learnable token embeddings are attached to the first 9 layers of the text encoder for refining the textual space.
but I only see that the 2nd (i = 1)
-> 8th (i = 7)
layers are attached. Can you explain it for me, thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.