nelson1425 / efficientad Goto Github PK
View Code? Open in Web Editor NEWUnofficial implementation of EfficientAD https://arxiv.org/abs/2303.14535
Home Page: https://arxiv.org/abs/2303.14535
License: Apache License 2.0
Unofficial implementation of EfficientAD https://arxiv.org/abs/2303.14535
Home Page: https://arxiv.org/abs/2303.14535
License: Apache License 2.0
Hello, Thank you for your work.
During training, does the image_size of the data need to be 256?
I used the python script you gave as follow:
python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle
but there are errors:
`python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\generic_util.py", line 102, in read_tiff
raise FileNotFoundError('Could not find a file with a TIFF extension'
FileNotFoundError: Could not find a file with a TIFF extension at ./output/1/anomaly_maps/mvtec_ad/bottle\test\broken_large\000
(base) PS F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data>
`
all the mvtec_ad_evaluation.tar.xz and mvtec_anomaly_detection.tar.xz are from what you share.
I am running from aconda powershell from windows 10
can anyone point a way ?
I use valid code that the author gives,but the there is something wrong that shows “ assert len(set(init_queries)) == len(init_queries) ”
I hope someone who meet this question can help me
How come pixelwise-AUC isn't evaluated?
As referred in the paper, the feature size is 64 x 64, but in this repo is 56 x 56
Input a picture, how to determine whether the picture is abnormal or not, how to write the code?
@nelson1425 Can you share the teacher model pretrained from imagenet? The database is too large . I can't train the teacher model .
it said: [RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 3 at line loss = torch.mean((target - prediction)**2)], so @nelson1425 could please tell me how to solve it? Many thanks!
Hi, I just want to let you know that the Google Drive URL mentioned in the readme, i.e., https://drive.google.com/uc?id=1n6RF08sp7RDxzKYuUoMox4RM13hqB1Jo, is not accessible anymore. I have pretrained the teacher model using the ImageNet dataset instead.
Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\efficientad.py", line 451, in
main()
File "D:\code\git-DS\EfficientAD\efficientad.py", line 268, in main
q_st_start, q_st_end, q_ae_start, q_ae_end = map_normalization(
File "C:\Users\2878045\AppData\Roaming\Python\Python39\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\code\git-DS\EfficientAD\efficientad.py", line 374, in map_normalization
q_st_start = torch.quantile(maps_st, q=0.9)
RuntimeError: quantile() input tensor is too large
anyone solve this?
Thanks for your great effort in contributing and sharing the reimplementation.
I have a problem here that needs your assistance. After training the MVTec LOCO's breakfast box category, I want to evaluate the output feature maps (in tiff format) using their official evaluation code, but I get this error:
File "[mydrive]/EfficientAD/mvtec_loco_ad_evaluation/src/aggregation.py", line 57, in binary_refinement
assert len(set(init_queries)) == len(init_queries)
I debugged the issue, and found this error is caused by the fact that the output feature maps contain too many 0
values, such that when the evaluation code creates the initial_thresholds
, a.k.a. init_queries
(a list of 50 anomaly scores), it contains duplicate values 0
. An example is like:
[2.83783058984375, 0.3397859036922455, 0.11129003018140793, 0.04379851371049881, 0.018422557041049004, 0.005928939674049616, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.00046889434452168643, -0.004845781251788139, -0.008791022002696991, -0.012286320328712463, -0.015344921499490738, -0.01806800439953804, -0.02048872597515583, -0.0226922407746315, -0.024705413728952408, -0.026554450392723083, -0.028272513300180435, -0.029879318550229073, -0.0313967764377594, -0.0328390970826149, -0.03421252220869064, -0.0355297289788723, -0.036807697266340256, -0.03804701194167137, -0.03924761712551117, -0.04042193666100502, -0.04157993942499161, -0.04271307215094566, -0.043838512152433395, -0.044960867613554, -0.046086572110652924, -0.04722655192017555, -0.048392221331596375, -0.049606941640377045, -0.05089380219578743, -0.052285678684711456, -0.05385420098900795, -0.055760398507118225, -0.05840958654880524, -0.07654058350419998]
I wonder if anyone encounters the same issue here. I appreciate if you can help me with solving the issue.
How to realize the visualization of heat map?
Sorry for my ignorance, i have small question about teacher model.
The teacher model pre-distilled with image dataset makes it smarter when training with other datasets?
Can you share the background of this rule. I feed there is no similarity between mvrec dataset and image dataset.
Thank you.
mvtec_ad_evaluation.tar.xz
If I can't connect to this link, is there any other way to get this file
Hi! I want to congratulate with you for the implementation of EfficientAD, thank you for your work. I would like to better understand some parts of your implementations, specifically the student/teacher model. In the paper the model has a fixed value for the padding in the convolutions resulting in an output with the shape 384x64x64, instead in your implementation the padding is set by the boolean's variable and the output's shape might be different (384x56x56). Is there a reason behind it?
(EfficientAD) E:\PycharmProjects\EfficientAD-main>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir E:\Dataset\anomaly_detection\mvtec_anomaly_detection\mvtec_anomaly_detection\ --anomaly_maps_dir E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad --output_dir E:\PycharmProjects\EfficientAD-main\output\1\metrics/mvtec_ad --
evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\generic_util.py", line 105, in read_tiff
raise IOError('Found multiple files with a TIFF extension at'
OSError: Found multiple files with a TIFF extension at E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad\bottle\test\broken_large\000
Please specify which TIFF extension to use via the exts
parameter of this function.
Hi, first of all thank you for your work and for making it completely public. I'm curious on the hardware you used for the pretraining on ImageNet, and on how much time did it take. Good job!
Thanks author for the awsome job. But the output size of the Network called PDN(patch descriptor network) is 384 × 56 ×56, while description the in the paper is 384 × 64 × 64. Is it a mistake or my misunderstanding?
Hi~, in the part of ouput anomaly map:
map_combined = torch.nn.functional.pad(map_combined, (4, 4, 4, 4))
map_combined = torch.nn.functional.interpolate(map_combined, (orig_height, orig_width), mode='bilinear')
why pad 4 pixel each side? what's the difference with resize to (orig_height, orig_width) directly?
🤔🤔🤔
Before freezing the backbone model, feature_dimensions() is called with a forward pass, thus the BN parameters are affected and it is no longer the original pre-trained model. Though it should cause little differences.
I use custom dataset.
(torch1) D:\code\git-DS\EfficientAD>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir "\DS-NAS\ds\oilhole(crank)" --anomaly_maps_dir './output/1/anomaly_maps/oilhole(crank)/' --output_dir './output/1/metrics/oilhole(crank)/' --evaluated_objects type_i
=== Evaluate type_i ===
Parsed 0 ground truth image files.
Read ground truth files and corresponding predictions...
0it [00:00, ?it/s]
Compute PRO curve...
Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 157, in calculate_au_pro_au_roc
pro_curve = compute_pro(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\pro_curve_util.py", line 37, in compute_pro
anomaly_maps[0].shape[0],
IndexError: list index out of range
I've encounted this issue.
I made it all the way to train and got a final image AUC: 98.3421.
The tiff files are stored in the anomaly_maps folder, but I don't know if I'm having trouble reading them or why.
Hi Nelson, i hope this message finds you well. I'm curious on what is your take on this.
I am very satisfied with results i obtain with pad_maps = True and padding=False.
However, as you know, pad_maps creates a "frame" that pads the resulting anomaly map and it seems that anomalies contained in the padded region are not seen by the model.
Playing with these two pad_maps and padding parameters i realized that, regardless of pad_maps value, it's padding that determines if the padded region is ignored or not. So, if i set padding=True the model also sees anomalies close to the borders of the image. Also, if I set padding=True and pad_maps=False I obtain precise segmentations and i also see defects close to borders. To me it seems that pad_maps is needed only to correct the segmentations "translation" that is caused by padding=False.
So why i'm writing this message and i'm not simply using padding=True and pad_maps=False?
The problem is that, regardless of the pad_maps (so we can forget about it, because it's getting confusing 😁 ), padding=True significantly worsen the results. I mean, it seems that the model learns less things.
So the first thing that came to my mind is that maybe the Teacher has been pretrained with padding=False, so it maybe has a different architecture with respect to the Student with padding=True that i'm trying to use.
But based on the code in this repo, it seems you pretrained the Teacher using padding=True and then you use a Student with padding=False when you do the efficientad training. Is this true?
This image shows what i obtain with padding=True and pad_maps=False . It segments perfectly the fake defects i am using to do these tests. Unfortunately the overall performance on the real defects in the rest of the test test is significantly worse wrt the model trained with padding=False (you can see it also in this image: the model struggles to segment the real defect on the bottom).
What do you think? What could be the reason why padding=True jeopardizes my trainings?
hello. thanks for the nice code
How do I just infer an image?
Hello, my hardware is quite old, and training takes a long time. Can you please share your pre training weights so that I can quickly check the effectiveness of this model?
https://github.com/nelson1425/EfficientAD/blob/fcab5146f84ae17597044ad5ddf1656ccf805401/efficientad.py#LL279C9-L279C75
I'm wondering why this padding is necessary. Can you help me?
Hello,Thanks for your work.How can I obtain --imagenet_ train_ path?
Hi Nelson,
can oyu provide the results of EfficientAD-M per category on the MVTec AD dataset? For matching the results in the anomlib framework, I would like to have the single category data :)
Thanks!
Why use batch size =1 ? What is for it?
train_loader = DataLoader(train_set, batch_size=1, shuffle=True,
num_workers=4, pin_memory=True)
train_loader_infinite = InfiniteDataloader(train_loader)
validation_loader = DataLoader(validation_set, batch_size=1)
Please ask! What is the final output obtained by converting the trained model into onnx format and importing it into onxxruntime for inference? Is it the input sample detection result (and whether it is an anomalous sample) and the corresponding confidence level? Or is it the anomaly_map with the same channel 1 as the input height and width?
Hi, I'm puzzled about the loss between student output and ae output,is it necessary?
Hi,Your work is greatly appreciated!I got a question, Why change the output size to 56 * 56,instead of 64 * 64 in paper.
I used the python script the author gave as follow :
!python efficientad.py --dataset mvtec_ad --subdataset wood
!python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle
but the error says:
"python3: can't open file '/content/efficientad.py': [Errno 2] No such file or directory"
I am running from colab
What should i do next? Hope someone can point a way. Thanks
I wrote the interface for model inference following the logic of the predict function in the efficientad.py
However, in actual testing, it was found that the average time was close to 1 second (excluding the model loading time), and it was also very slow after converting to the ONNX model
Device information:
CPU:
Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
GPU:
NVIDIA GeForce RTX 3080 16G
This is the code tested with the pth model:
import time
import numpy as np
import torch
from tqdm import tqdm
def pth_predict(image, teacher_model, student_model, ae_model, teacher_mean, teacher_std, out_channels,
q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
teacher_output = teacher_model(image)
teacher_output = (teacher_output - teacher_mean) / teacher_std
student_output = student_model(image)
autoencoder_output = ae_model(image)
map_st = torch.mean((teacher_output - student_output[:, :out_channels]) ** 2,
dim=1, keepdim=True)
map_ae = torch.mean((autoencoder_output -
student_output[:, out_channels:]) ** 2,
dim=1, keepdim=True)
if q_st_start is not None:
map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
if q_ae_start is not None:
map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
map_combined = 0.5 * map_st + 0.5 * map_ae
return map_combined, map_st, map_ae
if __name__ == '__main__':
# Load the PTH model
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
teacher_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth', map_location=device)
student_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth', map_location=device)
ae_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth', map_location=device)
# Construct the input data
fake_img_tensor = torch.rand((1, 3, 256, 256))
output_channels_num = 384
# Model prediction
teacher_mean_tensor = torch.rand((1, output_channels_num, 1, 1))
teacher_std_tensor = torch.rand((1, output_channels_num, 1, 1))
time_range = 100
time_cost_list = []
for i in tqdm(range(time_range)):
s1 = time.time()
pth_predict(fake_img_tensor, teacher_net, student_net, ae_net, teacher_mean_tensor, teacher_std_tensor,
output_channels_num,
q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
s2 = time.time()
time_cost_list.append(s2 - s1)
print(f'average time cost:{np.mean(time_cost_list):.6f}s')
This is the test code for inference with the ONNX model:
import time
import numpy as np
import onnxruntime
from tqdm import tqdm
def onnx_predict(img_arr, teacher_session, student_session, ae_session, teacher_mean, teacher_std, out_channels,
q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
ort_inputs1 = {teacher_session.get_inputs()[0].name: img_arr}
teacher_output = teacher_session.run(None, ort_inputs1)
teacher_output = teacher_output[0]
teacher_output = (teacher_output - teacher_mean) / teacher_std
ort_inputs2 = {student_session.get_inputs()[0].name: img_arr}
student_output = student_session.run(None, ort_inputs2)
student_output = student_output[0]
ort_inputs3 = {ae_session.get_inputs()[0].name: img_arr}
autoencoder_output = ae_session.run(None, ort_inputs3)
autoencoder_output = autoencoder_output[0]
map_st = np.mean((teacher_output - student_output[:, :out_channels]) ** 2, axis=1)
map_ae = np.mean((autoencoder_output -
student_output[:, out_channels:]) ** 2, axis=1)
if q_st_start is not None:
map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
if q_ae_start is not None:
map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
map_combined = 0.5 * map_st + 0.5 * map_ae
return map_combined, map_st, map_ae
if __name__ == '__main__':
# Load the ONNX model
teacher_ort_session = onnxruntime.InferenceSession('./output/onnx_path/teacher.onnx')
student_ort_session = onnxruntime.InferenceSession('./output/onnx_path/student.onnx')
ae_ort_session = onnxruntime.InferenceSession('./output/onnx_path/autoencoder.onnx')
# Construct the input data
fake_img_arr = np.random.rand(1, 3, 256, 256)
fake_img_arr = fake_img_arr.astype(np.float32)
output_channels_num = 384
# Model prediction
teacher_mean_arr = np.random.rand(1, output_channels_num, 1, 1)
teacher_std_arr = np.random.rand(1, output_channels_num, 1, 1)
time_range = 100
time_cost_list = []
for i in tqdm(range(time_range)):
s1 = time.time()
onnx_predict(fake_img_arr, teacher_ort_session, student_ort_session, ae_ort_session,
teacher_mean=teacher_mean_arr,
teacher_std=teacher_std_arr, out_channels=output_channels_num,
q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
s2 = time.time()
time_cost_list.append(s2 - s1)
print(f'average time cost:{np.mean(time_cost_list):.6f}s')
This is a function of converting the PTH model to the ONNX model:
def convert_to_onnx_with_dynamic_img_shape(model, input_size, onnx_path):
model.eval()
dummy_input = torch.randn(1, *input_size, requires_grad=True)
torch.onnx.export(model,
dummy_input,
onnx_path,
export_params=True,
opset_version=11,
do_constant_folding=True,
input_names=['modelInput'],
output_names=['modelOutput'],
dynamic_axes={'modelInput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'},
'modelOutput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'}})
print('Model has been converted to ONNX')
if __name__ == '__main__':
out_channels = 384
# teacher = get_pdn_small(out_channels)
# student = get_pdn_small(2 * out_channels)
# autoencoder = get_autoencoder(out_channels)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
teacher_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth',
map_location=device)
student_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth',
map_location=device)
autoencoder_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth',
map_location=device)
teacher_onnx_path = './output/onnx_path/teacher.onnx'
convert_to_onnx_with_dynamic_img_shape(teacher_model, input_size=(3, 256, 256), onnx_path=teacher_onnx_path)
student_onnx_path = './output/onnx_path/student.onnx'
convert_to_onnx_with_dynamic_img_shape(student_model, input_size=(3, 256, 256), onnx_path=student_onnx_path)
autoencoder_onnx_path = './output/onnx_path/autoencoder.onnx'
convert_to_onnx_with_dynamic_img_shape(autoencoder_model, input_size=(3, 256, 256), onnx_path=autoencoder_onnx_path)
For the performance reported in the readme, I wonder, did you use the same training steps (70,000) for all datasets (mvtec, mvtec loco, and visa)?
Hello, Thank you for your work.
To reproduce the paper results. I want to train on mvtec and visa datasets. But there is no configuration for the visa dataset. However, you mentioned the score of EfficientAD on the Visa dataset in the table.
How we can train on Visa Dataset.
Secondly, while training the algorithm, you save the model after some interval without considering the auroc score. I'm looking forward to hearing back from you. Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.