nelson1425 / efficientad Goto Github PK

View Code? Open in Web Editor NEW

220.0 2.0 56.0 37.99 MB

Unofficial implementation of EfficientAD https://arxiv.org/abs/2303.14535

Home Page: https://arxiv.org/abs/2303.14535

License: Apache License 2.0

Python 100.00%

anomaly-classification anomaly-detection anomaly-localization anomaly-segmentation efficientad paper

efficientad's Issues

The image_size

Hello, Thank you for your work.
During training, does the image_size of the data need to be 256?

Error when evaluate with MVTec code

I used the python script you gave as follow:

python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle

but there are errors:

`python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\generic_util.py", line 102, in read_tiff
raise FileNotFoundError('Could not find a file with a TIFF extension'
FileNotFoundError: Could not find a file with a TIFF extension at ./output/1/anomaly_maps/mvtec_ad/bottle\test\broken_large\000
(base) PS F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data>

all the mvtec_ad_evaluation.tar.xz and mvtec_anomaly_detection.tar.xz are from what you share.
I am running from aconda powershell from windows 10

can anyone point a way ?

about the validation of the MvTec-AD LOCO dataset

I use valid code that the author gives，but the there is something wrong that shows “ assert len(set(init_queries)) == len(init_queries) ”
I hope someone who meet this question can help me

Pixelwise_AUC isn't evaluated?

How come pixelwise-AUC isn't evaluated?

The feature size not the same with the papar

As referred in the paper, the feature size is 64 x 64, but in this repo is 56 x 56

how to test one img？

Input a picture, how to determine whether the picture is abnormal or not, how to write the code?

Can you share the teacher model pretrained from imagenet?

@nelson1425 Can you share the teacher model pretrained from imagenet? The database is too large . I can't train the teacher model .

error happend when training teacher_small.pth by my self.

it said: [RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 3 at line loss = torch.mean((target - prediction)**2)], so @nelson1425 could please tell me how to solve it? Many thanks!

The provided Google Drive URL is not available anymore

Hi, I just want to let you know that the Google Drive URL mentioned in the readme, i.e., https://drive.google.com/uc?id=1n6RF08sp7RDxzKYuUoMox4RM13hqB1Jo, is not accessible anymore. I have pretrained the teacher model using the ImageNet dataset instead.

RuntimeError: quantile() input tensor is too large

Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\efficientad.py", line 451, in
main()
File "D:\code\git-DS\EfficientAD\efficientad.py", line 268, in main
q_st_start, q_st_end, q_ae_start, q_ae_end = map_normalization(
File "C:\Users\2878045\AppData\Roaming\Python\Python39\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\code\git-DS\EfficientAD\efficientad.py", line 374, in map_normalization
q_st_start = torch.quantile(maps_st, q=0.9)
RuntimeError: quantile() input tensor is too large

anyone solve this?

Get error when evaluate breakfast_box's output with LOCO's official evaluation code

Thanks for your great effort in contributing and sharing the reimplementation.

I have a problem here that needs your assistance. After training the MVTec LOCO's breakfast box category, I want to evaluate the output feature maps (in tiff format) using their official evaluation code, but I get this error:

File "[mydrive]/EfficientAD/mvtec_loco_ad_evaluation/src/aggregation.py", line 57, in binary_refinement
    assert len(set(init_queries)) == len(init_queries)

I debugged the issue, and found this error is caused by the fact that the output feature maps contain too many 0 values, such that when the evaluation code creates the initial_thresholds, a.k.a. init_queries (a list of 50 anomaly scores), it contains duplicate values 0. An example is like:

[2.83783058984375, 0.3397859036922455, 0.11129003018140793, 0.04379851371049881, 0.018422557041049004, 0.005928939674049616, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.00046889434452168643, -0.004845781251788139, -0.008791022002696991, -0.012286320328712463, -0.015344921499490738, -0.01806800439953804, -0.02048872597515583, -0.0226922407746315, -0.024705413728952408, -0.026554450392723083, -0.028272513300180435, -0.029879318550229073, -0.0313967764377594, -0.0328390970826149, -0.03421252220869064, -0.0355297289788723, -0.036807697266340256, -0.03804701194167137, -0.03924761712551117, -0.04042193666100502, -0.04157993942499161, -0.04271307215094566, -0.043838512152433395, -0.044960867613554, -0.046086572110652924, -0.04722655192017555, -0.048392221331596375, -0.049606941640377045, -0.05089380219578743, -0.052285678684711456, -0.05385420098900795, -0.055760398507118225, -0.05840958654880524, -0.07654058350419998]

I wonder if anyone encounters the same issue here. I appreciate if you can help me with solving the issue.

[BUG] In some cases the quantile may lead to NAN exception or divide zero exception .

the q_st_start and q_st_end may be equal in some cases, which will lead to divide zero error. So it is a bug.

【EfficientAD】realize the visualization of heat map?

How to realize the visualization of heat map?

Distilled teacher model

Sorry for my ignorance, i have small question about teacher model.
The teacher model pre-distilled with image dataset makes it smarter when training with other datasets?

Can you share the background of this rule. I feed there is no similarity between mvrec dataset and image dataset.

Thank you.

mvtec_ad_evaluation.tar.xz

mvtec_ad_evaluation.tar.xz
If I can't connect to this link, is there any other way to get this file

Convolutions' padding - student/teacher model

Hi! I want to congratulate with you for the implementation of EfficientAD, thank you for your work. I would like to better understand some parts of your implementations, specifically the student/teacher model. In the paper the model has a fixed value for the padding in the convolutions resulting in an output with the shape 384x64x64, instead in your implementation the padding is set by the boolean's variable and the output's shape might be different (384x56x56). Is there a reason behind it?

I encountered an error while evaluating

(EfficientAD) E:\PycharmProjects\EfficientAD-main>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir E:\Dataset\anomaly_detection\mvtec_anomaly_detection\mvtec_anomaly_detection\ --anomaly_maps_dir E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad --output_dir E:\PycharmProjects\EfficientAD-main\output\1\metrics/mvtec_ad --
evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\generic_util.py", line 105, in read_tiff
raise IOError('Found multiple files with a TIFF extension at'
OSError: Found multiple files with a TIFF extension at E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad\bottle\test\broken_large\000
Please specify which TIFF extension to use via the exts parameter of this function.

缺陷检测交流微信群，互相交流学习进步

The source of teacher weights

How inteprete the hyper parameter table？

qa, qb, and qhard is three hyper parameter. But the meaning of the table is vague. I think the combination of three hyper parameter determine the auroc, but this table give auroc values for each hyper parameter value.

Hardware / time needed for pretraining

Hi, first of all thank you for your work and for making it completely public. I'm curious on the hardware you used for the pretraining on ImageNet, and on how much time did it take. Good job!

Do I need to re-pretrain a teacher network when I train my own dataset?

The output size of the network is inconsistent with the description in the paper

Thanks author for the awsome job. But the output size of the Network called PDN(patch descriptor network) is 384 × 56 ×56, while description the in the paper is 384 × 64 × 64. Is it a mistake or my misunderstanding?

Why pad 4 pixels before resizing instead of resizing directly？

Hi~, in the part of ouput anomaly map:

map_combined = torch.nn.functional.pad(map_combined, (4, 4, 4, 4))
map_combined = torch.nn.functional.interpolate(map_combined, (orig_height, orig_width), mode='bilinear')

why pad 4 pixel each side? what's the difference with resize to (orig_height, orig_width) directly?
🤔🤔🤔

A minor typo in distillation training

Before freezing the backbone model, feature_dimensions() is called with a forward pass, thus the BN parameters are affected and it is no longer the original pre-trained model. Though it should cause little differences.

IndexError: list index out of range

I use custom dataset.

(torch1) D:\code\git-DS\EfficientAD>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir "\DS-NAS\ds\oilhole(crank)" --anomaly_maps_dir './output/1/anomaly_maps/oilhole(crank)/' --output_dir './output/1/metrics/oilhole(crank)/' --evaluated_objects type_i
=== Evaluate type_i ===
Parsed 0 ground truth image files.
Read ground truth files and corresponding predictions...
0it [00:00, ?it/s]
Compute PRO curve...
Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 157, in calculate_au_pro_au_roc
pro_curve = compute_pro(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\pro_curve_util.py", line 37, in compute_pro
anomaly_maps[0].shape[0],
IndexError: list index out of range

I've encounted this issue.

I made it all the way to train and got a final image AUC: 98.3421.
The tiff files are stored in the anomaly_maps folder, but I don't know if I'm having trouble reading them or why.

If padding=True, worse results but good segmentation precision. Why?

Hi Nelson, i hope this message finds you well. I'm curious on what is your take on this.
I am very satisfied with results i obtain with pad_maps = True and padding=False.
However, as you know, pad_maps creates a "frame" that pads the resulting anomaly map and it seems that anomalies contained in the padded region are not seen by the model.
Playing with these two pad_maps and padding parameters i realized that, regardless of pad_maps value, it's padding that determines if the padded region is ignored or not. So, if i set padding=True the model also sees anomalies close to the borders of the image. Also, if I set padding=True and pad_maps=False I obtain precise segmentations and i also see defects close to borders. To me it seems that pad_maps is needed only to correct the segmentations "translation" that is caused by padding=False.

So why i'm writing this message and i'm not simply using padding=True and pad_maps=False?

The problem is that, regardless of the pad_maps (so we can forget about it, because it's getting confusing 😁 ), padding=True significantly worsen the results. I mean, it seems that the model learns less things.
So the first thing that came to my mind is that maybe the Teacher has been pretrained with padding=False, so it maybe has a different architecture with respect to the Student with padding=True that i'm trying to use.
But based on the code in this repo, it seems you pretrained the Teacher using padding=True and then you use a Student with padding=False when you do the efficientad training. Is this true?

This image shows what i obtain with padding=True and pad_maps=False . It segments perfectly the fake defects i am using to do these tests. Unfortunately the overall performance on the real defects in the rest of the test test is significantly worse wrt the model trained with padding=False (you can see it also in this image: the model struggles to segment the real defect on the bottom).

What do you think? What could be the reason why padding=True jeopardizes my trainings?

how to inference

hello. thanks for the nice code

How do I just infer an image?

Pretrained weights

Hello, my hardware is quite old, and training takes a long time. Can you please share your pre training weights so that I can quickly check the effectiveness of this model?

Purpose of padding

https://github.com/nelson1425/EfficientAD/blob/fcab5146f84ae17597044ad5ddf1656ccf805401/efficientad.py#LL279C9-L279C75
I'm wondering why this padding is necessary. Can you help me?

How to obtain --imagenet_ train_ path

Hello，Thanks for your work.How can I obtain --imagenet_ train_ path？

Per category result for MVTec AD

Hi Nelson,
can oyu provide the results of EfficientAD-M per category on the MVTec AD dataset? For matching the results in the anomlib framework, I would like to have the single category data :)
Thanks!

Can the pre-trained model of the teacher network 'teacher_small.pth' be trained by myself？

Asking why use batch size =1 ......? Please...help

Why use batch size =1 ? What is for it?
train_loader = DataLoader(train_set, batch_size=1, shuffle=True,
num_workers=4, pin_memory=True)
train_loader_infinite = InfiniteDataloader(train_loader)
validation_loader = DataLoader(validation_set, batch_size=1)

Asking about deployment of onnx format models

Please ask! What is the final output obtained by converting the trained model into onnx format and importing it into onxxruntime for inference? Is it the input sample detection result (and whether it is an anomalous sample) and the corresponding confidence level? Or is it the anomaly_map with the same channel 1 as the input height and width?

Question about the loss between student output and ae output

Hi， I'm puzzled about the loss between student output and ae output，is it necessary?

Why change the output size to 56 * 56，instead of 64 * 64

Hi，Your work is greatly appreciated！I got a question, Why change the output size to 56 * 56，instead of 64 * 64 in paper.

Error when Training and inference & evaluating with MVTec code

I used the python script the author gave as follow :

!python efficientad.py --dataset mvtec_ad --subdataset wood
!python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle

but the error says:
"python3: can't open file '/content/efficientad.py': [Errno 2] No such file or directory"

I am running from colab
What should i do next? Hope someone can point a way. Thanks

good job bro!

Inference is too slow,is something wrong?

I wrote the interface for model inference following the logic of the predict function in the efficientad.py
However, in actual testing, it was found that the average time was close to 1 second (excluding the model loading time), and it was also very slow after converting to the ONNX model

Device information：
CPU：
Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
GPU：
NVIDIA GeForce RTX 3080 16G

This is the code tested with the pth model：

import time
import numpy as np
import torch
from tqdm import tqdm


def pth_predict(image, teacher_model, student_model, ae_model, teacher_mean, teacher_std, out_channels,
                q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
    teacher_output = teacher_model(image)
    teacher_output = (teacher_output - teacher_mean) / teacher_std
    student_output = student_model(image)
    autoencoder_output = ae_model(image)
    map_st = torch.mean((teacher_output - student_output[:, :out_channels]) ** 2,
                        dim=1, keepdim=True)
    map_ae = torch.mean((autoencoder_output -
                         student_output[:, out_channels:]) ** 2,
                        dim=1, keepdim=True)
    if q_st_start is not None:
        map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
    if q_ae_start is not None:
        map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
    map_combined = 0.5 * map_st + 0.5 * map_ae
    return map_combined, map_st, map_ae


if __name__ == '__main__':
    # Load the PTH model
    device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
    teacher_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth', map_location=device)
    student_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth', map_location=device)
    ae_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth', map_location=device)

    # Construct the input data
    fake_img_tensor = torch.rand((1, 3, 256, 256))

    output_channels_num = 384
    # Model prediction
    teacher_mean_tensor = torch.rand((1, output_channels_num, 1, 1))
    teacher_std_tensor = torch.rand((1, output_channels_num, 1, 1))

    time_range = 100
    time_cost_list = []
    for i in tqdm(range(time_range)):
        s1 = time.time()
        pth_predict(fake_img_tensor, teacher_net, student_net, ae_net, teacher_mean_tensor, teacher_std_tensor,
                    output_channels_num,
                    q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
        s2 = time.time()
        time_cost_list.append(s2 - s1)
    print(f'average time cost:{np.mean(time_cost_list):.6f}s')

This is the test code for inference with the ONNX model：

import time
import numpy as np
import onnxruntime
from tqdm import tqdm


def onnx_predict(img_arr, teacher_session, student_session, ae_session, teacher_mean, teacher_std, out_channels,
                 q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
    ort_inputs1 = {teacher_session.get_inputs()[0].name: img_arr}
    teacher_output = teacher_session.run(None, ort_inputs1)
    teacher_output = teacher_output[0]
    teacher_output = (teacher_output - teacher_mean) / teacher_std

    ort_inputs2 = {student_session.get_inputs()[0].name: img_arr}
    student_output = student_session.run(None, ort_inputs2)
    student_output = student_output[0]

    ort_inputs3 = {ae_session.get_inputs()[0].name: img_arr}
    autoencoder_output = ae_session.run(None, ort_inputs3)
    autoencoder_output = autoencoder_output[0]

    map_st = np.mean((teacher_output - student_output[:, :out_channels]) ** 2, axis=1)
    map_ae = np.mean((autoencoder_output -
                      student_output[:, out_channels:]) ** 2, axis=1)
    if q_st_start is not None:
        map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
    if q_ae_start is not None:
        map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
    map_combined = 0.5 * map_st + 0.5 * map_ae

    return map_combined, map_st, map_ae


if __name__ == '__main__':
    # Load the ONNX model
    teacher_ort_session = onnxruntime.InferenceSession('./output/onnx_path/teacher.onnx')
    student_ort_session = onnxruntime.InferenceSession('./output/onnx_path/student.onnx')
    ae_ort_session = onnxruntime.InferenceSession('./output/onnx_path/autoencoder.onnx')

    # Construct the input data
    fake_img_arr = np.random.rand(1, 3, 256, 256)
    fake_img_arr = fake_img_arr.astype(np.float32)
    output_channels_num = 384
    # Model prediction
    teacher_mean_arr = np.random.rand(1, output_channels_num, 1, 1)
    teacher_std_arr = np.random.rand(1, output_channels_num, 1, 1)

    time_range = 100
    time_cost_list = []
    for i in tqdm(range(time_range)):
        s1 = time.time()
        onnx_predict(fake_img_arr, teacher_ort_session, student_ort_session, ae_ort_session,
                     teacher_mean=teacher_mean_arr,
                     teacher_std=teacher_std_arr, out_channels=output_channels_num,
                     q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
        s2 = time.time()
        time_cost_list.append(s2 - s1)
    print(f'average time cost:{np.mean(time_cost_list):.6f}s')

This is a function of converting the PTH model to the ONNX model：

def convert_to_onnx_with_dynamic_img_shape(model, input_size, onnx_path):
    model.eval()

    dummy_input = torch.randn(1, *input_size, requires_grad=True)

    torch.onnx.export(model, 
                      dummy_input, 
                      onnx_path,  
                      export_params=True,  
                      opset_version=11,  
                      do_constant_folding=True,  
                      input_names=['modelInput'],  
                      output_names=['modelOutput'],  
                      dynamic_axes={'modelInput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'}, 
                                    'modelOutput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'}})
    print('Model has been converted to ONNX')


if __name__ == '__main__':
    out_channels = 384

    # teacher = get_pdn_small(out_channels)
    # student = get_pdn_small(2 * out_channels)
    # autoencoder = get_autoencoder(out_channels)

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    teacher_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth',
                               map_location=device)
    student_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth',
                               map_location=device)
    autoencoder_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth',
                                   map_location=device)

    teacher_onnx_path = './output/onnx_path/teacher.onnx'
    convert_to_onnx_with_dynamic_img_shape(teacher_model, input_size=(3, 256, 256), onnx_path=teacher_onnx_path)

    student_onnx_path = './output/onnx_path/student.onnx'
    convert_to_onnx_with_dynamic_img_shape(student_model, input_size=(3, 256, 256), onnx_path=student_onnx_path)

    autoencoder_onnx_path = './output/onnx_path/autoencoder.onnx'
    convert_to_onnx_with_dynamic_img_shape(autoencoder_model, input_size=(3, 256, 256), onnx_path=autoencoder_onnx_path)

Question about training steps used to achieve reported performance

For the performance reported in the readme, I wonder, did you use the same training steps (70,000) for all datasets (mvtec, mvtec loco, and visa)?

result of inference on broken large bottle

hi nelson and thanks for your great code
i just executed efficientad.py with default setting for bottle with 70000 steps
below is the result i got for the first broken large image(000.png):

i just wanted to know if the result image is what you expected or not

Train on ViSA dataset

Hello, Thank you for your work.
To reproduce the paper results. I want to train on mvtec and visa datasets. But there is no configuration for the visa dataset. However, you mentioned the score of EfficientAD on the Visa dataset in the table.
How we can train on Visa Dataset.
Secondly, while training the algorithm, you save the model after some interval without considering the auroc score. I'm looking forward to hearing back from you. Thank you

nelson1425 / efficientad Goto Github PK

efficientad's Issues

Recommend Projects

Recommend Topics

Recommend Org