Giter Site home page Giter Site logo

nelson1425 / efficientad Goto Github PK

View Code? Open in Web Editor NEW
220.0 2.0 56.0 37.99 MB

Unofficial implementation of EfficientAD https://arxiv.org/abs/2303.14535

Home Page: https://arxiv.org/abs/2303.14535

License: Apache License 2.0

Python 100.00%
anomaly-classification anomaly-detection anomaly-localization anomaly-segmentation efficientad paper

efficientad's Issues

The image_size

Hello, Thank you for your work.
During training, does the image_size of the data need to be 256?

Error when evaluate with MVTec code

I used the python script you gave as follow:

python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle

but there are errors:

`python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data\mvtec_ad_evaluation\generic_util.py", line 102, in read_tiff
raise FileNotFoundError('Could not find a file with a TIFF extension'
FileNotFoundError: Could not find a file with a TIFF extension at ./output/1/anomaly_maps/mvtec_ad/bottle\test\broken_large\000
(base) PS F:\implementation_anomalydetection\nelson1425_EfficientAD\nelson_data>

`

all the mvtec_ad_evaluation.tar.xz and mvtec_anomaly_detection.tar.xz are from what you share.
I am running from aconda powershell from windows 10

can anyone point a way ?

about the validation of the MvTec-AD LOCO dataset

I use valid code that the author gives,but the there is something wrong that shows “ assert len(set(init_queries)) == len(init_queries) ”
I hope someone who meet this question can help me

how to test one img?

Input a picture, how to determine whether the picture is abnormal or not, how to write the code?

RuntimeError: quantile() input tensor is too large

Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\efficientad.py", line 451, in
main()
File "D:\code\git-DS\EfficientAD\efficientad.py", line 268, in main
q_st_start, q_st_end, q_ae_start, q_ae_end = map_normalization(
File "C:\Users\2878045\AppData\Roaming\Python\Python39\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\code\git-DS\EfficientAD\efficientad.py", line 374, in map_normalization
q_st_start = torch.quantile(maps_st, q=0.9)
RuntimeError: quantile() input tensor is too large

anyone solve this?

Get error when evaluate breakfast_box's output with LOCO's official evaluation code

Thanks for your great effort in contributing and sharing the reimplementation.

I have a problem here that needs your assistance. After training the MVTec LOCO's breakfast box category, I want to evaluate the output feature maps (in tiff format) using their official evaluation code, but I get this error:

File "[mydrive]/EfficientAD/mvtec_loco_ad_evaluation/src/aggregation.py", line 57, in binary_refinement
    assert len(set(init_queries)) == len(init_queries)

I debugged the issue, and found this error is caused by the fact that the output feature maps contain too many 0 values, such that when the evaluation code creates the initial_thresholds, a.k.a. init_queries (a list of 50 anomaly scores), it contains duplicate values 0. An example is like:

[2.83783058984375, 0.3397859036922455, 0.11129003018140793, 0.04379851371049881, 0.018422557041049004, 0.005928939674049616, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.00046889434452168643, -0.004845781251788139, -0.008791022002696991, -0.012286320328712463, -0.015344921499490738, -0.01806800439953804, -0.02048872597515583, -0.0226922407746315, -0.024705413728952408, -0.026554450392723083, -0.028272513300180435, -0.029879318550229073, -0.0313967764377594, -0.0328390970826149, -0.03421252220869064, -0.0355297289788723, -0.036807697266340256, -0.03804701194167137, -0.03924761712551117, -0.04042193666100502, -0.04157993942499161, -0.04271307215094566, -0.043838512152433395, -0.044960867613554, -0.046086572110652924, -0.04722655192017555, -0.048392221331596375, -0.049606941640377045, -0.05089380219578743, -0.052285678684711456, -0.05385420098900795, -0.055760398507118225, -0.05840958654880524, -0.07654058350419998]

I wonder if anyone encounters the same issue here. I appreciate if you can help me with solving the issue.

Distilled teacher model

Sorry for my ignorance, i have small question about teacher model.
The teacher model pre-distilled with image dataset makes it smarter when training with other datasets?

Can you share the background of this rule. I feed there is no similarity between mvrec dataset and image dataset.

Thank you.

Convolutions' padding - student/teacher model

Hi! I want to congratulate with you for the implementation of EfficientAD, thank you for your work. I would like to better understand some parts of your implementations, specifically the student/teacher model. In the paper the model has a fixed value for the padding in the convolutions resulting in an output with the shape 384x64x64, instead in your implementation the padding is set by the boolean's variable and the output's shape might be different (384x56x56). Is there a reason behind it?

I encountered an error while evaluating

(EfficientAD) E:\PycharmProjects\EfficientAD-main>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir E:\Dataset\anomaly_detection\mvtec_anomaly_detection\mvtec_anomaly_detection\ --anomaly_maps_dir E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad --output_dir E:\PycharmProjects\EfficientAD-main\output\1\metrics/mvtec_ad --
evaluated_objects bottle
=== Evaluate bottle ===
Parsed 83 ground truth image files.
Read ground truth files and corresponding predictions...
0%| | 0/83 [00:00<?, ?it/s]
Traceback (most recent call last):
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\evaluate_experiment.py", line 148, in calculate_au_pro_au_roc
prediction = util.read_tiff(pred_name)
File "E:\PycharmProjects\EfficientAD-main\mvtec_ad_evaluation\generic_util.py", line 105, in read_tiff
raise IOError('Found multiple files with a TIFF extension at'
OSError: Found multiple files with a TIFF extension at E:\PycharmProjects\EfficientAD-main\output\1\anomaly_maps\mvtec_ad\bottle\test\broken_large\000
Please specify which TIFF extension to use via the exts parameter of this function.

How inteprete the hyper parameter table?

image
qa, qb, and qhard is three hyper parameter. But the meaning of the table is vague. I think the combination of three hyper parameter determine the auroc, but this table give auroc values for each hyper parameter value.

Hardware / time needed for pretraining

Hi, first of all thank you for your work and for making it completely public. I'm curious on the hardware you used for the pretraining on ImageNet, and on how much time did it take. Good job!

A minor typo in distillation training

Before freezing the backbone model, feature_dimensions() is called with a forward pass, thus the BN parameters are affected and it is no longer the original pre-trained model. Though it should cause little differences.

IndexError: list index out of range

I use custom dataset.

(torch1) D:\code\git-DS\EfficientAD>python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir "\DS-NAS\ds\oilhole(crank)" --anomaly_maps_dir './output/1/anomaly_maps/oilhole(crank)/' --output_dir './output/1/metrics/oilhole(crank)/' --evaluated_objects type_i
=== Evaluate type_i ===
Parsed 0 ground truth image files.
Read ground truth files and corresponding predictions...
0it [00:00, ?it/s]
Compute PRO curve...
Traceback (most recent call last):
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 247, in
main()
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 215, in main
calculate_au_pro_au_roc(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\evaluate_experiment.py", line 157, in calculate_au_pro_au_roc
pro_curve = compute_pro(
File "D:\code\git-DS\EfficientAD\mvtec_ad_evaluation\pro_curve_util.py", line 37, in compute_pro
anomaly_maps[0].shape[0],
IndexError: list index out of range

I've encounted this issue.

I made it all the way to train and got a final image AUC: 98.3421.
The tiff files are stored in the anomaly_maps folder, but I don't know if I'm having trouble reading them or why.

If padding=True, worse results but good segmentation precision. Why?

Hi Nelson, i hope this message finds you well. I'm curious on what is your take on this.
I am very satisfied with results i obtain with pad_maps = True and padding=False.
However, as you know, pad_maps creates a "frame" that pads the resulting anomaly map and it seems that anomalies contained in the padded region are not seen by the model.
Playing with these two pad_maps and padding parameters i realized that, regardless of pad_maps value, it's padding that determines if the padded region is ignored or not. So, if i set padding=True the model also sees anomalies close to the borders of the image. Also, if I set padding=True and pad_maps=False I obtain precise segmentations and i also see defects close to borders. To me it seems that pad_maps is needed only to correct the segmentations "translation" that is caused by padding=False.

So why i'm writing this message and i'm not simply using padding=True and pad_maps=False?

The problem is that, regardless of the pad_maps (so we can forget about it, because it's getting confusing 😁 ), padding=True significantly worsen the results. I mean, it seems that the model learns less things.
So the first thing that came to my mind is that maybe the Teacher has been pretrained with padding=False, so it maybe has a different architecture with respect to the Student with padding=True that i'm trying to use.
But based on the code in this repo, it seems you pretrained the Teacher using padding=True and then you use a Student with padding=False when you do the efficientad training. Is this true?

This image shows what i obtain with padding=True and pad_maps=False . It segments perfectly the fake defects i am using to do these tests. Unfortunately the overall performance on the real defects in the rest of the test test is significantly worse wrt the model trained with padding=False (you can see it also in this image: the model struggles to segment the real defect on the bottom).

pad_maps_False_padding_True

What do you think? What could be the reason why padding=True jeopardizes my trainings?

how to inference

hello. thanks for the nice code

How do I just infer an image?

Pretrained weights

Hello, my hardware is quite old, and training takes a long time. Can you please share your pre training weights so that I can quickly check the effectiveness of this model?

Per category result for MVTec AD

Hi Nelson,
can oyu provide the results of EfficientAD-M per category on the MVTec AD dataset? For matching the results in the anomlib framework, I would like to have the single category data :)
Thanks!

Asking why use batch size =1 ......? Please...help

Why use batch size =1 ? What is for it?
train_loader = DataLoader(train_set, batch_size=1, shuffle=True,
num_workers=4, pin_memory=True)
train_loader_infinite = InfiniteDataloader(train_loader)
validation_loader = DataLoader(validation_set, batch_size=1)

Asking about deployment of onnx format models

Please ask! What is the final output obtained by converting the trained model into onnx format and importing it into onxxruntime for inference? Is it the input sample detection result (and whether it is an anomalous sample) and the corresponding confidence level? Or is it the anomaly_map with the same channel 1 as the input height and width?

Error when Training and inference & evaluating with MVTec code

I used the python script the author gave as follow :

!python efficientad.py --dataset mvtec_ad --subdataset wood
!python mvtec_ad_evaluation/evaluate_experiment.py --dataset_base_dir './mvtec_anomaly_detection/' --anomaly_maps_dir './output/1/anomaly_maps/mvtec_ad/' --output_dir './output/1/metrics/mvtec_ad/' --evaluated_objects bottle

but the error says:
"python3: can't open file '/content/efficientad.py': [Errno 2] No such file or directory"

I am running from colab
What should i do next? Hope someone can point a way. Thanks

Inference is too slow,is something wrong?

I wrote the interface for model inference following the logic of the predict function in the efficientad.py
However, in actual testing, it was found that the average time was close to 1 second (excluding the model loading time), and it was also very slow after converting to the ONNX model

Device information:
CPU:
Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
GPU:
NVIDIA GeForce RTX 3080 16G

This is the code tested with the pth model:

import time
import numpy as np
import torch
from tqdm import tqdm


def pth_predict(image, teacher_model, student_model, ae_model, teacher_mean, teacher_std, out_channels,
                q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
    teacher_output = teacher_model(image)
    teacher_output = (teacher_output - teacher_mean) / teacher_std
    student_output = student_model(image)
    autoencoder_output = ae_model(image)
    map_st = torch.mean((teacher_output - student_output[:, :out_channels]) ** 2,
                        dim=1, keepdim=True)
    map_ae = torch.mean((autoencoder_output -
                         student_output[:, out_channels:]) ** 2,
                        dim=1, keepdim=True)
    if q_st_start is not None:
        map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
    if q_ae_start is not None:
        map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
    map_combined = 0.5 * map_st + 0.5 * map_ae
    return map_combined, map_st, map_ae


if __name__ == '__main__':
    # Load the PTH model
    device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
    teacher_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth', map_location=device)
    student_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth', map_location=device)
    ae_net = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth', map_location=device)

    # Construct the input data
    fake_img_tensor = torch.rand((1, 3, 256, 256))

    output_channels_num = 384
    # Model prediction
    teacher_mean_tensor = torch.rand((1, output_channels_num, 1, 1))
    teacher_std_tensor = torch.rand((1, output_channels_num, 1, 1))

    time_range = 100
    time_cost_list = []
    for i in tqdm(range(time_range)):
        s1 = time.time()
        pth_predict(fake_img_tensor, teacher_net, student_net, ae_net, teacher_mean_tensor, teacher_std_tensor,
                    output_channels_num,
                    q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
        s2 = time.time()
        time_cost_list.append(s2 - s1)
    print(f'average time cost:{np.mean(time_cost_list):.6f}s')

This is the test code for inference with the ONNX model:

import time
import numpy as np
import onnxruntime
from tqdm import tqdm


def onnx_predict(img_arr, teacher_session, student_session, ae_session, teacher_mean, teacher_std, out_channels,
                 q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None):
    ort_inputs1 = {teacher_session.get_inputs()[0].name: img_arr}
    teacher_output = teacher_session.run(None, ort_inputs1)
    teacher_output = teacher_output[0]
    teacher_output = (teacher_output - teacher_mean) / teacher_std

    ort_inputs2 = {student_session.get_inputs()[0].name: img_arr}
    student_output = student_session.run(None, ort_inputs2)
    student_output = student_output[0]

    ort_inputs3 = {ae_session.get_inputs()[0].name: img_arr}
    autoencoder_output = ae_session.run(None, ort_inputs3)
    autoencoder_output = autoencoder_output[0]

    map_st = np.mean((teacher_output - student_output[:, :out_channels]) ** 2, axis=1)
    map_ae = np.mean((autoencoder_output -
                      student_output[:, out_channels:]) ** 2, axis=1)
    if q_st_start is not None:
        map_st = 0.1 * (map_st - q_st_start) / (q_st_end - q_st_start)
    if q_ae_start is not None:
        map_ae = 0.1 * (map_ae - q_ae_start) / (q_ae_end - q_ae_start)
    map_combined = 0.5 * map_st + 0.5 * map_ae

    return map_combined, map_st, map_ae


if __name__ == '__main__':
    # Load the ONNX model
    teacher_ort_session = onnxruntime.InferenceSession('./output/onnx_path/teacher.onnx')
    student_ort_session = onnxruntime.InferenceSession('./output/onnx_path/student.onnx')
    ae_ort_session = onnxruntime.InferenceSession('./output/onnx_path/autoencoder.onnx')

    # Construct the input data
    fake_img_arr = np.random.rand(1, 3, 256, 256)
    fake_img_arr = fake_img_arr.astype(np.float32)
    output_channels_num = 384
    # Model prediction
    teacher_mean_arr = np.random.rand(1, output_channels_num, 1, 1)
    teacher_std_arr = np.random.rand(1, output_channels_num, 1, 1)

    time_range = 100
    time_cost_list = []
    for i in tqdm(range(time_range)):
        s1 = time.time()
        onnx_predict(fake_img_arr, teacher_ort_session, student_ort_session, ae_ort_session,
                     teacher_mean=teacher_mean_arr,
                     teacher_std=teacher_std_arr, out_channels=output_channels_num,
                     q_st_start=None, q_st_end=None, q_ae_start=None, q_ae_end=None)
        s2 = time.time()
        time_cost_list.append(s2 - s1)
    print(f'average time cost:{np.mean(time_cost_list):.6f}s')

This is a function of converting the PTH model to the ONNX model:

def convert_to_onnx_with_dynamic_img_shape(model, input_size, onnx_path):
    model.eval()

    dummy_input = torch.randn(1, *input_size, requires_grad=True)

    torch.onnx.export(model, 
                      dummy_input, 
                      onnx_path,  
                      export_params=True,  
                      opset_version=11,  
                      do_constant_folding=True,  
                      input_names=['modelInput'],  
                      output_names=['modelOutput'],  
                      dynamic_axes={'modelInput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'}, 
                                    'modelOutput': {0: 'batch_size', 2: 'img_height', 3: 'img_weight'}})
    print('Model has been converted to ONNX')


if __name__ == '__main__':
    out_channels = 384

    # teacher = get_pdn_small(out_channels)
    # student = get_pdn_small(2 * out_channels)
    # autoencoder = get_autoencoder(out_channels)

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    teacher_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/teacher_final.pth',
                               map_location=device)
    student_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/student_final.pth',
                               map_location=device)
    autoencoder_model = torch.load('./output/ad_small/trainings/mvtec_ad/rain/autoencoder_final.pth',
                                   map_location=device)

    teacher_onnx_path = './output/onnx_path/teacher.onnx'
    convert_to_onnx_with_dynamic_img_shape(teacher_model, input_size=(3, 256, 256), onnx_path=teacher_onnx_path)

    student_onnx_path = './output/onnx_path/student.onnx'
    convert_to_onnx_with_dynamic_img_shape(student_model, input_size=(3, 256, 256), onnx_path=student_onnx_path)

    autoencoder_onnx_path = './output/onnx_path/autoencoder.onnx'
    convert_to_onnx_with_dynamic_img_shape(autoencoder_model, input_size=(3, 256, 256), onnx_path=autoencoder_onnx_path)

result of inference on broken large bottle

hi nelson and thanks for your great code
i just executed efficientad.py with default setting for bottle with 70000 steps
below is the result i got for the first broken large image(000.png):
Screenshot from 2024-02-26 14-15-37
i just wanted to know if the result image is what you expected or not

Train on ViSA dataset

Hello, Thank you for your work.
To reproduce the paper results. I want to train on mvtec and visa datasets. But there is no configuration for the visa dataset. However, you mentioned the score of EfficientAD on the Visa dataset in the table.
How we can train on Visa Dataset.
Secondly, while training the algorithm, you save the model after some interval without considering the auroc score. I'm looking forward to hearing back from you. Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.