Giter Site home page Giter Site logo

cbn001012 / geo-hgan-unsupervised-anomaly-detection-in-geochemical-data-via-latent-space-learning Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 2.56 MB

Geo-Hgan: Unsupervised anomaly detection in geochemical data via latent space learning

License: Apache License 2.0

Python 100.00%
anomaly-detection classification-algorithm data-mining data-science mineral-exploration transfer-learning geochemical-data

geo-hgan-unsupervised-anomaly-detection-in-geochemical-data-via-latent-space-learning's Introduction

Geo-Hgan: Unsupervised anomaly detection in geochemical data via latent space learning

Research roadmap

Research roadmap

Code Running Guide

Step1: Train the Latent Space Learning Module

To train the latent space learning module, run the "train_LSTM.py" file. In this file, the "autoencoder.py" file in the nets folder defines the structure of the autoencoder. The "latent_GAN.py" file in the nets folder defines the structure of the latent space generative adversarial network, and the "sample_GAN.py" file in the nets folder defines the structure of the sample generative adversarial network.

The purpose of this step is to constrain the sample generation process by exploring the latent space features of the samples.

Step2: GAN-guided Variational Feature Extraction in Joint Training

Run the "GAN_guided_train_encoder.py" file to further constrain the training of the encoder using the weights of the sample generative adversarial network pre-trained in Step 1.

The purpose of this step is to use the pre-trained GAN to constrain the process of extracting variant features by the encoder, in order to identify the features of weak anomalous data.

Step3: Evaluation of Anomaly Detection Performance

The above-mentioned Step 1 and Step 2 together constitute Geo-Hgan. Next, run the "test_anomaly_detection.py" file to test the anomaly detection performance of Geo-Hgan. Additionally, use the "select_anomaly_metric.py" file to identify the best metric for calculating anomaly scores, based on the AUC (Area Under the Curve). Three types of anomaly score calculation criteria, namely "z_distance", "img_distance", and "anomaly_score", have been computed. "anomaly_score" is the weighted sum of "z_distance" and "img_distance". The purpose of this code is to determine which metric is most effective in identifying anomalies (based on the maximum AUC) and to use this metric as the standard for calculating the anomaly scores of the samples.

After calculating the anomaly scores for each sample in the source domain, run the "AnomFilter.py" file to filter the anomalous samples based on two criteria: outlier values and data distribution. This filtering process aims to select high-confidence unlabeled anomalous samples. These selected anomalous samples will be added to the training data of the target domain and labeled as positive samples.

Utility functions: "tools" folder

"loadTifImage.py" (which is used to read multi-channel Tif format data in PyTorch).

Testing and Evaluation: "evaluation" folder

The "evaluation" folder contains the code for computing the ROC (Receiver Operating Characteristic) for unsupervised models ("unsupervised_ROC.py") and the ROC results of the models on different datasets.

Predicted Mineral Resources Results

Predicted Mineral Resources Results

geo-hgan-unsupervised-anomaly-detection-in-geochemical-data-via-latent-space-learning's People

Contributors

cbn001012 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.