Giter Site home page Giter Site logo

eif-1's Introduction

latest releasepypi version

Extended Isolation Forest

This is a simple package implementation for the Extended Isolation Forest method. It is an improvement on the original algorithm Isolation Forest which is described (among other places) in this paper for detecting anomalies and outliers from a data point distribution.

The original algorithm suffers from an inconsistency in producing anomaly scores due to slicing operations. Even though the slicing hyperplanes are selected at random, they are always parallel to the coordinate reference frame. The shortcoming can be seen in score maps as presented in the example notebooks in this repository. In order to improve the situation, we propose an extension which allows the hyperplanes to be taken at random angles. The way in which this is done gives rise to multiple levels of extension depending on the dimensionality of the problem. For an N dimensional dataset, Extended Isolation Forest has N levels of extension, with 0 being identical to the case of standard Isolation Forest, and N-1 being the fully extended version.

Here we provide the source code for the algorithm as well as documented example notebooks to help get started. Various visualizations are provided such as score distributions, score maps, aggregate slicing of the domain, and tree and whole forest visualizations. most examples are in 2D. We present one 3D example. However, the algorithm works readily with higher dimensional data.

Installation

pip install eif

or directly from the repository

pip install git+https://github.com/sahandha/eif.git

Requirements

  • numpy

No extra requirements are needed. In addition, it also contains means to draw the trees created using the igraph library. See the example for tree visualizations

Use

See these notebooks for examples on how to use it

Citation

If you use this code, please considering using the following reference:

@ARTICLE{2018arXiv181102141H,
   author = {{Hariri}, S. and {Carrasco Kind}, M. and {Brunner}, R.~J.},
    title = "{Extended Isolation Forest}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1811.02141},
 keywords = {Computer Science - Machine Learning, Statistics - Machine Learning},
     year = 2018,
    month = nov,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv181102141H},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

The pre-print article can be found here

Releases

v1.0.2

2018-OCT-01

  • Added documentation, examples and software paper

v1.0.1

2018-AUG-08

  • Bugfix for multidimensional data

v1.0.0

2018-JUL-15

  • Initial Release

eif-1's People

Contributors

mgckind avatar sahandha avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.