Giter Site home page Giter Site logo

bharathsudharsan / tiny-impute Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 94 KB

On-device Hybrid Anomaly Detection and Data Imputation

License: GNU General Public License v3.0

Jupyter Notebook 60.11% Python 39.89%
anamoly-detection data-quality edge-computing esp32 imputation-algorithm iot micro-python mkr1000 raspberry-pi tinyml arduino expectation-maximization knn laplacian moving-average simple-linear-regression

tiny-impute's Introduction

Tiny-Impute: On-device Hybrid Anomaly Detection and Data Imputation

Imputation Algorithms

Summary of the 3 hybrid anomaly detection and data imputation Algorithms:

Moving Average with Simple Linear Regression (MA-SLR)

This algorithm is designed for MCUs and small CPU devices (like Arduino boards), considering their hardware limitations. In this algorithm, we developed and employed a hybrid system that seamlessly integrates moving averages with Z-score thresholding to accurately pinpoint and remove anomalous data points within a dataset. This is further augmented by utilizing a modified linear regression method for data imputation [Code for IoT Boards][Code for PC and RPi].

K-Nearest Neighbors with Expectation-Maximization (KNN-EM)

This algorithm is designed for edge devices (like gateways, AIoT boards, and SBCs) with processing and memory capabilities higher than MCUs. The design of this algorithm combines our highly-optimized unsupervised K-Nearest Neighbors (KNN) and Expectation-Maximization (EM) for anomaly detection and data imputation respectively [Code for IoT Boards][Code for PC and RPi].

Optimized Laplacian Convolutional Representation (LCR-Opt)

Here, we deeply modified and optimized a top-performing and high-resource consuming (LCR) method, that imputes missing data using a low-rank approximation model complemented by regularization techniques [Code for IoT Boards][Code for PC and RPi].

Test Datasets

Datasets used to test Tiny-Impute algorithms MA-SLR, KNN-EM, LCR-Opt:

  • Gesture Phase Segmentation: The dataset is composed by features extracted from 7 videos with people gesticulating. It contains 50 attributes divided into two files for each video [Original Dataset] [Test Samples]

  • Iris Flowers: A small classic dataset. Very popular datasets used for evaluating classification methods [Original Dataset] [Test Samples]

  • Mammographic Mass: Discrimination of benign and malignant mammographic masses based on BI-RADS attributes and the patient's age. To access Original Dataset [Original Dataset] [Test Samples]

  • Daily and Sports Activities: The dataset comprises motion sensor data of 19 daily and sports activities each performed by 8 subjects in their own style for 5 minutes [Original Dataset] [Test Samples]

  • Urban Observatory - CO: Carbon Monoxide (CO) data taken from the Urban Observatory, Newcastle University [Original Dataset]

IoT Boards

The IoT boards used to test the 3 imputation algorithms over 5 test datasets:

  • Arduino MKR1000: [CPU] SAMD21 Cortex-M0+ 48MHz. [Memory] Flash 256KB, SRAM 32KB [Board]

  • ESP 32 Dev Kit: [CPU] Xtensa LX6 240 MHz. [Memory] Flash 4MB, SRAM 520KB [Board]

  • Raspberry Pi 4 Model B: [CPU] Cortex-A72 1.8GHz. [Memory] M-SD 16GB, SDRAM 4GB [Board]

Imputation Experiments

CircuitPython & MicroPython - IoT Boards

Set up the IoT board by installing the appropriate Python implementation by following [CircuitPython] or [MicroPython] To have an easier experience with coding and running the repo on MCUs, intall and use Thonny IDE

To run the expirements on IoT Board, clone this repo, copy the dataset sample (.csv files) to the board's memory, call the same name in the code, then run the (.py file) on the board

Jupyter Notebooks - PC / Collab

To run the expirements on local PC, clone this repo, open the algorithm of choice (.ipynb files), run all cells in sequence

tiny-impute's People

Contributors

bharathsudharsan avatar shamil-al-ameen avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.