Giter Site home page Giter Site logo

chrissi2802 / wisdm---biometric-time-series-data-classification Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 3.0 18.89 MB

Deep learning, classification on the WISDM dataset

Python 99.31% Shell 0.69%
deep-learning neuronal-networks python pytorch time-series dataset gpu tensorflow cnn gru

wisdm---biometric-time-series-data-classification's Introduction

WISDM - Biometric time series data classification

This repository contains several models for a classification of the reduced WISDM dataset.
Neural networks are used for feature extraction and classification.
These were implemented in Python using the PyTorch library. The latest neural networks have been implemented in TensorFlow. All files or folders with a "_tf" or "_TF" in the name are for TensorFlow.
This repository is based on a Kaggle Competition. The website for this Competition can be found here.

Data

The task is a classification of biometric time series data. The dataset is the "WISDM Smartphone and Smartwatch Activity and Biometrics Dataset", WISDM stands for Wireless Sensor Data Mining. The actual dataset was created by the Department of Computer and Information Science at Fordham University in New York. The researchers collected data from the accelerometer and gyroscope sensors of a smartphone and smartwatch as 51 subjects performed 18 diverse activities of daily living. Each activity was performed for 3 minutes, so that each subject contributed 54 minutes of data.
A detailed description of the dataset is also included in this repo. However, if you would like to view the original data, you can find the complete dataset here.

As already mentioned, a reduced dataset is used, which contains the following six activities:
A - walking
B - jogging
C - climbing stairs
D - sitting
E - standing
M - kicking soccer ball

Models

Moreover, not only eleven different neural networks are available, but training procedures and data pre-processing scripts are also included.

Models (neural networks):

  • PyTorch
    • Linear / Multilayer Perceptron (MLP) model
    • Convolutional Neural Network (CNN) 1D model
    • Gated Recurrent Units (GRU), this is a Recurrent Neural Network (RNN) model
    • CNN 2D model
    • Long Short-Term Memory (LSTM) model
  • TensorFlow
    • MLP model
    • CNN 2D model
    • GRU model
    • LSTM model
    • Big GRU model
    • Convolutional LSTM model

Overview of the folder structure and files

Files Description
Datasets/ contains the data and the submissions
Models/ contains the trained models
Plots/ contains all plots from the training and testing
.gitignore contains files and folders that are not tracked via git
dataset_tf.py provides the dataset and prepares the data for TensorFlow
datasets.py provides the dataset and prepares the data for PyTorch
helpers.py provides auxiliary classes and functions for neural networks
Job.sh provides a script to carry out the training on a computer cluster
models_tf.py provides the models for TensorFlow
models.py provides the models for PyTorch
train_tf.py provides functions for training and testing for TensorFlow
train.py provides functions for training and testing for PyTorch
WISDM-dataset-description.pdf further description of the dataset

Achieved results

The scores were calculated by Kaggle. The metric is the categorization accuracy (ACC).

Models Public leaderboard score Training time (hh:mm:ss) Parameters of the model
MLP_NET_V1 0.45856 00:05:22 902
CNN_NET_V1 0.51933 00:21:17 141,766
GRU_NET 0.00000 PyTorch GRU does not work 0
CNN_NET_V2 0.85635 00:01:28 134,134
LSTM_NET 0.83425 00:16:16 529,926
MLP_NET_TF 0.90055 00:08:20 112,262
CNN_NET_TF 0.87845 00:06:18 1,641,030
GRU_NET_TF 0.89502 00:18:55 4,175,238
LSTM_NET_TF 0.88950 00:19:04 4,470,150
GRU_NET_BIG_TF 0.95027 00:22:47 10,621,830
CONV_LSTM_NET_TF 0.93370 00:35:53 14,721,926

The two models GRU_NET_BIG_TF and CONV_LSTM_NET_TF were trained with an extended data set. For this purpose, three new features were added by means of feature engineering. The features are the Fast Fourier Transformation (FFT) of the individual signals.
In addition, these two models were trained with data created with a sliding window of size 200. All other models were trained with size 100.

The best model is therefore the GRU_NET_BIG_TF with an accuracy of 95.027%.

wisdm---biometric-time-series-data-classification's People

Contributors

chrissi2802 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.