Giter Site home page Giter Site logo

oc1-cuda's Introduction

OC1-CUDA

Introduction

This project is an attempt to implement an oblique classifier on a GPU using CUDA. The algorithm is inspired from the OC1 algorithm designed by Sreerama K. Murthy. Details of his work can be found here.

There are two different implementations. The first one parallelize most of the operations within one split of a set but each set is splitted serially. Since I was not satisfied by this implmentation, I implemented the same algorithm with an additional level of parallelism. In the second version, all the splits are done simultenaously. The performances are much better because the first version is linear in the number of node in the decision tree while the second version if linear in the depth of the tree.

Algorithm

Global Structure

The original algorithm is described in this thesis. I kept the main ideas.

The first step is to find the best parallel split. I do this exhaustively because it is possible to implement it linearly in the number of samples on a GPU while it is exponential in a serial implementation. It exists probabilistic algorithm that are more efficient but more complex to implement in parallel.

Then the idea is to modify the coefficients of the hyperplan one after the other. This is the perturbation phase. This cannot be parallelized. However, the parallel split is then compared to random splits. Since these splits are fully independent, they can be perturbed and their impurity can be computed in parallel. This is not the case in the first algorithm which compares in parallel different hyperplans but there is still a serial part. It is fully parallel in the second version.

Differences between version 1 and 2

The version one tests shifted versions of the currently perturbed hyperplan which leads to better results than just working on the original one. The version 2 doesn't do that for the moment but it is not a technical limitation, it is just that I had not the time to implement it.

Classification

The classification algorithm is the same in both version. It is just a thread per sample than goes through the tree. I have to work on that because it is not as efficient as it should be. The problem is probably related to branching.

First Version

This version is implemented in the following files

  • parallelOC1.py
  • impurity.c: kernels related to the computation of impurity of a split
  • set_impurity: kernels about compution of the impurity of sets
  • classify: kernel about using a tree to classify samples
  • parallel_splits.c: kernels about finding the best parallel split
  • pertub_coefficients.c: kernels about the pertubations of coefficient phase

Second version

This version is implemented in the following files

  • parallelOC1v2.py
  • parallel_build.c: all the kernels

Test it

Choose a impurity threshold which is the stopping condition: the algorithm will keep split sets until each set has a impurity level under that threshold. Impurity is a real between 0 and 1. Choose the number of points to be generated in each category. Choose which version of the algorithm to run: 1 or 2

Then to run the benchmark:

python parallelOC1Benchmark.py number_of_points impurity_threshold version

The decision tree will be displayed as a list of the hyperplans. The hyperplans are plotted. However, full lines are drawn. You need to check the order of the hyperplan to understand the plot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.