Giter Site home page Giter Site logo

liu3xing3long / thundergbm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xtra-computing/thundergbm

0.0 1.0 0.0 1.57 MB

ThunderGBM: Fast GBDTs and Random Forests on GPUs

License: Apache License 2.0

CMake 0.41% C++ 81.00% Cuda 17.73% Python 0.86%

thundergbm's Introduction

Documentation Status GitHub license GitHub issues

Documentations | Parameters | Python (scikit-learn) interface

Overview

The mission of ThunderGBM is to help users easily and efficiently apply GBDTs and Random Forests to solve problems. ThunderGBM exploits GPUs to achieve high efficiency. Key features of ThunderGBM are as follows.

  • Often by 10x times over other libraries.
  • Support Python (scikit-learn) interfaces.
  • Supported Operating System(s): Linux.
  • Support classification, regression and ranking.

Why accelerate GBDT and Random Forests: A survey conducted by Kaggle in 2017 shows that 50%, 46% and 24% of the data mining and machine learning practitioners are users of Decision Trees, Random Forests and GBMs, respectively.

GBDTs and Random Forests are often used for creating state-of-the-art data science solutions. We've listed three winning solutions using GBDTs below. Please check out the XGBoost website for more winning solutions and use cases. Here are some example successes of GDBTs and Random Forests:

Getting Started

Prerequisites

  • cmake 2.8 or above | gcc 4.8 or above for Linux | CUDA 8 or above

Download

git clone https://github.com/zeyiwen/thundergbm.git
git submodule init cub && git submodule update

Build on Linux

cd thundergbm
mkdir build && cd build && cmake .. && make -j

Build the test cases

git submodule update --init src/test/googletest

Quick Start

./bin/thundergbm-train ../dataset/machine.conf
./bin/thundergbm-predict ../dataset/machine.conf

You will see RMSE = 0.489562 after successful running.

Installation

  • Add the require binaries to $PATH (where path_to_cuda is the home directory of cuda, e.g., /usr/local/cuda-9.0/bin/; path_to_mpi is the home directory of MPI, e.g., /opt/openmpi-gcc/bin/)
export PATH="path_to_cuda:$PATH"
export PATH="path_to_mpi:$PATH"
  • Build ThunderGBM
cd thundergbm
mkdir build && cd build && cmake .. && make -j
  • Run ThunderGBM with MPI
make runtest-mpi

How to cite ThunderGBM

If you use ThunderGBM in your paper, please cite our work (preprint).

@article{wenthundergbm19,
 author = {Wen, Zeyi and Shi, Jiashuai and He, Bingsheng and Chen, Jian and Li, Qinbin},
 title = {{ThunderGBM}: Fast {GBDTs} and Random Forests on {GPUs}},
 journal = {To appear in arXiv},
 year = {2019}
}

Other related paper

  • Zeyi Wen, Bingsheng He, Kotagiri Ramamohanarao, Shengliang Lu, and Jiashuai Shi. Efficient Gradient Boosted Decision Tree Training on GPUs. The 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 234-243, 2018. pdf

Key members of ThunderGBM

  • Zeyi Wen, NUS
  • Jiashuai Shi, SCUT (a visiting student at NUS)
  • Qinbin Li, NUS
  • Advisor: Bingsheng He, NUS
  • Collaborators: Jian Chen (SCUT), Kotagiri Ramamohanarao (The University of Melbourne)

Other information

  • This work is supported by a MoE AcRF Tier 2 grant (MOE2017-T2-1-122) and an NUS startup grant in Singapore.

Related libraries

thundergbm's People

Contributors

zeyiwen avatar shijiashuai avatar bingshenghe avatar

Watchers

liu3xing3long avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.