Giter Site home page Giter Site logo

rent's Introduction

Modifications in this forked repo

Summary of modifications

This is a summary of modifications made to the original RENT in this forked repo:

  • For bootstrapping before each repeated Elastic Net run, the bootstrapping was done with stratification based on the target label distribution.
  • Boosting was added to Elastic Net for RENT_Regression(). The rationale is that even the current selection based on learnt E-Net coefficients, but these coefficients were optimized with a soft-margin and ignored the sample points that lies in the soft-margin. By taking into account the weighted coefficient of the weaker learning in the boosted ensemble of Elastic Nets, this could be improved.
  • Coefficients of features learnt were further ranked by their normalized mean and variance, which translate to their importance and stability. An option n_features were added to suggest the maximum number of features to return.

Purpose

This forked package was not suppose to be used independently. Feature selection is a task where noise in the data is almost certainly expected. There is strong evidence that boosting is susceptible to the influence of noise in the training data such that you might see a drop in stability in general if you use the boosting setting. This is because, with boosting, slight changes in the training dataset (due to the resampling inner-loop of RENT) will lead to various irrelevant features being selected. The boosted RENT was not intended to be used alone, but together with bagging. In more laymen terms, boosting RENT allows it to select more relevant features, but would also result in selecting more noise features. Luckily, these noisy features are also susceptible to changes in the training data that we can apply an extra layer of bagging to cancel out (or at least minimize).

Therefore, this repo should be used with an extra layer of bagging (BB-RENT). See more here. Note that this package is not officially released and you can go to the branch pre-release for the nightly build.

Example

Everything remains the same as the main repo, except there is now an additional option to use boosting instead of just elastic net:

model = RENT.RENT_Regression(data=pd.DataFrame(features),
                             target=_targets.ravel(),
                             feat_names=_features_names,
                             C=C,
                             l1_ratios=l1_ratio,
                             autoEnetParSel=False,
                             poly='OFF',
                             testsize_range=(1/float(n_splits), 1/float(n_splits)),
                             K=n_trials,
                             random_state=0,
                             verbose=1,
                             scale=False,
                             boosting=boosting) # BRENT or RENT

Below are links to Jupyter-notebooks that illustrate how to use RENT for

Requirements

Make sure that Python 3.5 or higher is installed. A convenient way to install Python and many useful packages for scientific computing is to use the Anaconda Distribution

  • numpy >= 1.11.3
  • pandas >= 1.2.3
  • scikit-learn >= 0.22
  • scipy >= 1.5.0
  • hoggorm >= 0.13.3
  • hoggormplot >= 0.13.2
  • matplotlib >= 3.2.2
  • seaborn >= 0.10

Installation

To install the package with the pip package manager, run the following command: python3 -m pip install git+https://github.com/NMBU-Data-Science/RENT.git

Documentation

Documentation is available at ReadTheDocs. It provides detailed explanation of methods and their inputs.

Citing RENT

If you use RENT in a report or scientific publication, we would appreciate citations to the following paper:

Jenul et al., (2021). RENT: A Python Package for Repeated Elastic Net Feature Selection. Journal of Open Source Software, 6(63), 3323, https://doi.org/10.21105/joss.03323

Bibtex entry:

@article{RENT, doi = {10.21105/joss.03323}, url = {https://doi.org/10.21105/joss.03323}, year = {2021}, publisher = {The Open Journal}, volume = {6}, number = {63}, pages = {3323}, author = {Anna Jenul and Stefan Schrunner and Bao Ngoc Huynh and Oliver Tomic}, title = {RENT: A Python Package for Repeated Elastic Net Feature Selection}, journal = {Journal of Open Source Software} }

rent's People

Contributors

alabamagan avatar annajenul avatar huynhngoc avatar olivertomic avatar uzaaft avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.