Giter Site home page Giter Site logo

fsfc's Introduction

Feature Selection for Clustering

mit Documentation Status

FSFC is a library with algorithms of feature selection for clustering.

It's based on the article "Feature Selection for Clustering: A Review." by S. Alelyani, J. Tang and H. Liu

Algorithms are covered with tests that check their correctness and compute some clustering metrics. For testing we use open datasets:

Project documentation is available on Read the Docs

Implemented algorithms:

  • Generic Data:
    • SPEC family - NormalizedCut, ArbitraryClustering, FixedClustering
    • Sparse clustering - Lasso
    • Localised feature selection - LFSBSS algorithm
    • Multi-Cluster Feature Selection
    • Weighted K-means
  • Text Data:
    • Text clustering - Chi-R algorithm, Feature Set-Based Clustering (FTC)
    • Frequent itemset extraction - Apriori

Dependencies:

  • numpy
  • scikit-learn
  • scipy

How to use:

Now the project is in the early alpha stage, so it isn't publish to pip.

Because of it, installation of the project is a bit complicated. To use FSFC you should:

  1. Clone repository to your computer.
  2. Run make init to install dependencies.
  3. Copy content of the folder fsfc to the source root of your project.

After it you can use feature selectors as follows:

import numpy as np
from fsfc.generic import NormalizedCut
from sklearn.pipeline import Pipeline
from sklearn.cluster import KMeans

data = np.array([...])

pipeline = Pipeline([
    ('select', NormalizedCut(3)),
    ('cluster', KMeans())
])
pipeline.fit_predict(data)

How to support:

You can support development by testing and reporting of bugs or opening pull-requests.

Project has tests, they can be run with the command make test

Also code there is a Sphinx documentation for code, it can be built with the command make html. Documentation uses numpydoc, so it should be installed on the system. To do it, run pip install numpydoc.

References:

fsfc's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.