Giter Site home page Giter Site logo

clustering's Introduction

Clustering/Subspace Clustering Algorithms on MATLAB

This repo is no longer in active development. However, any problem on implementations of existing algorithms is welcomed. [Oct, 2020]

1. Clustering Algorithms

  • K-means
  • K-means++
    • Generally speaking, this algorithm is similar to K-means;
    • Unlike classic K-means randomly choosing initial centroids, a better initialization procedure is integrated into K-means++, where observations far from existing centroids have higher probabilities of being chosen as the next centroid.
    • The initializeation procedure can be achieved using Fitness Proportionate Selection.
  • ISODATA (Iterative Self-Organizing Data Analysis)
    • To be brief, ISODATA introduces two additional operations: Splitting and Merging;
    • When the number of observations within one class is less than one pre-defined threshold, ISODATA merges two classes with minimum between-class distance;
    • When the within-class variance of one class exceeds one pre-defined threshold, ISODATA splits this class into two different sub-classes.
  • Mean Shift
    • For each point x, find neighbors, calculate mean vector m, update x = m, until x == m;
    • Non-parametric model, no need to specify the number of classes;
    • No structure priori.
  • DBSCAN (Density-Based Spatial Clustering of Application with Noise)
    • Starting with pre-selected core objects, DBSCAN extends each cluster based on the connectivity between data points;
    • DBSCAN takes noisy data into consideration, hence robust to outliers;
    • Choosing good parameters can be hard without prior knowledge;
  • Gaussian Mixture Model (GMM)
  • LVQ (Learning Vector Quantization)

2. Subspace Clustering Algorithms

  • Subspace K-means
    • This algorithm directly extends K-means to Subspace Clustering through multiplying each dimension dj by one weight mj (s.t. sum(mj)=1, j=1,2,...,p);
    • It can be efficiently sovled in an Expectation-Maximization (EM) fashion. In each E-step, it updates weights, centroids using Lagrange Multiplier;
    • This rough algorithm suffers from the problem on its favor of using just a few dimensions when clustering sparse data;
  • Entropy-Weighting Subspace K-means
    • Generally speaking, this algorithm is similar to Subspace K-means;
    • In addition, it introduces one regularization item related to weight entropy into the objective function, in order to mitigate the aforementioned problem in Subspace K-means.
    • Apart from its succinctness and efficiency, it works well on a broad range of real-world datasets.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.