Giter Site home page Giter Site logo

ryokki / kdd2019_k-multiple-means Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chlwr/kdd2019_k-multiple-means

0.0 0.0 0.0 2.97 MB

Implementation for the paper "K-Multiple-Means: A Multiple-Means Clustering Method with Specified K Clusters,", which has been accepted by KDD'2019 as an ORAL paper, in the Research Track.

License: GNU General Public License v3.0

MATLAB 100.00%

kdd2019_k-multiple-means's Introduction

KDD2019_K-Multiple-Means (KMM)

Implementation for the paper "K-Multiple-Means: A Multiple-Means Clustering Method with Specified K Clusters", which has been accepted by KDD'2019 as an Oral Paper, in the Research Track.

Paper: https://dl.acm.org/citation.cfm?id=3330846

Abstract

In this paper, we make an extension of K-means for the clustering of multiple means. The popular K-means clustering uses only one center to model each class of data. However, the assumption on the shape of the clusters prohibits it to capture the non-convex patterns. Moreover, many categories consist of multiple subclasses which obviously cannot be represented by a single prototype. We propose a K-Multiple-Means (KMM) method to group the data points with multiple sub-cluster means into specified k clusters. Unlike the methods which use the agglomerative strategies, the proposed method formalizes the multiple-means clustering problem as an optimization problem and updates the partitions of m subcluster means and k clusters by an alternating optimization strategy. Notably, the partition of the original data with multiple-means representation is modeled as a bipartite graph partitioning problem with the constrained Laplacian rank. We also show the theoretical analysis of the connection between our method and the K-means clustering. Meanwhile, KMM is linear scaled with respect to n. Experimental results on several synthetic and well-known realworld data sets are conducted to show the effectiveness of the proposed algorithm.

Short demo

Run 'test_KMM_toy.m' in MATLAB. A example of updating process of multiple-means is as follows:

Short Promotion Video

Welcome to click 'thumbs-up' for our work. https://youtu.be/HswEYH2td8w

IMAGE ALT TEXT

Oral Presentation Video

which will be released soon.

Others

KMM can not only obtain the clustering result but also obtain the prototypes corresponding to each sub-cluster, which can be applied to many fields such as vector quantization, cluster analysis, feature learning, nearest-neighbor search, data compression, etc. KMM can also be used as an anchor-based spectral clustering method, which can find better anchors and achieve better clustering result.

If you find this code useful in your research, please cite the paper.

Reference:

Feiping Nie, Cheng-Long Wang, Xuelong Li, "K-Multiple-Means: A Multiple-Means Clustering Method with Specified K Clusters," in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'19), Anchorage, AK, USA, August 4โ€“8, 2019.


Update Log

Fixed a bug on random seeds generation! Oct-12th-2019

kdd2019_k-multiple-means's People

Contributors

chlwr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.