rymc / n2d Goto Github PK

A deep clustering algorithm. Code to reproduce results for our paper N2D: (Not Too) Deep Clustering via Clustering the Local Manifold of an Autoencoded Embedding.

License: GNU General Public License v3.0

Python 88.27% Shell 11.73%

clustering deep-clustering-algorithms machine-learning

n2d's People

Contributors

Stargazers

Watchers

n2d's Issues

Number of Dimensions equals to clusters number

Hello,

I read the paper a couple of times, and everything was clear to me (for now) except 2 points.

First why are you setting as the number of dimensions to be the clusters number? (when you are using the manifold learning algorithm UMAP etc)
Second, for the visualization are you changing the number of the dimensions to 2, or you are adding one more manifold LA with the number of dimensions to 2?

Dimitris.

I cannot reproduce results when retraining

I am trying to reproduce the same results in the paper with the mnist dataset retraining the model from scratch. These are the parameters and the output:

Using GPU

Missing MulticoreTSNE package.. Only important if evaluating other manifold learners.

Namespace(ae_weights=None, batch_size=256, cluster='GMM', dataset='mnist', eval_all=False, gpu='0', manifold_learner='UMAP', n_clusters=10, pretrain_epochs=1000, save_dir='MYEXPS', umap_dim=10, umap_metric='euclidean', umap_min_dist='0.00', umap_neighbors=20, visualize=False)

Time to train the autoencoder: 1251.820315361023

=======================================

mnist | UMAP on autoencoded embedding with GMM - N2D
ACC 0.83611
NMI 0.8986
ARI 0.82823

============================================

They are well below the one in the paper (that I can reproduce using the provided weights):
ACC 0.979
MNI 0.942

Could you help me? Did I make some mistake?
Thanks

The theoretical derivation of the model

Hello,
Thank you for your paper and codes. I have a question: The theoretical derivation of the model is not given in the paper,is it because each component is trained independently and no formula derivation is required?
Hope to get your advice! Thanks!

Name of the shallow clustering algorithm used to cluster?(after learning the manifold)

Training

Hello, Can you provide an example of how to train N2D network?

Results on Fashion-MNIST is worse than results on your paper

I tried to run n2d on fashin-mnist dataset,but I got the result like this which is worse than results on your paper.

0.51409
0.55714
0.37784

I used the same parameters in the run.sh and the trained model downloaded with the code.
Can you please tell me how to acheive the ACC:0.672 and NMI:0.684 results on the dataset?

Meaning of the plots

Hello,
I successfully applied this technique on my own dataset and its producing really good results. Thanks for that!
But i have one question. What is the difference betwenn the two plots called "n2d-predicted.png" and "n2d.png"?

Problem with running the code

Hello,

When I want to run the code for UMAP, I get the following error:

Compilation is falling back to object mode WITH looplifting enabled because Function "make_euclidean_tree" failed type inference due to: Cannot unify RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none) and RandomProjectionTreeNode(none, bool, array(float32, 1d, C), float64, RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none), RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none)) for '$46call_function.15', defined at/anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py (457)

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py", line 457:
def make_euclidean_tree(data, indices, rng_state, leaf_size=30):

    left_node = make_euclidean_tree(data, left_indices, rng_state, leaf_size)
    ^

During: resolving callee type: recursive(type(CPUDispatcher(<function make_euclidean_tree at 0x7f959eb37dd0>)))
During: typing of call at/anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py (457)

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py", line 457:
def make_euclidean_tree(data, indices, rng_state, leaf_size=30):

    left_node = make_euclidean_tree(data, left_indices, rng_state, leaf_size)
    ^

@numba.jit()
/anaconda3/envs/n2d/lib/python3.7/site-packages/numba/core/object_mode_passes.py:178: NumbaWarning: Function "make_euclidean_tree" was compiled in object mode without forceobj=True.

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py", line 451:
@numba.jit()
def make_euclidean_tree(data, indices, rng_state, leaf_size=30):
^

state.func_ir.loc))
/anaconda3/envs/n2d/lib/python3.7/site-packages/numba/core/object_mode_passes.py:188: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/rp_tree.py", line 451:
@numba.jit()
def make_euclidean_tree(data, indices, rng_state, leaf_size=30):
^

state.func_ir.loc))
/anaconda3/envs/n2d/lib/python3.7/site-packages/umap/nndescent.py:92: NumbaPerformanceWarning:
The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.

To find out why, try turning on parallel diagnostics, see https://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help.

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/utils.py", line 409:
@numba.njit(parallel=True)
def build_candidates(current_graph, n_vertices, n_neighbors, max_candidates, rng_state):
^

current_graph, n_vertices, n_neighbors, max_candidates, rng_state
/anaconda3/envs/n2d/lib/python3.7/site-packages/numba/core/typed_passes.py:314: NumbaPerformanceWarning:
The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.

To find out why, try turning on parallel diagnostics, see https://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help.

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/nndescent.py", line 47:
@numba.njit(parallel=True)
def nn_descent(
^

state.func_ir.loc))
/anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py:349: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "fuzzy_simplicial_set" failed type inference due to: Untyped global name 'nearest_neighbors': cannot determine Numba type of <class 'function'>

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py", line 467:
def fuzzy_simplicial_set(

if knn_indices is None or knn_dists is None:
knn_indices, knn_dists, _ = nearest_neighbors(
^

@numba.jit()
/anaconda3/envs/n2d/lib/python3.7/site-packages/numba/core/object_mode_passes.py:178: NumbaWarning: Function "fuzzy_simplicial_set" was compiled in object mode without forceobj=True.

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py", line 350:
@numba.jit()
def fuzzy_simplicial_set(
^

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "../anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py", line 350:
@numba.jit()
def fuzzy_simplicial_set(
^

state.func_ir.loc))
Traceback (most recent call last):
File "n2d.py", line 391, in
hl, y, label_names)
File "n2d.py", line 192, in cluster_manifold_in_embedding
min_dist=md).fit_transform(hl)
File /anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py", line 1596, in fit_transform
self.fit(X, y)
File "/anaconda3/envs/n2d/lib/python3.7/site-packages/umap/umap_.py", line 1454, in fit
self._search_graph.transpose()
File "/anaconda3/envs/n2d/lib/python3.7/site-packages/scipy/sparse/lil.py", line 437, in transpose
return self.tocsr(copy=copy).transpose(axes=axes, copy=False).tolil(copy=False)
File "/anaconda3/envs/n2d/lib/python3.7/site-packages/scipy/sparse/lil.py", line 462, in tocsr
_csparsetools.lil_get_lengths(self.rows, indptr[1:])
File "_csparsetools.pyx", line 109, in scipy.sparse._csparsetools.lil_get_lengths
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Could you please help?

Do y'all intend on converting this to a library?

I love this research, and would love to see it in an even more portable/applicable fashion in the form of a library. I have started with an object oriented framework for this stuff here, https://github.com/josephsdavid/N2D-OOP, and would love to make this into an entire library where it can be widely used :)

Regardless, keep up the great work!

Feature Selection

Hello,

Can I ask you to share HAR Feature Selection code?
Thank you

Exception after exception - code is not running

Hello!
I hope there are still people hanging around in the repo and maybe know the issue. After completing the steps described in readme I got errors if I try to run the model. Here is what I got:

/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Traceback (most recent call last):
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/errors.py", line 744, in new_error_context
yield
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 230, in lower_block
self.lower_inst(inst)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 328, in lower_inst
self.storevar(val, inst.target.name)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 1278, in storevar
raise AssertionError(msg)
AssertionError: Storing i64 to ptr of i32 ('dim'). FE type int32

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "n2d.py", line 15, in
import umap
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/umap/init.py", line 1, in
from .umap_ import UMAP
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/umap/umap_.py", line 54, in
from umap.layouts import (
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/umap/layouts.py", line 36, in
"dim": numba.types.int32,
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/decorators.py", line 221, in wrapper
disp.compile(sig)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/dispatcher.py", line 909, in compile
cres = self._compiler.compile(args, return_type)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/dispatcher.py", line 79, in compile
status, retval = self._compile_cached(args, return_type)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/dispatcher.py", line 93, in _compile_cached
retval = self._compile_core(args, return_type)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/dispatcher.py", line 111, in _compile_core
pipeline_class=self.pipeline_class)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler.py", line 606, in compile_extra
return pipeline.compile_extra(func)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler.py", line 353, in compile_extra
return self._compile_bytecode()
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler.py", line 415, in _compile_bytecode
return self._compile_core()
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler.py", line 395, in _compile_core
raise e
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler.py", line 386, in _compile_core
pm.run(self.state)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler_machinery.py", line 339, in run
raise patched_exception
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler_machinery.py", line 330, in run
self._runPass(idx, pass_inst, state)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler_machinery.py", line 289, in _runPass
mutated |= check(pss.run_pass, internal_state)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/compiler_machinery.py", line 262, in check
mangled = func(compiler_state)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/typed_passes.py", line 463, in run_pass
NativeLowering().run_pass(state)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/typed_passes.py", line 384, in run_pass
lower.lower()
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 136, in lower
self.lower_normal_function(self.fndesc)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 190, in lower_normal_function
entry_block_tail = self.lower_function_body()
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 216, in lower_function_body
self.lower_block(block)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/lowering.py", line 230, in lower_block
self.lower_inst(inst)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/contextlib.py", line 130, in exit
self.gen.throw(type, value, traceback)
File "/home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/numba/core/errors.py", line 751, in new_error_context
raise newerr.with_traceback(tb)
numba.core.errors.LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Storing i64 to ptr of i32 ('dim'). FE type int32

File "../miniconda3/envs/n2d2/lib/python3.7/site-packages/umap/layouts.py", line 52:
def rdist(x, y):

result = 0.0
dim = x.shape[0]
^

During: lowering "dim = static_getitem(value=$8load_attr.2, index=0, index_var=$const10.3, fn=)" at /home/anton/miniconda3/envs/n2d2/lib/python3.7/site-packages/umap/layouts.py (52)

Tested on Ubuntu 20.04 running through WSL2 and on another workstation with Ubuntu 18.04.
Any ideas what have I done wrong?

Thanks.

how to use n2d in other datasets？

Hi，author，
I want to apply my own data set to n2d.data.shape[90,20000],sample:90,feature:20000,what should I do？
thanks

Choice of architecture/relation to t-sne

Hi Ryan, when you get the chance do you think you could walk through the choice of the [500, 500, 2000, c] architecture, especially with relation to t-sne (as mentioned in the paper)? I’ve been trying to understand it so I can confidently explain it to my advisor, but coming up pretty short :)

Best,

David

Unable to replicate the results

I am not sure this question comes under issues or not ? But I'm still gonna ask.

Hi @rymc, I am trying to replicate the results on MNIST dataset alone. I am little confused with the way the model is getting trained.

creating an autoencoder model #line354
extracting only modules related to encoder alone, creating separate model for encoder #line357.
then training the autoencoder model.
But predicting on Images, using encoder model ? #line385

How the encoder model got trained, since we are using it to predict on images?
Thanks in advance.

how to use it if there's no labels

Thanks for the implementation, I am trying to use the code on an unsupervised clustering problem. Is it possible to not feed labels? Thanks!

rymc / n2d Goto Github PK

n2d's People

Contributors

Stargazers

Watchers

Forkers

n2d's Issues

I tried to run n2d on fashin-mnist dataset,but I got the result like this which is worse than results on your paper.

0.51409 0.55714 0.37784

Recommend Projects

Recommend Topics

Recommend Org

0.51409
0.55714
0.37784