Autoencoder from not too deep method is used and then UMAP is used for dimensionality reduction. On which kmeans and gmm are tested to clustter the image.
N2D: (Not Too) Deep Clustering via Clustering the Local Manifold of an Autoencoded Embedding. https://github.com/rymc/n2d