aidos-lab / pytorch-topological Goto Github PK

View Code? Open in Web Editor NEW

139.0 5.0 14.0 321 KB

A topological machine learning framework based on PyTorch

Home Page: https://pytorch-topological.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

deep-learning pytorch topological-data-analysis topological-machine-learning

pytorch-topological's Introduction

`pytorch-topological`: A topological machine learning framework for `pytorch`

pytorch-topological (or torch_topological) is a topological machine learning framework for PyTorch. It aims to collect loss terms and neural network layers in order to simplify building the next generation of topology-based machine learning tools.

Topological machine learning in a nutshell

Topological machine learning refers to a new class of machine learning algorithms that are able to make use of topological features in data sets. In contrast to methods based on a purely geometrical point of view, topological features are capable of focusing on connectivity aspects of a data set. This provides an interesting fresh perspective that can be used to create powerful hybrid algorithms, capable of yielding more insights into data.

This is an emerging research field, firmly rooted in computational topology and topological data analysis. If you want to learn more about how topology and geometry can work in tandem, here are a few resources to get you started:

Amézquita et al., The Shape of Things to Come: Topological Data Analysis and Biology, from Molecules to Organisms, Developmental Dynamics Volume 249, Issue 7, pp. 816--833, 2020.
Hensel et al., A Survey of Topological Machine Learning Methods, Frontiers in Artificial Intelligence, 2021.

Installation and requirements

torch_topological requires Python 3.9. More recent versions might work but necessitate building some dependencies by yourself; Python 3.9 currently offers the smoothest experience. It is recommended to use the excellent poetry framework to install torch_topological:

poetry add torch-topological

Alternatively, use pip to install the package:

pip install -U torch-topological

A note on older versions. Older versions of Python are not explicitly supported, and things may break in unexpected ways. If you want to use a different version, check pyproject.toml and adjust the Python requirement to your preference. This may or may not work, good luck!

Usage

torch_topological is still a work in progress. You can browse the documentation or, if code reading is more your thing, dive directly into some example code.

Here is a list of other projects that are using torch_topological:

SHAPR, a method for for predicting the 3D cell shape of individual cells based on 2D microscopy images

This list is incomplete---you can help expanding it by using torch_topological in your own projects! 😇

Contributing

Check out the contribution guidelines or the road map of the project.

Acknowledgements

Our software and research does not exist in a vacuum. pytorch-topological is standing on the shoulders of proverbial giants. In particular, we want to thank the following projects for constituting the technical backbone of the project:

`giotto-tda`	`gudhi`

Furthermore, pytorch-topological draws inspiration from several projects that provide a glimpse into the wonderful world of topological machine learning:

difftda by Mathieu Carrière
Ripser by Ulrich Bauer
Teaspoon by Elizabeth Munch and her team
TopologyLayer by Rickard Brüel Gabrielsson
topological-autoencoders by Michael Moor, Max Horn, and Bastian Rieck
torchph by Christoph Hofer and Roland Kwitt

Finally, pytorch-topological makes heavy use of POT, the Python Optimal Transport Library. We are indebted to the many contributors of all these projects.

pytorch-topological's People

Contributors

Stargazers

Watchers

Forkers

xiaoyiyong555 lihaoliu-cambridge nklkhlr crisbodnar lucia-yin pgu-nd marcusblake arianemora sinashish nkalyanv clancy97 simonschindler shaohuasonggit jamestiotio

pytorch-topological's Issues

TypeError: 'PersistenceInformation' object is not iterable

How can I handle non-iterable "PersistenceInformation' in the TopologicalModel classification model?
The error comes from: pers_info = make_tensor(pers_info). I also tried torch.tensor(pers_info) and got the error that pers_info is not a sequence.

Finally, pers_info is the VR complex of weights in a torch.tensor. I am not sure if that matters.

Thank you.

About SignatureLoss

Thank you for sharing this repo
But now I have a problem
Use this SignatureLoss function it return 'AttributeError: 'list' object has no attribute 'pairing''

self.vr = VietorisRipsComplex(dim=0)
self.topo_loss = SignatureLoss(p=2)

a = x
pi_x = self.vr(a)
b = x
pi_z = self.vr(b)

topo_loss = self.topo_loss([a, pi_x],[b, pi_z])

Support for different persistent homology backends

As stated in the roadmap, currently, guddhi and giotto-ph are used for various purposes. We want to be able to allow the user to specify which backend to use, and also provide sane defaults if the user does not care/has no preference.

Support `ripser++` as a backend for Vietoris–Rips complex calculations

Some previous benchmarks by @crisbodnar: https://colab.research.google.com/drive/1CbLnKu4v964Gxb2FDsxp1kgPwe0pOKpl?usp=sharing
Suggestion for integration: use str parameter for VietorisRipsComplex
Add as optional dependency

CubicalComplex of 1d feature vs 2d image

Thanks for the useful framework.
I have two questions about processing images.

Is it advised to pass an image through a linear layer (convolve) and then get the persistence, or is it better to to in the reverse order?

if I have a [28,28] tensor image X, I can do directly

cc = CubicalComplex()
pers = cc(X)

if I have a representation Z, e.g. a tensor of dimension 128 outputted from a resnet
is this a correct way to proceed, or would you advise a better solution ?

cc = CubicalComplex()
pers = cc(Z.view(1, -1))

Many thanks

Make TOGL feature complete

Currently, the implementation of TOGL ignores the following features:

handling higher-order information properly
expanding simplicial complexes
making use of the dimension of features

cannot run example scripts

This is the image-smoothing script in the examples folder of the source package.

import numpy as np
import matplotlib.pyplot as plt

from torch_topological.nn import Cubical
from torch_topological.nn import SummaryStatisticLoss

from sklearn.datasets import make_circles

import torch


def _make_data(n_cells, n_samples=1000):
    X = make_circles(n_samples, shuffle=True, noise=0.05)[0]

    heatmap, *_ = np.histogram2d(X[:, 0], X[:, 1], bins=n_cells)
    heatmap -= heatmap.mean()
    heatmap /= heatmap.max()

    return heatmap


class TopologicalSimplification(torch.nn.Module):
    def __init__(self, theta):
        super().__init__()

        self.theta = theta

    def forward(self, x):
        persistence_information = cubical(x)
        persistence_information = [persistence_information[0]]

        gens, pd = persistence_information[0]

        persistence = (pd[:, 1] - pd[:, 0]).abs()
        indices = persistence <= self.theta

        gens = gens[indices]

        indices = torch.vstack((gens[:, 0:2], gens[:, 2:]))

        indices = np.ravel_multi_index(
            (indices[:, 0], indices[:, 1]), x.shape
        )

        x.ravel()[indices] = 0.0

        persistence_information = cubical(x)
        persistence_information = [persistence_information[0]]

        return x, persistence_information


if __name__ == '__main__':

    np.random.seed(23)

    Y = _make_data(50)
    Y = torch.as_tensor(Y, dtype=torch.float)
    X = torch.as_tensor(
        Y + np.random.normal(scale=0.05, size=Y.shape), dtype=torch.float
    )

    theta = torch.nn.Parameter(
        torch.as_tensor(1.0), requires_grad=True,
    )

    topological_simplification = TopologicalSimplification(theta)

    optimizer = torch.optim.Adam(
        [theta], lr=1e-2
    )
    loss_fn = SummaryStatisticLoss('total_persistence', p=1)

    cubical = Cubical()

    persistence_information_target = cubical(Y)
    persistence_information_target = [persistence_information_target[0]]

    for i in range(500):
        X, persistence_information = topological_simplification(X)

        optimizer.zero_grad()

        loss = loss_fn(
            persistence_information,
            persistence_information_target
        )

        print(loss.item(), theta.item())

        theta.backward()
        optimizer.step()

    X = X.detach().numpy()

    plt.imshow(X)
    plt.show()

Trying to run it results in

ImportError: cannot import name 'Cubical' from 'torch_topological.nn' (/home/bernard/opt/python310/lib/python3.10/site-packages/torch_topological/nn/__init__.py)

Changing Cubical to CubicalComplex results in a different error

Traceback (most recent call last):

  File ~/lib/python3.10/site-packages/spyder_kernels/py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File ~/Examples/torch_topological/examples/image_smoothing.py:82
    X, persistence_information = topological_simplification(X)

  File ~/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl
    return forward_call(*args, **kwargs)

  File ~/Examples/torch_topological/examples/image_smoothing.py:34 in forward
    gens, pd = persistence_information[0]

TypeError: 'PersistenceInformation' object is not iterable

pip installation issue

Hi, I was trying to install pytorch-topological using 'pip install torch-topological' command but I got the error messages below:

Collecting torch-topological
  Using cached torch_topological-0.1.7-py3-none-any.whl (51 kB)
Collecting POT<0.9.0,>=0.8.0 (from torch-topological)
  Using cached POT-0.8.2.tar.gz (255 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
INFO: pip is looking at multiple versions of torch-topological to determine which version is compatible with other requirements. This could take a while.
Collecting torch-topological
  Using cached torch_topological-0.1.6-py3-none-any.whl (51 kB)
  Using cached torch_topological-0.1.5-py3-none-any.whl (50 kB)
  Using cached torch_topological-0.1.4-py3-none-any.whl (325 kB)
  Using cached torch_topological-0.1.3-py3-none-any.whl (323 kB)
  Using cached torch_topological-0.1.2-py3-none-any.whl (316 kB)
  Using cached torch_topological-0.1.1-py3-none-any.whl (38 kB)
  Using cached torch_topological-0.1.0-py3-none-any.whl (9.2 kB)
INFO: pip is still looking at multiple versions of torch-topological to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install torch-topological==0.1.0, torch-topological==0.1.1, torch-topological==0.1.2, torch-topological==0.1.3, torch-topological==0.1.4, torch-topological==0.1.5, torch-topological==0.1.6 and torch-topological==0.1.7 because these package versions have conflicting dependencies.

The conflict is caused by:
    torch-topological 0.1.7 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.6 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.5 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.4 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.3 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.2 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.1 depends on giotto-ph<0.3.0 and >=0.2.0
    torch-topological 0.1.0 depends on giotto-ph<0.3.0 and >=0.2.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

I'm currently using a python 3.11 environment, is this a compatibility issue or I need to do some extra settings? Thanks for the help!

Alternative (faster) approach for constructing SimplexTree in TOGL example code.

Dear Bastian,

Thank your so much for maintaining and actively improving this library, your previous works, TOGL and GFL, have given me so many inspirations!

I haven been using the TOGL example code to build Topol model for 3D point cloud analysis. Although gudhi's Tree Simplex (line 204) allows for constructing general simplices, this method are computationally prohibitive when handling large-scale point clouds (N > 100k).

I also check the library from one of your colleagues, but there is no suitable method to construct the abstract simplicial complexes as the Tree Simplex.

Could you recommend any other computationally feasible way to compute the persistent homology based on the graph filtration?

Regrads,
Zexian.

TOGL example: use of GPU not specified (was: About TOGL demo)

Thansk for your repo in togl
But when I run the demo(graph.py) it will make an error 'Segmentation fault (core dumped)'

Encounter empty tensor output from VietorisRipsComplex class.

Dear Bastian and your team,

Thank you all for the this project!

I am trying to build the topological gin following the receipt of your graph filtration learning and topological GNN papers (Great work, btw!)

Given that my custom dataset is a 2D dataset with non-fixed number of points (i.e., [n, d] where n is not fixed size), i use the torch_geometric to handle my data and batching. I follow the example in example/classification.py to build my forward function, however, the VietorisRipsComplex return an empty tensor after the make_tensor function. Are empty tensors expceted?

Below is the code I have written:

class SomeNN (torch.nn.Module):
    def __init__(self, 
                 in_channels,
                 hid_channels, 
                 out_channels):
        super(SomeNN, self).__init__()
        
        self.n_elements = 10
        self.vl = StructureElementLayer(self.n_elements)
        self.vr = VietorisRipsComplex(dim=0)

    def forward(self, x, edge_index, batch):
        ph_info = []
        max_size = batch.max()
        
        for size in list(range(0, max_size)):
            ph = self.vr(x[batch[batch==size]].view(1, -1, 2))
            ph = make_tensor(ph)
            ph = self.vl(ph)
            ph_info.append(ph) 
        ...

The second question is that, may not be relevant, but how are the PH diagrams calculated in pytorch-topological guaranteed to be differentiable compared to other backends, such as giotto-tda? What are the advantages that pytorch-topological provides in terms of differentiability and integration into deep NN layers?

In both of your papers that mentioned above, the PH diagrams being differentiable due to the fact that the birth-death tuples of data X are pair-wise distinct, is this the reason that PH info in pytorch-topological being differentiable?

Thanks!

Request to add: Example for binary image segmentation task

CubicalComplex documentation could be more inviting for higher dimensionsal data

Hi Bastian,

As I wrote in my E-mail to you I was a little confused and deterred by the documentation of the CubicalComplex class. I expected that your implementation only works for 2D images and was worried that, when feeding a 3D volume the third dimension would simply be considered channel information. I think especially the explicit example in the forward() method threw me off there:

     1. Tensor of `dim = 2`: a single 2D image
     2. Tensor of `dim = 3`: a single 2D image with channels
     3. Tensor of `dim = 4`: a batch of 2D images with channels

Now after having spend a little time reading the code and documentation again I can see that you were correctly describing the behavior of the method and there was a misunderstanding on my side. Still, I think the docs could be more inviting for higher dimensional data.

One way I thought of to improve the docs is to add an example for higher D data, or mention the usage of volumes etc in the docs.

Another way could be to slightly restructure the code and to simply set the default value of dim not to dim=None but dim=2, which would in my eyes result in the same behavior and to explain the behavior of treating "extra dimensions" of the data as channels/batch in a more general way with a specific example perhaps for 2D.

Thanks again for your help with our Project and your awesome work in the TDA field!

run alpha_complex.py with error

Hi, developers,

When I run the example, python alpha_complex.py, I got the following error:

Traceback (most recent call last):
  File "/data/code13/pytorch-topological/torch_topological/examples/alpha_complex.py", line 13, in <module>
    from torch_topological.utils import SelectByDimension
ImportError: cannot import name 'SelectByDimension' from 'torch_topological.utils' (/root2/anaconda3/envs/pytorch-topological/lib/python3.9/site-packages/torch_topological/utils/__init__.py)

Is there something missing in the import location?

Thanks~

About Topological Graph Neural Networks

Does the pytorch-topological repo support the implementation of TOGL(Topological Graph Neural Networks) framework?

Permit arbitrary distance matrices for Vietoris–Rips complex

The interface to ripser_parallel already supports that anyway so we can make our life easier here.

too much time-consuming after adding topology loss term(Cubical complex)

I found that my code very slow after adding topology loss( using cubical complex). The GPU usage down to 10%, but around 90% without adding topology loss. Is it normal?

about installation

why i can't install it through pip, is there any other way to install without poetry?

Does torch_topological.nn and functions like SignatureLossm, VietorisRipsComplex use GPU

Hi,

The code and documentaiton are very fluid and easy to understand. Thanks for the code.
I wanted to know whether torch_topological.nn and its various functions support GPU computations(ie. do they run on GPU).
If not, is there any way to modify the code to allow them to run on GPU.

Eagerly awaiting a response.

Thanks,
Prashant

About classification.py this repo

In classification.py can I use Alpha complex or Cubical complex replace Vietoris--Rips complex?

Write quickstart guide for documentation

Just covering the basics for new users: suppose you don't care about topology at all, what's the smallest thing you can add to your code to make it 'topology-aware?'

Visualization of PD

Hi,

I feel like there should be in-built functions for visualizing Persistence diagrams.

Like now, I am using matplotlib, to scatter plot the diagram of PersistenceInformation, but it requires extra steps of detaching and blah blah.
Plus, I dont know what would be the best way to visualize the pairing?
So, I feel there should be something in the documentation or tutorial on how to do so!

Add support for PersLayer

Import PersLayer to PyTorch topological. As apart of this effort, it would be nice to include the following deliverables:

PyTorch layer
Unit tests
Example of usage
Simple documentation on how this is used and why someone would find this layer useful to integrate into their network architecture

Useful resources for completing this task:

PersLayer paper.
Current PersLayer implementation in TensorFlow which is used by Gudhi.

support for python<3.9

I know this is futile since upgradation is the natural flow of things. but I feel some people (like me) have a lot of other projects/envs using 3.5<=python<=3.8, so upgrading to python3.9 breaks some libraries especially when some of them support only certain versions of CUDA/torch.
It would be great if there was support for any of python3.5/6/7/8!

Add example and unit tests for TOGL

Currently, there are no examples present for usage of TOGL. Would be good to provide example of use case.