Giter Site home page Giter Site logo

Comments (15)

astamm avatar astamm commented on June 14, 2024

Is there some randomness in the way the cubical complex computes persistence which could explain such differences?

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

Hi Aymeric, is this an issue that only happens with R, or is it possible to reproduce it directly with Python?
I don't think there is supposed to be any randomness in computing the persistence diagram of a periodic cubical complex, but it is hard to say without a full example, and I don't know what to look at...

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

Hey Marc. Well I only see through the R interface but I don't see any obvious reason why that would be R-specific as I am only calling Python classes and methods. Specifically the unit test setup in R is:

n <- 10
X <- cbind(seq(0, 1, len = n), seq(0, 1, len = n))
cc <- CubicalComplex$new(top_dimensional_cells = X)
cc$compute_persistence()$persistence()

the Python equivalent of which would be probably something like:

import numpy as np
import gudhi as gd
n = 10
X = np.linspace(0, 1, num = n)
X = [X, X]
cc = gd.CubicalComplex(top_dimensional_cells = X)
cc.compute_persistence()
cc.persistence()

Then we would need to test this Python code on the various platforms I mentioned.

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

As expected, this prints

[(0, (0.0, inf))]

on linux, whatever the version of python used.

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

cc.compute_persistence()
cc.persistence()

By the way, this is redundant, persistence already calls compute_persistence, so this computes it twice.

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

As expected, this prints

[(0, (0.0, inf))]

on linux, whatever the version of python used.

Which Python versions did you include in your tests ? 3.8 to 3.10 ?
which Linux distribution ? The problem seems to be specific to Ubuntu 20.04.

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

Ok there is actually no issues with the cubical complex, only with the periodic cc.
When I do:

cc = gd.PeriodicCubicalComplex(top_dimensional_cells = X, periodic_dimensions = True)
cc.persistence()

I get on most systems a diagram with points for dimension 0 and 1 but on Ubuntu 20.04 for some Python and R versions, I get something in dimension 2 as well so that .persistence(), .betti_numbers(), .num_simplices(), .persistent_betti_numbers()and.cofaces_of_persistence_pairs()` does not return the expected value.

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

cc = gd.PeriodicCubicalComplex(top_dimensional_cells = X, periodic_dimensions = True)

periodic_dimensions is supposed to be a vector<bool>. If you pass True, I don't know, maybe it converts True to 1, constructs a vector of size 1, and accessing the second element is then undefined behavior...

It does seem easy to misuse periodic_dimensions, we should either diagnose the error, or interpret a bool the same as an array with the right size and constant value.

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

Actually, when I try passing periodic_dimensions = True, I get an error

TypeError: 'bool' object is not iterable

I guess it could depend on the version of [cp]ython used, but are you sure that's really the code you tried?

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

Yes, what I run in R is with

periodic_dimensions = TRUE

which is converted by reticulate::r_to_py() into True.
So indeed since in this example it actually would need a length-2 boolean vector, it then has to infer the second element.
In R, this is silently achieved by recycling, i.e., in this case it would automatically extend to c(TRUE , TRUE) but it might not understand always that it has to recycle because it needs to understand that from the Python side in this case.
Plus, in any event, recycling silently is a bad thing.
So I guess I'll start by passing a boolean vector and see if that improves things.

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

Ok that seems to be the reason.

With the second element to TRUE:

> pcc <- PeriodicCubicalComplex$new(
+     top_dimensional_cells = X,
+     periodic_dimensions = c(TRUE, TRUE)
+   )
> pcc$persistence()
# A tibble: 4 × 3
  dimension birth death
      <int> <dbl> <dbl>
1         2     1   Inf
2         1     0   Inf
3         1     1   Inf
4         0     0   Inf

which is what I get on ubuntu 20.04 when my test fails. And with the second element to FALSE:

> pcc <- PeriodicCubicalComplex$new(
+     top_dimensional_cells = X,
+     periodic_dimensions = c(TRUE, FALSE)
+   )
> pcc$persistence()
# A tibble: 2 × 3
  dimension birth death
      <int> <dbl> <dbl>
1         1     1   Inf
2         0     0   Inf

which is what I get on the rest of platforms.

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

cc.compute_persistence()
cc.persistence()

By the way, this is redundant, persistence already calls compute_persistence, so this computes it twice.

I added in all R classes a private attribute that tracks whether the persistence has already been computed. That way, whenever the user calls .compute_persistence(), it actually only runs if that attribute is FALSE. Maybe that could be interesting to do so also in the Python classes or even at the C++ level?

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

I added in all R classes a private attribute that tracks whether the persistence has already been computed. That way, whenever the user calls .compute_persistence(), it actually only runs if that attribute is FALSE.

Then you have to take care to set it back to false whenever someone for instance changes a filtration value or inserts a simplex in the SimplexTree. Also, you need to remember what options were passed to compute_persistence, in case the second call has different options.

from gudhi-devel.

astamm avatar astamm commented on June 14, 2024

Yes you are right. It needs some care in its implementation to get things right.

from gudhi-devel.

mglisse avatar mglisse commented on June 14, 2024

I think this one was only on the R side, so we can close it.

from gudhi-devel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.