Comments (15)
Is there some randomness in the way the cubical complex computes persistence which could explain such differences?
from gudhi-devel.
Hi Aymeric, is this an issue that only happens with R, or is it possible to reproduce it directly with Python?
I don't think there is supposed to be any randomness in computing the persistence diagram of a periodic cubical complex, but it is hard to say without a full example, and I don't know what to look at...
from gudhi-devel.
Hey Marc. Well I only see through the R interface but I don't see any obvious reason why that would be R-specific as I am only calling Python classes and methods. Specifically the unit test setup in R is:
n <- 10
X <- cbind(seq(0, 1, len = n), seq(0, 1, len = n))
cc <- CubicalComplex$new(top_dimensional_cells = X)
cc$compute_persistence()$persistence()
the Python equivalent of which would be probably something like:
import numpy as np
import gudhi as gd
n = 10
X = np.linspace(0, 1, num = n)
X = [X, X]
cc = gd.CubicalComplex(top_dimensional_cells = X)
cc.compute_persistence()
cc.persistence()
Then we would need to test this Python code on the various platforms I mentioned.
from gudhi-devel.
As expected, this prints
[(0, (0.0, inf))]
on linux, whatever the version of python used.
from gudhi-devel.
cc.compute_persistence()
cc.persistence()
By the way, this is redundant, persistence already calls compute_persistence, so this computes it twice.
from gudhi-devel.
As expected, this prints
[(0, (0.0, inf))]
on linux, whatever the version of python used.
Which Python versions did you include in your tests ? 3.8 to 3.10 ?
which Linux distribution ? The problem seems to be specific to Ubuntu 20.04.
from gudhi-devel.
Ok there is actually no issues with the cubical complex, only with the periodic cc.
When I do:
cc = gd.PeriodicCubicalComplex(top_dimensional_cells = X, periodic_dimensions = True)
cc.persistence()
I get on most systems a diagram with points for dimension 0 and 1 but on Ubuntu 20.04 for some Python and R versions, I get something in dimension 2 as well so that .persistence()
, .betti_numbers()
, .num_simplices(),
.persistent_betti_numbers()and
.cofaces_of_persistence_pairs()` does not return the expected value.
from gudhi-devel.
cc = gd.PeriodicCubicalComplex(top_dimensional_cells = X, periodic_dimensions = True)
periodic_dimensions
is supposed to be a vector<bool>
. If you pass True, I don't know, maybe it converts True to 1, constructs a vector of size 1, and accessing the second element is then undefined behavior...
It does seem easy to misuse periodic_dimensions
, we should either diagnose the error, or interpret a bool
the same as an array with the right size and constant value.
from gudhi-devel.
Actually, when I try passing periodic_dimensions = True
, I get an error
TypeError: 'bool' object is not iterable
I guess it could depend on the version of [cp]ython used, but are you sure that's really the code you tried?
from gudhi-devel.
Yes, what I run in R is with
periodic_dimensions = TRUE
which is converted by reticulate::r_to_py()
into True
.
So indeed since in this example it actually would need a length-2 boolean vector, it then has to infer the second element.
In R, this is silently achieved by recycling, i.e., in this case it would automatically extend to c(TRUE , TRUE)
but it might not understand always that it has to recycle because it needs to understand that from the Python side in this case.
Plus, in any event, recycling silently is a bad thing.
So I guess I'll start by passing a boolean vector and see if that improves things.
from gudhi-devel.
Ok that seems to be the reason.
With the second element to TRUE
:
> pcc <- PeriodicCubicalComplex$new(
+ top_dimensional_cells = X,
+ periodic_dimensions = c(TRUE, TRUE)
+ )
> pcc$persistence()
# A tibble: 4 × 3
dimension birth death
<int> <dbl> <dbl>
1 2 1 Inf
2 1 0 Inf
3 1 1 Inf
4 0 0 Inf
which is what I get on ubuntu 20.04 when my test fails. And with the second element to FALSE
:
> pcc <- PeriodicCubicalComplex$new(
+ top_dimensional_cells = X,
+ periodic_dimensions = c(TRUE, FALSE)
+ )
> pcc$persistence()
# A tibble: 2 × 3
dimension birth death
<int> <dbl> <dbl>
1 1 1 Inf
2 0 0 Inf
which is what I get on the rest of platforms.
from gudhi-devel.
cc.compute_persistence()
cc.persistence()By the way, this is redundant, persistence already calls compute_persistence, so this computes it twice.
I added in all R classes a private attribute that tracks whether the persistence has already been computed. That way, whenever the user calls .compute_persistence()
, it actually only runs if that attribute is FALSE
. Maybe that could be interesting to do so also in the Python classes or even at the C++ level?
from gudhi-devel.
I added in all R classes a private attribute that tracks whether the persistence has already been computed. That way, whenever the user calls
.compute_persistence()
, it actually only runs if that attribute isFALSE
.
Then you have to take care to set it back to false whenever someone for instance changes a filtration value or inserts a simplex in the SimplexTree. Also, you need to remember what options were passed to compute_persistence, in case the second call has different options.
from gudhi-devel.
Yes you are right. It needs some care in its implementation to get things right.
from gudhi-devel.
I think this one was only on the R side, so we can close it.
from gudhi-devel.
Related Issues (20)
- Explain requirements
- multiple instantiations of Simplex_tree hidden in SimplexTree HOT 4
- `master` default branch should be renamed `main`
- sklearn set_output
- Show progress in transformers
- Use the new Simplex_tree.clear function HOT 2
- BettiCurve.fit_transform missing y=None param HOT 1
- BettiCurve: `not X` is not a safe way to test for emptiness
- Identifying the simplicies involved in forming a feature in persistence diagram HOT 2
- Unsafe Cech with Epick_d?
- Atol : tests are failing with scikit-learn 1.4.0 HOT 4
- Representations - kernel_methods : Undefined name `metric` HOT 2
- Periodic alpha complex with DTM filtration HOT 4
- [Representations Module] New `__call__` strategy does not retrieve attributes HOT 6
- Mix of random generators in tangential complex
- [Representations Module] ComplexPolynomial class does not update threshold automatically HOT 2
- Re-generate doxygen headers HOT 1
- [Cover Complex] Make KeplerMapper a Python function HOT 1
- [ToMATo] relate PD points to clusters HOT 1
- [Python wheels] Python 3.12 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gudhi-devel.