geostat-framework / pentapy Goto Github PK
View Code? Open in Web Editor NEWA Python toolbox for pentadiagonal linear systems
Home Page: https://pentapy.readthedocs.io/
License: MIT License
A Python toolbox for pentadiagonal linear systems
Home Page: https://pentapy.readthedocs.io/
License: MIT License
Very nice work, this is super useful!
I have a quick question, more of a clarification. The documentation says
# create a flattened pentadiagonal matrix
M_flat = (np.random.random((5, size)) - 0.5) * 1e-5
Here, I interpret M_flat
to represent each of the 5 diagonals of a pentadiagonal matrix. I believe this is similar to a banded structure.
I see that pentapy supports the matrix being input as Full
and Flat
. The Full
portion makes sense to me, this numpy array has shape (N_col, M_row)
. I'm a bit confused about Flat
because flat
typically refers to an (N, N) -> (N^2, 1)
transformation (see https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flatten.html). However, the above leads me to believe that Flat
is actually just the 5 diagonals, i.e. a numpy array of shape (5, N).
So, this is a long winded way of asking, is it safe to assume that pentapy
supports solving pentadiagonal systems represented in a sparse format using just the 5 diagonals to represent the operator (and the right hand side)?
We should simplify our branching structure and use the new default "main" that was suggested by GitHub:
Things to take care of:
index.rst
and README.md
(hopefully auto-redirect as promised for older docs)main
)CONTRIBUTING.md
to refer to main
main
At the moment the core algorithms are implemented in Cython and Python at the same time:
py_solver.py
solver.pyx
Since we are building wheels with pre-compiled code for almost every system, there is no need to provide python fallback implementations.
Another side-effect is, that the setup will be much more clean.
This is a neat and well-defined package that provides a big performance improvement on existing python
algorithms for solving linear systems of pentadiagonal matrices, and I found it to work as claimed. My main reservation is the documentation, which I think could be improved somewhat, but if the points I make below are addressed/defended satisfactorily, I would be happy to accept it for publication in JOSS.
The package installed correctly via pip. Dependencies are dealt with by setup.py but scipy
(which is required for some of the algorithms) is not installed automatically, and has to be installed manually. Perhaps this could be automatic, and/or see comment in Documentation section below.
For solver=1,2 there is no check for singular matrix so you get a ZeroDivisionError: float division
. Algorithms 3,4,5 report this situation specifically (as either an error or a warning). If performance is not impacted, could you check for this? Or simply catch numerical errors and re-raise with a more suggestive error message(s).
I found the solver to work correctly on some basic but inexhaustive tests of my own - discretising the heat equation both tri- and pentadiagonally and comparing the result to an analytic solution.
I think there should be more tests than are currently implemented in the test suite, and they should include some more realistic examples as well as the random matrix tests. There should also be some basic unit tests of the matrix utilities, e.g. check full->banded->full gives the original matrix.
Documentation on how to run (and extend) the package tests should be provided in the community guidelines, if the author is willing to accept contributions.
My tests agree with the claims in the paper. But the code only supports 5 of the 7 algorithms compared in the graphs in the README/readthedocs, could those graphs be made consistent with the one in the paper?
The README looks a little rushed, e.g. "pentapy is a toolbox to deal with pentadiagonal matrices in Python" is a bit vague and to a large extent the README just duplicates the readthedocs documentation. Personally, I would have a README containing only a statement of intent, authorship and community guidelines, basic installation instructions, and links to detailed documentation and examples (in readthedocs).
Re: community guidelines, I don't see "clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support".
Whilst help(pentapy)
provides API-level documentation (some typos/spelling issues here), I think this should also be in the main documentation, the solve
and create_full
methods are only dealt with in the one example given. Other methods: diag_indices
, shift_banded
, create_banded
are not mentioned at all.
The Installation and Requirements sections of the README could be merged and made clearer, esp. that scipy
is an optional dependency that require.
I think the "unique selling point" of this package is a very siginificant improvement in performance over current implementations/algorithms, and this should be emphasised more in the paper (and the README).
Hi, thanks for the great library. I have a suggestion that could slightly improve performance when using the solvers with numpy array inputs.
In the solve function, you use np.array for the input matrix and right-hand-side array, which creates new arrays of the inputs even if they are already numpy arrays. I know NumPy's and Scipy's solvers do not modify the input arrays by default so they don't require copying, and it doesn't seem like your two solvers modify them either unless you call shift_banded before the solver. Therefore, I would suggest you use np.asarray whenever you do not require calling shift_banded (similar to how you use np.asanyarray within the tools functions), so that a new array is created only if the input is not already an array with the correct dtype.
I modified your 03_perform_simple.py example program and made it into a gist that shows the difference of the solver times for the PTRANS-I and PTRANS-II solvers when using np.asarray (https://gist.github.com/derb12/3e2670a509e86051ee9dfe5aa76a0a95). The modified solve function is solve_new. I also included just the time required to call np.array and np.asarray on the inputs. An image of the gist's output is below. For small N (< 10000), you can reduce time by ~5-10%, and for large N (> ~10000) the time reduction is ~30%.
If you're interested in making this change, feel free to copy and/or modify any of the code in the gist since it's basically just copies of your code anyway, or I can submit a pull request. You could simplify the code by always using np.asarray and simply calling shift_banded with copy=True, but I wanted to modify as little as possible for the gist.
The special case of a 3x3 matrix is not handled correctly by both algorithm I and algorithm II.
They have the access pattern first 2 rows - all rows in between - last 2 rows, but for a 3x3 matrix, the first and last 2 rows have an overlap because the second row belongs to both.
This is an edge case, but yet it can happen and lead to wrong memory access.
Unfortunately, the tests do not capture this case because they only capture a single case of n=1_000
.
I have already fixed this for #11 by falling back to np.linalg.solve
for this special case.
The coverage shows 80%, but only because some lines were hit during the unit tests that only consider one error case and one normal case with n=1_000
. I would consider this a bug even though it works because this basically means that the package is mostly untested despite the coverage. Edge cases are not covered which lead to #24 .
On top of that, the most important functionality - the input and error checks - had zero coverage which lead to #23 . It might seem tedious and a waste of time to test those, but cutting short here can easily backfire.
Besides, the test if the x
for solving Ax = b
is correct by looking at the residuals is very risky. A
is initialised randomly, but random pentadiagonal matrices are not guaranteed to be well-conditioned. Looking at the residuals can thus be completely meaingless because the solver might work correctly, but just run into numerical issues because A
doesn't want to be solved.
I've written a dedicated function to generate a well-conditioned pentadiagonal A
and as one can see, a lot of logic is required to guarantee good condition numbers. With this, the solution x
can be checked, but still not via the residuals, but against the solution x_ref
obtained by a well-tested standard like the LAPACK Partially Pivoted Banded LU solver wrapped in scipy.linalg.solve_banded
.
I fixed all of this in one go with #11 by completely dropping unittest
in favour of pure pytest
and leveraging parametrized tests that test all relevant combinations with the same test function (including edge cases and all different sorts of input handling). To reduce the load when running the tests, I added pytest-xdist
for parallelized tests. Required utility functions are doctested during the pytest
runs to ensure that they are also operating correctly.
With this, many different sizes and matrix layout (full, banded row, banded column) can be tested with just a single test function
Python 3.8 was released on Oct. 14, 2019.
It is supported by TravisCI and Appveyor.
It it also already implemented in cibuildwheel, but not released yet. See: pypa/cibuildwheel#170.
I think we should wait for cibuildwheel v0.13 and then add py3.8 to the travisCI workflow.
While #27 is open and changes a lot of internal logic, I want to propose another feature for a version upgrade: the computation of the log-determinant. It would add into #27 very easily with a few lines of code.
If we look at how pentapy
solves the system Ax = b
, it basically factorises A
and transforms b
as follows for algorithm I:
A = LU
where L
is a lower triangular matrix whose entries will not be discussed here (its lower triangle is dense) and U
is the unit upper triangular matrix where
Since the main diagonal of U
holds only ones and the log determinant of a triangular matrix is simply the sum of the logarithms of its main diagonal elements, this is 0.
So, the determinant of A
is completely encoded in L
for which we then also need the main diagonal elements given that this triangular as well. Consequently, the same computations as for U
apply. But how should this be done? It cannot be computed when it's dense because that would ruin performance. Well, this is where the next point comes in:
zeta = inv(L) @ b
What does that help us? Well, we now need to look at how the update with inv(L)
is achieved:
The parts I marked in red help us getting the main diagonal of inv(L)
because this means that
Combining all of this with the fact that the main diagonal elements of the inverse of a triangular matrix are just the reciprocals of its main diagonal elements, we get
Long story short, that means that if we sum up the logarithms of the
For multiple right-hand sides, the additional load will be vanishing because the factorization and determinant computation only need to be done once.
The error case of
I tested this for algorithm I against np.linalg.slogdet
and it works ✅.
If we allow the user to set a flag like with_logdet
, this could be returned along or not. If the default is False
, the Python -API does not change, but we would need to
In chemotools
, we rely heavily on the log determinants for Bayesian regression, but currently we use SciPy's solvers for that. Therefore, this feature would enable us to use pentapy
to a way stronger extent than we currently do.
1) The determinant can however NOT be used to check whether the matrix is singular. This is a common mistake that comes from math classes where everything was done in 100% exact arithmetics. On a computer with finite precision, this is the worst check to be performed and only the condition number can provide true insights for this.
Since Python 2 will reach it's EOL on 1. Jan 2020, we will drop support for it.
Python 3.4 has already reached its EOL on 2019-03-18, so we are dropping support here, too.
The next minor release (v1.1) will then be Python 3 only.
Since TravisCI supports Windows but only for Python 3 we can then drop the use of Appveyor and stick to one CI system.
At the moment only one dependent variable b
is allowed to solve A @ x = b
for x
.
In other algorithms, multiple dependent variables could be passed, e.g.:
https://numpy.org/devdocs/reference/generated/numpy.linalg.solve.html
Since the solver is implemented in cython, this could be optimized by using prange.
Firstly, I think this is a nicely written library and the code was very easy to read.
I do have some comments, some of which I think need to be addressed as part of the review and some that I think might be nice but are not essential.
I will start with the points that I think need to be addressed as part of the review process:
import unittest
import numpy as np
from numpy.testing import assert_almost_equal
import pentapy as pp
class TestMuGammaNonZeroExampleFromPaper(unittest.TestCase):
"""Example taken from https://www.hindawi.com/journals/mpe/2015/232456/"""
def setUp(self):
self.mat = np.array([
[1, 2, 1, 0, 0, 0, 0, 0, 0, 0],
[3, 2, 2, 5, 0, 0, 0, 0, 0, 0],
[1, 2, 3, 1, -2, 0, 0, 0, 0, 0],
[0, 3, 1, -4, 5, 1, 0, 0, 0, 0],
[0, 0, 1, 2, 5, -7, 5, 0, 0, 0],
[0, 0, 0, 5, 1, 6, 3, 2, 0, 0],
[0, 0, 0, 0, 2, 2, 7, -1, 4, 0],
[0, 0, 0, 0, 0, 2, 1, -1, 4, -3],
[0, 0, 0, 0, 0, 0, 2, -2, 1, 5],
[0, 0, 0, 0, 0, 0, 0, -1, 4, 8]
])
self.vec = np.array([
8, 33, 8, 24, 29, 98, 99, 17, 57, 108
])
self.expected = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
def test_pp1(self):
sol = pp.solve(self.mat, self.vec)
assert_almost_equal(sol, self.expected)
def test_pp2(self):
sol = pp.solve(self.mat, self.vec, solver=2)
assert_almost_equal(sol, self.expected)
def test_validity(self):
sol = self.mat @ self.expected.reshape((10, 1))
assert_almost_equal(sol.reshape((10, )), self.vec)
In addition to the above, I suggest some minor issues that I think will improve the code:
import warnings
warnings.warn("Could not import the Cython solver, falling back to Python solver", ImportWarning)
Solvers
enum that wraps the possible solvers. This will make the code easier to read and more explicit about which solver is which (without having to read the documentation to find the number to use). For example,from enum import IntEnum
class Solvers(IntEnum):
PTRANS_I = 1
PTRANS_II = 2
LAPACK_BANDED = 3
SP_SOLVE = 4
SP_SOLVE_UMF = 5
Since IntEnum
is a subclass of int, you wouldn't have to change any of your other code.
I am happy to discuss any of these points, just respond to this issue.
We could provide the solvers penta_solver1
and penta_solver2
for other cython projects, by providing a *.pxd
file.
The Cython compiler options in this line
# cython: language_level=3, boundscheck=True, wraparound=False, cdivision=True
are counterproductive when the following checks are made here
try:
return penta_solver1(mat_flat, rhs)
except ZeroDivisionError:
warnings.warn("pentapy: PTRANS-I not suitable for input-matrix.")
return np.full_like(rhs, np.nan)
because cdivision=True
explicitly prevents the compiled C-code from ever raising a ZeroDivisionError
.
This branch of the code was always dead. This is also indicated by the coverage which shows that these lines was never hit.
Actually, the code produced nan
-values naturally because a single nan
will propagate to contaminate the full solution, but the error handling was never triggered.
For the warning to be issued, cdivision=False
is required here.
Even though the Cython file will be relatively yellow, there is no need to hesitate when setting cdivision=False
. Cython does a very good job when it comes to handling the more elaborate division because it helps the branch predictor by making the zero-case the unlikely case. With this, the performance impact is negligible:
I will fix this in the same go as #11
Requiring the main diagonal of the inverse is not uncommon when solving pentadiagonal systems, e.g., during Cross-Validation for spline smoothing.
Similar to #28 this should be easily doable from the factorization A=LU
where we have:
Since the main diagonal of L
is known, this can also be written as A=LDU
where
Now, special formulas can be applied for computing selected main diagonal elements of the inverse inv(A)
as taken from the master thesis
Fasllija, E., Parallel computation of the diagonal of the inverse of a sparse matrix, 2017, pp. 8-9
All of this applies in our case and it would be a minimum effort to add a function that computes this from a factorization.
Given that one can solely rely on U
, the fact that L
's lower triangle is dense should not pose a problem.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.