cornellius-gp / linear_operator Goto Github PK

View Code? Open in Web Editor NEW

83.0 83.0 26.0 2.85 MB

A LinearOperator implementation to wrap the numerical nuts and bolts of GPyTorch

License: MIT License

Python 99.95% Shell 0.05%

linear_operator's People

Contributors

Stargazers

Watchers

linear_operator's Issues

[Bug] `pivoted_cholesky` fails with KeOps kernels

🐛 Bug

When using a KeOps kernel in GPyTorch, and making predictions (data larger than settings.min_preconditioning_size), the pivoted cholesky decomposition fails. This seems to be due to covar_func returning a pykeops LazyTensor rather than a LinearOperator, making to_dense() fail.

To reproduce

model is a SingleTaskGP with train_inputs of size [100, 2000, 6].

We use a KeOps kernel as laid out in the GPyTorch tutorials.

gpytorch.kernels.ScaleKernel(
        base_kernel=gpytorch.kernels.keops.MaternKernel(
            nu=2.5,
            ard_num_dims=ard_num_dims,
            batch_shape=batch_shape,
            lengthscale_prior=gpytorch.priors.torch_priors.GammaPrior(3.0, 6.0),
        ),
        batch_shape=batch_shape,
        outputscale_prior=gpytorch.priors.torch_priors.GammaPrior(2.0, 0.15),
)

# this fails
preds = model(X) # X being shape [100,2000,6] in my case

** Stack trace/error message **

  File "/fsx/home_dirs/fegt/BOSS/boss/acquisition.py", line 137, in forward
    preds = model(X)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_gp.py", line 333, in __call__
    ) = self.prediction_strategy.exact_prediction(full_mean, full_covar)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 289, in exact_prediction
    self.exact_predictive_mean(test_mean, test_train_covar),
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 306, in exact_predictive_mean
    if len(self.mean_cache.shape) == 4:
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/utils/memoize.py", line 59, in g
    return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 256, in mean_cache
    mean_cache = train_train_covar.evaluate_kernel().solve(train_labels_offset).squeeze(-1)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 2334, in solve
    return func.apply(self.representation_tree(), False, right_tensor, *self.representation())
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_solve.py", line 53, in forward
    solves = _solve(linear_op, right_tensor)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_solve.py", line 20, in _solve
    preconditioner = linear_op.detach()._solve_preconditioner()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 806, in _solve_preconditioner
    base_precond, _, _ = self._preconditioner()

# Starting from here, we fail because we exceed settings.min_preconditioning_size

  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/added_diag_linear_operator.py", line 126, in _preconditioner
    self._piv_chol_self = self._linear_op.pivoted_cholesky(rank=max_iter)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 1965, in pivoted_cholesky
    res, pivots = func(self.representation_tree(), rank, error_tol, *self.representation())
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_pivoted_cholesky.py", line 24, in forward
    matrix_diag = matrix._approx_diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/constant_mul_linear_operator.py", line 74, in _approx_diagonal
    res = self.base_linear_op._approx_diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 492, in _approx_diagonal
    return self._diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/utils/memoize.py", line 59, in g
    return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/kernel_linear_operator.py", line 233, in _diagonal
    diag_mat = to_dense(self.covar_func(x1, x2, **tensor_params, **self.nontensor_params))
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 2987, in to_dense
    raise TypeError("object of class {} cannot be made into a Tensor".format(obj.__class__.__name__))
TypeError: object of class LazyTensor cannot be made into a Tensor

Expected Behavior

The LazyTensor should probably somehow be cast to a KernelLinearOperator

System information

Please complete the following information:

linear-operator==0.5.1
pykeops==2.1.2
gpytorch==1.11
torch==2.0.1

Additional context

Add any other context about the problem here.

`DenseLinearOperator` does not have `operator` attribute

I ran into this while porting over this PR here.

When I execute the test suite, I get 122 errors of the form

AttributeError: 'DenseLinearOperator' object has no attribute 'operator'

When I add an operator attribute here and set it to tsr, all tests pass.

It seems like @gpleiss is about to merge this, and I am not sure if there are any other pending changes coming into this repository. Let me know if I should hold off until the merge is complete, or if I should make a PR for this.

[Bug] Parameters missing from graph when KeOpsLinearOpeartor is used

🐛 Bug

I have implemented a KeOps periodic kernel in cornellius-gp/gpytorch#2296 , however the raw_lengthscale parameter does not have gradients computed (see cornellius-gp/gpytorch#2296 (comment) ). I have managed to track the issue to the matrix multiplication implemented in LinearOperator.matmul, see Expected Behavior. This matrix multiplication is called when doing LinearOperator.sum (as in the example below).

To reproduce

** Code snippet to reproduce **

import torch
from gpytorch.kernels.keops import PeriodicKernel as KeOpsPeriodicKernel #implementation from pull request 2296
import gpytorch

torch.manual_seed(7)

M, N, D = 1000, 2000, 3
x = torch.randn(M, D).double()
y = torch.randn(N, D).double()
k = KeOpsPeriodicKernel(ard_num_dims=3).double()
k.lengthscale = torch.tensor(1.0).double()
k.period_length = torch.tensor(1.0).double()

# context manager used so that type(covar) is KeOpsLinearOpeartor, not LazyEvaluatedKernelTensor
with gpytorch.settings.lazily_evaluate_kernels(False):
    covar = k(x, y)
    print(type(covar))
     # Calls `LinearOperator.sum``, which subsequently calls `LinearOperator.matmul`
     # `LinearOperator.matmul` uses a custom torch.Function for matrix multiplication
    res2 = covar.sum(dim=1) # res2 is a torch.Tensor here
    res2 = res2.sum()
    print(res2)
    g_x = torch.autograd.grad(res2, [k.raw_lengthscale, k.raw_period_length])
    print(g_x)

** Stack trace/error message **

<class 'linear_operator.operators.keops_linear_operator.KeOpsLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
Traceback (most recent call last):
  File "/home/julian/Desktop/test/keops_periodic_low_level/issue_keops_linear_operator.py", line 23, in <module>
    g_x = torch.autograd.grad(res2, [k.raw_lengthscale, k.raw_period_length])
  File "/home/julian/.venv/ichor/lib/python3.10/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

where k.raw_lengthscale is causing the issue.

Expected Behavior

Compute the gradients for both k.raw_lengthscale and k.raw_period_length. The exact place where the issue occurs is here

linear_operator/linear_operator/operators/_linear_operator.py

Line 2366 in 92f7e33

return (self @ ones).squeeze(-1)

which calls the LinearOperator.matmul that loses track of the gradients. I am summing across the columns in the example, but the same issue occurs if summing across the rows. Adding the following check

# Case: summing across columns
if dim == (self.dim() - 1):
    ones = torch.ones(self.size(-1), 1, dtype=self.dtype, device=self.device)
    from .keops_linear_operator import KeOpsLinearOperator
    if isinstance(self, KeOpsLinearOperator):
        return self.covar_mat.sum(dim=1)
    return (self @ ones).squeeze(-1)

gives gradients for both raw_lengthscale and raw_period_length as the custom Matmul is never called.

<class 'linear_operator.operators.keops_linear_operator.KeOpsLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
(tensor([[70682.4975, 70796.7652, 70631.9364]], dtype=torch.float64), tensor([[ 70.2535,  19.3231, -47.2902]], dtype=torch.float64))

This is probably not the best solution, perhaps the Matmul forward/backward methods can be changed, so the gradients are computed correctly?

As a check, the same numbers are returned if the normal periodic kernel is used

<class 'linear_operator.operators.dense_linear_operator.DenseLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
(tensor([[70682.4975, 70796.7652, 70631.9364]], dtype=torch.float64), tensor([[ 70.2535,  19.3231, -47.2902]], dtype=torch.float64))

System information

Please complete the following information:
linear_operator version: 0.3.0
torch version: 1.13.1+cu117

Additional context

Add any other context about the problem here.

Update tolerances for numerical methods

Let's update the CG/minres/etc tolerances to be much tighter than they currently are. GPyTorch can use looser tolerances as necessary.

[Bug] Failing to slice the CatLinearOperator when indexes are negative or when using boolean array

🐛 Bug

When slicing the CatLinearOperator using a negative index, the final shape of the slice does not match the expected shape and an error is returned. This fails at least for ToeplitzLinearOperator, the DiagLinearOperator and the IdentityLinearOperator.

To reproduce

** Code snippet to reproduce **

from linear_operator.operators import IdentityLinearOperator as Ops
from linear_operator.operators import cat as cat_ops

N = 8
base = cat_ops([Ops(N) for _ in range(2)], dim=1)
print(base.shape) #should be 8,16
print(base[:,3:base.shape[-1]-3].shape) #should be 8,10
print(base[:,3:-3].shape) #fail...

** Stack trace/error message **

torch.Size([8, 16])
torch.Size([8, 10])
Traceback (most recent call last):
  File "/home/moise/Program/moise/linear_operator/debug.py", line 8, in <module>
    print(base[:,3:-3].shape) #fail...
  File "/home/moise/Program/moise/linear_operator/linear_operator/operators/_linear_operator.py", line 2870, in __getitem__
    raise RuntimeError(
RuntimeError: CatLinearOperator.__getitem__ failed! Expected a final shape of size torch.Size([8, 10]), got torch.Size([8, 5]). This is a bug with LinearOperator, or your custom LinearOperator.

Expected Behavior

The slice behave as it is working when using positive indexes.

System information

LinearOperator Version 0.5.2
PyTorch Version 2.0.1
Ubuntu 22.04

Python 3.7 suport - Google colab related

Hi,

Thanks for the nice library. I just saw the package on pypi (https://pypi.org/project/linear-operator/) and wanted to quickly try it with Google colab (with currently runs python 3.7.14) but haven't managed to get it setup yet due to the required python version being >=3.8:
https://github.com/cornellius-gp/linear_operator/blob/main/setup.py

Would it be possible to relax the python requirements to support running on google colab?

For the record, the error message was also rather difficult to read:

!pip install linear-operator

led to

ERROR: Could not find a version that satisfies the requirement linear-operator (from versions: none)

while a more explicit command was a bit more readable:

!pip install https://files.pythonhosted.org/packages/e6/3c/a2cbf56429c4e370cdcd76155fc1068d21a812c67655244f0776a27697c7/linear_operator-0.1.1-py3-none-any.whl

led to

ERROR: Package 'linear-operator' requires a different Python: 3.7.14 not in '>=3.8'

Best,
Tom

Missing potential `unsqueeze` for `initial_guess` in `linear_cg`

If initial_guess is a vector, linear_cg doesn't do the right thing which leads to CUDA being out of memory. Similar to

linear_operator/linear_operator/utils/linear_cg.py

Lines 125 to 127 in 6166df3

    
           is_vector = rhs.ndimension() == 1 
        
           if is_vector: 
        
               rhs = rhs.unsqueeze(-1)

a check for the need to unsqueeze initial_guess should probably be performed here:

linear_operator/linear_operator/utils/linear_cg.py

Lines 134 to 135 in 6166df3

    
           if initial_guess is None: 
        
               initial_guess = torch.zeros_like(rhs)

Steps to reproduce the bug on colab:

import sys
print("Python version:",sys.version)

import torch
print('PyTorch version:',torch.__version__)

torchdevice = torch.device('cpu')
if torch.cuda.is_available():
  torchdevice = torch.device('cuda')
  print('Default GPU is ' + torch.cuda.get_device_name(torch.device('cuda')))
print('Running on ' + str(torchdevice))

# see https://github.com/cornellius-gp/linear_operator/issues/20
!pip install gpytorch
import gpytorch
print('gpytorch version:',gpytorch.__version__)

# Dimension of the square sparse matrix
n = 30000
# Number of non-zero elements (up to duplicates)
nnz = 10000

# Create sparse SPD matrix
rowidx = torch.randint(low=0, high=n, size=(nnz,), device=torchdevice)
colidx = torch.randint(low=0, high=n, size=(nnz,), device=torchdevice)
itemidx = torch.vstack((rowidx,colidx))
values = torch.randn(nnz, device=torchdevice)
sparse_mat = torch.sparse_coo_tensor(itemidx, values, size=(n,n)).coalesce()
sparse_mat = sparse_mat.t() @ sparse_mat
id_mat = torch.sparse_coo_tensor(torch.stack((torch.arange(n,device=torchdevice),torch.arange(n,device=torchdevice))), torch.ones(n,device=torchdevice), (n,n))
sparse_mat = id_mat+sparse_mat
print('sparse_mat:',sparse_mat)

b = torch.randn(n, device=torchdevice)

# unsqueezing x0 fixes the issue
x0 = torch.zeros_like(b)

closure = lambda xtest : sparse_mat @ xtest
x = gpytorch.utils.linear_cg(closure, b, initial_guess=x0)
print('cg:',x)

[Bug] solve_triangular(Tensor, LinearOperator) not supported

🐛 Bug

Passing a linear operator as the right-hand side argument to solve() does not work if it decides to use triangular_solve().

To reproduce

Code snippet to reproduce

import torch
from linear_operator import to_linear_operator

a_root = torch.tensor([[1.0, 0.5, 2.0], [0.0, 0.1, 0.2], [0.0, 0.0, 2.5]])
a = a_root @ a_root.T
b = torch.tensor([[3.0, 1.0], [4.0, 4.0], [1.0, 3.0]])
torch.linalg.solve(a, b)
torch.linalg.solve(to_linear_operator(a), b)
torch.linalg.solve(to_linear_operator(a), to_linear_operator(b))

Stack trace/error message

Traceback (most recent call last):
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 68, in _cholesky_solve
    res = self._tensor._cholesky_solve(rhs=rhs, upper=upper)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/dense_linear_operator.py", line 31, in _cholesky_solve
    return torch.cholesky_solve(rhs, self.to_dense(), upper=upper)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2785, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.cholesky_solve(DenseLinearOperator, Tensor, upper=bool) is not implemented.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/path/to/virtualenv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3460, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-29-e6878d4d649f>", line 1, in <module>
    torch.linalg.solve(to_linear_operator(a), to_linear_operator(b))
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2789, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2205, in solve
    return func.apply(self.representation_tree(), False, right_tensor, *self.representation())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/functions/_solve.py", line 53, in forward
    solves = _solve(linear_op, right_tensor)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/functions/_solve.py", line 17, in _solve
    return linear_op.cholesky()._cholesky_solve(rhs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 76, in _cholesky_solve
    w = self.solve(rhs)
        ^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 181, in solve
    res = torch.linalg.solve_triangular(self.to_dense(), right_tensor, upper=self.upper)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2775, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.linalg.solve_triangular(Tensor, DenseLinearOperator, upper=bool) is not implemented.

Expected Behavior

A solution is found.

System information

Please complete the following information:

linear_operator 0.3.0
pytorch 1.13.1
Fedora Linux 37

Additional context

NNVariationalStrategy in pytorch can happen to make such a call (here).

`LinearOperatorTestCase`'s `_test_triangular_linear_op_inv_quad_logdet` is skipped and fails otherwise

LinearOperatorTestCase's _test_triangular_linear_op_inv_quad_logdet is not executed in the current test suite because of the leading underscore. When this underscore is removed, the test executes but fails.

In particular, the test calls _test_inv_quad_logdet and checks if this uses a call to cholesky. However, derived test classes like TestLowRankRootAddedDiagLinearOperator, TestLowRankRootAddedDiagLinearOperatorBatch and TestSumKroneckerLinearOperator apparently don't use Cholesky-based inference for this step (see here for the low rank test), which makes the test fail.

[Bug] Indexing `ConstantMulLinearOperator` with a `SumBatchLinearOperator` base operator

🐛 Bug

To reproduce

A = ops.DenseLinearOperator(rand(4, 3, 2, 2))
B = ops.SumBatchLinearOperator(A, block_dim=-3)
C = ops.ConstantMulLinearOperator(B, rand([]))
C[:, -1:, :].to_dense()

The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-e4d937430d51> in <module>
----> 1 C[:, -1:, :].to_dense()
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/sum_batch_linear_operator.py in to_dense(self)
     59 
     60     def to_dense(self):
---> 61         return self.base_linear_op.to_dense().sum(dim=-3)  # BlockLinearOperators always use dim3 for the block_dim
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/utils/memoize.py in g(self, *args, **kwargs)
     57         kwargs_pkl = pickle.dumps(kwargs)
     58         if not _is_in_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl):
---> 59             return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
     60         return _get_from_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl)
     61 
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/constant_mul_linear_operator.py in to_dense(self)
    164     def to_dense(self):
    165         res = self.base_linear_op.to_dense()
--> 166         return res * self.expanded_constant
    167 
    168     @cached(name="root_decomposition")
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

Expected Behavior

The code is expected to behave in the same way its dense analogue would.

Symmetric matrices and convolution operators

Hi !
I'm interested in the library to reduce the memory and computationnal complexity of computing covariance matrices of a convolutional operation. The covariance is symmetric by construction, which would approximately divide the burden by half. The result may be consumed by other convolutionnal layers/other operations, or in some cases only the diagonal may be kept (which should in turn reduce the burden to square root of the full covariance).

Edit: it would actually be symmetric tensors with shape channel*width*height*channel*width*height

Is it possible with the library at the moment ? should there be concerns for performance with respect to the cuDNN implementations of convolutions ?

A possible approach I thought of involved unfolding the input, which I'm not sure is fully supported by the library/is somewhat inefficient in regular Pytorch.

[Feature Request] Block matrix support

🚀 Feature Request

Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are supported, the efficient representation is only when there is a diagonal structure over the T dimensions.

Motivation

Here is an example that linear_operator cannot deal with:

import torch
import itertools

T, N, M = 2, 4, 3
As = [torch.rand(N, M) for _ in range(T)]
Bs = [[torch.rand(M, M) for _ in range(T)] for _ in range(T)]
Cs = [torch.rand(N, N) for _ in range(T)]
L = torch.rand(T, T)

A_bl = torch.zeros((N * T, M * T))  # BlockDiag (non-square)
B_bl = torch.zeros((M * T, M * T))  # Dense
C_bl = torch.zeros((N * T, N * T))  # BlockDiag
L_bl = torch.kron(L, torch.eye(N))  # Kroneker

for t in range(T):
    A_bl[N * t : N * (t + 1), M * t : M * (t + 1)] = As[t]
    C_bl[N * t : N * (t + 1), N * t : N * (t + 1)] = Cs[t]

for t1, t2 in itertools.product(range(T), range(T)):
    B_bl[M * t1 : M * (t1 + 1), M * t2 : M * (t2 + 1)] = Bs[t1][t2]

# Desired calculation
print("inefficient method")
print(torch.diag(L_bl @ (C_bl + A_bl @ B_bl @ A_bl.T) @ L_bl.T))

This calculation turns up in some multi-output GP models. It has a straightforward efficient implementation:

M_diag = {}
# We only need the diagonal of each block of M
for t1, t2 in itertools.product(range(T), range(T)):
    r = (As[t1].T * (Bs[t1][t2] @ As[t2].T)).sum(0)
    if t1 == t2:
        r += torch.diag(Cs[t1])
    M_diag[(t1, t2)] = r

# The rotation is applied blockwise due to the kron structure
R = {}
for t in range(T):  # we don't need the off-diag blocks
    r = 0
    for i1, i2 in itertools.product(range(T), range(T)):
        r += L[t, i1] * M_diag[(i1, i2)] * L[t, i2]
    R[t] = r

print("fast way")
print(torch.concat([R[t] for t in range(T)]))

Currently, this calculation could be implemented inside linear_operator like this

from linear_operator.operators import (
    to_linear_operator,
    IdentityLinearOperator,
    BlockDiagLinearOperator,
    BlockLinearOperator,
    MatmulLinearOperator,
    KroneckerProductLinearOperator,
)


class BlockDiagLinearOperatorNonSquare(BlockLinearOperator):
    _add_batch_dim = BlockDiagLinearOperator._add_batch_dim
    _remove_batch_dim = BlockDiagLinearOperator._remove_batch_dim
    _get_indices = BlockDiagLinearOperator._get_indices
    _size = BlockDiagLinearOperator._size
    num_blocks = BlockDiagLinearOperator.num_blocks

    def __init__(self, base_linear_op, block_dim=-3):
        super().__init__(base_linear_op, block_dim)

A_lo = BlockDiagLinearOperatorNonSquare(torch.stack(As, 0))
B_lo = to_linear_operator(B_bl)
C_lo = BlockDiagLinearOperator(to_linear_operator(torch.stack(Cs, 0)))
L_lo = KroneckerProductLinearOperator(L, IdentityLinearOperator(N))

M = MatmulLinearOperator(A_lo, (MatmulLinearOperator(B_lo, A_lo.T)))

print("using linear operator, with to_dense()")
print(
    MatmulLinearOperator(L_lo, MatmulLinearOperator(C_lo + M, L_lo.T))
    .to_dense()
    .diagonal()
)

Removing the to_dense() gives an error, however.

Pitch

Add block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.

As a work-around, I have written manual implementations of specific cases, such as above.

I'm willing to work on PR for this

Additional context

None

[Feature Request] Tracking torch namespace function coverage

🚀 Feature Request

Currently, there are a number of functions in the torch namespace that are not registered to dispatch to LinearOperator. Closing this gap will be necessary to make LinearOperator true drop-in replacement for torch.Tensor.

The purpose of this issue is to track which functions are currently not registered, and prioritize the work on closing these gaps. I will try to keep this issue updated with things that come up and cross out things as we close them out.

Missing functions

~~torch.linalg.solve_triangular - #29~~
torch.linalg.cholesky_ex
torch.isclose
...

[Bug] linear_cg fails on sparse matrices

🐛 Bug

linear_cg function runs fails to produce correct values when operated on sparse matrices.

To reproduce

** add the following code snippet to test_linear_cg.py to reproduce **

    def test_cg_with_sparse(self):
        # Create a sample sparse CSR tensor
        N = size = 1000
        nnz = 10000
        indices = torch.randint(0, N, (2, nnz), dtype=torch.int32)
        values = torch.randn(nnz, dtype=torch.float64)
        matrix = torch\
            .sparse_coo_tensor(indices, values, (N, N), dtype=torch.float64)\
            .to_sparse_csr()

        # set up vector rhs
        rhs = torch.randn(size, dtype=torch.float64)

        # basic solve
        solves = linear_cg(matrix.matmul, rhs=rhs, max_iter=size)

        # solve with init value
        init = torch.randn(size, dtype=torch.float64)
        solves_with_init = linear_cg(matrix.matmul, rhs=rhs, max_iter=size, initial_guess=init)

        # Check cg
        matrix, rhs = matrix.to_dense(), rhs.to_dense()
        actual = torch.linalg.solve(matrix, rhs)
        self.assertTrue(torch.allclose(solves, actual, atol=1e-3, rtol=1e-4))
        self.assertTrue(torch.allclose(solves_with_init, actual, atol=1e-3, rtol=1e-4))

** Stack trace/error message **

NumericalWarning: CG terminated in 1000 iterations with average residual norm 97.1439134614844 which is larger than the tolerance of 1 specified by linear_operator.settings.cg_tolerance. If performance is affected, consider raising the maximum number of CG iterations by running code in a linear_operator.settings.max_cg_iterations(value) context.

FAIL: test_cg_with_sparse (__main__.TestLinearCG)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/path/test_linear_cg.py", line 90, in test_cg_with_sparse
    self.assertTrue(torch.allclose(solves, actual, atol=1e-3, rtol=1e-4))
AssertionError: False is not true

----------------------------------------------------------------------

Expected Behavior

Expect the result of running linear_cg to be close to that of torch.linalg.solve.

System information

Please complete the following information:

LinearOperator Version 0.3.0
PyTorch Version 2.0.0+cu117
macOS Monterey 12.6.3
Python version 3.10.9

Additional context

Other warnings produced when running the test:

  warnings.warn(
.../path/test_linear_cg.py:115: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:54.)
  .to_sparse_csr()
/path/lib/python3.10/site-packages/linear_operator/utils/linear_cg.py:323: NumericalWarning: CG terminated in 1000 iterations with average residual norm 97.1439134614844 which is larger than the tolerance of 1 specified by linear_operator.settings.cg_tolerance. If performance is affected, consider raising the maximum number of CG iterations by running code in a linear_operator.settings.max_cg_iterations(value) context.

I suppose this may also be a PyTorch issue but I am not sure how to investigate further from here.
Appreciate your time to review this issue. Thanks!

Upgrade required typeguard version

Greetings,

LinearOperator seems to require typeguard~=2.13.3, which is from December 2021 and typeguard is currently at v4.1.5. It would be nice to upgrade this dependency to something more recent.

Thank you.

[Bug] `mul` with mixed LinearOperator and Tensor gives incorrect shape

🐛 Bug

Calling the LinearOperator mul method on a torch.Tensor results in the wrong shape (although converting to a regular tensor appears to give the correct result). This caused GPyTorch to crash when using a ScaleKernel with an AdditiveKernel (in the multi-task case only).

To reproduce

import torch
from linear_operator import to_linear_operator

a = torch.randn(2, 1, 1)
b = torch.randn(1, 1000, 10000)

print(b.mul(a).shape)

a_lo = to_linear_operator(a)
b_lo = to_linear_operator(b)

print(b_lo.mul(a).shape)
print(b_lo.mul(a).to_dense().shape)

** Output **

torch.Size([2, 1000, 10000])
torch.Size([1, 1000, 10000])
torch.Size([2, 1000, 10000])

System information

linear-operator==0.1.1

ZeroLinearOperators used with CatLinearOperators creates representation issues.

Discussed in #78

^{Originally posted by AndreaBraschi September 13, 2023}
Hi all,

I've written a function to concatenate 2 square LinearOperator objects with different shapes:

from linear_operator.operators import ZeroLinearOperator, CatLinearOperator, KroneckerProductLinearOperator

def cat_LinearOperators(*linear_operators):
        oper_A = linear_operators[0]
        oper_B = linear_operators[1]
        oper_A_size = oper_A.shape[0]
        oper_B_size = oper_B.shape[0]

        # mat_A
        a = ZeroLinearOperator(oper_A_size, oper_B_size)
        cat_1 =CatLinearOperator(a, oper_A, dim=1)
        c = ZeroLinearOperator(oper_B_size, oper_B_size)
        cat_2 = CatLinearOperator(a.transpose(-1, -2), c, dim=1)
        tensor_A = CatLinearOperator(cat_2, cat_1, dim=0)

        # mat_B
        e = ZeroLinearOperator(oper_B_size, oper_A_size)
        cat_3 = CatLinearOperator(oper_B, e, dim=1)
        c = ZeroLinearOperator(oper_A_size, lazy_A_size)
        cat_4 = CatLinearOperator(c, e, dim=0)
        tensor_B = CatLinearOperator(cat_3, cat_4.transpose(-1, -2), dim=0)

        cat_operator = SumLinearOperator(tensor_A, tensor_B)

        return cat_operator

the function works, butI get the following RuntimeError when I evaluate the resulting CatLinearOperator:
Representation of a LazyTensor should consist only of Tensors.

You can evaluate the function yourself:

tensor_A = torch.randn(38, 38)
tensor_B = torch.randn(50, 50)
tensor_C = torch.randn(4, 4)
Kron_A = KroneckerProductLinearOperator(tensor_A, tensor_B)
Kron_B =  KroneckerProductLinearOperator(tensor_C, tensor_B)
cat_operator =cat_LinearOperators(Kron_B, Kron_A)
cat_operator.to_dense()

I start to believe that the mistake is in how I use the ZeroLinearOperator object?

Any help or suggestion will be greatly appreciated.

Many thanks,
Andrea

[Feature Request] Non-symmetric Toeplitz matrices

🚀 Feature Request

Implementing non-symmetric Toeplitz matrices with fast algorithm for solve and inverse.

Motivation

I have a custom neural net which exhibit non-symmetric Toeplitz matrices.

Pitch

I have already started implementing non-symmetric Toeplitz matrices myself. This issue is just to let you know I am working on it. See https://github.com/MoiseRousseau/linear_operator/tree/Moise/non_sym_toeplitz

[Bug] ConstantDiagLinearOperator._mul_constant

🐛 Bug

ConstantDiagLinearOperator._mul_constant seems to be bugged.

To reproduce

from linear_operator import operators
A = operators.DenseLinearOperator(rand(4, 2, 3, 3))
B = A.add_jitter(1.0)
C = operators.ConstantMulLinearOperator(B, rand([]))
C[..., -1:, :]

Stack trace/error message

The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-11-690c97df3908> in <module>
----> 1 C[..., -1:, :]
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in __getitem__(self, index)
   2632             res = self._get_indices(row_index, col_index, *batch_indices)
   2633         else:
-> 2634             res = self._getitem(row_index, col_index, *batch_indices)
   2635 
   2636         # If we selected a single row and/or column (or did tensor indexing), we'll be retuning a tensor
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/constant_mul_linear_operator.py in _getitem(self, row_index, col_index, *batch_indices)
    102         constant = self._constant.expand(self.batch_shape)[batch_indices]
    103         constant = constant.view(*constant.shape, 1, 1)
--> 104         return base_linear_op * constant
    105 
    106     def _matmul(self, rhs):
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in __mul__(self, other)
   2665     @_implements_second_arg(torch.Tensor.mul)
   2666     def __mul__(self, other: Union[torch.Tensor, LinearOperator, float]) -> LinearOperator:
-> 2667         return self.mul(other)
   2668 
   2669     @_implements_second_arg(torch.Tensor.add)
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in mul(self, other)
   1753                 return self._mul_constant(other.squeeze())
   1754             elif other.shape[-2:] == torch.Size((1, 1)):
-> 1755                 return self._mul_constant(other.view(*other.shape[:-2]))
   1756 
   1757         return self._mul_matrix(to_linear_operator(other))
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/sum_linear_operator.py in _mul_constant(self, other)
     44     def _mul_constant(self, other):
     45         # We're using a custom method here - the constant mul is applied to the base_linear_ops
---> 46         return self.__class__(*[lt._mul_constant(other) for lt in self.linear_ops])
     47 
     48     def _bilinear_derivative(self, left_vecs, right_vecs):
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/interpolated_linear_operator.py in _mul_constant(self, other)
    218         # This preserves the interpolated structure
    219         return self.__class__(
--> 220             self.base_linear_op._mul_constant(other),
    221             self.left_interp_indices,
    222             self.left_interp_values,
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/diag_linear_operator.py in _mul_constant(self, constant)
    272 
    273     def _mul_constant(self, constant: Tensor) -> "ConstantDiagLinearOperator":
--> 274         return self.__class__(self.diag_values * constant, diag_shape=self.diag_shape)
    275 
    276     def _mul_matrix(self, other: Union[Tensor, LinearOperator]) -> Union[Tensor, LinearOperator]:
RuntimeError: The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 1

Additional context

The following seems to work, but I don't know enough to claim that it's a fix.

from unittest.mock import patch


def _mul_constant(self, constant):
    return self.__class__(self.diag_values * constant.unsqueeze(-1), diag_shape=self.diag_shape)


with patch.object(operators.ConstantDiagLinearOperator, "_mul_constant", new=_mul_constant):
    C[..., -1:, :]

argument name inconsistency in setting jitter value with gpytorch

Change the init argument of class _dtype_value_context.
- Currently the init function for _dtype_value_context in linear_operator/settings.py is:
  def __init__(self, float=None, double=None, half=None):
- This is not consistent with the init function for _dtype_value_context in gpytorch/settings.py, which is:
  def __init__(self, float_value=None, double_value=None, half_value=None):
- I would suggest to unify the argument names, for example, make the both init functions as:
  def __init__(self, float_value=None, double_value=None, half_value=None):

[Bug] Passing `other: Tensor` to `LinearOperator.to`.

🐛 Bug

torch.Tensor.to allows the user to provide a reference tensor in lieu of explicitly passing device and dtype; however, this logic seems to be missing from LinearOperator.to.

To reproduce

A = DiagLinearOperator(torch.rand(3, dtype=torch.float64))
B = A.to(torch.rand(3, dtype=torch.float32))
B.dtype
> torch.float64

Expected Behavior

The code should either follow the same convention as torch.Tensor.to or raise an exception.

Use a fix random seed for the tests

🚀 Feature Request

Use a fix random seed for the tests.

Motivation

Hello!

In a CI build of linear_operator, we are sometimes experiencing a build failure due to too tight tolerances. For instance, here:

https://bordeaux.guix.gnu.org/build/cecb897d-366c-4d76-a501-7e7507978b53/log

The fact that it happens randomly is a problem to build the package reliably.

Pitch

Could you please use a fixed random seed to run the tests, at least for CI builds? I don’t know enough Python to help much.

Best regards,

Vivien

Pythonic Sum is broken with LinearOperators

Attempting to use Pythonic sum of arbitrary iterables of LinearOperators raises TypeError. This is because the builtin sum() function starts with int(0) and sums iteratively, as follows:

sum([a,b,..z]) = 0 + a + b + .. + z

Minimal reproducible example (with workaround)

import gpytorch
import torch
import traceback
import warnings

x1 = torch.rand([16,1])
x2 = torch.rand([16,1])

k1 = gpytorch.kernels.RBFKernel()(x1)
k2 = gpytorch.kernels.RBFKernel()(x2)

k3 = k1+k2

print("'+' operator works fine.")

# This does not
try:
    k4 = sum([k1,k2])
except Exception as e: traceback.print_exc()

# A workaround now: functools.reduce works, but is a little unsightly

from functools import reduce
import operator

k5 = reduce(operator.add,[k1,k2])

A Fix

This can be fixed by simply including logic for LinearOperator to add int(0) in the same way as it adds a ZeroLinearOperator, by returning self, as follows:

import numbers
# Specifically only allow numbers.Number without raising TypeError in the case where other==0
if isinstance(other, numbers.Number) and other==0:
    return self

I'd recommend adding this case at the end of the conditions in LinearOperator.__add__, on line 2571, so it's only checked after the more common usecases, as follows:

def __add__(self, other: Union[torch.Tensor, LinearOperator, float]) -> LinearOperator:
    from torch import Tensor

    from .added_diag_linear_operator import AddedDiagLinearOperator
    from .dense_linear_operator import to_linear_operator
    from .diag_linear_operator import DiagLinearOperator
    from .root_linear_operator import RootLinearOperator
    from .sum_linear_operator import SumLinearOperator
    from .zero_linear_operator import ZeroLinearOperator
    
    import numbers.Number

    if isinstance(other, ZeroLinearOperator):
        return self
    elif isinstance(other, DiagLinearOperator):
        return AddedDiagLinearOperator(self, other)
    elif isinstance(other, RootLinearOperator):
        return self.add_low_rank(other.root)
    elif isinstance(other, Tensor):
        other = to_linear_operator(other)
        shape = torch.broadcast_shapes(self.shape, other.shape)
        new_self = self if self.shape[:-2] == shape[:-2] else self._expand_batch(shape[:-2])
        new_other = other if other.shape[:-2] == shape[:-2] else other._expand_batch(shape[:-2])
        return SumLinearOperator(new_self, new_other)
    elif isinstance(other, numbers.Number) and other==0:
        return self
    else:
        return SumLinearOperator(self, other)

[Bug] torch.cat fails for linear operators

🐛 Bug

torch.cat fails for linear operators.

To reproduce

** Code snippet to reproduce **

from linear_operator.operators import DiagLinearOperator
import torch

D = DiagLinearOperator(torch.randn(2, 3, 100))  # Represents an operator of size 2 x 3 x 100
torch.cat([D, D], dim=-2)

** Stack trace/error message **

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/env/lib/python3.12/site-packages/linear_operator/operators/_linear_operator.py", line 2948, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.cat(list, dim=int) is not implemented.

Expected Behavior

According to the documentation, torch.cat should work on linear operators.

System information

Please complete the following information:

LinearOperator version: 0.5.2
PyTorch Version: 2.2.1
OS: Ubuntu 20.04.6 LTS

Adding 0 or ZeroLinearOperator to a LinearOperator should return a deep copy of the LinearOperator.

See the discussion in #9.

We're holding off on implementing this due to downstream issues in BoTorch (see #12).

Representation of a LinearOperator should consist only of Tensors

Hi all,

I've written a function to concatenate 2 square LinearOperator objects with different shapes:

from linear_operator.operators import ZeroLinearOperator, CatLinearOperator, KroneckerProductLinearOperator

def cat_LinearOperators(*linear_operators):
        oper_A = linear_operators[0]
        oper_B = linear_operators[1]
        oper_A_size = oper_A.shape[0]
        oper_B_size = oper_B.shape[0]

        # mat_A
        a = ZeroLinearOperator(oper_A_size, oper_B_size)
        cat_1 =CatLinearOperator(a, oper_A, dim=1)
        c = ZeroLinearOperator(oper_B_size, oper_B_size)
        cat_2 = CatLinearOperator(a.transpose(-1, -2), c, dim=1)
        tensor_A = CatLinearOperator(cat_2, cat_1, dim=0)

        # mat_B
        e = ZeroLinearOperator(oper_B_size, oper_A_size)
        cat_3 = CatLinearOperator(oper_B, e, dim=1)
        c = ZeroLinearOperator(oper_A_size, lazy_A_size)
        cat_4 = CatLinearOperator(c, e, dim=0)
        tensor_B = CatLinearOperator(cat_3, cat_4.transpose(-1, -2), dim=0)

        cat_operator = SumLinearOperator(tensor_A, tensor_B)

        return cat_operator

the function works, butI get the following RuntimeError when I evaluate the resulting CatLinearOperator:
Representation of a LazyTensor should consist only of Tensors.

You can evaluate the function yourself:

tensor_A = torch.randn(38, 38)
tensor_B = torch.randn(50, 50)
tensor_C = torch.randn(4, 4)
Kron_A = KroneckerProductLinearOperator(tensor_A, tensor_B)
Kron_B =  KroneckerProductLinearOperator(tensor_C, tensor_B)
cat_operator =cat_LinearOperators(Kron_B, Kron_A)
cat_operator.to_dense()

I start to believe that the mistake is in how I use the ZeroLinearOperator object?

Any help or suggestion will be greatly appreciated.

Many thanks,
Andrea

Originally posted by @AndreaBraschi in #78

	is_vector = rhs.ndimension() == 1
	if is_vector:
	rhs = rhs.unsqueeze(-1)

	if initial_guess is None:
	initial_guess = torch.zeros_like(rhs)

cornellius-gp / linear_operator Goto Github PK

linear_operator's People

Contributors

Stargazers

Watchers

Forkers

linear_operator's Issues

🐛 Bug

To reproduce

Expected Behavior

System information

Additional context

🐛 Bug

To reproduce

Expected Behavior

System information

Additional context

🐛 Bug

To reproduce

Expected Behavior

System information

🐛 Bug

To reproduce

Expected Behavior

System information

Additional context

🐛 Bug

To reproduce

Expected Behavior

🚀 Feature Request

Motivation

Pitch

Additional context

🚀 Feature Request

Missing functions

🐛 Bug

To reproduce

Expected Behavior

System information

Additional context

🐛 Bug

To reproduce

System information

Discussed in #78

🚀 Feature Request

Motivation

Pitch

🐛 Bug

To reproduce

Stack trace/error message

Additional context

🐛 Bug

To reproduce

Expected Behavior

🚀 Feature Request

Motivation

Pitch

Minimal reproducible example (with workaround)

A Fix

🐛 Bug

To reproduce

Expected Behavior

System information

Recommend Projects

Recommend Topics

Recommend Org