Giter Site home page Giter Site logo

linear_operator's People

Contributors

balandat avatar colesbury avatar dannyfriar avatar dannylagrouw avatar docusaurus-bot avatar gpleiss avatar gpleiss-asapp avatar jacobrgardner avatar jahall avatar keawang avatar naefjo avatar patel-zeel avatar qingfeng10 avatar rhaps0dy avatar rmsander avatar rsid8 avatar saitcakmak avatar samuelstanton avatar sdaulton avatar sebastianament avatar thjashin avatar turakar avatar tvercaut avatar valtron avatar vishwakftw avatar vr308 avatar wbeardall avatar wecacuee avatar wjmaddox avatar wrh14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linear_operator's Issues

[Bug] `pivoted_cholesky` fails with KeOps kernels

πŸ› Bug

When using a KeOps kernel in GPyTorch, and making predictions (data larger than settings.min_preconditioning_size), the pivoted cholesky decomposition fails. This seems to be due to covar_func returning a pykeops LazyTensor rather than a LinearOperator, making to_dense() fail.

To reproduce

model is a SingleTaskGP with train_inputs of size [100, 2000, 6].

We use a KeOps kernel as laid out in the GPyTorch tutorials.

gpytorch.kernels.ScaleKernel(
        base_kernel=gpytorch.kernels.keops.MaternKernel(
            nu=2.5,
            ard_num_dims=ard_num_dims,
            batch_shape=batch_shape,
            lengthscale_prior=gpytorch.priors.torch_priors.GammaPrior(3.0, 6.0),
        ),
        batch_shape=batch_shape,
        outputscale_prior=gpytorch.priors.torch_priors.GammaPrior(2.0, 0.15),
)
# this fails
preds = model(X) # X being shape [100,2000,6] in my case

** Stack trace/error message **

  File "/fsx/home_dirs/fegt/BOSS/boss/acquisition.py", line 137, in forward
    preds = model(X)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_gp.py", line 333, in __call__
    ) = self.prediction_strategy.exact_prediction(full_mean, full_covar)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 289, in exact_prediction
    self.exact_predictive_mean(test_mean, test_train_covar),
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 306, in exact_predictive_mean
    if len(self.mean_cache.shape) == 4:
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/utils/memoize.py", line 59, in g
    return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/gpytorch/models/exact_prediction_strategies.py", line 256, in mean_cache
    mean_cache = train_train_covar.evaluate_kernel().solve(train_labels_offset).squeeze(-1)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 2334, in solve
    return func.apply(self.representation_tree(), False, right_tensor, *self.representation())
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_solve.py", line 53, in forward
    solves = _solve(linear_op, right_tensor)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_solve.py", line 20, in _solve
    preconditioner = linear_op.detach()._solve_preconditioner()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 806, in _solve_preconditioner
    base_precond, _, _ = self._preconditioner()

# Starting from here, we fail because we exceed settings.min_preconditioning_size

  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/added_diag_linear_operator.py", line 126, in _preconditioner
    self._piv_chol_self = self._linear_op.pivoted_cholesky(rank=max_iter)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 1965, in pivoted_cholesky
    res, pivots = func(self.representation_tree(), rank, error_tol, *self.representation())
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/functions/_pivoted_cholesky.py", line 24, in forward
    matrix_diag = matrix._approx_diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/constant_mul_linear_operator.py", line 74, in _approx_diagonal
    res = self.base_linear_op._approx_diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 492, in _approx_diagonal
    return self._diagonal()
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/utils/memoize.py", line 59, in g
    return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/kernel_linear_operator.py", line 233, in _diagonal
    diag_mat = to_dense(self.covar_func(x1, x2, **tensor_params, **self.nontensor_params))
  File "/nfs_home/users/fegt/.conda/envs/botorch/lib/python3.10/site-packages/linear_operator/operators/_linear_operator.py", line 2987, in to_dense
    raise TypeError("object of class {} cannot be made into a Tensor".format(obj.__class__.__name__))
TypeError: object of class LazyTensor cannot be made into a Tensor

Expected Behavior

The LazyTensor should probably somehow be cast to a KernelLinearOperator

System information

Please complete the following information:

  • linear-operator==0.5.1
  • pykeops==2.1.2
  • gpytorch==1.11
  • torch==2.0.1

Additional context

Add any other context about the problem here.

`DenseLinearOperator` does not have `operator` attribute

I ran into this while porting over this PR here.

When I execute the test suite, I get 122 errors of the form

AttributeError: 'DenseLinearOperator' object has no attribute 'operator'

When I add an operator attribute here and set it to tsr, all tests pass.

It seems like @gpleiss is about to merge this, and I am not sure if there are any other pending changes coming into this repository. Let me know if I should hold off until the merge is complete, or if I should make a PR for this.

[Bug] Parameters missing from graph when KeOpsLinearOpeartor is used

πŸ› Bug

I have implemented a KeOps periodic kernel in cornellius-gp/gpytorch#2296 , however the raw_lengthscale parameter does not have gradients computed (see cornellius-gp/gpytorch#2296 (comment) ). I have managed to track the issue to the matrix multiplication implemented in LinearOperator.matmul, see Expected Behavior. This matrix multiplication is called when doing LinearOperator.sum (as in the example below).

To reproduce

** Code snippet to reproduce **

import torch
from gpytorch.kernels.keops import PeriodicKernel as KeOpsPeriodicKernel #implementation from pull request 2296
import gpytorch

torch.manual_seed(7)

M, N, D = 1000, 2000, 3
x = torch.randn(M, D).double()
y = torch.randn(N, D).double()
k = KeOpsPeriodicKernel(ard_num_dims=3).double()
k.lengthscale = torch.tensor(1.0).double()
k.period_length = torch.tensor(1.0).double()

# context manager used so that type(covar) is KeOpsLinearOpeartor, not LazyEvaluatedKernelTensor
with gpytorch.settings.lazily_evaluate_kernels(False):
    covar = k(x, y)
    print(type(covar))
     # Calls `LinearOperator.sum``, which subsequently calls `LinearOperator.matmul`
     # `LinearOperator.matmul` uses a custom torch.Function for matrix multiplication
    res2 = covar.sum(dim=1) # res2 is a torch.Tensor here
    res2 = res2.sum()
    print(res2)
    g_x = torch.autograd.grad(res2, [k.raw_lengthscale, k.raw_period_length])
    print(g_x)

** Stack trace/error message **

<class 'linear_operator.operators.keops_linear_operator.KeOpsLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
Traceback (most recent call last):
  File "/home/julian/Desktop/test/keops_periodic_low_level/issue_keops_linear_operator.py", line 23, in <module>
    g_x = torch.autograd.grad(res2, [k.raw_lengthscale, k.raw_period_length])
  File "/home/julian/.venv/ichor/lib/python3.10/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

where k.raw_lengthscale is causing the issue.

Expected Behavior

Compute the gradients for both k.raw_lengthscale and k.raw_period_length. The exact place where the issue occurs is here

return (self @ ones).squeeze(-1)

which calls the LinearOperator.matmul that loses track of the gradients. I am summing across the columns in the example, but the same issue occurs if summing across the rows. Adding the following check

# Case: summing across columns
if dim == (self.dim() - 1):
    ones = torch.ones(self.size(-1), 1, dtype=self.dtype, device=self.device)
    from .keops_linear_operator import KeOpsLinearOperator
    if isinstance(self, KeOpsLinearOperator):
        return self.covar_mat.sum(dim=1)
    return (self @ ones).squeeze(-1)

gives gradients for both raw_lengthscale and raw_period_length as the custom Matmul is never called.

<class 'linear_operator.operators.keops_linear_operator.KeOpsLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
(tensor([[70682.4975, 70796.7652, 70631.9364]], dtype=torch.float64), tensor([[ 70.2535,  19.3231, -47.2902]], dtype=torch.float64))

This is probably not the best solution, perhaps the Matmul forward/backward methods can be changed, so the gradients are computed correctly?

As a check, the same numbers are returned if the normal periodic kernel is used

<class 'linear_operator.operators.dense_linear_operator.DenseLinearOperator'>
tensor(202237.5145, dtype=torch.float64, grad_fn=<SumBackward0>)
(tensor([[70682.4975, 70796.7652, 70631.9364]], dtype=torch.float64), tensor([[ 70.2535,  19.3231, -47.2902]], dtype=torch.float64))

System information

Please complete the following information:
linear_operator version: 0.3.0
torch version: 1.13.1+cu117

Additional context

Add any other context about the problem here.

[Bug] Failing to slice the CatLinearOperator when indexes are negative or when using boolean array

πŸ› Bug

When slicing the CatLinearOperator using a negative index, the final shape of the slice does not match the expected shape and an error is returned. This fails at least for ToeplitzLinearOperator, the DiagLinearOperator and the IdentityLinearOperator.

To reproduce

** Code snippet to reproduce **

from linear_operator.operators import IdentityLinearOperator as Ops
from linear_operator.operators import cat as cat_ops

N = 8
base = cat_ops([Ops(N) for _ in range(2)], dim=1)
print(base.shape) #should be 8,16
print(base[:,3:base.shape[-1]-3].shape) #should be 8,10
print(base[:,3:-3].shape) #fail...

** Stack trace/error message **

torch.Size([8, 16])
torch.Size([8, 10])
Traceback (most recent call last):
  File "/home/moise/Program/moise/linear_operator/debug.py", line 8, in <module>
    print(base[:,3:-3].shape) #fail...
  File "/home/moise/Program/moise/linear_operator/linear_operator/operators/_linear_operator.py", line 2870, in __getitem__
    raise RuntimeError(
RuntimeError: CatLinearOperator.__getitem__ failed! Expected a final shape of size torch.Size([8, 10]), got torch.Size([8, 5]). This is a bug with LinearOperator, or your custom LinearOperator.

Expected Behavior

The slice behave as it is working when using positive indexes.

System information

LinearOperator Version 0.5.2
PyTorch Version 2.0.1
Ubuntu 22.04

Python 3.7 suport - Google colab related

Hi,

Thanks for the nice library. I just saw the package on pypi (https://pypi.org/project/linear-operator/) and wanted to quickly try it with Google colab (with currently runs python 3.7.14) but haven't managed to get it setup yet due to the required python version being >=3.8:
https://github.com/cornellius-gp/linear_operator/blob/main/setup.py

Would it be possible to relax the python requirements to support running on google colab?

For the record, the error message was also rather difficult to read:

!pip install linear-operator

led to

ERROR: Could not find a version that satisfies the requirement linear-operator (from versions: none)

while a more explicit command was a bit more readable:

!pip install https://files.pythonhosted.org/packages/e6/3c/a2cbf56429c4e370cdcd76155fc1068d21a812c67655244f0776a27697c7/linear_operator-0.1.1-py3-none-any.whl

led to

ERROR: Package 'linear-operator' requires a different Python: 3.7.14 not in '>=3.8'

Best,
Tom

Missing potential `unsqueeze` for `initial_guess` in `linear_cg`

If initial_guess is a vector, linear_cg doesn't do the right thing which leads to CUDA being out of memory. Similar to

is_vector = rhs.ndimension() == 1
if is_vector:
rhs = rhs.unsqueeze(-1)

a check for the need to unsqueeze initial_guess should probably be performed here:
if initial_guess is None:
initial_guess = torch.zeros_like(rhs)

Steps to reproduce the bug on colab:

import sys
print("Python version:",sys.version)

import torch
print('PyTorch version:',torch.__version__)

torchdevice = torch.device('cpu')
if torch.cuda.is_available():
  torchdevice = torch.device('cuda')
  print('Default GPU is ' + torch.cuda.get_device_name(torch.device('cuda')))
print('Running on ' + str(torchdevice))

# see https://github.com/cornellius-gp/linear_operator/issues/20
!pip install gpytorch
import gpytorch
print('gpytorch version:',gpytorch.__version__)

# Dimension of the square sparse matrix
n = 30000
# Number of non-zero elements (up to duplicates)
nnz = 10000

# Create sparse SPD matrix
rowidx = torch.randint(low=0, high=n, size=(nnz,), device=torchdevice)
colidx = torch.randint(low=0, high=n, size=(nnz,), device=torchdevice)
itemidx = torch.vstack((rowidx,colidx))
values = torch.randn(nnz, device=torchdevice)
sparse_mat = torch.sparse_coo_tensor(itemidx, values, size=(n,n)).coalesce()
sparse_mat = sparse_mat.t() @ sparse_mat
id_mat = torch.sparse_coo_tensor(torch.stack((torch.arange(n,device=torchdevice),torch.arange(n,device=torchdevice))), torch.ones(n,device=torchdevice), (n,n))
sparse_mat = id_mat+sparse_mat
print('sparse_mat:',sparse_mat)

b = torch.randn(n, device=torchdevice)

# unsqueezing x0 fixes the issue
x0 = torch.zeros_like(b)

closure = lambda xtest : sparse_mat @ xtest
x = gpytorch.utils.linear_cg(closure, b, initial_guess=x0)
print('cg:',x)

[Bug] solve_triangular(Tensor, LinearOperator) not supported

πŸ› Bug

Passing a linear operator as the right-hand side argument to solve() does not work if it decides to use triangular_solve().

To reproduce

Code snippet to reproduce

import torch
from linear_operator import to_linear_operator

a_root = torch.tensor([[1.0, 0.5, 2.0], [0.0, 0.1, 0.2], [0.0, 0.0, 2.5]])
a = a_root @ a_root.T
b = torch.tensor([[3.0, 1.0], [4.0, 4.0], [1.0, 3.0]])
torch.linalg.solve(a, b)
torch.linalg.solve(to_linear_operator(a), b)
torch.linalg.solve(to_linear_operator(a), to_linear_operator(b))

Stack trace/error message

Traceback (most recent call last):
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 68, in _cholesky_solve
    res = self._tensor._cholesky_solve(rhs=rhs, upper=upper)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/dense_linear_operator.py", line 31, in _cholesky_solve
    return torch.cholesky_solve(rhs, self.to_dense(), upper=upper)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2785, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.cholesky_solve(DenseLinearOperator, Tensor, upper=bool) is not implemented.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/path/to/virtualenv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3460, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-29-e6878d4d649f>", line 1, in <module>
    torch.linalg.solve(to_linear_operator(a), to_linear_operator(b))
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2789, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2205, in solve
    return func.apply(self.representation_tree(), False, right_tensor, *self.representation())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/functions/_solve.py", line 53, in forward
    solves = _solve(linear_op, right_tensor)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/functions/_solve.py", line 17, in _solve
    return linear_op.cholesky()._cholesky_solve(rhs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 76, in _cholesky_solve
    w = self.solve(rhs)
        ^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/triangular_linear_operator.py", line 181, in solve
    res = torch.linalg.solve_triangular(self.to_dense(), right_tensor, upper=self.upper)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/virtualenv/lib/python3.11/site-packages/linear_operator/operators/_linear_operator.py", line 2775, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.linalg.solve_triangular(Tensor, DenseLinearOperator, upper=bool) is not implemented.

Expected Behavior

A solution is found.

System information

Please complete the following information:

  • linear_operator 0.3.0
  • pytorch 1.13.1
  • Fedora Linux 37

Additional context

NNVariationalStrategy in pytorch can happen to make such a call (here).

`LinearOperatorTestCase`'s `_test_triangular_linear_op_inv_quad_logdet` is skipped and fails otherwise

LinearOperatorTestCase's _test_triangular_linear_op_inv_quad_logdet is not executed in the current test suite because of the leading underscore. When this underscore is removed, the test executes but fails.

In particular, the test calls _test_inv_quad_logdet and checks if this uses a call to cholesky. However, derived test classes like TestLowRankRootAddedDiagLinearOperator, TestLowRankRootAddedDiagLinearOperatorBatch and TestSumKroneckerLinearOperator apparently don't use Cholesky-based inference for this step (see here for the low rank test), which makes the test fail.

[Bug] Indexing `ConstantMulLinearOperator` with a `SumBatchLinearOperator` base operator

πŸ› Bug

To reproduce

A = ops.DenseLinearOperator(rand(4, 3, 2, 2))
B = ops.SumBatchLinearOperator(A, block_dim=-3)
C = ops.ConstantMulLinearOperator(B, rand([]))
C[:, -1:, :].to_dense()
The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-e4d937430d51> in <module>
----> 1 C[:, -1:, :].to_dense()
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/sum_batch_linear_operator.py in to_dense(self)
     59 
     60     def to_dense(self):
---> 61         return self.base_linear_op.to_dense().sum(dim=-3)  # BlockLinearOperators always use dim3 for the block_dim
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/utils/memoize.py in g(self, *args, **kwargs)
     57         kwargs_pkl = pickle.dumps(kwargs)
     58         if not _is_in_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl):
---> 59             return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
     60         return _get_from_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl)
     61 
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/constant_mul_linear_operator.py in to_dense(self)
    164     def to_dense(self):
    165         res = self.base_linear_op.to_dense()
--> 166         return res * self.expanded_constant
    167 
    168     @cached(name="root_decomposition")
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

Expected Behavior

The code is expected to behave in the same way its dense analogue would.

Symmetric matrices and convolution operators

Hi !
I'm interested in the library to reduce the memory and computationnal complexity of computing covariance matrices of a convolutional operation. The covariance is symmetric by construction, which would approximately divide the burden by half. The result may be consumed by other convolutionnal layers/other operations, or in some cases only the diagonal may be kept (which should in turn reduce the burden to square root of the full covariance).

Edit: it would actually be symmetric tensors with shape channel*width*height*channel*width*height

Is it possible with the library at the moment ? should there be concerns for performance with respect to the cuDNN implementations of convolutions ?

A possible approach I thought of involved unfolding the input, which I'm not sure is fully supported by the library/is somewhat inefficient in regular Pytorch.

[Feature Request] Block matrix support

πŸš€ Feature Request

Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are supported, the efficient representation is only when there is a diagonal structure over the T dimensions.

Motivation

Here is an example that linear_operator cannot deal with:

import torch
import itertools

T, N, M = 2, 4, 3
As = [torch.rand(N, M) for _ in range(T)]
Bs = [[torch.rand(M, M) for _ in range(T)] for _ in range(T)]
Cs = [torch.rand(N, N) for _ in range(T)]
L = torch.rand(T, T)

A_bl = torch.zeros((N * T, M * T))  # BlockDiag (non-square)
B_bl = torch.zeros((M * T, M * T))  # Dense
C_bl = torch.zeros((N * T, N * T))  # BlockDiag
L_bl = torch.kron(L, torch.eye(N))  # Kroneker

for t in range(T):
    A_bl[N * t : N * (t + 1), M * t : M * (t + 1)] = As[t]
    C_bl[N * t : N * (t + 1), N * t : N * (t + 1)] = Cs[t]

for t1, t2 in itertools.product(range(T), range(T)):
    B_bl[M * t1 : M * (t1 + 1), M * t2 : M * (t2 + 1)] = Bs[t1][t2]

# Desired calculation
print("inefficient method")
print(torch.diag(L_bl @ (C_bl + A_bl @ B_bl @ A_bl.T) @ L_bl.T))

This calculation turns up in some multi-output GP models. It has a straightforward efficient implementation:

M_diag = {}
# We only need the diagonal of each block of M
for t1, t2 in itertools.product(range(T), range(T)):
    r = (As[t1].T * (Bs[t1][t2] @ As[t2].T)).sum(0)
    if t1 == t2:
        r += torch.diag(Cs[t1])
    M_diag[(t1, t2)] = r

# The rotation is applied blockwise due to the kron structure
R = {}
for t in range(T):  # we don't need the off-diag blocks
    r = 0
    for i1, i2 in itertools.product(range(T), range(T)):
        r += L[t, i1] * M_diag[(i1, i2)] * L[t, i2]
    R[t] = r

print("fast way")
print(torch.concat([R[t] for t in range(T)]))

Currently, this calculation could be implemented inside linear_operator like this

from linear_operator.operators import (
    to_linear_operator,
    IdentityLinearOperator,
    BlockDiagLinearOperator,
    BlockLinearOperator,
    MatmulLinearOperator,
    KroneckerProductLinearOperator,
)


class BlockDiagLinearOperatorNonSquare(BlockLinearOperator):
    _add_batch_dim = BlockDiagLinearOperator._add_batch_dim
    _remove_batch_dim = BlockDiagLinearOperator._remove_batch_dim
    _get_indices = BlockDiagLinearOperator._get_indices
    _size = BlockDiagLinearOperator._size
    num_blocks = BlockDiagLinearOperator.num_blocks

    def __init__(self, base_linear_op, block_dim=-3):
        super().__init__(base_linear_op, block_dim)

A_lo = BlockDiagLinearOperatorNonSquare(torch.stack(As, 0))
B_lo = to_linear_operator(B_bl)
C_lo = BlockDiagLinearOperator(to_linear_operator(torch.stack(Cs, 0)))
L_lo = KroneckerProductLinearOperator(L, IdentityLinearOperator(N))

M = MatmulLinearOperator(A_lo, (MatmulLinearOperator(B_lo, A_lo.T)))

print("using linear operator, with to_dense()")
print(
    MatmulLinearOperator(L_lo, MatmulLinearOperator(C_lo + M, L_lo.T))
    .to_dense()
    .diagonal()
)

Removing the to_dense() gives an error, however.

Pitch

Add block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.

As a work-around, I have written manual implementations of specific cases, such as above.

I'm willing to work on PR for this

Additional context

None

[Feature Request] Tracking torch namespace function coverage

πŸš€ Feature Request

Currently, there are a number of functions in the torch namespace that are not registered to dispatch to LinearOperator. Closing this gap will be necessary to make LinearOperator true drop-in replacement for torch.Tensor.

The purpose of this issue is to track which functions are currently not registered, and prioritize the work on closing these gaps. I will try to keep this issue updated with things that come up and cross out things as we close them out.

Missing functions

  • torch.linalg.solve_triangular - #29
  • torch.linalg.cholesky_ex
  • torch.isclose
  • ...

[Bug] linear_cg fails on sparse matrices

πŸ› Bug

linear_cg function runs fails to produce correct values when operated on sparse matrices.

To reproduce

** add the following code snippet to test_linear_cg.py to reproduce **

    def test_cg_with_sparse(self):
        # Create a sample sparse CSR tensor
        N = size = 1000
        nnz = 10000
        indices = torch.randint(0, N, (2, nnz), dtype=torch.int32)
        values = torch.randn(nnz, dtype=torch.float64)
        matrix = torch\
            .sparse_coo_tensor(indices, values, (N, N), dtype=torch.float64)\
            .to_sparse_csr()

        # set up vector rhs
        rhs = torch.randn(size, dtype=torch.float64)

        # basic solve
        solves = linear_cg(matrix.matmul, rhs=rhs, max_iter=size)

        # solve with init value
        init = torch.randn(size, dtype=torch.float64)
        solves_with_init = linear_cg(matrix.matmul, rhs=rhs, max_iter=size, initial_guess=init)

        # Check cg
        matrix, rhs = matrix.to_dense(), rhs.to_dense()
        actual = torch.linalg.solve(matrix, rhs)
        self.assertTrue(torch.allclose(solves, actual, atol=1e-3, rtol=1e-4))
        self.assertTrue(torch.allclose(solves_with_init, actual, atol=1e-3, rtol=1e-4))

** Stack trace/error message **

NumericalWarning: CG terminated in 1000 iterations with average residual norm 97.1439134614844 which is larger than the tolerance of 1 specified by linear_operator.settings.cg_tolerance. If performance is affected, consider raising the maximum number of CG iterations by running code in a linear_operator.settings.max_cg_iterations(value) context.

FAIL: test_cg_with_sparse (__main__.TestLinearCG)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/path/test_linear_cg.py", line 90, in test_cg_with_sparse
    self.assertTrue(torch.allclose(solves, actual, atol=1e-3, rtol=1e-4))
AssertionError: False is not true

----------------------------------------------------------------------

Expected Behavior

Expect the result of running linear_cg to be close to that of torch.linalg.solve.

System information

Please complete the following information:

  • LinearOperator Version 0.3.0
  • PyTorch Version 2.0.0+cu117
  • macOS Monterey 12.6.3
  • Python version 3.10.9

Additional context

Other warnings produced when running the test:

  warnings.warn(
.../path/test_linear_cg.py:115: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:54.)
  .to_sparse_csr()
/path/lib/python3.10/site-packages/linear_operator/utils/linear_cg.py:323: NumericalWarning: CG terminated in 1000 iterations with average residual norm 97.1439134614844 which is larger than the tolerance of 1 specified by linear_operator.settings.cg_tolerance. If performance is affected, consider raising the maximum number of CG iterations by running code in a linear_operator.settings.max_cg_iterations(value) context.

I suppose this may also be a PyTorch issue but I am not sure how to investigate further from here.
Appreciate your time to review this issue. Thanks!

Upgrade required typeguard version

Greetings,

LinearOperator seems to require typeguard~=2.13.3, which is from December 2021 and typeguard is currently at v4.1.5. It would be nice to upgrade this dependency to something more recent.

Thank you.

[Bug] `mul` with mixed LinearOperator and Tensor gives incorrect shape

πŸ› Bug

Calling the LinearOperator mul method on a torch.Tensor results in the wrong shape (although converting to a regular tensor appears to give the correct result). This caused GPyTorch to crash when using a ScaleKernel with an AdditiveKernel (in the multi-task case only).

To reproduce

import torch
from linear_operator import to_linear_operator

a = torch.randn(2, 1, 1)
b = torch.randn(1, 1000, 10000)

print(b.mul(a).shape)

a_lo = to_linear_operator(a)
b_lo = to_linear_operator(b)

print(b_lo.mul(a).shape)
print(b_lo.mul(a).to_dense().shape)

** Output **

torch.Size([2, 1000, 10000])
torch.Size([1, 1000, 10000])
torch.Size([2, 1000, 10000])

System information

linear-operator==0.1.1

ZeroLinearOperators used with CatLinearOperators creates representation issues.

Discussed in #78

Originally posted by AndreaBraschi September 13, 2023
Hi all,

I've written a function to concatenate 2 square LinearOperator objects with different shapes:

from linear_operator.operators import ZeroLinearOperator, CatLinearOperator, KroneckerProductLinearOperator

def cat_LinearOperators(*linear_operators):
        oper_A = linear_operators[0]
        oper_B = linear_operators[1]
        oper_A_size = oper_A.shape[0]
        oper_B_size = oper_B.shape[0]

        # mat_A
        a = ZeroLinearOperator(oper_A_size, oper_B_size)
        cat_1 =CatLinearOperator(a, oper_A, dim=1)
        c = ZeroLinearOperator(oper_B_size, oper_B_size)
        cat_2 = CatLinearOperator(a.transpose(-1, -2), c, dim=1)
        tensor_A = CatLinearOperator(cat_2, cat_1, dim=0)

        # mat_B
        e = ZeroLinearOperator(oper_B_size, oper_A_size)
        cat_3 = CatLinearOperator(oper_B, e, dim=1)
        c = ZeroLinearOperator(oper_A_size, lazy_A_size)
        cat_4 = CatLinearOperator(c, e, dim=0)
        tensor_B = CatLinearOperator(cat_3, cat_4.transpose(-1, -2), dim=0)

        cat_operator = SumLinearOperator(tensor_A, tensor_B)

        return cat_operator

the function works, butI get the following RuntimeError when I evaluate the resulting CatLinearOperator:
Representation of a LazyTensor should consist only of Tensors.

You can evaluate the function yourself:

tensor_A = torch.randn(38, 38)
tensor_B = torch.randn(50, 50)
tensor_C = torch.randn(4, 4)
Kron_A = KroneckerProductLinearOperator(tensor_A, tensor_B)
Kron_B =  KroneckerProductLinearOperator(tensor_C, tensor_B)
cat_operator =cat_LinearOperators(Kron_B, Kron_A)
cat_operator.to_dense()

I start to believe that the mistake is in how I use the ZeroLinearOperator object?

Any help or suggestion will be greatly appreciated.

Many thanks,
Andrea

[Bug] ConstantDiagLinearOperator._mul_constant

πŸ› Bug

ConstantDiagLinearOperator._mul_constant seems to be bugged.

To reproduce

from linear_operator import operators
A = operators.DenseLinearOperator(rand(4, 2, 3, 3))
B = A.add_jitter(1.0)
C = operators.ConstantMulLinearOperator(B, rand([]))
C[..., -1:, :]

Stack trace/error message

The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-11-690c97df3908> in <module>
----> 1 C[..., -1:, :]
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in __getitem__(self, index)
   2632             res = self._get_indices(row_index, col_index, *batch_indices)
   2633         else:
-> 2634             res = self._getitem(row_index, col_index, *batch_indices)
   2635 
   2636         # If we selected a single row and/or column (or did tensor indexing), we'll be retuning a tensor
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/constant_mul_linear_operator.py in _getitem(self, row_index, col_index, *batch_indices)
    102         constant = self._constant.expand(self.batch_shape)[batch_indices]
    103         constant = constant.view(*constant.shape, 1, 1)
--> 104         return base_linear_op * constant
    105 
    106     def _matmul(self, rhs):
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in __mul__(self, other)
   2665     @_implements_second_arg(torch.Tensor.mul)
   2666     def __mul__(self, other: Union[torch.Tensor, LinearOperator, float]) -> LinearOperator:
-> 2667         return self.mul(other)
   2668 
   2669     @_implements_second_arg(torch.Tensor.add)
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/_linear_operator.py in mul(self, other)
   1753                 return self._mul_constant(other.squeeze())
   1754             elif other.shape[-2:] == torch.Size((1, 1)):
-> 1755                 return self._mul_constant(other.view(*other.shape[:-2]))
   1756 
   1757         return self._mul_matrix(to_linear_operator(other))
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/sum_linear_operator.py in _mul_constant(self, other)
     44     def _mul_constant(self, other):
     45         # We're using a custom method here - the constant mul is applied to the base_linear_ops
---> 46         return self.__class__(*[lt._mul_constant(other) for lt in self.linear_ops])
     47 
     48     def _bilinear_derivative(self, left_vecs, right_vecs):
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/interpolated_linear_operator.py in _mul_constant(self, other)
    218         # This preserves the interpolated structure
    219         return self.__class__(
--> 220             self.base_linear_op._mul_constant(other),
    221             self.left_interp_indices,
    222             self.left_interp_values,
/mnt/xarfuse/uid-22150/d091ed77-seed-nspid4026533386_cgpid5352405-ns-4026533383/linear_operator/operators/diag_linear_operator.py in _mul_constant(self, constant)
    272 
    273     def _mul_constant(self, constant: Tensor) -> "ConstantDiagLinearOperator":
--> 274         return self.__class__(self.diag_values * constant, diag_shape=self.diag_shape)
    275 
    276     def _mul_matrix(self, other: Union[Tensor, LinearOperator]) -> Union[Tensor, LinearOperator]:
RuntimeError: The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 1

Additional context

The following seems to work, but I don't know enough to claim that it's a fix.

from unittest.mock import patch


def _mul_constant(self, constant):
    return self.__class__(self.diag_values * constant.unsqueeze(-1), diag_shape=self.diag_shape)


with patch.object(operators.ConstantDiagLinearOperator, "_mul_constant", new=_mul_constant):
    C[..., -1:, :]

argument name inconsistency in setting jitter value with gpytorch

  • Change the init argument of class _dtype_value_context.

    • Currently the init function for _dtype_value_context in linear_operator/settings.py is:
      def __init__(self, float=None, double=None, half=None):

    • This is not consistent with the init function for _dtype_value_context in gpytorch/settings.py, which is:
      def __init__(self, float_value=None, double_value=None, half_value=None):

    • I would suggest to unify the argument names, for example, make the both init functions as:
      def __init__(self, float_value=None, double_value=None, half_value=None):

[Bug] Passing `other: Tensor` to `LinearOperator.to`.

πŸ› Bug

torch.Tensor.to allows the user to provide a reference tensor in lieu of explicitly passing device and dtype; however, this logic seems to be missing from LinearOperator.to.

To reproduce

A = DiagLinearOperator(torch.rand(3, dtype=torch.float64))
B = A.to(torch.rand(3, dtype=torch.float32))
B.dtype
> torch.float64

Expected Behavior

The code should either follow the same convention as torch.Tensor.to or raise an exception.

Use a fix random seed for the tests

πŸš€ Feature Request

Use a fix random seed for the tests.

Motivation

Hello!

In a CI build of linear_operator, we are sometimes experiencing a build failure due to too tight tolerances. For instance, here:

https://bordeaux.guix.gnu.org/build/cecb897d-366c-4d76-a501-7e7507978b53/log

The fact that it happens randomly is a problem to build the package reliably.

Pitch

Could you please use a fixed random seed to run the tests, at least for CI builds? I don’t know enough Python to help much.

Best regards,

Vivien

Pythonic Sum is broken with LinearOperators

Attempting to use Pythonic sum of arbitrary iterables of LinearOperators raises TypeError. This is because the builtin sum() function starts with int(0) and sums iteratively, as follows:

sum([a,b,..z]) = 0 + a + b + .. + z

Minimal reproducible example (with workaround)

import gpytorch
import torch
import traceback
import warnings

x1 = torch.rand([16,1])
x2 = torch.rand([16,1])

k1 = gpytorch.kernels.RBFKernel()(x1)
k2 = gpytorch.kernels.RBFKernel()(x2)

k3 = k1+k2

print("'+' operator works fine.")

# This does not
try:
    k4 = sum([k1,k2])
except Exception as e: traceback.print_exc()

# A workaround now: functools.reduce works, but is a little unsightly

from functools import reduce
import operator

k5 = reduce(operator.add,[k1,k2])

A Fix

This can be fixed by simply including logic for LinearOperator to add int(0) in the same way as it adds a ZeroLinearOperator, by returning self, as follows:

import numbers
# Specifically only allow numbers.Number without raising TypeError in the case where other==0
if isinstance(other, numbers.Number) and other==0:
    return self

I'd recommend adding this case at the end of the conditions in LinearOperator.__add__, on line 2571, so it's only checked after the more common usecases, as follows:

def __add__(self, other: Union[torch.Tensor, LinearOperator, float]) -> LinearOperator:
    from torch import Tensor

    from .added_diag_linear_operator import AddedDiagLinearOperator
    from .dense_linear_operator import to_linear_operator
    from .diag_linear_operator import DiagLinearOperator
    from .root_linear_operator import RootLinearOperator
    from .sum_linear_operator import SumLinearOperator
    from .zero_linear_operator import ZeroLinearOperator
    
    import numbers.Number

    if isinstance(other, ZeroLinearOperator):
        return self
    elif isinstance(other, DiagLinearOperator):
        return AddedDiagLinearOperator(self, other)
    elif isinstance(other, RootLinearOperator):
        return self.add_low_rank(other.root)
    elif isinstance(other, Tensor):
        other = to_linear_operator(other)
        shape = torch.broadcast_shapes(self.shape, other.shape)
        new_self = self if self.shape[:-2] == shape[:-2] else self._expand_batch(shape[:-2])
        new_other = other if other.shape[:-2] == shape[:-2] else other._expand_batch(shape[:-2])
        return SumLinearOperator(new_self, new_other)
    elif isinstance(other, numbers.Number) and other==0:
        return self
    else:
        return SumLinearOperator(self, other)

[Bug] torch.cat fails for linear operators

πŸ› Bug

torch.cat fails for linear operators.

To reproduce

** Code snippet to reproduce **

from linear_operator.operators import DiagLinearOperator
import torch

D = DiagLinearOperator(torch.randn(2, 3, 100))  # Represents an operator of size 2 x 3 x 100
torch.cat([D, D], dim=-2)

** Stack trace/error message **

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/env/lib/python3.12/site-packages/linear_operator/operators/_linear_operator.py", line 2948, in __torch_function__
    raise NotImplementedError(f"torch.{name}({arg_classes}, {kwarg_classes}) is not implemented.")
NotImplementedError: torch.cat(list, dim=int) is not implemented.

Expected Behavior

According to the documentation, torch.cat should work on linear operators.

System information

Please complete the following information:

  • LinearOperator version: 0.5.2
  • PyTorch Version: 2.2.1
  • OS: Ubuntu 20.04.6 LTS

Representation of a LinearOperator should consist only of Tensors

Hi all,

I've written a function to concatenate 2 square LinearOperator objects with different shapes:

from linear_operator.operators import ZeroLinearOperator, CatLinearOperator, KroneckerProductLinearOperator

def cat_LinearOperators(*linear_operators):
        oper_A = linear_operators[0]
        oper_B = linear_operators[1]
        oper_A_size = oper_A.shape[0]
        oper_B_size = oper_B.shape[0]

        # mat_A
        a = ZeroLinearOperator(oper_A_size, oper_B_size)
        cat_1 =CatLinearOperator(a, oper_A, dim=1)
        c = ZeroLinearOperator(oper_B_size, oper_B_size)
        cat_2 = CatLinearOperator(a.transpose(-1, -2), c, dim=1)
        tensor_A = CatLinearOperator(cat_2, cat_1, dim=0)

        # mat_B
        e = ZeroLinearOperator(oper_B_size, oper_A_size)
        cat_3 = CatLinearOperator(oper_B, e, dim=1)
        c = ZeroLinearOperator(oper_A_size, lazy_A_size)
        cat_4 = CatLinearOperator(c, e, dim=0)
        tensor_B = CatLinearOperator(cat_3, cat_4.transpose(-1, -2), dim=0)

        cat_operator = SumLinearOperator(tensor_A, tensor_B)

        return cat_operator

the function works, butI get the following RuntimeError when I evaluate the resulting CatLinearOperator:
Representation of a LazyTensor should consist only of Tensors.

You can evaluate the function yourself:

tensor_A = torch.randn(38, 38)
tensor_B = torch.randn(50, 50)
tensor_C = torch.randn(4, 4)
Kron_A = KroneckerProductLinearOperator(tensor_A, tensor_B)
Kron_B =  KroneckerProductLinearOperator(tensor_C, tensor_B)
cat_operator =cat_LinearOperators(Kron_B, Kron_A)
cat_operator.to_dense()

I start to believe that the mistake is in how I use the ZeroLinearOperator object?

Any help or suggestion will be greatly appreciated.

Many thanks,
Andrea

Originally posted by @AndreaBraschi in #78

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.