Giter Site home page Giter Site logo

Error messages to improve about pytorch HOT 26 CLOSED

pytorch avatar pytorch commented on April 27, 2024 7
Error messages to improve

from pytorch.

Comments (26)

shkr avatar shkr commented on April 27, 2024 7

I just received this error. It is very confusing

size '[-1 x 400]' is invalid for input of with 952576 elements at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/TH/THStorage.c:

from pytorch.

soumith avatar soumith commented on April 27, 2024 4

@hughperkins "inconsistent tensor size" is fixed in master to be much better.

from pytorch.

soumith avatar soumith commented on April 27, 2024 3

std::bad_cast when running https://gist.github.com/panovr/2977d9f26866b05583b0c40d88a315bf instead of a proper error message.
As posted here: https://discuss.pytorch.org/t/runtimeerror-std-bad-cast-trying-to-fine-tune-resnet18/1592

from pytorch.

Atcold avatar Atcold commented on April 27, 2024 2

The following error is hard to parse.

TypeError: FloatClassNLLCriterion_updateOutput received an invalid combination of arguments - got (int, torch.FloatTensor, torch.FloatTensor, torch.FloatTensor, bool, NoneType, torch.FloatTensor), but expected (int state, torch.FloatTensor input, torch.LongTensor target, torch.FloatTensor output, bool sizeAverage, [torch.FloatTensor weights or None], torch.FloatTensor total_weight)

It would be nicer to see something like "target is expected to be a LongTensor, got FloatTensor instead". Moreover, I couldn't figure out where the error has been thrown (which assert does this).

from pytorch.

soumith avatar soumith commented on April 27, 2024 2

@alexholdenmiller reports:

shoot I had just tried torch.gather(source, 1, batch_of_indices), and got the error "Input tensor must have same dimensions as output tensor", which gave me the impression that gather wouldn't work.

Turns out this error meant to communicate the issue "Input tensor must have same number of dimensions as index tensor", as the working command was torch.gather(source, 1, batch_of_indices.view(-1, 1))

from pytorch.

lwolfsonkin avatar lwolfsonkin commented on April 27, 2024 1

Consider the following error:

In [64]: Em
Out[64]: 
Variable containing:
 0.3989  0.1153  0.2762
 0.3676  0.6075  0.9249
 0.8584  0.1807  0.9608
 0.7265  0.8045  0.1511
 0.9004  0.1337  0.0259
 0.7794  0.2576  0.1897
 0.6640  0.4450  0.0762
 0.0892  0.0540  0.7068
[torch.FloatTensor of size 8x3]

In [65]: Em[0]
Out[65]: 
Variable containing:
 0.3989
 0.1153
 0.2762
[torch.FloatTensor of size 3]

In [66]: Em[0].t()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-66-ddcc6c77b14a> in <module>()
----> 1 Em[0].t()

/usr/local/lib/python3.6/site-packages/torch/autograd/variable.py in t(self)
    663 
    664     def t(self):
--> 665         return Transpose(0, 1)(self)
    666 
    667     def transpose(self, dim1, dim2):

/usr/local/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py in forward(self, i)
     77 
     78     def forward(self, i):
---> 79         result = i.transpose(*self.dims)
     80         self.mark_shared_storage((i, result))
     81         return result

RuntimeError: out of range at /Users/soumith/code/pytorch-builder/wheel/pytorch-src/torch/lib/TH/generic/THTensor.c:408

Trying to do transpose on a vector (1 dimensional) gives an unintelligible error when that vector is wrapped in Variable.

RuntimeError: out of range at /Users/soumith/code/pytorch-builder/wheel/pytorch-src/torch/lib/TH/generic/THTensor.c:408

The equivalent operation on a Tensor returns:

RuntimeError: t() expects a 2D tensor, but self is 1D

This is an example of the last member of the above checklist.

from pytorch.

Stonesjtu avatar Stonesjtu commented on April 27, 2024 1

env

version: 0.2.0+bcea678(build from source)
CUDA: 8.0
CUDNN: 6

bn = nn.BatchNorm1d(100)
input = Variable(torch.randn(10,100).cuda())

bn(input) # produce std::bad_cast

bn.cuda()
bn(input) # works fine

from pytorch.

soumith avatar soumith commented on April 27, 2024

reported by @y0ast

When calling the Linear module with a Tensor with higher than 2D the error is: RuntimeError: matrices expected, got 3D, 2D tensors. It was quite confusing to me, maybe we can change to RuntimeError: matrices expected, got 3D tensors and a matrix or even better RuntimeError: 2D tensors expected, got 3D, 2D tensors

from pytorch.

hughperkins avatar hughperkins commented on April 27, 2024

Just generally, I think that all c errors should be caught, and converted to something more meaningful, at the python level. The c errors are good for dev debugging of the c libraries, but are really a pain when trying to write actual models. Another error, which might or might not be included above (hard to tell):

Traceback (most recent call last):
  File "seq2seq_attention.py", line 116, in <module>
    state, enc_loss = encode(input_encoded, state)
  File "seq2seq_attention.py", line 105, in encode
    enc_loss += criterion(pred_c, autograd.Variable(torch.LongTensor([input_c_encoded.item()])))
  File "/Users/hugh2/conda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/hugh2/conda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 36, in forward
    return backend_fn(self.size_average, weight=self.weight)(input, target)
  File "/Users/hugh2/conda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py", line 41, in forward
    output, *self.additional_args)
RuntimeError: Assertion `THIndexTensor_(size)(target, 0) == batch_size' failed.  at /Users/soumith/miniconda2/conda-bld/pytorch_1493757886437/work/torch/lib/THNN/generic/ClassNLLCriterion.c:50

There are a few issues really:

  • it's hard to search through the stack trace to figure out which line of one's one code is the last thing before it goes into torch code
    • solution: cut the stack trace off wherever it passes into the torch library (in this case, cut everything from module.py line 206 forards off
    • provide an option to display this stuff anywyay, like torch.StackTraces.show_all(True), or whatever (no opinoin on the exact command, it's pytorch-dev only mostly anyway)
  • it's hard to figure out which tensors are at fault. eg you pass in three tensors to a torch python method, which ones are mismatching etc?
    • solution: somehow map the error message to the correct tensor. I realize this is generally really hard/impossible. But I'm at least making the observation that it would be nice, if someone can figure out how to make it possible :-)
  • doesnt show the actual and expected sizes of the tensors going in. Like, 'When you called foobar(spam, eggs, chips), len(spam[:,0,:]) was 10, and len(eggs(0, :, :) was 5, and they should be the same', or something
    • solution: at least print out the sizes of the relevant tensors etc, even from the c code would be nice, rather than just calling assert

One challenge is that it's hard to tell where is the entry point into the pytorch library from someones code: an entrypoint in one persons code might be used by the entry point funcgtion in another persons code. and putting exception stuff all over the place probably wont be graet for performance. One option that occurs to me, just after finishing writing the above, could be to make some kind of 'prettifier', that takes all the stack trace output, and automatically beuatifies it a bit. So, a user can just call print(prettify(some_exception)) or similar

  • obviously this would need all required information to be present somehow/somewhere in the stack trace. eg the c code should print out the tensor dimensions etc, in some easy-to-parse format, somewhere
  • however it does have a few advantages:
    • could be a separate project from core, could even be pluggable, so doesnt involve changing core code (beyond making sure core code dumps all required information somewhere, during an exception)
    • wont have any performance impact etc, unless someone explicitly calls prettify

Note that I'm not really thinking of actually contributing to any of these fixes :-P . Just logging it here for completeness. I might occasionally add one single error message or something, though it'd be nice to somehow decide on /document a standard framework/pattern for doing this first perhaps.

from pytorch.

hughperkins avatar hughperkins commented on April 27, 2024

Its in the list above, but I think it could be easier to debug inconsistent tensor size if it stated the size of the two tensors, to save having to add prints to the program to get this information:

RuntimeError: inconsistent tensor size at /Users/soumith/miniconda2/conda-
bld/pytorch_1493757886437/work/torch/lib/TH/generic/THTensorMath.c:831

from pytorch.

Stonesjtu avatar Stonesjtu commented on April 27, 2024

std::bad_cast is still not fixed in the newest master. Is anyone working on this?

from pytorch.

soumith avatar soumith commented on April 27, 2024

@Stonesjtu in what context did you get std::bad_cast? Happy to fix, but need some context.

from pytorch.

Stonesjtu avatar Stonesjtu commented on April 27, 2024

Something wrong when doing mixed advanced indexing. The error message is not helping in this case.

In [1]: import torch
from torch.nn        
In [2]: from torch.autograd import Variable

In [3]: va = Variable(torch.rand(5,5))

In [4]: index = torch.LongTensor([1,2,3])

In [5]: va[1,index]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-c87759241b4e> in <module>()
----> 1 va[1,index]

/slwork/users/kys10/anaconda2/lib/python2.7/site-packages/torch/autograd/variable.pyc in __getitem__(self, key)
     73                 return IndexSelect.apply(self, 0, key)
     74             # else fall through and raise an error in Index
---> 75         return Index.apply(self, key)
     76 
     77     def __setitem__(self, key, value):

/slwork/users/kys10/anaconda2/lib/python2.7/site-packages/torch/autograd/_functions/tensor.pyc in forward(ctx, i, index)
     14         ctx.input_size = i.size()
     15         ctx.index = index
---> 16         result = i.index(ctx.index)
     17         ctx.advanced_indexing = i._check_advanced_indexing(index)
     18         if not ctx.advanced_indexing:

TypeError: indexing a tensor with an object of type torch.LongTensor. The only supported types are integers, slices, numpy scalars and torch.LongTensor or torch.ByteTensor as the only argument.

from pytorch.

jdily avatar jdily commented on April 27, 2024

Hi, I am under v0.2, and have the same issue mentioned as @alexholdenmiller
and i already try to use the fix as follow,
current_Q_values = Q(obs_batch).gather(1, act_batch.view(-1,1))
but still got the following error :
Index tensor must have same dimensions as input tensor at /home/jdily/Desktop/project/lib/pytorch/torch/lib/THC/generic/THCTensorScatterGather.cu:111

thx

from pytorch.

idansc avatar idansc commented on April 27, 2024

torch.mm on 2D and 3D tensor returns
RuntimeError: matrices expected, got 2D, 3D tensors at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.c:2028

I'd change term matrices to 2D tensors

from pytorch.

ezyang avatar ezyang commented on April 27, 2024

@idansc Sure, sounds reasonable. Are you interested in submitting a PR?

from pytorch.

kielpins avatar kielpins commented on April 27, 2024

@idansc @ezyang I'm not sure why this needs to be a RuntimeError in the first place. For instance, the corresponding numpy error is ValueError: operands could not be broadcast together with shapes FOO, BAR. That is a much more specific error than RuntimeError, which is practically a catch-all. Personally, I feel a lot more comfortable writing a try-except block against ValueError than RuntimeError.

from pytorch.

ezyang avatar ezyang commented on April 27, 2024

@kielpins Sure, refining the error to be a ValueError seems fine to me, and I'd accept PRs for it. We are horribly, horribly inconsistent about this at the moment, however, because we don't have any convenient macros for saying "hey, this is a value error" specifically, so even if you wanted to catch ValueError it wouldn't work most of the time.

from pytorch.

ezyang avatar ezyang commented on April 27, 2024

This is a pretty old issue, but don't close me just yet! If you want to close me, please go through the list and check if we've already fixed the error message problems.

from pytorch.

pifparfait avatar pifparfait commented on April 27, 2024

Hi world,
I got this error after declaring and call the following variables:

randomly initialize parameters from a normal distribution

params = np.random.normal(0, np.pi, (nr_qubits, nr_layers, 3))
params = Variable(torch.tensor(params), requires_grad=True)

The error is:

TypeError Traceback (most recent call last)
in
15 # the final stage of optimization isn't always the best, so we keep track of
16 # the best parameters along the way
---> 17 best_cost = cost_fn(params)
18 best_params = np.zeros((nr_qubits, nr_layers, 3))
19

in cost_fn(params)
2 cost = 0
3 for k in range(3):
----> 4 cost += torch.abs(circuit(params, A=Paulis[k]) - bloch_v[k])
5
6 return cost

in circuit(params, A)
3 # repeatedly apply each layer in the circuit
4 for j in range(nr_layers):
----> 5 layer(params, j)
6
7 # returns the expectation of the input matrix A on the first qubit

in layer(params, j)
11 def layer(params, j):
12 for i in range(nr_qubits):
---> 13 qml.RX(params[i, j, 0], wires=i)
14 qml.RY(params[i, j, 1], wires=i)
15 qml.RZ(params[i, j, 2], wires=i)

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in init(self, wires, do_queue, *params)
713 assert self.grad_recipe is None, "Gradient recipe is only used by the A method!"
714
--> 715 super().init(*params, wires=wires, do_queue=do_queue)
716
717

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in init(self, wires, do_queue, *params)
388 if self.do_check_domain:
389 for p in params:
--> 390 self.check_domain(p)
391 self.data = list(params) #: list[Any]: parameters of the operator
392

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in check_domain(self, p, flattened)
451 if not isinstance(p, numbers.Real):
452 raise TypeError(
--> 453 "{}: Real scalar parameter expected, got {}.".format(self.name, type(p))
454 )
455

TypeError: RX: Real scalar parameter expected, got <class 'torch.Tensor'>.

from pytorch.

pifparfait avatar pifparfait commented on April 27, 2024

Hi world,
I got this error after declaring and call the following variables:

randomly initialize parameters from a normal distribution

params = np.random.normal(0, np.pi, (nr_qubits, nr_layers, 3))
params = Variable(torch.tensor(params), requires_grad=True)

The error is:

TypeError Traceback (most recent call last)
in
15 # the final stage of optimization isn't always the best, so we keep track of
16 # the best parameters along the way
---> 17 best_cost = cost_fn(params)
18 best_params = np.zeros((nr_qubits, nr_layers, 3))
19

in cost_fn(params)
2 cost = 0
3 for k in range(3):
----> 4 cost += torch.abs(circuit(params, A=Paulis[k]) - bloch_v[k])
5
6 return cost

in circuit(params, A)
3 # repeatedly apply each layer in the circuit
4 for j in range(nr_layers):
----> 5 layer(params, j)
6
7 # returns the expectation of the input matrix A on the first qubit

in layer(params, j)
11 def layer(params, j):
12 for i in range(nr_qubits):
---> 13 qml.RX(params[i, j, 0], wires=i)
14 qml.RY(params[i, j, 1], wires=i)
15 qml.RZ(params[i, j, 2], wires=i)

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in init(self, wires, do_queue, *params)
713 assert self.grad_recipe is None, "Gradient recipe is only used by the A method!"
714
--> 715 super().init(*params, wires=wires, do_queue=do_queue)
716
717

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in init(self, wires, do_queue, *params)
388 if self.do_check_domain:
389 for p in params:
--> 390 self.check_domain(p)
391 self.data = list(params) #: list[Any]: parameters of the operator
392

~/opt/anaconda3/lib/python3.7/site-packages/pennylane/operation.py in check_domain(self, p, flattened)
451 if not isinstance(p, numbers.Real):
452 raise TypeError(
--> 453 "{}: Real scalar parameter expected, got {}.".format(self.name, type(p))
454 )
455

TypeError: RX: Real scalar parameter expected, got <class 'torch.Tensor'>.

SOLVED!

from pytorch.

KushajveerSingh avatar KushajveerSingh commented on April 27, 2024

@apaszke I can work on this issue. Can you give me some pointers on where I should start?

from pytorch.

hughperkins avatar hughperkins commented on April 27, 2024

@KushajveerSingh I assume this is for some kind of project? What you could do is:

  • first work down the list, at the top of this thread, and document which items are still current (it's a really old list, some things might have already been fixed)
    • eg use a function that generates an out of range error
  • then work your way down the list

One slight variant on this would be to add unit tests, that test the behavior to out of specification inputs to pytorch functions, and check the resulting error messages. The advantage of this approach is that you have a deliverable even if you fail to find anything broken. For bonus points, you could try to test in as scalable way as possible, ideally.

from pytorch.

KushajveerSingh avatar KushajveerSingh commented on April 27, 2024

@hughperkins Thank you for the information. This is not for a project, just wanted to start contributing to pytorch in some way.

from pytorch.

hughperkins avatar hughperkins commented on April 27, 2024

A new one for you:

  • if you access .grad, and there is no gradient, you get an error message like
    "UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com//pull/30531 for more informations."
    • this is intended behavior, sort of ish
  • now if you call zero_grad() on an nn.Module instance, and one of the parameters is zero, you get an error like:
    "/Users/hp/.pyenv/versions/ulfs/lib/python3.7/site-packages/torch/nn/modules/module.py:1332: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com//pull/30531 for more informations."
    • there are a few problems with this:
      • firstly, the module and line number are uninterpretable, and are outside of the user's script
      • doesn't really explain the higher-level reason for what is going on. Like, we don't know which parameter, etc.
  • I haven't really thought about how to solve this error message, but it is fairly hard to interpret I feel.
    (maybe what should happen is that calling zero_grad on something without a grad should just ignore that something?)

from pytorch.

mruberry avatar mruberry commented on April 27, 2024

Closing this issue due to age and lack of updates. Many of the cases described in the original issue are moot by other updates in PyTorch. If there are issues or possible improvements for PyTorch's current error messages then please file new issues.

from pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.