Comments (7)
right so if you dont want lags set that array to [1], and increase you context lengh by 1 more time step...can you check if that works?
from transformers.
It seems like you are encountering a runtime error in the Informer model while generating predictions, specifically at line 2029 in modeling_informer.py
. You have identified that the issue might be related to the shift=1
parameter at line 2020, which could be causing the tensor shapes mismatch.
To address this issue, you can try the following possible solutions:
-
Change
shift=1
at Line 2020 toshift=0
:- Update the
shift
parameter to0
at line 2020 inmodeling_informer.py
. This change may help align tensor shapes correctly during prediction generation.
- Update the
-
Adjust
repeated_features
at Line 2026:- Instead of using
k+1
in line 2026 forrepeated_features
, you can consider usingk
to ensure that the tensor shapes are consistent withreshaped_lagged_sequence
.
- Instead of using
-
Seek clarification on the author's intent:
- If you are unsure about the author's reasoning behind the
shift
parameter or the use ofk+1
, you may want to refer to the documentation or reach out to the author or community for clarification on the intended design and behavior.
- If you are unsure about the author's reasoning behind the
By implementing one of these solutions or a combination of them, you may be able to resolve the tensor shape mismatch issue and avoid the RuntimeError
during prediction generation in the Informer model. Testing the changes and observing the model's behavior after the adjustments will help confirm if the issue has been effectively addressed.
from transformers.
In the context of the error you encountered while generating predictions with an Informer model, it seems that the issue may be related to the manipulation of tensor shapes and the specific handling of the "shift" parameter in the code snippet provided.
After reviewing the relevant portion of the code in the Transformers library's modeling_informer.py
file and considering the observed error message, it appears that the discrepancy in tensor sizes leading to the RuntimeError could be attributed to the way the "shift" parameter is utilized within the code.
Here are some possible solutions and considerations for addressing this issue:
-
Investigating the Shift Parameter (shift=1):
- The "shift" parameter in line 2020 of the code snippet you referenced may indeed impact the tensor shapes, potentially causing mismatches in dimension sizes during prediction generation.
- By setting "shift=0" or adjusting the logic related to how the "shift" parameter is utilized, you may align the tensor sizes properly and eliminate the dimension mismatch issue.
-
Reviewing Tensor Reshaping Logic:
- The reshaping of tensors, specifically concerning the
reshaped_lagged_sequence
andrepeated_features
, plays a critical role in maintaining consistent tensor sizes for subsequent operations. - Ensure that the reshaping logic in lines 2020 and 2026 appropriately handles the dimensions of the tensors involved to prevent size discrepancies.
- The reshaping of tensors, specifically concerning the
-
Author's Intuition on Shift Parameter:
- Understanding the author's intent behind incorporating the "shift" parameter in the code can provide insights into the rationale for using a specific value (such as "shift=1") and its impact on tensor operations.
- Consider revisiting the documentation or comments related to the "shift" parameter within the code to clarify its intended purpose and how it influences tensor manipulations.
-
Testing and Validation:
- Experiment with different values for the "shift" parameter (e.g., 0, 1) in line 2020 to observe how they affect tensor shapes and the overall prediction generation process.
- Validate the model's behavior and predictions after making adjustments to the "shift" parameter to confirm that the dimension mismatch issue has been resolved.
In summary, by investigating the logic surrounding the "shift" parameter, reviewing the tensor reshaping procedures, and considering the author's reasoning behind its usage, you can work towards addressing the dimension mismatch error encountered during prediction generation with the Informer model. Experimenting with different values for "shift" and ensuring consistent tensor shapes can help resolve the RuntimeError and optimize the model's performance in generating accurate predictions.
from transformers.
cc @kashif
from transformers.
@jhzsquared so the intention was that the model is learning the next step's distribution given the past as well as the covariates up till the time step at which one is forecasting...
can you paste in your lag_seq
vectors that you are using?
from transformers.
I'm not using any lag right now, so have an initial model input lags_sequence = [0]
.
And thanks! Conceptually that makes sense... functionally though, when k=0
, the get_lagged_subsequences
function with shift=1 is manifesting as a context_length
size tensor at dimension 1, while the repeated_features[:,k+1]
is always size k+1
at dimension 1 of course. When k>0
, then the lagged_sequence
shape at dimension 1 is always 1 less (-1
) the size of the corresponding subset of repeated_features
it is supposed to be combined with at Line 2026.
from transformers.
Ohh okay did not realize that should have been [1]
. That fixed it! Thank you so much!
from transformers.
Related Issues (20)
- What if past_key_values is in model_kwargs but is None HOT 2
- isin() received an invalid combination of arguments HOT 7
- Add option to only install AutoTokenizer for production environment
- Question about model training with checkpointing: HOT 1
- Paligemma model Forward Method Not Returning Loss in Trainer HOT 4
- Whether the OutEffHop can support with Transfomers HOT 1
- Add sanity validation steps HOT 5
- Checkpoint saving by different evaluation criterias HOT 1
- Token Classification Pipeline support for Layout models HOT 2
- GLUs misleadingly named MLP HOT 6
- TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. HOT 3
- Python suffix stripping in `transformers.dynamic_model_utils.get_class_in_module` strips additional characters HOT 2
- Sigmoid instead of softmax used in Documentation and autopipeline for siglip HOT 8
- AutoModel not loading model correctly due to config_class inconsistency HOT 7
- Llama-2 output from the forward function is nonsense. Output from `.generate()` is okay HOT 2
- GenerationConfig throws Object is not JSON serializable when setting constraints HOT 3
- from transformers import Phi3VModel, Phi3VConfig. Phi3 Vision config not added to transformers. HOT 1
- Add ability to specify input device for ffmpeg_microphone() HOT 1
- Slow tokenizer not loading mixtral 8x22b tokenizer correctly HOT 2
- Multi input of owlv2 cause RuntimeError: Boolean value of Tensor with more than one value is ambiguous HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.