Giter Site home page Giter Site logo

da-rnn's Introduction

Hey! I'm Sean Aubin!

I'm a punk trained in Electrical Engineering and Theoretical Neuroscience, which means I'm a data generalist competent in Machine Learning. I have a blog that usually updates once a year.

Further details about my professional experience are (unfortunately) on LinkedIn.

Feel free to contact me via email or LinkedIn. I'm trying to avoid using Twitter and have not yet adopted another social network.

Computery Interests

  • Dynamicland: Making computing collaborative and humane by embodying it in the world.
  • Zig: Replacing C for low-level computation.
  • Oils: An upgrade path from Bash to a better shell language.

Other Interests

  • Social System Design: Designing systems (both digital and in-person) which allow for meaningful collaborations.

AFK

I run the Toronto Applied Rationality Meetup where we discuss complex topics with compassion, nuance, and mindfulness.

da-rnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

da-rnn's Issues

It's weird that this code can only performance well on predicting 'NDX'

I found that this da-rnn can only predict 'NDX'.
If I try to use it to predict other colums such as 'YHOO' or 'XLNX', the results are bad.
And here is the most weird thing. I modified it to do the single varible prediction (e.g. using 'NDX' as both the input and target). And it still only works well on 'NDX'. The results on other stocks are bad (the Loss doesn't decrease).
Can this be explained?
pred_0
the performance on 'NDX'
pred_0
the performance on 'AAL'

Data overlapping in train/test split

Current version of predict function creates overlapping batch 1st element' indexes for train and test X and y_history tensors. Last item from X in train is first item in X in test. And due to mentioned in issue #4 gap between y_hist and y_targ there is one sequence missing in last chunk of splitted y_pred: i.e. we have dummy dataset with numbers as targs from 1 to 60, out last item in last batch would be 58 with y_targ = [60] leaving time window with 59 number out of party

Error using CUDA

I get this error when trying to use the code with GPU (it works fine with CPU):

2019-05-14 19:51:48,675 - VOC_TOPICS - INFO - Shape of data: (40560, 82).
Missing in data: 0.
2019-05-14 19:51:48,785 - VOC_TOPICS - INFO - Training size: 28392.
2019-05-14 19:51:51,329 - VOC_TOPICS - INFO - Iterations per epoch: 221.812 ~ 222.
Traceback (most recent call last):
  File "main.py", line 196, in <module>
    iter_loss, epoch_loss = train(model, data, config, n_epochs=10, save_plots=save_plots)
  File "main.py", line 84, in train
    loss = train_iteration(net, t_cfg.loss_func, feats, y_history, y_target)
  File "main.py", line 143, in train_iteration
    input_weighted, input_encoded = t_net.encoder(numpy_to_tvar(X))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/da-rnn/da-rnn/modules.py", line 43, in forward
    input_data.permute(0, 2, 1)), dim=2)  # batch_size * input_size * (2*hidden_size + T - 1)
RuntimeError: Expected object of backend CPU but got backend CUDA for sequence element 2 in sequence argument at position #1 'tensors'

RuntimeError

win7 64bit, python 3.7.0, cuda 9

Warning (from warnings module):
File "E:\python370\lib\site-packages\sklearn\externals\joblib\externals\cloudpickle\cloudpickle.py", line 47
import imp
DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
2018-11-06 22:37:50,303 - VOC_TOPICS - INFO - Using computation device: cuda:0
2018-11-06 22:37:52,505 - VOC_TOPICS - INFO - Shape of data: (40560, 82).
Missing in data: 0.
2018-11-06 22:37:52,637 - VOC_TOPICS - INFO - Training size: 28392.
2018-11-06 22:38:02,391 - VOC_TOPICS - INFO - Iterations per epoch: 221.812 ~ 222.
Traceback (most recent call last):
File "E:\python370\Scripts\da-rnn-master\main.py", line 196, in
iter_loss, epoch_loss = train(model, data, config, n_epochs=10, save_plots=save_plots)
File "E:\python370\Scripts\da-rnn-master\main.py", line 84, in train
loss = train_iteration(net, t_cfg.loss_func, feats, y_history, y_target)
File "E:\python370\Scripts\da-rnn-master\main.py", line 143, in train_iteration
input_weighted, input_encoded = t_net.encoder(numpy_to_tvar(X))
File "E:\python370\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(input, **kwargs)
File "E:\python370\Scripts\da-rnn-master\modules.py", line 43, in forward
input_data.permute(0, 2, 1)), dim=2) # batch_size * input_size * (2
hidden_size + T - 1)
RuntimeError: Expected a Tensor of type torch.FloatTensor but found a type torch.cuda.FloatTensor for sequence element 2 in sequence argument at position #1 'tensors'

Regarding scaling of data

I have seen that standardscaler.fit(X) is being used which which scale the entire data.But the usual practice is to fit on the training data and apply the same mean on testing and validation data set.I am new to this feild and doesnt know how to preprocess time series data.Kindly reply

Is there any room for gpu memory improvements?

Hi, it looks like the loop inside the decoder consumes too much gpu memory very quickly if you try to increase history size because we keep tracking hidden and cell units. I wonder if there is room for any improvements. Would it kill the purpose of the model if we detach some things somehow?

Dose this model genelarize well on your (other) dataset?

Thank you very much for the implementation. And I wonder whether there is someone applying this method on other datasets and how's the performance?

When I apply this method on my datasets (a traffic dataset and a disease dataset), there are two problems: 1. the loss is very big, i.e., the model cannot learn the pattern, much worse than the vanilla LSTM, wired. 2. in some cases, the val loss drops quickly, but increases explosively (in 1-2 epochs).

I have tried to use the minmax scaler and the gradient norm clip to address the problem, but these don't work. As the encoder use the whole sequence for attention, the T cannot be too big, that limits the information inputed to the model. But I still think it is hard to tune this model in other datasets. Does someone have similar experience?

Reg predicting the output

The paper is a NARX problem and as such, the predicted value at a time step should be used for next prediction.But here its not used.When this code is used as such, the predictions are very good.But when I modified the code to use predicted values for the next prediction, its not giving good result.But as per the paper their RMSE is very low even under this setting.But I couldn't find any difference in the implementation w.r,t the paper.

I got an error when run main_predict.py after running main.py successful

Uploading TIM截图20191010220050.png…
Traceback (most recent call last): File "D:/graduate/Code/Research/DA_RNN/Seanny123/main_predict.py", line 74, in <module> final_y_pred = predict(enc, dec, data, **da_rnn_kwargs) File "D:/graduate/Code/Research/DA_RNN/Seanny123/main_predict.py", line 49, in predict y_pred[y_slc] = decoder(input_encoded, y_history).cpu().data.numpy() File "B:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "D:\graduate\Code\Research\DA_RNN\Seanny123\modules.py", line 108, in forward y_tilde = self.fc(torch.cat((context, y_history[:, t]), dim=1)) # (batch_size, out_size) File "B:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "B:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 87, in forward return F.linear(input, self.weight, self.bias) File "B:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1369, in linear ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [128 x 40624], m2: [65 x 1] at C:\w\1\s\tmp_conda_3.7_021303\conda\conda-bld\pytorch_1565316900252\work\aten\src\TH/generic/THTensorMath.cpp:752
I have read the similar issues #2 but there isn't "unsqueeze(1)" in the newest code.
I have tried to add unsqueeze(1) back as same as the old location ( like Chandler Zuo did), but main.py and main_predict.py both cannot running rightly.
Could someone tell me how can i solve it?
Please

thank you

why not use tanh in encoder while use it in decoder ?

Firstly, thanks for your code ,it really helps me a lot to understander the paper.
But when i debug the code , i find that in modules.py seanny used tanh in decoder while omit it in encoder ,but in paper ,formula 8 and 12 both use tanh to calculate part of attention weight.
I dont know why , can anybody offer some help?Thanks in advance !

why use companies' stock price to predict NASDAQ-100 Index?

your code in main.py line189, the code of getting data is below, why do you use companies' stock price to predict NASDAQ-100 Index?

raw_data = pd.read_csv(os.path.join("data", "nasdaq100_padding.csv"), nrows=100 if debug else None)
logger.info(f"Shape of data: {raw_data.shape}.\nMissing in data: {raw_data.isnull().sum().sum()}.")
targ_cols = ("NDX",)
data, scaler = preprocess_data(raw_data, targ_cols)

NDX should be calculated by these stock prices, isn’t it? why u have to learn the calculation formula by RNN?
The DA-RNN paper gives a time series predicting model, right? But where is your time series predicting? I am confusion.

That's what I found when I read the code repeatedly, If I got wrong or missed something, please tell me.
Thank you.

Error while executing main_predict.py

Got this error trying just to run code from github:
Traceback (most recent call last): File "main_predict.py", line 76, in <module> final_y_pred = predict(enc, dec, data, **da_rnn_kwargs) File "main_predict.py", line 51, in predict y_pred[y_slc] = decoder(input_encoded, y_history).cpu().data.numpy() File "/home/halcyon/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/home/halcyon/darnn/modules.py", line 106, in forward y_tilde = self.fc(torch.cat((context, y_history[:, t]), dim=1)) # (batch_size, out_size) File "/home/halcyon/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/home/halcyon/anaconda3/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 55, in forward return F.linear(input, self.weight, self.bias) File "/home/halcyon/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1024, in linear return torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [128 x 40624], m2: [65 x 1] at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/THC/generic/THCTensorMathBlas.cu:249
Running main.py gives no error, so looks like somehow tensor sizes got mismatch at prediction stage
The only modification to code i've made is just added this string in the beginning:
torch.set_default_tensor_type('torch.cuda.FloatTensor')
w/o that string i've got 'type mismatch' from torch

tensor problem

hi, I got this error :
RuntimeError: All input tensors must be on the same device. Received cpu and cuda:0

any idea ?

Current values of external factor are used for prediction in your code?

I am confused about Chandler's change in the code. He presented in his blog, i.e. "Unlike the experiment presented in the paper, which uses the contemporary values of exogenous factors to predict the target variable, I exclude them.". What the mean is that instead of inputting synchronous external factors when predicting the target series, he only used the past values of all external series to predict the target time series? Instead, current values of external factor are not used for prediction in his code?

Multi-Step Prediction

With the current implementation (afaik) only a single step will be predicted. A modification for the Decoder part would be great

How many epoch should I choose?

I noticed that Seanny used 10 epoch in this project.But actually the train loss is still falling .
So I wanna ask you that in this project , how many epoch should I choose ?
If I use 200 epoch to train my data , it cost too quite a long time , how to speed up ?If you know how to change this project from CPU to GPU , could you please reply in detail?
Thank you in advance !

Evaluation mode missing on validation and predict

This section in predict()

da-rnn/main.py

Lines 178 to 181 in 8585806

y_history = numpy_to_tvar(y_history)
_, input_encoded = t_net.encoder(numpy_to_tvar(X))
y_pred[y_slc] = t_net.decoder(input_encoded, y_history).cpu().data.numpy()

should be changed to

with torch.no_grad():
  y_history = numpy_to_tvar(y_history)
  _, input_encoded = t_net.encoder(numpy_to_tvar(X))
  y_pred[y_slc] = t_net.decoder(input_encoded, y_history).cpu().data.numpy()

This is required to disable any gradient calculation during validation / prediction in training loop. From my understanding if not included, it wll update weights on each validation / prediction iteration.

This is original final_prediction output
original

This is with grad turned off during validation / prediction.
grad off

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.