ivsxk / rat Goto Github PK

Shell 2.66% Python 97.34%

rat's Introduction

Introduction
- Differences from the article version
Dependencies
Dataset
Build
- Running
- Configurations
Logging
Model
Acknowledgement
Citation
References

Introduction

This repo is the Pytorch code for our paper “Relation-Aware Transformer for Portfolio Policy Learning”^[1] in IJCAI 2020.

Differences from the article version

Note that this library is a part of our main project, and it is several versions ahead of the article.

In decision-making layer, we construct the leverage operation through two softmax heads, and it's performance is similar to the result of using three softmax heads.

Dependencies

python 3.7.3 (Anaconda)

pytorch 0.4.1.post2

cudnn 7.4.1

Dataset

The raw data of Crypto-A is accessed with Poloniex^[2]. We provide the Crypto-A dataset in link^[3], where data selection is based on the method in ^[4]. Please download ./database to the same directory as the main.py. The statistics of Crypto-A are summarized as below.

Dataset	Assets	Training	Test
Crypto-A	12	2016.01-2017.11	2017.11-2018.01

Build

File main.py mainly contains the construction of RAT network, data preprocessing, the fitting model process and testing process. File run_mian.sh mainly contains the parameter configurations of training RAT.

Running

cd ${RAT_ROOT}/RAT-master

./run_main.sh

Configurations

The figure shows the entire structure of RAT, and we detail some related parameter configurations in run_main.sh as below.

--x_window_size

The length of the price series.

--local_context_length

The length of local price context.

--daily_interest_rate

The interest rate of the loan for one day.

--log_dir

The directory to save the log.

--model_dir

The directory to save the model.

--model_index

Set a unique ID for the model.

Logging

After training process, the model is saved in ${SAVE_MODEL_DIR}/${MODEL_INDEX}.pkl. After testing process, the backtest results are saved in ${LOG_DIR}/train_summary.csv. It contains metrics such as fAPV, SR, CR and basktest_history.

Model

We provide a model with 495.28 fAPV in https://drive.google.com/drive/folders/11MK2QSj30G9pYE8qx_-80zgCDmJOHS9U?usp=sharing. You can download the model to ${SAVE_MODEL_DIR}/ and test it. When testing the given model, please comment out line 1443 in main.py to avoid the training process.

Acknowledgement

This project is constructed based on the open source project:

[PGPortfolio toolbox(https://github.com/ZhengyaoJiang/PGPortfolio)]
[Zhang Y, Zhao P, Li B, et al. Cost-sensitive portfolio selection via deep reinforcement learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2020(https://github.com/Vanint/PPN)]

Citation

If you use this code for your research, please consider citing:

@inproceedings{xu-relation,
  title = {Relation-Aware Transformer for Portfolio Policy Learning},
  author = {Xu, Ke and Zhang, Yifan and Ye, Deheng and Zhao, Peilin and Tan, Mingkui},
  booktitle = {Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence}, 
  pages = {4647--4653},
  year = {2020},
  month = {7},
  note = {Special Track on AI in FinTech}
}

References

[1] Xu Ke, Zhang Yifan, Ye Deheng, Zhao Peilin and Tan Mingkui. Relation-Aware Transformer for Portfolio Policy Learning. IJCAI, 2020.

[2] Poloniex’s official API.

[3] Crypto-A dataset.

[4] Zhengyao Jiang, Dixing Xu, and Jinjun Liang. A deep reinforcement learning framework for the financial port-folio management problem. arXiv, 2017.

rat's People

Contributors

Stargazers

Watchers

Forkers

gamehoo lydia99992 hxlau vanint dxist codeninja sbhadade adinesia jnenx moosahussain kshre oxful weiziang1 xy-1 louisoutin scut-ailab jediyoda36 thutzr wizardkai1021 alsac saimodukuri-samsung atilimcetin liujun202 ylcoder peilinzhao crneville tamiraldematshuva yanncalec benwaldner ehsanfar dreamern niceboy120 jiahaoli57 skpalu shirc stevexxs annakavosh navendugarg lakshman2003 zaza-97

rat's Issues

Question on 'previous action' implementation

Hi I have a question on the 'previous action' functionality implementation in the code. I might be wrong but it seems like you are sampling the previous action from self.__PVM. Every time a new batch is sampled, this function gets called. From the code I can see self.__PVM is initialized here.

The question:

However, it seems like the order of sampling previous actions for the RAT network is

batch=DM.next_batch() --> when next batch called
last_w = self.__PVM.values[indexs-1, :] --> sample 'previous action' and return as last_w using random starting index (say index A)
reshape 'previous action' as previous_w

previous_w=torch.tensor(batch_last_w,dtype=torch.float).to(device)
previous_w=torch.unsqueeze(previous_w,1)

out = model.forward(src, currt_price, previous_w, price_series_mask, trg_mask, padding_price) --> passing previous_w to RAT model
setting 'new action' to the same index A in self.__PVM

 def setw(w):
        self.__PVM.iloc[indexs, :] = w

repeat by sampling from a new random starting index (say index B).

in this sense, the RAT model is not really receiving a 'previous action' due to the different starting index. Moreover, it is just receiving the initialization.

I might have missed something, but any clarification on this is appreciated! :)

How to reproduce the results on more recent data?

Hi,

I am trying to replicate the results of this algo but everytime I try I get very, very poor results. I am wondering how the data was preprocessed. Could you clarify that? Any help is very welcome.

Thanks!

TypeError: Panel() takes no arguments

hi, im getting this error when running with Crypto-A dataset

Traceback (most recent call last):
  File "main.py", line 1372, in <module>
    portion_reversed=False                            )
  File "main.py", line 362, in __init__
    features=type_list)
  File "main.py", line 133, in get_global_panel
    panel = pd.Panel(items=features, major_axis=coins, minor_axis=time_index, dtype=np.float32)
TypeError: Panel() takes no arguments

Data preparation

Hey,

After analyzing the dataset that is used I found the following information:
The maximum value of each of the following variables has been set to a specific value:
3300 for high, 1500 for low, 2500 for open, 2000 for close.

Were the rows that contained a value for any of the variables which is higher than a maximum removed, and if so why?
If that wasn't the case, how was the maximum value for each variable determined.

Thanks

preprocessing - element-wise dividing the prices regarding the last period in the price series

Hi Sir,

In Section 5.1 of the paper there is "...element-wise dividing the prices regarding the last period in the price series." What does this mean? Can you clarify on this?

Thanks a lot!

Repo with refactored code and Binance provider

Hey guys,

I did a fork of that project and made the following changes:

update to last python/pandas/torch version
Refactored the single file main.py into distinct python modules
Moving from Poloniex data provider to Binance
pip installable package
experiment with train / val / test defined in a clearer way in the input configs.

Here is the repo:
https://github.com/louisoutin/rat_crypto_trader

I didn't manage to get portfolio values greater than 1 for the moment (as this post mention too: #8).

Attention mask unused?

        ass_mask=torch.ones(q_size2*q_size1,1,1,q_size0).cuda()  #[31*128,1,1,11]
        x, self.attn_asset = attention(ass_query, ass_key, ass_value, mask=None, 
                             dropout=self.dropout)

Within MultiHeadedAttention the ass_mask is not being passed into the attention method here and appears as if it's unused. IIUC the attention mask is necessary to prevent look ahead bias in the attention mechanism and should be masking off future values when calculating attention.

If this mask is unused, what was it's intent? Where is attention being masked? And how should that be appied?

Coin seletion

Hi,

The algo selects coins for the highest volume on the last 30 days. There are some coins that just don´t make sense, for example:

A coin that was delisted, or only introduced recently - no data for training.

Another bad coin data is reverse coins, I believe that will really mess up things with ML since it is reversed.

Another concern is using coins that are not pegged to the same pair, like ETH/BTC and ETH/USDT. There is supposed to be a zero-risk cash, be that BTC or USDT.

Yet another case is using the same coin that is pegged to different stable coins, like ETH/USDT and ETH/USDC. That is actually two pairs but virtually the same data.

Do you filter our those coins? How to deal with that?

Thanks again,

Jose

How to reproduce the results on Stock Data?

Hi,

Thanks for the kind sharing code! I see that RAT has achieved fantastic results on S&P500. However, the implementation details of stock market is not introduced in the paper. So I am wondering how the hyperparameter was setted on S&P500, such as selected stock numbers and volume average days. Any help is very welcome.

Thanks!

How to reproduce the performance on the paper ？

Thanks for the great job !

I get this when I run run_main.sh.

Not up to the performance in the paper.

How can I achieve the performance in the paper?

SP500 data: loss and portfolio_value is nan

I have tried to use SP500 data to reproduce this code, and change some sql sequence. But, when i run the code,the loss and portfoilio_value is nan. My data and running result are shown below.

I wonder to know if the data needed to be preprocessed before run the code like normalization or anything else.

Looking forward to your reply!

Total asset from three-headed output

Hi, I have a question about total wealth.
In your work, the output is summation of three heads initial, short, and reinvest vector, respectively.

The total sum of three heads is 1+(-1) + 1 = 1.
But this value is not considered of total asset volume.
The absolute asset is 3. So, i think it’s like managing three times asset.

(Of course, if the two output values are not zero for both positive and negative heads, respectively, then each absolute value will be offset because the final portfolio value is generated in addition.
However, if you finally put short position, you will always use more than 1.)

For example, for 3 stocks, the final portfolio value is like this at time t,
A_t = [0.8, -0.3, 0.5].
Abs(A_t) = 0.8 + -(-0.3) + 0.5 = 1.6.
So, final portfolio vector have to be scaled by 1.6.
Proper portfolio vector is
A_t’ = [0.8/1.6, -0.3/1.6, 0.5/1.6].

If I misunderstood something, please let me know.
Thank you!

I am trying to run your code, but it fails with the following exception:
Exception: Have to be online

I use a conda env with same versions as defined in your readme and run ./run_main.sh.

Thanks a lot,
Louis