Giter Site home page Giter Site logo

diplomathesis's Introduction

DiplomaThesis

Can Machines Explain Stock Returns?

Abstract

Recent research shows that neural networks predict stock returns better than any other model. The networks' mathematically complicated nature is both their advantage, enabling to uncover complex patterns, and their curse, making them less readily interpretable, which obscures their strengths and weaknesses and complicates their usage. This thesis is one of the first attempts at overcoming this curse in the domain of stock returns prediction. Using some of the recently developed machine learning interpretability methods, it explains the networks' superior return forecasts. This gives new answers to the long-standing question of which variables explain differences in stock returns and clarifies the unparalleled ability of networks to identify future winners and losers among the stocks in the market. Building on 50 years of asset pricing research, this thesis is likely the first to uncover whether neural networks support the economic mechanisms proposed by the literature. To a finance practitioner, the thesis offers the transparency of decomposing any prediction into its drivers, while maintaining a state-of-the-art profitability in terms of Sharpe ratio. Additionally, a novel metric is proposed that is particularly suited to interpret return-predicting networks in financial practice. This thesis offers a usable and economically explainable account of how machines make stock return predictions.

diplomathesis's People

Contributors

karolinachalupova avatar

Stargazers

 avatar

diplomathesis's Issues

Correlated features problems

  • Feature importance measures are calculated by nulling a single input feature. But some features are very correlated (calculated slightly differently from same underlying data), so it does not make sense to keep the correlated features not nulled.
  • A possible way around this is to only choose a single feauture from a group of very correlated features. I consider doing this manually just using the correlation matrix and my brain to decide.

Use smaller number of features

Možná by dávalo smysl použít i menší množství features, protože napříč Kellym (Gu et al, 2018) a vlastně i napříč ostatní literaturou mi připadá, že se modely hodně shodují ve výběru důležitých proměnných, takže bych je prozatím mohla vzít jako dané. Řekla bych, že když bude menší množství features, bude ta interpretabilita daleko přehlednější, což je podle mě dobrý, protože se v tom já i čtenář líp zorientujeme. Viděla bych okolo 30 features namísto současných 150.

@martinhronec what do you think?

Can I somehow reduce time dimension to make my life easier?

Right now, the same model is trained multiple times (call it T times) on expanding window, there are essentially T models and T test corresponding test sets (same as Gu et al., 2018, we have discussed this @martinhronec). This makes sense from the perspective of using the model in practice for trading.
However, it is burdensome for purposes of interpretability. Given that I would like to try more architectures (simpler to more complex, as in Gu et al., 2018) and different seeds, there are a lot of models. So number of models is num of architectures times num of seeds tried times T. It would make my life easier if I could only have T=1, in other words, a single train-validation-test split for the whole data. It would be a lot easier to code it as well as to handle the results.
The question is: can I do it? There are two ways of doing this:

  • ASSUMING IT: getting permission to do it from prior literature, eg. Gu et al, 2018 - is the model stable in time? I need to look.
  • SHOWING IT: showing that the model is stable in time myself. This means showing e.g. that feature importance is stable in time.

@martinhronec any thoughts?

Liquidity filter - do I have the right data from @martinhronec?

@martinhronec Tomas mentioned that there was a problem with the liquidity filter. I do not have the code of your liquidity filter, just the dscodes that pass the filter. Tomas said that the filter was more strict than it should be. I just want to make sure I have the right data. Even if I dont, this may be immaterial, as the filter is not wrong, just strict.

A few models predict the same number for all inputs (just for some seeds). What should I do?

@martinhronec I had a deeper look on the models by random seed for models trained on 12 years and models trained on 13 years.
I discovered that a few of them (3/(592) = 3 percent of models) learned to predict the same number no matter what input.
It appears in deep models in particular. The following models suffer from the issue:

  • among models trained on 12 years:
    • architecture with 4 hidden layers, 5th seed
    • architecute with 5 hidden layers, 8th seed
  • among models trained on 13 year:
    • architecture with 5 hidden layers, 9th seed

The predicted number is always very close to the mean of the training data.

What do you think I should do about these models?

Understanding Fisher et al. 2019: All Models are Wrong, but Many are Useful

I think this paper could be crucial. https://arxiv.org/abs/1801.01489
With a single measure:

  • It gives confidence intervals for feature importance
  • helps understand why ensembles work
  • helps with the problem of interpreting models with correlated features

If I understand correctly, the MR is permutation feature importance (based on decrease in loss). MCR provides confidence interval for MR, using epsilon-Rashomon set.
- Issue A: I cannot find an implementation in python. I can try to code it up. There is R interpretation from the authors: https://github.com/aaronjfisher/mcr
- Issue B: I do not understand what models consitute the empirical epsilon-Rashomon set - what is their general family - different models / architectures / seeds?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.