Giter Site home page Giter Site logo

financialcomputingucl / lobframe Goto Github PK

View Code? Open in Web Editor NEW
62.0 3.0 13.0 117 KB

We release `LOBFrame', a novel, open-source code base which presents a renewed way to process large-scale Limit Order Book (LOB) data.

Python 100.00%
deep-learning financial-markets high-frequency-data limit-order-book market-microstructure research-paper econophysics

lobframe's Introduction

LOBFrame

We release LOBFrame (see the two papers Deep Limit Order Book Forecasting and HLOB - Information Persistence and Structure in Limit Order Books), a novel, open-source code base which presents a renewed way to process large-scale Limit Order Book (LOB) data. This framework integrates all the latest cutting-edge insights from scientific research (see Lucchese et al., Prata et al.) into a cohesive system. Its strength lies in the comprehensive nature of the implemented pipeline, which includes the data transformation and processing stage, an ultra-fast implementation of the training, validation, and testing steps, as well as the evaluation of the quality of a model's outputs through trading simulations. Moreover, it offers flexibility by accommodating the integration of new models, ensuring adaptability to future advancements in the field.

Introduction

In this tutorial, we show how to replicate the experiments presented in the two papers titled "Deep Limit Order Book Forecasting: A microstructural guide" and "HLOB - Information Persistence and Structure in Limit Order Books".

Before starting, please remember to ALWAYS CITE OUR WORKS as follows:

@article{briola2024deep,
  title={Deep Limit Order Book Forecasting},
  author={Briola, Antonio and Bartolucci, Silvia and Aste, Tomaso},
  journal={arXiv preprint arXiv:2403.09267},
  year={2024}
}
@misc{briola2024hlob,
      title={HLOB -- Information Persistence and Structure in Limit Order Books}, 
      author={Antonio Briola and Silvia Bartolucci and Tomaso Aste},
      year={2024},
      eprint={2405.18938},
      archivePrefix={arXiv},
      primaryClass={q-fin.TR}
}

Pre-requisites

Install the required packages:

pip3 install -r requirements.txt

If you are using a MacOS operating system, please proceed as follows:

pip3 install -r requirements_mac_os.txt

Data

All the code in this repository exploits LOBSTER data. To have an overview on their structure, please refer to the official documentation available at the following link.

Preliminary operations

Before starting any experiment:

  • Open the lightning_batch_gd.py file and insert the Weights & Biases project's name and API key (search for TODOs).
  • Open the utils.py file and set the default values of the parameters.

Usage

To start an experiment from scratch, you need to follow these steps:

  • Place the raw data in the data/nasdaq/raw folder. The data must be in the LOBSTER format and each folder must be named with the asset's name (e.g. AAPL for Apple stock).
  • Run the following command to pre-process data:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "data_processing"
  • Run the following command to prepare the torch datasets (this allows to reduce the training time):
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "torch_dataset_preparation" --prediction_horizon 10
    If you are interested also in performing the backtest stage, run the following command:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "torch_dataset_preparation,torch_dataset_preparation_backtest" --prediction_horizon 10
  • If you are planning to use the HLOB model (see the paper titled HLOB - Structure and Persistence of Information in Limit Order Books), it is mandatory to execute the following command:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "complete_homological_structures_preparation"
  • Run the following command to train the model:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "training"
    Currently available models are:
    • deeplob
    • transformer
    • itransformer
    • lobtransformer
    • dla
    • cnn1
    • cnn2
    • binbtabl
    • binctabl
    • axiallob
    • hlob
  • Run the following command to evaluate the model:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --experiment_id "<experiment_id_generated_in_the_training_stage>" --stages "evaluation"
  • Run the following command to analyze the results:
      python3 main --training_stocks "CSCO" --target_stocks "CSCO" --experiment_id "<experiment_id_generated_in_the_training_stage>" --stages "backtest,post_trading_analysis"

Multiple (compatible) stages can be executed at the same time. Consider the following example:

python3 main --training_stocks "CSCO" --target_stocks "CSCO" --stages "data_processing,torch_dataset_preparation,torch_dataset_preparation_backtest,training,evaluation,backtest,post_trading_analysis"

Each experiment can be resumed and re-run by specifying its ID in the experiment_id parameter.

We now provide the typical structure of a folder before an experiment's run:

.
├── README.md
├── data
│   └── nasdaq
│        ├── raw_data
│             ├── <Stock1_Name>
│             └── <Stock1_Name>
│        ├── scaled_data
│             ├── test
│             ├── training
│             └── validation
│        └── unscaled_data
│             ├── test
│             ├── training
│             └── validation
├── data_processing
│   ├── data_process.py
│   └── data_process_utils.py
│   └── complete_homological_utils.py
├── loaders
│   └── custom_dataset.py
├── loggers
│   ├── logger.py
│   └── results
├── main.py
├── models
│   ├── AxialLob
│         └── axiallob.py
│   ├── CNN1
│         └── cnn1.py
│   ├── CNN2
│         └── cnn2.py
│   ├── DeepLob
│         └── deeplob.py
│   ├── DLA
│         └── DLA.py
│   ├── iTransformer
│         └── itransformer.py
│   ├── LobTransformer
│         └── lobtransformer.py
│   ├── TABL
│         ├── bin_nn.py
│         ├── bin_tabl.py
│         ├── bl_layer.py
│         └── tabl_layer.py
│   ├── Transformer
│         └── transformer.py
|   ├── CompleteHCNN
│         └── complete_hcnn.py
├── optimizers
│   ├── executor.py
│   └── lightning_batch_gd.py
├── requirements.txt
├── simulator
│   ├── market_sim.py
│   ├── post_trading_analysis.py
│   └── trading_agent.py
├── torch_datasets
│   └── threshold_1e-05
│       └── batch_size_32
│           └── 10
│               ├── test_dataset.pt
│               ├── test_dataset_backtest.pt
│               ├── training_dataset.pt
│               └── validation_dataset.pt
├── results
└── utils.py

License

Copyright 2024 Antonio Briola, Silvia Bartolucci, Tomaso Aste.

Licensed under the CC BY-NC-ND 4.0 Licence (the "Licence"); you may not use this file except in compliance with the License. You may obtain a copy of the License at:

https://creativecommons.org/licenses/by-nc-nd/4.0/

Software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the provided link for the specific language governing permissions and limitations under the License.

lobframe's People

Contributors

antobr96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

lobframe's Issues

Hi, I have a question about data.

Is it possible to run the code using LOBSTER's sample data (AAPL, AMZN, GOOG, INTL and MSFT)? If not, do I need to pay for LOBSTER's data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.