Giter Site home page Giter Site logo

ritchieng / fractional_differencing_gpu Goto Github PK

View Code? Open in Web Editor NEW
55.0 8.0 11.0 5.37 MB

Rapid large-scale fractional differencing with NVIDIA RAPIDS and GPU to minimize memory loss while making a time series stationary. 6x-400x speed up over CPU implementation.

License: MIT License

Jupyter Notebook 100.00%
fractional-differencing rapids cudf gpu-computing hpc-applications nvidia python time-series stationarity

fractional_differencing_gpu's Introduction

Fractional Differencing with GPU (GFD)

This is a GPU implementation of fractional differencing (we call it GFD). It allows rapid large-scale implementation of fractional differencing to minimize memory loss while achieving stationary for time series data.

Experiment Our Code Instantly Now on Google Colaboratory

Open In Colab

Easily run the whole tutorial in a self-contained Jupyter Notebook on Google Colaboratory by pressing the button above. The whole process of including pulling all data, dependencies and running the code for GFD is contained in the notebook, allowing you to run this notebook as is.

Summary Results

Number of data points and time taken in seconds. You can easily reach such similar multipliers on Google Colab or on more powerful machines via GCP, AWS or your local servers/machines.

**** 100k 1m 10m 100m
GCP 8x vCPUs 9.18 89.62 891.24 9803.11
GCP 1x T4 GPU 1.44 1.33 3.75 29.88
GCP 1x V100 GPU 0.93 1.07 3.17 23.81
Speed-up 1x T4 vs 8x vCPUs 6.38x 67.38x 237.66x 328.08
Speed-up 1x V100 vs 8x vCPUs 9.87x 83.76x 281.15x 411.72x

Optimized Version by NVIDIA

Full credits to NVIDIA where they built on our work and further speed things up resulting in almost 10000x speed-up over a CPU implementation. You can find the more complex and less intuitive but highly performance version of GFD by NVIDIA in this notebook.

Simple GFD Function

We've created a simple function in the notebook, pass your Pandas dataframe into the function and it will return fractionally differenced time series dataframe.

Arguments

  • d: fractional differencing value, 0 means no differencing, above 1 means integer differencing, and anything between 0 to 1 is fractional differencing.
  • floor: minimum value to ignore for fixed window fractional differencing.

Notes

  • Your dataframe (df_raw) is required to have an index such that it's from lag k (oldest time) to lag 0 (latest time) from top to the bottom of the dataframe accordingly for this function to work appropriately.
    • Future: We will implement auto-fixes moving forward such that you don't have to care about the order of your dataframe, but just take in the mean time.
  • We tested for dataframes up to 100 million data points per function call (per dataframe essentially).

GPU implementation

gfd, weights = frac_diff_gpu(df_raw, d=0.5, floor=5e-5)

CPU implementation

fd, weights = frac_diff(df_raw, d=0.5, floor=5e-5)

Important Links to Presentation and Code Repository

GFD Repository Plans

  • Make GFD more efficient
    • Chunk size implementation: currently just throwing the entire chunk matching the size of the dataset's length (haha).
  • Run GFD on thousands of time series datasets, creating a grid of t-stats and time benchmarks.
  • Package GFD functions into a pip package for quick running

Release Notes

The next release will include multiple 1D blocks in a 1D grid instead of a single 1D block of 518/1024 threads. This will help users understand multiple blocks vs a single block.

Beyond the next release, we'll be moving to explain the use of more than 1 dimension blocks/grids.

Citation Reference to Repository/Presentation

If you use the code, please cite using this link alongside Prado/Hosking papers.

Credits and Special Thanks

  1. NVIDIA (Ettikan, Chris and Nick), Amazon AWS, ensemblecap.ai, and NExT++ (NUS School of Computing)
  2. Marcos Lopez de Prado for his recent push on the use of fractional differencing of which this guide is based on.
  3. Hosking for his paper in 1981 on fractional differencing.

Help Wanted

  1. Feel free to raise any issue for feedback on bugs or improvements.
  2. I'll be implementing more critical GPU-accelerated functions, we are looking for collaborators.
  3. If you find this repository useful, please star it!

fractional_differencing_gpu's People

Contributors

ritchieng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fractional_differencing_gpu's Issues

plot fracdiff and raw at same chart

at ["Fixed Window Fractional Differencing Function (CPU)"], include the raw data, to easily see the difference between both charts =)
great work, congrats!

Plot

plt.figure(figsize=figsize)
plt.subplot(2,1,1)
plt.plot(df_raw_fd)
plt.subplot(2,1,2)
plt.plot(df_raw)
plt.show()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.