Giter Site home page Giter Site logo

Indicators and dataframes about pybroker HOT 5 CLOSED

edtechre avatar edtechre commented on June 6, 2024
Indicators and dataframes

from pybroker.

Comments (5)

rokups avatar rokups commented on June 6, 2024 1

Hmm i did not think about using lambdas. Thank you for the example. I suppose this is solved then?

from pybroker.

edtechre avatar edtechre commented on June 6, 2024

First off, thank you for your thoughtful input @rokups, it is much appreciated. My thoughts are below:

df['spread'] = df['high'] - df['low']
df['spread_sma20'] = ta.SMA(df['spread'], 20)
df['spread_sma40'] = ta.SMA(df['spread'], 40)

This looks trivial on surface and of course is nothing PyBroker can not do, but actually this is very powerful.

To achieve something like this in PyBroker we would have to create a custom indicator functions for spread_sma20 and spread_sma40. But here we waste calculation of the spread column as it is done twice now.

PyBroker computes indicators in parallel using a process pool. To simplify this, the indicators are distributed across multiple processes for each ticker and indicator function pair. This means that there are no dependencies between indicators, making their computation easily parallelizable.

If you need to share custom data between indicators, you can register a custom data column with PyBroker and then create your own DataSource class or pass your own DataFrame to PyBroker. The Creating a Custom DataSource notebook shows how to do this. In your example, you would calculate the spread column in your DataFrame and then register it using pybroker.register_columns. The custom column will then be made available on the BarData instance passed to your indicator function.

It also is rather cumbersome to use indicator libraries like lib-ta or pandas_ta. These libraries already provide one-func-call indicators that we now must wrap in another function to acquaint them with PyBroker.

I am considering creating a wrapper around ta-lib. You should already be able to use pandas_ta by using a custom data source and registering custom columns, as explained previously. Perhaps I can add an example of pandas_ta to the custom DataSources notebook.

This dataframe need to be split anyway, besides merging dataframes of different symbols puts a burden on the user to make sure that dataframes of all queried symbols are of equal length and user must properly merge them in case there are missing candles. If everyone has to do it - might as well do it in the library.
Then, if dataframes were separate, we could also have a user-implemented indicators_fn(df) in the same spirit as exec_fn, which would allow massaging dataframe in any way we see necessary and utilizing all power of pandas.

Creating multiple DataFrames would introduce extra overhead and complexity. External APIs for historical data are designed to return a single DataFrame to maintain simplicity and performance. However, a bigger concern is that having multiple DataFrames may not parallelize efficiently across multiple processes due to memory limitations and would also severely slow down serialization given PyBroker's current implementation. On the other hand, NumPy arrays can be mem-mapped across processes with ease and can be accelerated using Numba.

There is one special case where my proposed approach is not good enough: pairs trading. We need price data of two symbols in order to calculate necessary metrics.

You can retrieve the indicator of another symbol using ExecContext#indicator(), as well as OHLCV + custom column data with ExecContext#foreign().

I agree that support for multi-symbol indicators would make sense. It is something that I considered during the design phase, but I limited the implementation to single-symbol indicators for the sake of simplicity in the initial release (V1). I need to give this more thought, but my plan would be to add support for multi-symbol indicators as a configuration option that groups data for all symbols per indicator. If you have any suggestions, please let me know. In the meantime, you can calculate the multi-symbol indicator outside of PyBroker, save it to a DataFrame column, and then register the custom column with PyBroker.

from pybroker.

rokups avatar rokups commented on June 6, 2024

Hmm what you say does make sense...

I am considering creating a wrapper around ta-lib

Here is a little help on that: talibgen.py.txt

This is an updated and fixed script from TA-Lib/ta-lib-python#212, should simplify the process.

from pybroker.

edtechre avatar edtechre commented on June 6, 2024

Great, thank you!

from pybroker.

edtechre avatar edtechre commented on June 6, 2024

After reviewing TA-Lib again, I am unsure if creating a wrapper for it adds significant value. It's already fairly straightforward to integrate TA-Lib with PyBroker by using lambdas as shown in the following example:

import talib

rsi_20 = pybroker.indicator('rsi_20', lambda data: talib.RSI(data.close, timeperiod=20))
rsi_20(df)

I added this example to the Writing Indicators notebook.

from pybroker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.