quantopian / pyfolio Goto Github PK

View Code? Open in Web Editor NEW

5.6K 303.0 1.8K 85.8 MB

Portfolio and risk analytics in Python

Home Page: https://quantopian.github.io/pyfolio

License: Apache License 2.0

Python 2.82% Shell 0.01% Jupyter Notebook 97.18% Batchfile 0.01%

pyfolio's Issues

Add Omega Ratio to timeseries.py

simple version that doesn't include a benchmark or MAR (Minimum Acceptable Return).

So its just: (Total Profit %) / (Total Losses %)

Adapt quant_notebooks NBs to use quantrisk

For the time being, quant_notebooks will depend on the internal quant_utils repo. Once things stabilize over here we should make a concerted effort to port all our NBs over to use this new library. The overlapping parts in quant_utils will then get removed.

New things should, however, go into quantrisk.

boxplot seems to be throwing exception in tearsheet_quantrisk

seeing this on the research server, using the tearsheet_quantrisk nb.

ValueError: List of boxplot statistics and `positions` values must have same the length

Repro expression:

rets, pos, txn_daily = quantrisk.internals.get_and_analyze_algo('5586028584f4829e9800025e',
                                                                backtest_min_years=5, 
                                                                plot_risk_factors=True,
                                                                include_positions=True)

Error stack trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-f997b0a20ee9> in <module>()
      2                                                                 backtest_min_years=5,
      3                                                                 plot_risk_factors=True,
----> 4                                                                 include_positions=True)

/opt/code/quantrisk/quantrisk/internals.pyc in get_and_analyze_algo(*args, **kwargs)
    281                                  fetcher_urls=fetcher_urls,
    282                                  algo_create_date=algo_create_date,
--> 283                                  cone_std=cone_std, bayesian=bayesian)
    284 
    285     return df_rets, df_pos, df_txn

/opt/code/quantrisk/quantrisk/tears.pyc in create_full_tear_sheet(df_rets, df_pos, df_txn, gross_lev, fetcher_urls, algo_create_date, bayesian, backtest_days_pct, cone_std)
    227     benchmark2_rets = utils.get_symbol_rets('IEF')  # 7-10yr Bond ETF.
    228 
--> 229     create_returns_tear_sheet(df_rets, algo_create_date=algo_create_date, backtest_days_pct=backtest_days_pct, cone_std=cone_std, benchmark_rets=benchmark_rets, benchmark2_rets=benchmark2_rets)
    230 
    231     create_interesting_times_tear_sheet(df_rets, benchmark_rets=benchmark_rets)

/opt/code/quantrisk/quantrisk/tears.pyc in create_returns_tear_sheet(df_rets, algo_create_date, backtest_days_pct, cone_std, benchmark_rets, benchmark2_rets)
    116                                            ax=ax_daily_similarity_no_var_no_mean)
    117 
--> 118     plotting.plot_return_quantiles(df_rets, df_weekly, df_monthly, ax=ax_return_quantiles)
    119 
    120 

/opt/code/quantrisk/quantrisk/plotting.pyc in plot_return_quantiles(df_rets, df_weekly, df_monthly, ax, **kwargs)
    595     sns.boxplot([df_rets, df_weekly, df_monthly],
    596                 names=['daily', 'weekly', 'monthly'],
--> 597                 ax=ax, **kwargs)
    598     ax.set_title('Return quantiles')
    599     return ax

/opt/miniconda/lib/python2.7/site-packages/seaborn/categorical.pyc in boxplot(x, y, hue, data, order, hue_order, orient, color, palette, saturation, width, fliersize, linewidth, whis, notch, ax, **kwargs)
   1621     kwargs.update(dict(whis=whis, notch=notch))
   1622 
-> 1623     plotter.plot(ax, kwargs)
   1624     return ax
   1625 

/opt/miniconda/lib/python2.7/site-packages/seaborn/categorical.pyc in plot(self, ax, boxplot_kws)
    516     def plot(self, ax, boxplot_kws):
    517         """Make the plot."""
--> 518         self.draw_boxplot(ax, boxplot_kws)
    519         self.annotate_axes(ax)
    520         if self.orient == "h":

/opt/miniconda/lib/python2.7/site-packages/seaborn/categorical.pyc in draw_boxplot(self, ax, kws)
    453                                          positions=[i],
    454                                          widths=self.width,
--> 455                                          **kws)
    456                 color = self.colors[i]
    457                 self.restyle_boxplot(artist_dict, color, kws)

/opt/miniconda/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in boxplot(self, x, notch, sym, vert, whis, positions, widths, patch_artist, bootstrap, usermedians, conf_intervals, meanline, showmeans, showcaps, showbox, showfliers, boxprops, labels, flierprops, medianprops, meanprops, capprops, whiskerprops, manage_xticks)
   3116                            meanline=meanline, showfliers=showfliers,
   3117                            capprops=capprops, whiskerprops=whiskerprops,
-> 3118                            manage_xticks=manage_xticks)
   3119         return artists
   3120 

/opt/miniconda/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in bxp(self, bxpstats, positions, widths, vert, patch_artist, shownotches, showmeans, showcaps, showbox, showfliers, boxprops, whiskerprops, flierprops, medianprops, capprops, meanprops, meanline, manage_xticks)
   3383             positions = list(xrange(1, N + 1))
   3384         elif len(positions) != N:
-> 3385             raise ValueError(datashape_message.format("positions"))
   3386 
   3387         # width

ValueError: List of boxplot statistics and `positions` values must have same the length

Examine get_backtest() format and make quantrisk compat

get_backtest() on the quantopian research platform (QRP) returns backtest results, positions and transactions. For quantrisk to be useful on the QRP, it has to work with that format.

I imagine this could go 2 ways. Either we switch internally to always use that format if it's convenient, or we transform whatever get_backtest() returns.

Fix underwater plot to have net %

Port over tear sheet

Most of the code that actually creates the tearsheet lives in https://github.com/quantopian/quant_notebooks/blob/master/analyses/algo_tear_sheet_contest.ipynb. We should refactor that NB in a major way so that the one huge function is broken up into individual plots. Then the tear sheet would just call the individual functions.

Change 'calmer' ratio spelling to 'calmar'

This is a darn legacy typo on my part in the spelling :(

https://en.wikipedia.org/wiki/Calmar_ratio

Gus, giving it to you since I figure you know best as to how renaming will affect dependencies.

Add quantrisk to the internal research server

Once quantrisk is in an installable state, we should add it to research so that we can start to remove chunks from quant_utils that are overlapping.

Create example NBs

We want two example notebooks, one that uses a single stock and one that uses a zipline algo.

Remove "None" from cumulative returns plot legend

First plot in returns tearsheet. I tried to fix this myself but I wasn't sure of an easy way.

Gonna assign to @justinlent, but if you're not sure/don't have time I can take another stab.

PEP8 and unify codebase

Use e.g. autopep8 to make the syntax correspond to pep8.

Also, the functions use different naming conventions, change all to use underscore_style rather than camelCase.

Change default value in timeseries.annual_return function for parameter 'style'. Change to 'compound'

@twiecki I want to propose we change the default returns 'style' parameter used for annual_return to be "compound" instead of "calendar". Calendar isn't very conventional (its just simple arithmetic annual return), and was originally implemented just to match zipline results. Here's an example showing how different they can be using the benchmark_rets timeseries in tearsheet_standalone. Returning the geometric compounded value, which is what 'style=compound' returns simply makes more sense I think for a publicly facing API.

Given annual_return is called in the calculation for quite a few other performance statistics, this can make a huge impact downstream (especially over df_rets that span many years), so I think we should think about it. Maybe you can take a quick look at the exact implementation of the code for a sanity check as well.

Thoughts?

X-axis labels not showing up on some plots

I'm seeing this locally on master. This is not a problem on the internal research server. So 1 of 2 things is going on:

IRS might not be on the latest quantrisk, so a recent commit broke it.
Some of my local packages are out of date (though I just upgraded to seaborn 0.6)

I've copied in some images below, but now I'm noticing that nearly all my plots don't have x-axis labels. Even the x-axis dates from the main cumulative returns plot are missing.

rolling multi factor risk plot is broken

somehow the title and y-axis got labeled "alpha". as well as some extraneous code that computed alpha was merged into the body that we don't need right now. I'm commenting out what I think is wrong and committed here for other's review in this commit:
becac6c

If all looks good we can get rid of the commented lines

add traditional single factor betas

first off we need to define which sids we want to use to proxy these - i think i said I'd do that, but if @justinlent wants to weigh in please do!

Traditional single factor betas: Equity, Bonds, Credit, Gold, Crude Oil, Volatility

test

Create aliases in quant_utils to quantrisk

It will be a bit tricky in the interim period where we still have both libraries side-by-side. We definitely don't want to have to fix bugs twice.

One proposal is to first remove the functions in quant_utils that are overlapping, and then do a selective import from quantrisk.

If a function with the same name and call signature now exists in quantrisk, remove it from quant_utils and add an alias. For example:

cum_returns exists in both, quantrisk/timeseries.py as well as quant_utils/timeseries.py. Remove cum_returns from quant_utils/timeseries.py and add a from quantrisk.timeseries import cum_returns. That way the imports from the NBs will continue to work. If the name changed, add an alias with the old name (e.g. if we before had the name cumReturns we'd do from quantrisk.timeseries import cum_returns as cumReturns.

Investigate what we want to do in case of negative cash in plot_exposures

When there is negative cash that should likely be obvious to the user

Add pyfolio to quantopian research platform

Install quantrisk as a module onto the quantopian research platform. I don't think we need to blacklist anything.

Not sure who could help us with that, any idea @KarenRubin?

Add license header to all files

Example: https://github.com/quantopian/zipline/blob/master/zipline/algorithm.py#L1 For every python file.

Add Python 3 compatibility

Will probably just require us to adjust print statements.

Remove obsolete code

I already went through and deleted some obvious cases but I bet there is more stale code lying around.

support comparison of multiple algorithms in the same tear sheet

just assuming equal-weight of each algo for phase 1 implementation is acceptable

Remove dependency on multiple max drawdown functions

Currently, we have max_drawdown and get_max_draw_down, the latter likely being the better version.

Support less/more benchmarks

Allow users to pass in different amounts of benchmarks besides 2, most likely just in the cumulative returns plot

Write unittests

The more each function is tested the better.

The question arises as to how to define ground truth. zipline does this via an excel spreadsheet and we could copy that approach but it's a bit cumbersome. Finding some small cases where the truth can be calculated manually would be a better first step.

Once that is in place we should activate travis.

Port logic from standalone to catch cases when no live trading data is available

here is one example algo id: 557e96eb8ba119ea30000409

IndexError: index out of bounds

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-e2266cc245fd> in <module>()
      3 # B algo:  54349eefcd2e3f7f57000025
      4 #a, b, c = internals.get_and_analyze_algo('54349eefcd2e3f7f57000025', contest='MEGARUN_1');
----> 5 a, b, c = internals.get_and_analyze_algo('557e96eb8ba119ea30000409', backtest_min_years=5, cone_std=1.0);

/Users/jlent/github_projects/quantrisk/quantrisk/internals.pyc in get_and_analyze_algo(algo_id, contest, cone_std, plot_risk_factors, algo_live_date, include_positions, backtest_min_years, backtest_max_years)
    283 
    284     tears.create_full_tear_sheet(df_rets, df_pos=df_pos, df_txn=df_txn_daily, gross_lev=gross_lev, fetcher_urls=fetcher_urls,
--> 285                                  algo_create_date=algo_create_date, cone_std=cone_std)
    286 
    287     return df_rets, df_pos, df_txn_daily

/Users/jlent/github_projects/quantrisk/quantrisk/tears.pyc in create_full_tear_sheet(df_rets, df_pos, df_txn, gross_lev, fetcher_urls, algo_create_date, backtest_days_pct, cone_std)
    111                            backtest_days_pct=0.5, cone_std=1.0):
    112 
--> 113     create_returns_tear_sheet(df_rets, algo_create_date=algo_create_date, backtest_days_pct=backtest_days_pct, cone_std=cone_std)
    114 
    115     if df_pos is not None:

/Users/jlent/github_projects/quantrisk/quantrisk/tears.pyc in create_returns_tear_sheet(df_rets, algo_create_date, backtest_days_pct, cone_std)
     36     print '\n'
     37 
---> 38     plotting.show_perf_stats(df_rets, algo_create_date, benchmark_rets)
     39 
     40     plotting.plot_rolling_returns(

/Users/jlent/github_projects/quantrisk/quantrisk/plotting.pyc in show_perf_stats(df_rets, algo_create_date, benchmark_rets)
    299 
    300     diff_pct = timeseries.out_of_sample_vs_in_sample_returns_kde(timeseries.cum_returns(df_rets_backtest , 1.0), 
--> 301                                                              timeseries.cum_returns(df_rets_live, 1.0) )
    302 
    303     consistency_pct = int( 100*(1.0 - diff_pct) )

/Users/jlent/github_projects/quantrisk/quantrisk/timeseries.pyc in cum_returns(df_rets, starting_value)
     64     # Note that we can't add that ourselves as we don't know which dt
     65     # to use.
---> 66     if pd.isnull(df_rets.iloc[0]):
     67         df_rets.iloc[0] = 0.
     68 

/Users/jlent/anaconda/lib/python2.7/site-packages/pandas/core/indexing.pyc in __getitem__(self, key)
   1215             return self._getitem_tuple(key)
   1216         else:
-> 1217             return self._getitem_axis(key, axis=0)
   1218 
   1219     def _getitem_axis(self, key, axis=0):

/Users/jlent/anaconda/lib/python2.7/site-packages/pandas/core/indexing.pyc in _getitem_axis(self, key, axis)
   1506                 self._is_valid_integer(key, axis)
   1507 
-> 1508             return self._get_loc(key, axis=axis)
   1509 
   1510     def _convert_to_indexer(self, obj, axis=0, is_setter=False):

/Users/jlent/anaconda/lib/python2.7/site-packages/pandas/core/indexing.pyc in _get_loc(self, key, axis)
     90 
     91     def _get_loc(self, key, axis=0):
---> 92         return self.obj._ixs(key, axis=axis)
     93 
     94     def _slice(self, obj, axis=0, kind=None):

/Users/jlent/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in _ixs(self, i, axis)
    486             values = self.values
    487             if isinstance(values, np.ndarray):
--> 488                 return _index.get_value_at(values, i)
    489             else:
    490                 return values[i]

pandas/index.pyx in pandas.index.get_value_at (pandas/index.c:2358)()

pandas/src/util.pxd in util.get_value_at (pandas/index.c:15287)()

IndexError: index out of bounds

Monthly Returns heatmap formatting broken

as well, the number formatting within each heatmap cell needs to be fixed. 1 number before and after the decimal should be the formatting

Make standalone tear sheet depend on quantrisk

Talked to @justinlent yesterday and he said there were some updates to the cone code that have not made into quantrisk yet.

add sector exposure/ sector breakout analysis

need to find a way to link up MS asset classifications, or some other external data source to do this one.

Sectors (10-GIC levels to start, or whatever similar we have from Morningstar)
Table: Per year and overall sector weight longs, shorts, net, absolute and relative performance of each
Plot (timeseries): portfolio weight in each sector over time

add table for top 10/bottom 10 winners/losers

Table: Top and Bottom 10 holdings by contribution to performance ever, and per year (long, short, overall) - performance is over the time position was held in portfolio.

Performance, weight and relative contribution for top 10 holdings, ranked from largest contribution to smallest

"best" and "worst" positions basically

use qgrid or some other scrollable table to display full list of positions

with the full list of holdings from all time is a large list (>10 or so) it doesn't show up nicely in the tearsheet.

Add win/loss analysis

Might be useful to have a breakdown of win/loss percentage

Distill internal functions

Some functions are only useful to people with access to quantopian's DB. These should be distilled out into a separate file. Later, we can worry about how these internal tools are made available.

Looks like negative cash is breaking long/short/cash layer cake chart?

Ran this command:

rets, pos, txn_daily = quantrisk.internals.get_and_analyze_algo('5552d8b7c4b238fb42000578', 
                                            backtest_min_years=10, 
                                            plot_risk_factors=True,
                                            include_positions=True,
                                                               contest='all_contest_entries')

Got this traceback:

--------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-6670bc8a3ecd> in <module>()
      3                                            backtest_min_years=5,
      4                                            plot_risk_factors=True,
----> 5                                            include_positions=True)

/opt/code/quantrisk/quantrisk/internals.py in get_and_analyze_algo(*args, **kwargs)
    295 
    296     tears.create_full_tear_sheet(df_rets, df_pos=df_pos, df_txn=df_txn, gross_lev=gross_lev, fetcher_urls=fetcher_urls,
--> 297                                  algo_create_date=algo_create_date, cone_std=cone_std)
    298 
    299     return df_rets, df_pos, df_txn

/opt/code/quantrisk/quantrisk/tears.py in create_full_tear_sheet(df_rets, df_pos, df_txn, gross_lev, fetcher_urls, algo_create_date, backtest_days_pct, cone_std)
    114 
    115     if df_pos is not None:
--> 116         create_position_tear_sheet(df_rets, df_pos, gross_lev=gross_lev)
    117 
    118         if df_txn is not None:

/opt/code/quantrisk/quantrisk/tears.py in create_position_tear_sheet(df_rets, df_pos_val, gross_lev)
     89     df_pos_alloc = positions.get_portfolio_alloc(df_pos_val)
     90 
---> 91     plotting.plot_exposures(df_cum_rets, df_pos_alloc)
     92 
     93     plotting.show_and_plot_top_positions(df_cum_rets, df_pos_alloc)

/opt/code/quantrisk/quantrisk/plotting.py in plot_exposures(df_cum_rets, df_pos_alloc)
    448     df_long_short = positions.get_long_short_pos(df_pos_alloc)
    449     df_long_short.plot(
--> 450         kind='area', color=['lightblue', 'green', 'coral'], alpha=1.0)
    451     plt.xlim((df_cum_rets.index[0], df_cum_rets.index[-1]))
    452     plt.title("Long/Short/Cash Exposure")

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in plot_frame(data, x, y, kind, ax, subplots, sharex, sharey, layout, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, secondary_y, sort_columns, **kwds)
   2486                  yerr=yerr, xerr=xerr,
   2487                  secondary_y=secondary_y, sort_columns=sort_columns,
-> 2488                  **kwds)
   2489 
   2490 

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in _plot(data, x, y, subplots, ax, kind, **kwds)
   2322         plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
   2323 
-> 2324     plot_obj.generate()
   2325     plot_obj.draw()
   2326     return plot_obj.result

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in generate(self)
    912         self._compute_plot_data()
    913         self._setup_subplots()
--> 914         self._make_plot()
    915         self._add_table()
    916         self._make_legend()

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in _make_plot(self)
   1623             kwds['label'] = label
   1624 
-> 1625             newlines = plotf(ax, x, y, style=style, column_num=i, **kwds)
   1626             self._add_legend_handle(newlines[0], label, index=i)
   1627 

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in plotf(ax, x, y, style, column_num, **kwds)
   1743                 if column_num == 0:
   1744                     self._initialize_prior(len(self.data))
-> 1745                 y_values = self._get_stacked_values(y, kwds['label'])
   1746                 lines = f(ax, x, y_values, style=style, **kwds)
   1747 

/opt/miniconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in _get_stacked_values(self, y, label)
   1638             else:
   1639                 raise ValueError('When stacked is True, each column must be either all positive or negative.'
-> 1640                                  '{0} contains both positive and negative values'.format(label))
   1641         else:
   1642             return y

ValueError: When stacked is True, each column must be either all positive or negative.cash contains both positive and negative values

Make plotting function take `ax` kwarg and pass on plotting kws.

Currently, every plotting function creates it's own figure and then plots an axes object (i.e. what is actually the plot, figure is just the window that contains it) inside. This is fairly inflexible as maybe you want one large layout and plot the individual subplots in the appropriate axes defined outside of the plotting function.

A common pattern used e.g. by seaborn is to have every plotting function take an ax kws (e.g. https://github.com/mwaskom/seaborn/blob/master/seaborn/distributions.py#L33). That way the outside can control where the plot is going to be placed.

The pattern would be:

fig, (ax1, ax2, ax3) = plt.subplots(3, sharex=True)
quantrisk.plotting.plot_returns(df, ax=ax1)
quantrisk.plotting.plot_beta(df, ax=ax2)

Another nice pattern is to also take in kws that are passed on to individual plotting functions (https://github.com/mwaskom/seaborn/blob/master/seaborn/distributions.py#L56). For example, if I want to change the line-width of the plot, we wouldn't have to have an explicit linewidth kwarg in our function definition, but rather could do:

quantriks.plotting.plot_returns(df, plot_kws={'linewidth': 3})

When computing cash, subtract shorted $

Write doc strings for functions

Every function should have a doc string (in the numpy format https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt) explaining what it does, the arguments it takes and what those do as well as what it returns. Preferably with a small example code snippet.

pull Fama-French risk factors directly from the academic website

We should look into directly pulling this data since the csv can get out of date really quickly. Maybe we supply a default csv (if we can) as well as expose a function to pull the data from the university page

We also might want to confirm if there are any licenses or issues with redistributing it. @twiecki maybe you have some experience understanding redistribution of data in the OSS world?

Update Long+Short+Cash plot y-values to match Gross Leverage plot

might make more sense visually is have the long+short+cash plot y-axis values match up with the gross exposure y-axis

add 'auto-hedged' portfolio to tearsheet cumulative returns plot

assume you hedge the algo with its rolling 6-month SPY beta everyday to see if it improves sharpe ratio of the algo. plot it on the Cumulative Returns chart along with the algo and the benchmarks

add visualization of average number of holdings by month

the number of holdings plotted daily can be difficult to read over a long backtest period.

overlay Bayesian cone on top of regular cone in cumulative performance plot

It will be good to get a sense, visually, how closely the two cones are with one another depending on the algo.

it should be pretty easy to do just by passing the ax to the existing cumulative performance figure

Tidy arguments, remove commented out code

Function definitions and calls need to be tidied up and standardized.

Old code that was commented out that can be removed should be removed.

underwater function not checking whether there actually is a recovery from a drawdown

(moved over From quantnotebooks)

E.g. expression to reproduce:

df_rets, df_pos, df_txn = get_and_analyze_algo('54c90d32896ec780f20003b5', contest='MEGARUN_1',plot_risk_factors=False, plot_volatility_betas=False)

Investigate dash interface

plotly is building on a cool dashboard (that can also be used without plotly) here: https://github.com/chriddyp/dash

Could be great to integrate that.

try/except failing in tearsheet_standalone nb when catching exception when loading 'bayesian' fails

Add Sortino Ratio to timeseries.py

Always output the Out-of-sample start date even if its very short, or after the last backtest date

its useful info, if its available. In screenshot, sounds like the out of sample date was less than 5 days from the end of the available backtest data from the test harness run, and thus an automatical 85%/15% split was used (which is totally fine, and a good default). But if the out of sample date is available lets just show it

add plot for average monthly turnover

same logic as issue #12 the daily plot is hard to read

quantopian / pyfolio Goto Github PK

pyfolio's Issues

Recommend Projects

Recommend Topics

Recommend Org