predict-idlab / plotly-resampler Goto Github PK

View Code? Open in Web Editor NEW

994.0 994.0 67.0 50.33 MB

Visualize large time series data with plotly.py

Home Page: https://predict-idlab.github.io/plotly-resampler/latest

License: MIT License

Python 97.67% Makefile 0.14% JavaScript 2.20%

data-analysis data-science data-visualization plotly plotly-dash python time-series visualization

plotly-resampler's People

Contributors

Stargazers

Watchers

Forkers

hercules261188 adbmd wildone aucan eng-rsmy jeremi-nh stjordanis louisvn shalevy1 bigdatasciencegroup esalehim stevo003 nata-star rohitpandey13 alejotf10 mahdi-akraminia mhmdsab sgatea jayceslesar adoherty21 bbradbury creative-research-project-v1-1 hyunmu prasadoruganti someshfengde ggreen98 zetsuchan wout-s isx2 ebw44 dal3006 matthewr1993 kouniam hdh7485 gg-big-org dhockaday ocamond shabbirhasan1 jerrryyjsu momentum96 imichaelmoore wellalb quaehasfacn wypse e-zl nielspraet javiervicho ryan-workfromhome ottaviom ivancho523 ank078 pragmatic-dash ssh352 adriandyderski t-jakubek isarcharmchi fehde-k adonglo sprig ekaratnida d0ng4667 dinhquangdung1999 lemikhovalex zhnathaniellee rainyblue-w data-ai-ml-services

plotly-resampler's Issues

How to stop the underlying graph server?

Attribute Error when running examples

When running your example notebook I get the following error:

AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

Maybe it's a plotly version issue?

Cheers

plans for 2D resampling support (e.g. heatmaps, images, ...)

Do you have plans to add support for heatmap-series resampling? it's just a 2D picture and there are many implementations of fast efficient image resampling algorithms...

Problem with pyinstaller

Hi,

First thing - thanks for your work 👍 This Plotly-Resampler allow me to handle 30bln datapoints plots. Really great job!
Unfortunately I have a problem with your library combined with PyInstaller. Code works properly inside PyCharm but when I add to the code "from plotly_resampler import FigureResampler", build standalone .exe file and run it then it crash without any error.
Do you know what can be a root-cause or how can I WA this issue?
Ofc I added all needed dependencies to the *.spec file.

Update: when I removed Jupyter and show_dash() function then .exe file starts working so maybe there is some routing conflict between two local servers?

Thanks,
Piotr

Double Clicking on Dash App Does not Reset Axes

Horizontal drag causes autoscale

expected behavior - keep y-range

first graph update seems significantly slower than consecutive ones when dealing with large data

Cannot seem to reproduce this bug with the use case of 3 traces, each having 90M data points. So will close this issue for now.

Unexpected inverted y-axis from `FigureResampler`

First, thanks for creating this! Our data can be very oversized for Plotly and this has the potential to save us a lot of development time!

Background:
I'm developing a library of Plotly figure functions for our varying datatypes (meteorological and oceanographic) and our rain sensor samples at 1-minute intervals, but our data extend from 10 months to 2+ years.

Problem:
It's on the verge of perfect, but I cannot determine why the y-axis is inverting.
Here is the code that generates a standard Plotly Scattergl plot that will always crash a browser or notebook:

fig = plotly.graph_objects.Figure()
fig.add_trace(
    plotly.graph_objects.Scattergl(
        x=pt0.RAIN.data.index,
        y=pt0.RAIN.data.loc[:, IDX[1,"1046",'Vol']],
        mode="markers",
        showlegend=True,
        name="Rain Vol",
    )
)

And here is the code that I'm attempting to use to generate a Resampler figure:

fig = FigureResampler(plotly.graph_objects.Figure())
fig.add_trace(
    plotly.graph_objects.Scattergl(
        name="Rain Vol", 
        showlegend=True,
        mode="markers",
    ),
    hf_x=pt0.RAIN.data.index, 
    hf_y=pt0.RAIN.data.loc[:, IDX[1,"1046",'Vol']],
    downsampler=EveryNthPoint(interleave_gaps=False),
    max_n_samples=int(pt0.RAIN.data.index.size / 4),
    limit_to_view=True,
)

fig.show_dash(mode="inline")

lttbc -> rounding errors when passing int-indexed data

fact: lttbc requires a int/float index as input and not a (datetime64) time-index.
As a result, this code was written, where the time-index series is converted into a int-index representing the time in nanoseconds.

However, we observed that rounding errors occur because this int-index is internally converted by ltbbc into a float index, after which we aim to again derive an int index ➡️ rounding errors.

As a result, this code adjustment was made, mitigating this rounding error in most cases.

Note that this is not 100% solved! Rounding errors can still occur. An ideal solution would be that LTTBc just returns the data index positions of the selected data-points

Streamlit support

Hi,

I tried to use the lib with Streamlit. The figure is displayed correctly but the resmapling does not seem to work. Is there a trick to make it work with Streamlit or is this not supported yet? If not, is there any way to support it?

Thanks,

Ghiles

Improve selenium tests

Perform Wireshark (request) logging to see whether the data effectively gets resampled
Log the console output to check whether no errors are thrown
Maybe also look at the events

Trying to make the resampler work with dynamic graphs

So I made this minimal example but I can not figure out why I can't get the callbacks to work.

"""
Minimal dynamic dash app example.
"""

import numpy as np
import plotly.graph_objects as go
import trace_updater
from dash import Dash, Input, Output, State, dcc, html
from plotly_resampler import FigureResampler

x = np.arange(1_000_000)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000

app = Dash(__name__)

fig = FigureResampler(go.Figure(go.Scatter(x=x, y=noisy_sin)))


app.layout = html.Div(
    [
        html.Div(
            children=[
                html.Button("Add Chart", id="add-chart", n_clicks=0),
            ]
        ),
        html.Div(id="container", children=[]),
    ]
)


@app.callback(
    Output("container", "children"),
    Input("add-chart", "n_clicks"),
    State("container", "children"),
)
def display_graphs(n_clicks: int, div_children: list[html.Div]) -> list[html.Div]:
    """
    This function is called when the button is clicked. It adds a new graph to the div.
    """
    figure = fig
    figure.register_update_graph_callback(
        app=app,
        graph_id=f"graph-id-{n_clicks}",
        trace_updater_id=f"trace-updater-id-{n_clicks}",
    )

    new_child = html.Div(
        children=[
            dcc.Graph(id=f"graph-id-{n_clicks}", figure=fig),
            trace_updater.TraceUpdater(
                id=f"trace-updater-id-{n_clicks}", gdID=f"graph-id-{n_clicks}"
            ),
        ],
    )
    div_children.append(new_child)
    return div_children


if __name__ == "__main__":
    app.run_server(debug=True)

No module named _lzma

While running the dash_app.py demo I am getting this error

I've installed required libraries (dash_bootstrap_components) and some other one's not sure why it's happening
the code requires lzma for importing from functional import seq

FigureWidget() update

Hi, very useful project, all my career I dream about such thing. It seems that it can make plotly usable in real life, not only in the iris dataset.

Is there a way to dynamic update the resampled FigureWidget instance? For example, in the Jupyter lab:

The last cell causes an update of the data in the chart if fig is an FigureWidget instance, but does not update if the instance is a FigureResampler(go.FigureWidget())

Test case:

import numpy as np
from plotly_resampler import FigureResampler

x = np.arange(1_000_000)
noisy_sin = (3 + np.sin(x / 15000) + np.random.randn(len(x)) / 10) * x / 1_000

fig = FigureResampler(go.FigureWidget())
fig.add_scattergl(name='noisy sine', showlegend=True, x=x, y=noisy_sin)

fig.update_layout(autosize=True, height=300, template=None, legend=dict(x=0.1, y=1, orientation="h"),
                  margin=dict(l=45, r=15, b=20, t=30, pad=3))
fig.show()

# does not update chart if fig is FigureResampler instance
with fig.batch_update():
    fig.data[0].y = -fig.data[0].y

PS: It seems that resampling only works in dash, but not in jupyterlab?

Flickering while updating one series

I often have several independently updated series on my charts, so I found that the chart redraws them all when updating only one, causing heavy flickering. Ideally, it would be nice if updating the hf_data property caused only that series to be redrawn, without the need to explicitly call reload_data(), which cannot determine which series has changed (at most you can add a parameter there that points to the changed series?)

Here is the demo and testcase:

dual.resampler.zip

update example notebooks

Main points:

add register_plotly_resampler / unregister_plotly_resampler
- show register_plotly_resampler + pd.options.plotting.backend = 'plotly'
show how add_traces can be used
show how add_trace and add_traces can be used with dict_input

to discuss: maybe also add the hack that when the trace type is not specified with a dict input, that we assume a scattergl will be used; instead of the default scatter behavior.

`FigureResampler` address already in use - takes way to long to throw error / quit

I think it is possible to catch this issue ourselves by looking prospective whether this port is in use b4 feeding it to the traceupdater

Box & lasso select do not work when resampling

Using plotly-resampler with dashapp?

I'm having some issues when rendering this figure with dashapp.

Firstly, I make a dashapp with the following controls:

controls = [
            dcc.Graph(
                id='uptime-graph',
               ''' some additional styling"""
                }
            ),
            dcc.Graph(
                id='timeseries-graph',
                figure={
                    'data': []
                    
                }
            )
        ]

I'm using an uptime graph to select specific trace segments I want to look at. then, I update 'timeseries-graph' with a callback upon selection within the uptime graph:

def update_timeseries(relayoutData):
    if new_coords is None or 'autosize' in new_coords.keys() or 'xaxis.autorange' \
        in new_coords.keys():
            return None
    start = new_coords['xaxis.range[0]']
    end   = new_coords['xaxis.range[1]']
    dict_frame = self.model.get_timeseries(start,end)
    n_titles, plotting_dict = self._restructure_data(dict_frame)

    fig = FigureResampler(
                        make_subplots(
                            rows=len(plotting_dict.keys()),
                            cols=1,
                            row_titles=n_titles,
                            vertical_spacing=0.001,
                            shared_xaxes=True),
                            default_n_shown_samples=5_000,
                                verbose=False,
                    )
    fig['layout'].update(height=1700)
    row_iterator = 1
    has_legend = {'ex':False,'ey':False,'hx':False,'hy':False,'hz':False}
    for station_key in plotting_dict.keys():
        for trace_data in plotting_dict[station_key]:
            color = self._get_trace_color(station_key)
            name, showlegend =self._legend_name_parser(has_legend,station_key)
            fig.add_trace(go.Scattergl(name=name, showlegend=showlegend,connectgaps=False,
                                                   line={'color':color,'dash':'solid'}), 
                            hf_x=trace_data.time, hf_y=trace_data['value'],row=row_iterator,col=1)
        row_iterator+=1
    print('updated timeseries figure')
    fig.show_dash(mode='inline')
    return fig


@dashapp.callback(
    Output('timeseries-graph', 'figure'),
    Input('uptime-graph', 'relayoutData'))
def uptime_data_select(relayoutData):
    fig = controller.update_timeseries_daterange(relayoutData)
    return fig

It kinda works, then begins to spit the same error every four seconds, preventing any further interaction with the webapp


Traceback (most recent call last):
  File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 2077, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1525, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1523, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1509, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/dash/dash.py", line 1382, in dispatch
    raise KeyError(msg.format(output)) from missing_callback_function
KeyError: "Callback function not found for output 'timeseries-graph.figure', perhaps you forgot to prepend the '@'?"
2022-05-11T15:10:10 [line 1455] local_app.log_exception - ERROR: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/Users/kevinmendoza/miniconda3/envs/mtpytest/lib/python3.9/site-packages/dash/dash.py", line 1344, in dispatch
    cb = self.callback_map[output]
KeyError: 'timeseries-graph.figure'

I suppose its possible i'm not using it correctly, but if I am, there appears to be an error with resampling hooking back into the graph.

Make `FigureResampler` work on existing figures

i.e. when we wrap a go.Figure with already some high-dim traces registered, we do not yet extract these traces and process them as new traces.

This issue addresses this problem and suggests that these pre-existing traces should be handled as potential to-be-downsampled traces.

e.g.

import plotly.express as px

# now this just shows the non-aggregated RAW data, very slow
FigureResampler(px.line(very_large_df, x='time', y='sensor_name'))

Fair questions:

How slow is px.line on high-freq data? (w.r.t. creating a scatter instead of whole go.Figure )

no support for dict representation of plotly traces

The following code works in plotly, but does not work when wrapping the constructor with FigureResampler.

trace = {
    "type": "scatter",
        #  "x": [1, 2, 3],
    "y": np.arange(5_000),
    "name": "some_trace",  # is not even required
}
fig = go.Figure()  # wrapping this will fail
fig.add_trace(trace)

We can support dict representation on 2 places;

in the constructor of AbstractFigureAggregator (-> not really sure if we want to do this)
in the add_trace method -> our type hinting indicates that we support dict input, but this is clearly not the case

Wondering about the license

Hello.

I really like this software and I have been using it a lot.

Now that I would want to use it in our company, where we run a local dash app. The app is used by me and my colleagues to share results with each other. But after reading the license it seems that I can not use this internally within our company?

Current LTTB (lttbc) algorithm is sometimes numerically unstable

There appears to be (very minor) changes in floating point numbers when using the lttbc LTTB implementation

This causes test test_fwr_time_based_data_s to fail sometimes

A fix for this would be using our own lttb algorithm (also implemented in C) -> see #84

As for now, we will disable this test (as this only occurs in very niche use-cases) and leave this issue open until this problem is resolved (by merging #84)

Support for plotly express

Hello - I really like this tool! Just wondering if it currently supports plotly express figures (specifically px.scatter). If not, is this in the roadmap for the near future?

Thanks!

Sam

On click callback in notebooks.

Is it possible to have the plotly on_click() listener work in a Jupyter notebook? If so, could you extend the sample notebook to demonstrate this?

Parrallelize the resampling

How will the verbose-argument work then?
-> maybe if verbose, set n_jobs to 0
use multiprocess to deal with local scope methods

FigureWidget(Resampler) does not integrate well in google colab

In google colab plotly FigureWidget & plotly-resampler FigureWidgetResampler:

🎉 work perfectly when creating an empty figure and displaying it while adding traces in another cell

Example with plotly FigureWidget:

Example with plotly-resampler FigureWidgetResampler:

😿 do not work when creating the figure and adding the data + displaying it in the same cell

Example for plotly FigureWidget & plotly-resampler FigureWidgetResampler

-> This relates to this issue; googlecolab/colabtools#2871
(however for FigureWidgetResampler it does not work for integer datatypes, whereas for FigureWidget this does work)

What is the impact of this problem?

Registing plotly-resampler in google colab will result in figures that are not displayed;

FYi: to use plotly FigureWidgets in google colab you should first execute the following code (see __init__.py);

from google.colab import output
output.enable_custom_widget_manager()

Cannot assing hf_data series when initial size is small

I make a graph that initially has a small amount of data (or no data at all), and then they are incrementally added. I have encountered that if I don't set hf_y at all (or pass less than 1000 points), then the hf_data property is not created (because there is no need to resample, I guess.) Is there a way to create it later, or create it even in case of small amount of data)?

Testcase:
resampler.debug.zip

Is there a simple way to convert a `FigureResampler` figure to a standard plotly Figure?

So that the standard plotly Figure includes all the points. My global objective is to save a full-points figure to HTML, and if I just do write_html on the FigureResampler figure, I only get a reduced-resolution version. It's OK if there is no way, but any thoughts would be appreciated.

I have tried

# given that `fig` is a resampled `FigureResampler` figure

from plotly_resampler import FigureResampler
fig2 = FigureResampler(fig, default_n_shown_samples = 100000)
fig2.write_html('overview.html')

# and

import plotly.graph_objs as go
fig3 = go.Figure(fig)
fig3.write_html('overview.html')

but it still saves a rescaled version.

Thanks! :)

Django support (django-plotly-dash library)

How does this library integrate with Django?

(Issue indicated by @thanosam in #45)

Question: does it work with large 2D-heatmaps?

I wonder if plotly-resampler works primarily with timelines or can be applied to large two dimensional heatmaps or histograms.

🐛 add_trace seems slow when using a series with `hf_x` and `hf_y` set as a series

Improve docs

Add more docs (+ examples) about:

How to integrate plotly-resampler with your custom dash app
Plotly-resampler & non hf-traces
Tips & tricks:
- optimizing your code for faster figure construction
- how to (not) add hf-traces and why you should do so
- Aliasing
Position ploty-resampler to other tools (Plotjuggler, DataShader, (Holoviews), FigureWidgets + interaction ...)
see plotly-resampler benchmarks

`FigureResampler` replace not working as it should when using a `go.Figure`

Unable to install `plotly-resampler` on some linux distributions

I am using plotly-resampler, which installs correctly on my local Windows machine and on another Linux machine I have access to (by just using pip install plotly-resampler). However, we also run Gitlab-CI tests in a controlled environment and installing with pip kept failing in that environment. The exact error was

Building wheel for lttbc (setup.py): started
  Building wheel for lttbc (setup.py): finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-jgc_t9dg/lttbc_[494](https://gitlab.edf-sf.com/optim/statistical_analysis/-/jobs/131076#L494)9a59daf574371b0f97218e19bdac5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-6yato32h
       cwd: /tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5/
  Complete output (16 lines):
  running bdist_wheel
  running build
  running build_ext
  building 'lttbc' extension
  creating build
  creating build/temp.linux-x86_64-3.9
  gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5 -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5 -I/usr/local/include/python3.9 -c lttbc.c -o build/temp.linux-x86_64-3.9/lttbc.o
  In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/syslimits.h:7,
                   from /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:34,
                   from /usr/local/include/python3.9/Python.h:11,
                   from lttbc.c:2:
  /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:195:15: fatal error: limits.h: No such file or directory
    195 | #include_next <limits.h>  /* recurse down to the real one */
        |               ^~~~~~~~~~
  compilation terminated.
  error: command '/usr/bin/gcc' failed with exit code 1
  ----------------------------------------
  ERROR: Failed building wheel for lttbc
  Running setup.py clean for lttbc
 Running setup.py install for lttbc: started
    Running setup.py install for lttbc: finished with status 'error'
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-uwdc83fp/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/lttbc
         cwd: /tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/
    Complete output (16 lines):
    running install
    running build
    running build_ext
    building 'lttbc' extension
    creating build
    creating build/temp.linux-x86_64-3.9
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851 -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851 -I/usr/local/include/python3.9 -c lttbc.c -o build/temp.linux-x86_64-3.9/lttbc.o
    In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/syslimits.h:7,
                     from /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:34,
                     from /usr/local/include/python3.9/Python.h:11,
                     from lttbc.c:2:
    /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:195:15: fatal error: limits.h: No such file or directory
      195 | #include_next <limits.h>  /* recurse down to the real one */
          |               ^~~~~~~~~~
    compilation terminated.
    error: command '/usr/bin/gcc' failed with exit code 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-uwdc83fp/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/lttbc Check the logs for full command output.

I am posting it here in case it helps other folks who might encounter the same problem.

I have played around with the test environment and was able to install all packages by executing

apt update && apt install -yqq --no-install-recommends gcc musl-dev linux-headers-amd64 libc-dev

before the pip command. This allowed me to install the apparently missing linux header, lttbc and ploty-resampler. However, for some reason resulted in an incompatibility with numpy:

------------------------------- Captured stderr --------------------------------
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
___________________ ERROR collecting pv/test_pv_generator.py ___________________
ImportError while importing test module '/builds/--YGsyLe/2/optim/statistical_analysis/tests/pv/test_pv_generator.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test.py:12: in <module>
    from plotly_resampler import FigureResampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/__init__.py:8: in <module>
    from .figure_resampler import FigureResampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/figure_resampler.py:28: in <module>
    from .downsamplers import AbstractSeriesDownsampler, LTTB
/usr/local/lib/python3.9/site-packages/plotly_resampler/downsamplers/__init__.py:5: in <module>
    from .downsamplers import LTTB, EveryNthPoint, AggregationDownsampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/downsamplers/downsamplers.py:8: in <module>
    import lttbc
E   ImportError: numpy.core.multiarray failed to import

So I abandoned.

As I said, I am publishing this info here in case someone stumbles on a similar issue, so feel free to close. However, I saw that lttbc is a top-level dependency of plotly-resampler and is still in early stages (version <1) and has not been updated since 2020. So there is little chance its python wheels will be changed anytime soon. So I wonder, whether on the plotly-resampler side we could add a try-except for lttbc import and fall back onto another resampler if lttbc is unavailable for import? Or, perhaps, if you have any idea of how to install the lttbc dependency without gcc compiling, it would be much appreciated!

I understand this is not directly related to ploty-resampler. I have thought about posting in lttbc instead, but the repo does not seem to be actively maintained. Thanks again for the resampler. Great idea!

Deprecation warnings from pandas

With pandas 1.4 or later I am getting deprecations warnings from figure_resampler.py

pandas.UInt64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.

Missing points

Why are points missing?
From the watching the gif linked from your research paper, I thought that it would resample once again when I zoom in.

Code for the data

from plotly_resampler import FigureResampler
x = np.arange(10_000)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000

Before Zooming

Not Resampled

Resampled

After Zooming

Not Resampled

Resampled

reset_axes() has no effect on subplot

I tried adding the bottom subplot and the behavior of reload_data and reset_axes is different from the top graph on it (even by manual cell run) - these calls have no effect on the second graph. Are they handled differently?

The test script: Untitled.zip

Different behavior with hf_x with type `int64` vs `float64`

Hello !
This is a really neat project ! Great job !

I have experienced a weird behavior. I work in a jupyter notebook inside vscode. The data has a shape of (6261089,).

This snippet work perfectly fine.

fig = FigureResampler(go.Figure())
fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=x.astype(np.int64), hf_y=raw_data)
fig.show_dash(mode='inline')

But when I changed the parameter hf_x=x.astype(np.float64)

fig = FigureResampler(go.Figure())
fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=x.astype(np.float64), hf_y=raw_data)
fig.show_dash(mode='inline')

I get the following error :

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/data_inspection.ipynb Cell 9' in <cell line: 13>()
     10 print(raw_data.shape, times.shape)
     12 fig = FigureResampler(go.Figure())
---> 13 fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=ranged_arr.astype(np.float64), hf_y=raw_data)
     14 fig.show_dash(mode='inline')

File ~/.venv/lib/python3.9/site-packages/plotly_resampler/figure_resampler/figure_resampler_interface.py:743, in AbstractFigureAggregator.add_trace(self, trace, max_n_samples, downsampler, limit_to_view, hf_x, hf_y, hf_hovertext, **trace_kwargs)
    732 trace = {
    733     k: trace[k]
    734     for k in set(trace.keys()).difference(
    735         {"text", "hovertext", "x", "y"}
    736     )
    737 }
    739 # NOTE:
    740 # If all the raw data needs to be sent to the javascript, and the trace
    741 # is high-frequency, this would take significant time!
    742 # Hence, you first downsample the trace.
--> 743 trace = self._check_update_trace_data(trace)
    744 assert trace is not None
    745 super(self._figure_class, self).add_trace(trace=trace, **trace_kwargs)

File ~/.venv/lib/python3.9/site-packages/plotly_resampler/figure_resampler/figure_resampler_interface.py:240, in AbstractFigureAggregator._check_update_trace_data(self, trace, start, end)
    238 # Downsample the data and store it in the trace-fields
    239 downsampler: AbstractSeriesAggregator = hf_trace_data["downsampler"]
--> 240 s_res: pd.Series = downsampler.aggregate(
    241     hf_series, hf_trace_data["max_n_samples"]
    242 )
    243 trace["x"] = s_res.index
    244 trace["y"] = s_res.values

File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregation_interface.py:142, in AbstractSeriesAggregator.aggregate(self, s, n_out)
    138     s = s.astype("uint8")
    140 if len(s) > n_out:
    141     # More samples that n_out -> perform data aggregation
--> 142     s = self._aggregate(s, n_out=n_out)
    144     # When data aggregation is performed -> we do not "insert" gaps but replace
    145     # The end of gap periods (i.e. the first non-gap sample) with None to
    146     # induce such gaps
    147     if self.interleave_gaps:

File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregators.py:240, in EfficientLTTB._aggregate(self, s, n_out)
    238 def _aggregate(self, s: pd.Series, n_out: int) -> pd.Series:
    239     if s.shape[0] > n_out * 1_000:
--> 240         s = self.minmax._aggregate(s, n_out * 50)
    241     return self.lttb._aggregate(s, n_out)

File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregators.py:134, in MinMaxOverlapAggregator._aggregate(self, s, n_out)
    127 offset = np.arange(
    128     0, stop=s.shape[0] - block_size - argmax_offset, step=block_size
    129 )
    131 # Calculate the argmin & argmax on the reshaped view of `s` &
    132 # add the corresponding offset
    133 argmin = (
--> 134     s[: block_size * offset.shape[0]]
    135     .values.reshape(-1, block_size)
    136     .argmin(axis=1)
    137     + offset
    138 )
    139 argmax = (
    140     s[argmax_offset : block_size * offset.shape[0] + argmax_offset]
    141     .values.reshape(-1, block_size)
   (...)
    144     + argmax_offset
    145 )
    146 # Sort the argmin & argmax (where we append the first and last index item)
    147 # and then slice the original series on these indexes.

ValueError: cannot reshape array of size 6260945 into shape (251)

Thank you !

Fix nans when resampling with LTTB

minimal snippet:

nb_samples = 10_000_000

x = np.arange(nb_samples).astype(np.uint32)
y = np.sin(x/300).astype(np.float32) + np.random.randn(nb_samples) / 5

s = pd.Series(index=x, data=y)

from context_aware.visualizations.plotly.downsamplers import LTTB
out = LTTB().downsample(s, n_out=3000)
out
# 0          0.263548
# 1374      -1.791638
# 4183       1.557843
# 7118      -1.570702
# 10017      1.331199
#             ...   
# 9990631    1.491713
# 9993480   -1.437206
# 9999224   -1.541181
# 9999224         NaN
# 9999999    0.778644
out.isna().sum()
# 149

Add windows and mac as platforms to the pytest testing matrix

Should be roughly the same as we did with the tsflex packages @jvdd.

Resampler fails for pd.Series if the column is not selected

Normally, plotly allows one not to specify (the only) column of a series one is plotting. So the following code works fine:

fig = go.Figure()
ser = pd.Series(index = np.arange(100), data = {'a': np.arange(100)})
fig.add_trace(
go.Scattergl(
    x=ser.index,
    y=ser,
)
fig.show()

However, it fails when combining it with the FigureResampler with the following error:

C:\tools\miniconda3\envs\stat-analysis-3.7\lib\site-packages\plotly_resampler\figure_resampler.py in add_trace(self, trace, max_n_samples, downsampler, limit_to_view, hf_x, hf_y, hf_hovertext, **trace_kwargs)
    531             if pd.isna(hf_y).any():
    532                 not_nan_mask = ~pd.isna(hf_y)
--> 533                 hf_x = hf_x[not_nan_mask]
    534                 hf_y = hf_y[not_nan_mask]
    535                 if isinstance(hf_hovertext, np.ndarray):

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

The problem is solved by specifying the column of the series to plot, i.e. y=ser['a']. It took me several minutes to find it though and I guess plotly-resampler was designed to be transparent.

Perhaps, an if instance(pd.Series) could be added to treat this specific case? Or could there be another way for checking if isna that works for such series as well?

In any way, thanks for the great software!

Are Candlestick charts exporter?

Hello and thank you for this awesome project.

It works fine for normal arrays, but it doesn't seem to do anything for candlestick charts.
Is this normal?

Thx in advance!
Piotr

Look into anti-aliasing downsamplers

Resample only the visible traces

Only resample the traces that are shown, and do not resample the not shown (i.e., toggled off) traces.

This is not straightforward, as toggling a trace in the legend does not result in a callback event (i.e., pure front-end event).
To fix this, we should pass the visibility state of the front-end via the trace-updater. Needs some more thought..

LTTB - 0.2.1 not working

TODO:

Look into using other, (more optimized) LTTB versions, such as @mcourteaux its awesome lttb implementation

Triple click within graph resets axes, but data doesn't get updated

Still need to investigate where it goes awry.

Inconsistent behavior of `show` method

Currently, when you call the show method on a FigureResampler object it returns a down sampled version of the figure that does not resample, which looks like a "broken" figure when zooming in. This behavior is inconsistent with standard Plotly figures and could create confusion.
I see two possible solutions to solve this problem:

Making the behavior consistent with Plotly by returning the original (not down sampled) figure when show is called. Although this could be annoying if you are working with a lot of points and inadvertently call show and crash your kernel/browser as a result.
Keep the same behavior as now, but throw a warning message. As this might not result in the expected behavior.

Roadmap

This issue is a request to the community to submit what their vision of plotly-resampler is, which features are still worth implementing?

Some features which I find worth pursuing:

Summary dataset statistics of the respective view (initially in a table)
-> e.g. a Table with df.describe() for each series that is shown.
Options to annotate your data
No idea how to implement this with plotly, would think that you would need to define your own OO structure to do so (which re-uses underlying annotations), but this would still imply a lot of logic and design decisions (e.g. loading annotations, saving annotations, ...)
Other aggregation methods (e.g. using width of trace for uncertainty / downsampling rate)
just a gist, but using a plot-per-pixel ratio of 2 with LTTB seems a rather good IDEA to me.
Also, playing with the line-width seems a valid path to embark upon.

FigureResampler removes annotations

Don't know if this is an enhancement or a bug. But it seems that calling FigureResampler removes the text labels from the plot. At least if the number of points exceeds default_n_shown_samples. Here is a simple example illustrating the issue in a dash application.

To see the labels you can uncomment the fig without the FigureResampler function.

I guess the wanted behavior if lttb removes the point with the label is to not show that label. That is also probably the easiest to implement.

import numpy as np
import plotly.graph_objects as go
import trace_updater
from dash import Dash, dcc, html
from plotly_resampler import FigureResampler

app = Dash(__name__)


# Create labels for the plot, but only on every 500th point
labels = [""] * 1_000_0
labels[500::501] = ["label" for x in labels[500::501]]

# Create the x and y values for the plot
x = np.arange(1_000_0)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000


fig = FigureResampler(go.Figure())
fig.add_trace(
    go.Scatter(
        x=x,
        y=noisy_sin,
        mode="lines+text",
        text=labels,
        textposition="top center",
    ),
)
fig.register_update_graph_callback(
    app=app, graph_id="graph-id", trace_updater_id="trace-updater"
)

# fig = go.Figure()
# fig.add_trace(
#     go.Scatter(
#         x=x,
#         y=noisy_sin,
#         mode="lines+text",
#         text=labels,
#         textposition="top center",
#     )
# )


app.layout = html.Div(
    [
        dcc.Graph(id="graph-id", figure=fig, mathjax=True),
        trace_updater.TraceUpdater(
            id="trace-updater",
            gdID="graph-id",
            sequentialUpdate=False,
        ),
    ],
)


if __name__ == "__main__":
    app.run_server(debug=True)

predict-idlab / plotly-resampler Goto Github PK

plotly-resampler's People

Contributors

Stargazers

Watchers

Forkers

plotly-resampler's Issues

🎉 work perfectly when creating an empty figure and displaying it while adding traces in another cell

😿 do not work when creating the figure and adding the data + displaying it in the same cell

What is the impact of this problem?

Code for the data

Before Zooming

Not Resampled

Resampled

After Zooming

Not Resampled

Resampled

Recommend Projects

Recommend Topics

Recommend Org