predict-idlab / plotly-resampler Goto Github PK
View Code? Open in Web Editor NEWVisualize large time series data with plotly.py
Home Page: https://predict-idlab.github.io/plotly-resampler/latest
License: MIT License
Visualize large time series data with plotly.py
Home Page: https://predict-idlab.github.io/plotly-resampler/latest
License: MIT License
When running your example notebook I get the following error:
AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')
Maybe it's a plotly version issue?
Cheers
Do you have plans to add support for heatmap-series resampling? it's just a 2D picture and there are many implementations of fast efficient image resampling algorithms...
Hi,
First thing - thanks for your work ๐ This Plotly-Resampler allow me to handle 30bln datapoints plots. Really great job!
Unfortunately I have a problem with your library combined with PyInstaller. Code works properly inside PyCharm but when I add to the code "from plotly_resampler import FigureResampler", build standalone .exe file and run it then it crash without any error.
Do you know what can be a root-cause or how can I WA this issue?
Ofc I added all needed dependencies to the *.spec file.
Update: when I removed Jupyter and show_dash() function then .exe file starts working so maybe there is some routing conflict between two local servers?
Thanks,
Piotr
Cannot seem to reproduce this bug with the use case of 3 traces, each having 90M data points. So will close this issue for now.
First, thanks for creating this! Our data can be very oversized for Plotly and this has the potential to save us a lot of development time!
Background:
I'm developing a library of Plotly figure functions for our varying datatypes (meteorological and oceanographic) and our rain sensor samples at 1-minute intervals, but our data extend from 10 months to 2+ years.
Problem:
It's on the verge of perfect, but I cannot determine why the y-axis is inverting.
Here is the code that generates a standard Plotly Scattergl
plot that will always crash a browser or notebook:
fig = plotly.graph_objects.Figure()
fig.add_trace(
plotly.graph_objects.Scattergl(
x=pt0.RAIN.data.index,
y=pt0.RAIN.data.loc[:, IDX[1,"1046",'Vol']],
mode="markers",
showlegend=True,
name="Rain Vol",
)
)
And here is the code that I'm attempting to use to generate a Resampler
figure:
fig = FigureResampler(plotly.graph_objects.Figure())
fig.add_trace(
plotly.graph_objects.Scattergl(
name="Rain Vol",
showlegend=True,
mode="markers",
),
hf_x=pt0.RAIN.data.index,
hf_y=pt0.RAIN.data.loc[:, IDX[1,"1046",'Vol']],
downsampler=EveryNthPoint(interleave_gaps=False),
max_n_samples=int(pt0.RAIN.data.index.size / 4),
limit_to_view=True,
)
fig.show_dash(mode="inline")
fact: lttbc requires a int/float index as input and not a (datetime64) time-index.
As a result, this code was written, where the time-index series is converted into a int-index representing the time in nanoseconds.
However, we observed that rounding errors occur because this int-index is internally converted by ltbbc into a float index, after which we aim to again derive an int index โก๏ธ rounding errors.
As a result, this code adjustment was made, mitigating this rounding error in most cases.
Note that this is not 100% solved! Rounding errors can still occur. An ideal solution would be that LTTBc just returns the data index positions of the selected data-points
Hi,
I tried to use the lib with Streamlit. The figure is displayed correctly but the resmapling does not seem to work. Is there a trick to make it work with Streamlit or is this not supported yet? If not, is there any way to support it?
Thanks,
Ghiles
So I made this minimal example but I can not figure out why I can't get the callbacks to work.
`
"""
Minimal dynamic dash app example.
"""
import numpy as np
import plotly.graph_objects as go
import trace_updater
from dash import Dash, Input, Output, State, dcc, html
from plotly_resampler import FigureResampler
x = np.arange(1_000_000)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
app = Dash(__name__)
fig = FigureResampler(go.Figure(go.Scatter(x=x, y=noisy_sin)))
app.layout = html.Div(
[
html.Div(
children=[
html.Button("Add Chart", id="add-chart", n_clicks=0),
]
),
html.Div(id="container", children=[]),
]
)
@app.callback(
Output("container", "children"),
Input("add-chart", "n_clicks"),
State("container", "children"),
)
def display_graphs(n_clicks: int, div_children: list[html.Div]) -> list[html.Div]:
"""
This function is called when the button is clicked. It adds a new graph to the div.
"""
figure = fig
figure.register_update_graph_callback(
app=app,
graph_id=f"graph-id-{n_clicks}",
trace_updater_id=f"trace-updater-id-{n_clicks}",
)
new_child = html.Div(
children=[
dcc.Graph(id=f"graph-id-{n_clicks}", figure=fig),
trace_updater.TraceUpdater(
id=f"trace-updater-id-{n_clicks}", gdID=f"graph-id-{n_clicks}"
),
],
)
div_children.append(new_child)
return div_children
if __name__ == "__main__":
app.run_server(debug=True)
`
Hi, very useful project, all my career I dream about such thing. It seems that it can make plotly usable in real life, not only in the iris dataset.
Is there a way to dynamic update the resampled FigureWidget instance? For example, in the Jupyter lab:
The last cell causes an update of the data in the chart if fig is an FigureWidget instance, but does not update if the instance is a FigureResampler(go.FigureWidget())
Test case:
import numpy as np
from plotly_resampler import FigureResampler
x = np.arange(1_000_000)
noisy_sin = (3 + np.sin(x / 15000) + np.random.randn(len(x)) / 10) * x / 1_000
fig = FigureResampler(go.FigureWidget())
fig.add_scattergl(name='noisy sine', showlegend=True, x=x, y=noisy_sin)
fig.update_layout(autosize=True, height=300, template=None, legend=dict(x=0.1, y=1, orientation="h"),
margin=dict(l=45, r=15, b=20, t=30, pad=3))
fig.show()
# does not update chart if fig is FigureResampler instance
with fig.batch_update():
fig.data[0].y = -fig.data[0].y
PS: It seems that resampling only works in dash, but not in jupyterlab?
I often have several independently updated series on my charts, so I found that the chart redraws them all when updating only one, causing heavy flickering. Ideally, it would be nice if updating the hf_data
property caused only that series to be redrawn, without the need to explicitly call reload_data()
, which cannot determine which series has changed (at most you can add a parameter there that points to the changed series?)
Here is the demo and testcase:
dual.resampler.zip
Main points:
register_plotly_resampler
/ unregister_plotly_resampler
register_plotly_resampler
+ pd.options.plotting.backend
= 'plotly'add_traces
can be usedadd_trace
and add_traces
can be used with dict_inputto discuss: maybe also add the hack that when the trace type is not specified with a dict input, that we assume a scattergl will be used; instead of the default scatter behavior.
I'm having some issues when rendering this figure with dashapp.
Firstly, I make a dashapp with the following controls:
controls = [
dcc.Graph(
id='uptime-graph',
''' some additional styling"""
}
),
dcc.Graph(
id='timeseries-graph',
figure={
'data': []
}
)
]
I'm using an uptime graph to select specific trace segments I want to look at. then, I update 'timeseries-graph' with a callback upon selection within the uptime graph:
def update_timeseries(relayoutData):
if new_coords is None or 'autosize' in new_coords.keys() or 'xaxis.autorange' \
in new_coords.keys():
return None
start = new_coords['xaxis.range[0]']
end = new_coords['xaxis.range[1]']
dict_frame = self.model.get_timeseries(start,end)
n_titles, plotting_dict = self._restructure_data(dict_frame)
fig = FigureResampler(
make_subplots(
rows=len(plotting_dict.keys()),
cols=1,
row_titles=n_titles,
vertical_spacing=0.001,
shared_xaxes=True),
default_n_shown_samples=5_000,
verbose=False,
)
fig['layout'].update(height=1700)
row_iterator = 1
has_legend = {'ex':False,'ey':False,'hx':False,'hy':False,'hz':False}
for station_key in plotting_dict.keys():
for trace_data in plotting_dict[station_key]:
color = self._get_trace_color(station_key)
name, showlegend =self._legend_name_parser(has_legend,station_key)
fig.add_trace(go.Scattergl(name=name, showlegend=showlegend,connectgaps=False,
line={'color':color,'dash':'solid'}),
hf_x=trace_data.time, hf_y=trace_data['value'],row=row_iterator,col=1)
row_iterator+=1
print('updated timeseries figure')
fig.show_dash(mode='inline')
return fig
@dashapp.callback(
Output('timeseries-graph', 'figure'),
Input('uptime-graph', 'relayoutData'))
def uptime_data_select(relayoutData):
fig = controller.update_timeseries_daterange(relayoutData)
return fig
It kinda works, then begins to spit the same error every four seconds, preventing any further interaction with the webapp
Traceback (most recent call last):
File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 2077, in wsgi_app
response = self.full_dispatch_request()
File "/Users.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1525, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1523, in full_dispatch_request
rv = self.dispatch_request()
File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/flask/app.py", line 1509, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
File "/Users/.../miniconda3/envs/mtpytest/lib/python3.9/site-packages/dash/dash.py", line 1382, in dispatch
raise KeyError(msg.format(output)) from missing_callback_function
KeyError: "Callback function not found for output 'timeseries-graph.figure', perhaps you forgot to prepend the '@'?"
2022-05-11T15:10:10 [line 1455] local_app.log_exception - ERROR: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
File "/Users/kevinmendoza/miniconda3/envs/mtpytest/lib/python3.9/site-packages/dash/dash.py", line 1344, in dispatch
cb = self.callback_map[output]
KeyError: 'timeseries-graph.figure'
I suppose its possible i'm not using it correctly, but if I am, there appears to be an error with resampling hooking back into the graph.
i.e. when we wrap a go.Figure
with already some high-dim traces registered, we do not yet extract these traces and process them as new traces.
This issue addresses this problem and suggests that these pre-existing traces should be handled as potential to-be-downsampled traces.
e.g.
import plotly.express as px
# now this just shows the non-aggregated RAW data, very slow
FigureResampler(px.line(very_large_df, x='time', y='sensor_name'))
Fair questions:
The following code works in plotly, but does not work when wrapping the constructor with FigureResampler
.
trace = {
"type": "scatter",
# "x": [1, 2, 3],
"y": np.arange(5_000),
"name": "some_trace", # is not even required
}
fig = go.Figure() # wrapping this will fail
fig.add_trace(trace)
We can support dict representation on 2 places;
AbstractFigureAggregator
(-> not really sure if we want to do this)add_trace
method -> our type hinting indicates that we support dict input, but this is clearly not the caseHello.
I really like this software and I have been using it a lot.
Now that I would want to use it in our company, where we run a local dash app. The app is used by me and my colleagues to share results with each other. But after reading the license it seems that I can not use this internally within our company?
There appears to be (very minor) changes in floating point numbers when using the lttbc LTTB implementation
This causes test test_fwr_time_based_data_s
to fail sometimes
A fix for this would be using our own lttb algorithm (also implemented in C) -> see #84
As for now, we will disable this test (as this only occurs in very niche use-cases) and leave this issue open until this problem is resolved (by merging #84)
Hello - I really like this tool! Just wondering if it currently supports plotly express figures (specifically px.scatter
). If not, is this in the roadmap for the near future?
Thanks!
Sam
Is it possible to have the plotly on_click() listener work in a Jupyter notebook? If so, could you extend the sample notebook to demonstrate this?
In google colab plotly FigureWidget
& plotly-resampler FigureWidgetResampler
:
Example with plotly FigureWidget
:
Example with plotly-resampler FigureWidgetResampler
:
Example for plotly FigureWidget
& plotly-resampler FigureWidgetResampler
-> This relates to this issue; googlecolab/colabtools#2871
(however for FigureWidgetResampler
it does not work for integer datatypes, whereas for FigureWidget
this does work)
Registing plotly-resampler in google colab will result in figures that are not displayed;
FYi: to use plotly FigureWidgets in google colab you should first execute the following code (see __init__.py
);
from google.colab import output
output.enable_custom_widget_manager()
I make a graph that initially has a small amount of data (or no data at all), and then they are incrementally added. I have encountered that if I don't set hf_y at all (or pass less than 1000 points), then the hf_data property is not created (because there is no need to resample, I guess.) Is there a way to create it later, or create it even in case of small amount of data)?
Testcase:
resampler.debug.zip
So that the standard plotly Figure includes all the points. My global objective is to save a full-points figure to HTML, and if I just do write_html
on the FigureResampler
figure, I only get a reduced-resolution version. It's OK if there is no way, but any thoughts would be appreciated.
I have tried
# given that `fig` is a resampled `FigureResampler` figure
from plotly_resampler import FigureResampler
fig2 = FigureResampler(fig, default_n_shown_samples = 100000)
fig2.write_html('overview.html')
# and
import plotly.graph_objs as go
fig3 = go.Figure(fig)
fig3.write_html('overview.html')
but it still saves a rescaled version.
Thanks! :)
I wonder if plotly-resampler
works primarily with timelines or can be applied to large two dimensional heatmaps or histograms.
Add more docs (+ examples) about:
I am using plotly-resampler
, which installs correctly on my local Windows machine and on another Linux machine I have access to (by just using pip install plotly-resampler
). However, we also run Gitlab-CI
tests in a controlled environment and installing with pip
kept failing in that environment. The exact error was
Building wheel for lttbc (setup.py): started
Building wheel for lttbc (setup.py): finished with status 'error'
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-jgc_t9dg/lttbc_[494](https://gitlab.edf-sf.com/optim/statistical_analysis/-/jobs/131076#L494)9a59daf574371b0f97218e19bdac5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-6yato32h
cwd: /tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5/
Complete output (16 lines):
running bdist_wheel
running build
running build_ext
building 'lttbc' extension
creating build
creating build/temp.linux-x86_64-3.9
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5 -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-jgc_t9dg/lttbc_4949a59daf574371b0f97218e19bdac5 -I/usr/local/include/python3.9 -c lttbc.c -o build/temp.linux-x86_64-3.9/lttbc.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/syslimits.h:7,
from /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:34,
from /usr/local/include/python3.9/Python.h:11,
from lttbc.c:2:
/usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:195:15: fatal error: limits.h: No such file or directory
195 | #include_next <limits.h> /* recurse down to the real one */
| ^~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1
----------------------------------------
ERROR: Failed building wheel for lttbc
Running setup.py clean for lttbc
Running setup.py install for lttbc: started
Running setup.py install for lttbc: finished with status 'error'
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-uwdc83fp/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/lttbc
cwd: /tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/
Complete output (16 lines):
running install
running build
running build_ext
building 'lttbc' extension
creating build
creating build/temp.linux-x86_64-3.9
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851 -I/usr/local/lib/python3.9/site-packages/numpy/core/include -I/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851 -I/usr/local/include/python3.9 -c lttbc.c -o build/temp.linux-x86_64-3.9/lttbc.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/syslimits.h:7,
from /usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:34,
from /usr/local/include/python3.9/Python.h:11,
from lttbc.c:2:
/usr/lib/gcc/x86_64-linux-gnu/10/include/limits.h:195:15: fatal error: limits.h: No such file or directory
195 | #include_next <limits.h> /* recurse down to the real one */
| ^~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lgmdp697/lttbc_3817a2fad4f7468ea159ce68739ae851/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-uwdc83fp/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/lttbc Check the logs for full command output.
I am posting it here in case it helps other folks who might encounter the same problem.
I have played around with the test environment and was able to install all packages by executing
apt update && apt install -yqq --no-install-recommends gcc musl-dev linux-headers-amd64 libc-dev
before the pip command. This allowed me to install the apparently missing linux header, lttbc
and ploty-resampler
. However, for some reason resulted in an incompatibility with numpy
:
------------------------------- Captured stderr --------------------------------
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
___________________ ERROR collecting pv/test_pv_generator.py ___________________
ImportError while importing test module '/builds/--YGsyLe/2/optim/statistical_analysis/tests/pv/test_pv_generator.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/test.py:12: in <module>
from plotly_resampler import FigureResampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/__init__.py:8: in <module>
from .figure_resampler import FigureResampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/figure_resampler.py:28: in <module>
from .downsamplers import AbstractSeriesDownsampler, LTTB
/usr/local/lib/python3.9/site-packages/plotly_resampler/downsamplers/__init__.py:5: in <module>
from .downsamplers import LTTB, EveryNthPoint, AggregationDownsampler
/usr/local/lib/python3.9/site-packages/plotly_resampler/downsamplers/downsamplers.py:8: in <module>
import lttbc
E ImportError: numpy.core.multiarray failed to import
So I abandoned.
As I said, I am publishing this info here in case someone stumbles on a similar issue, so feel free to close. However, I saw that lttbc
is a top-level dependency of plotly-resampler
and is still in early stages (version <1) and has not been updated since 2020. So there is little chance its python wheels will be changed anytime soon. So I wonder, whether on the plotly-resampler
side we could add a try-except
for lttbc
import and fall back onto another resampler if lttbc
is unavailable for import? Or, perhaps, if you have any idea of how to install the lttbc
dependency without gcc
compiling, it would be much appreciated!
I understand this is not directly related to ploty-resampler
. I have thought about posting in lttbc
instead, but the repo does not seem to be actively maintained. Thanks again for the resampler
. Great idea!
With pandas 1.4 or later I am getting deprecations warnings from figure_resampler.py
pandas.UInt64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
Why are points missing?
From the watching the gif linked from your research paper, I thought that it would resample once again when I zoom in.
from plotly_resampler import FigureResampler
x = np.arange(10_000)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
I tried adding the bottom subplot and the behavior of reload_data
and reset_axes
is different from the top graph on it (even by manual cell run) - these calls have no effect on the second graph. Are they handled differently?
The test script: Untitled.zip
Hello !
This is a really neat project ! Great job !
I have experienced a weird behavior. I work in a jupyter notebook
inside vscode
. The data has a shape of (6261089,)
.
This snippet work perfectly fine.
fig = FigureResampler(go.Figure())
fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=x.astype(np.int64), hf_y=raw_data)
fig.show_dash(mode='inline')
But when I changed the parameter hf_x=x.astype(np.float64)
fig = FigureResampler(go.Figure())
fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=x.astype(np.float64), hf_y=raw_data)
fig.show_dash(mode='inline')
I get the following error :
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/data_inspection.ipynb Cell 9' in <cell line: 13>()
10 print(raw_data.shape, times.shape)
12 fig = FigureResampler(go.Figure())
---> 13 fig.add_trace(go.Scattergl(name='Trace', showlegend=True), hf_x=ranged_arr.astype(np.float64), hf_y=raw_data)
14 fig.show_dash(mode='inline')
File ~/.venv/lib/python3.9/site-packages/plotly_resampler/figure_resampler/figure_resampler_interface.py:743, in AbstractFigureAggregator.add_trace(self, trace, max_n_samples, downsampler, limit_to_view, hf_x, hf_y, hf_hovertext, **trace_kwargs)
732 trace = {
733 k: trace[k]
734 for k in set(trace.keys()).difference(
735 {"text", "hovertext", "x", "y"}
736 )
737 }
739 # NOTE:
740 # If all the raw data needs to be sent to the javascript, and the trace
741 # is high-frequency, this would take significant time!
742 # Hence, you first downsample the trace.
--> 743 trace = self._check_update_trace_data(trace)
744 assert trace is not None
745 super(self._figure_class, self).add_trace(trace=trace, **trace_kwargs)
File ~/.venv/lib/python3.9/site-packages/plotly_resampler/figure_resampler/figure_resampler_interface.py:240, in AbstractFigureAggregator._check_update_trace_data(self, trace, start, end)
238 # Downsample the data and store it in the trace-fields
239 downsampler: AbstractSeriesAggregator = hf_trace_data["downsampler"]
--> 240 s_res: pd.Series = downsampler.aggregate(
241 hf_series, hf_trace_data["max_n_samples"]
242 )
243 trace["x"] = s_res.index
244 trace["y"] = s_res.values
File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregation_interface.py:142, in AbstractSeriesAggregator.aggregate(self, s, n_out)
138 s = s.astype("uint8")
140 if len(s) > n_out:
141 # More samples that n_out -> perform data aggregation
--> 142 s = self._aggregate(s, n_out=n_out)
144 # When data aggregation is performed -> we do not "insert" gaps but replace
145 # The end of gap periods (i.e. the first non-gap sample) with None to
146 # induce such gaps
147 if self.interleave_gaps:
File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregators.py:240, in EfficientLTTB._aggregate(self, s, n_out)
238 def _aggregate(self, s: pd.Series, n_out: int) -> pd.Series:
239 if s.shape[0] > n_out * 1_000:
--> 240 s = self.minmax._aggregate(s, n_out * 50)
241 return self.lttb._aggregate(s, n_out)
File ~/.venv/lib/python3.9/site-packages/plotly_resampler/aggregation/aggregators.py:134, in MinMaxOverlapAggregator._aggregate(self, s, n_out)
127 offset = np.arange(
128 0, stop=s.shape[0] - block_size - argmax_offset, step=block_size
129 )
131 # Calculate the argmin & argmax on the reshaped view of `s` &
132 # add the corresponding offset
133 argmin = (
--> 134 s[: block_size * offset.shape[0]]
135 .values.reshape(-1, block_size)
136 .argmin(axis=1)
137 + offset
138 )
139 argmax = (
140 s[argmax_offset : block_size * offset.shape[0] + argmax_offset]
141 .values.reshape(-1, block_size)
(...)
144 + argmax_offset
145 )
146 # Sort the argmin & argmax (where we append the first and last index item)
147 # and then slice the original series on these indexes.
ValueError: cannot reshape array of size 6260945 into shape (251)
Thank you !
minimal snippet:
nb_samples = 10_000_000
x = np.arange(nb_samples).astype(np.uint32)
y = np.sin(x/300).astype(np.float32) + np.random.randn(nb_samples) / 5
s = pd.Series(index=x, data=y)
from context_aware.visualizations.plotly.downsamplers import LTTB
out = LTTB().downsample(s, n_out=3000)
out
# 0 0.263548
# 1374 -1.791638
# 4183 1.557843
# 7118 -1.570702
# 10017 1.331199
# ...
# 9990631 1.491713
# 9993480 -1.437206
# 9999224 -1.541181
# 9999224 NaN
# 9999999 0.778644
out.isna().sum()
# 149
Should be roughly the same as we did with the tsflex
packages @jvdd.
Normally, plotly allows one not to specify (the only) column of a series one is plotting. So the following code works fine:
fig = go.Figure()
ser = pd.Series(index = np.arange(100), data = {'a': np.arange(100)})
fig.add_trace(
go.Scattergl(
x=ser.index,
y=ser,
)
fig.show()
However, it fails when combining it with the FigureResampler
with the following error:
C:\tools\miniconda3\envs\stat-analysis-3.7\lib\site-packages\plotly_resampler\figure_resampler.py in add_trace(self, trace, max_n_samples, downsampler, limit_to_view, hf_x, hf_y, hf_hovertext, **trace_kwargs)
531 if pd.isna(hf_y).any():
532 not_nan_mask = ~pd.isna(hf_y)
--> 533 hf_x = hf_x[not_nan_mask]
534 hf_y = hf_y[not_nan_mask]
535 if isinstance(hf_hovertext, np.ndarray):
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
The problem is solved by specifying the column of the series to plot, i.e. y=ser['a']
. It took me several minutes to find it though and I guess plotly-resampler
was designed to be transparent.
Perhaps, an if instance(pd.Series)
could be added to treat this specific case? Or could there be another way for checking if isna
that works for such series as well?
In any way, thanks for the great software!
Hello and thank you for this awesome project.
It works fine for normal arrays, but it doesn't seem to do anything for candlestick charts.
Is this normal?
Thx in advance!
Piotr
Only resample the traces that are shown, and do not resample the not shown (i.e., toggled off) traces.
This is not straightforward, as toggling a trace in the legend does not result in a callback event (i.e., pure front-end event).
To fix this, we should pass the visibility state of the front-end via the trace-updater. Needs some more thought..
TODO:
Still need to investigate where it goes awry.
Currently, when you call the show
method on a FigureResampler object it returns a down sampled version of the figure that does not resample, which looks like a "broken" figure when zooming in. This behavior is inconsistent with standard Plotly figures and could create confusion.
I see two possible solutions to solve this problem:
show
is called. Although this could be annoying if you are working with a lot of points and inadvertently call show and crash your kernel/browser as a result.This issue is a request to the community to submit what their vision of plotly-resampler is, which features are still worth implementing?
Some features which I find worth pursuing:
Don't know if this is an enhancement or a bug. But it seems that calling FigureResampler removes the text labels from the plot. At least if the number of points exceeds default_n_shown_samples
. Here is a simple example illustrating the issue in a dash application.
To see the labels you can uncomment the fig without the FigureResampler
function.
I guess the wanted behavior if lttb
removes the point with the label is to not show that label. That is also probably the easiest to implement.
import numpy as np
import plotly.graph_objects as go
import trace_updater
from dash import Dash, dcc, html
from plotly_resampler import FigureResampler
app = Dash(__name__)
# Create labels for the plot, but only on every 500th point
labels = [""] * 1_000_0
labels[500::501] = ["label" for x in labels[500::501]]
# Create the x and y values for the plot
x = np.arange(1_000_0)
noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
fig = FigureResampler(go.Figure())
fig.add_trace(
go.Scatter(
x=x,
y=noisy_sin,
mode="lines+text",
text=labels,
textposition="top center",
),
)
fig.register_update_graph_callback(
app=app, graph_id="graph-id", trace_updater_id="trace-updater"
)
# fig = go.Figure()
# fig.add_trace(
# go.Scatter(
# x=x,
# y=noisy_sin,
# mode="lines+text",
# text=labels,
# textposition="top center",
# )
# )
app.layout = html.Div(
[
dcc.Graph(id="graph-id", figure=fig, mathjax=True),
trace_updater.TraceUpdater(
id="trace-updater",
gdID="graph-id",
sequentialUpdate=False,
),
],
)
if __name__ == "__main__":
app.run_server(debug=True)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.