Comments (21)
As far as I see plotly template is not JSON but JavaScript code, which is problematic from security perspective.
Not sure If I understand. At least on DVC side, plotly
would behave "exactly" as vega
.
For an example linear plot, we would have a JSON template with placeholders:
vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"width": 300,
"height": 300,
"layer": [
{
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
},
"layer": [
{
"mark": "line"
},
{
"selection": {
"label": {
"type": "single",
"nearest": true,
"on": "mouseover",
"encodings": [
"x"
],
"empty": "none",
"clear": "mouseout"
}
},
"mark": "point",
"encoding": {
"opacity": {
"condition": {
"selection": "label",
"value": 1
},
"value": 0
}
}
}
]
},
{
"transform": [
{
"filter": {
"selection": "label"
}
}
],
"layer": [
{
"mark": {
"type": "rule",
"color": "gray"
},
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative"
}
}
},
{
"encoding": {
"text": {
"type": "quantitative",
"field": "<DVC_METRIC_Y>"
},
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative"
}
},
"layer": [
{
"mark": {
"type": "text",
"align": "left",
"dx": 5,
"dy": -5
},
"encoding": {
"color": {
"type": "nominal",
"field": "rev"
}
}
}
]
}
]
}
]
}
plotly
{
"legendgroup": "<DVC_METRIC_COLOR>",
"line": {
"dash": "solid"
},
"marker": {
"symbol": "circle"
},
"mode": "lines",
"name": "<DVC_METRIC_COLOR>",
"orientation": "v",
"showlegend": true,
"type": "scatter",
"xaxis": "<DVC_METRIC_X_LABEL>",
"yaxis": "<DVC_METRIC_Y_LABEL>",
"x": "<DVC_METRIC_X>",
"y": "<DVC_METRIC_Y>"
}
That would get filled with datapoints collected by DVC plots and embedded in an HTML div:
vega
<div id = "{id}">
<script type = "text/javascript">
var spec = {partial};
vegaEmbed('#{id}', spec);
</script>
</div>
plotly
<div id = "{id}">
<script type = "text/javascript">
var plotly_data = {partial};
Plotly.newPlot("{id}", plotly_data.data, plotly_data.layout);
</script>
</div>
The {partial}
placeholder in the HTML div is filled, in both cases, a plain JSON object which is also what's currently returned when using the --show-vega
option.
So, --show-plotly
(or --show-json
) would return a similar plain JSON object.
from dvc-render.
If we can start trying to build towards it being a drop-in replacement for vega-lite, I think it would be nice.
from dvc-render.
Haven't really thought on cross-product implications
from dvc-render.
Looks like a very cool library, and certainly much easier to use than Vega! The defaults don't quite match our design, but it seems there's enough ways to hook into hover and click events to make up for that and the ability to set colors to lines is much friendlier than the equivalent scale
API in Vega.
I'd say if the goal is to encourage users to develop tools based on these plots, a more high-level alternative backend like plotly here could do the job. Worth noting it could add some complexity in features like --show-vega
, but that can likely be handled without too much issue.
Maybe that could be handled by changing --show-vega
to --show-json
with a new --backend
or --plots-backend
flag that's vega
by default and can be set to plotly
? The names would make sense if we added images to the --show-vega
output from iterative/dvc#6752.
from dvc-render.
I assume that mixing schemas in that output could be a little problematic for integrations thinking (Is that assumption correct?)
I think that would be the case, but it could be handled if there was some way to distinguish the plots with different schemas from each other.
from dvc-render.
Another reason plotly would be useful: it has built-in support for more ML/DS/analytical visualizations, like smoothing (see iterative/vscode-dvc#3837).
from dvc-render.
I could start moving this forward in the CLI first and trying to get something working on Studio by myself (probably with som help) after
from dvc-render.
The main reasons to migrate to plotly would be:
- It's a more popular library, especially with the ML community.
- It opens options for users to eventually develop custom plots using the familiar plotly python api and visualize them with dvc.
A distant 3rd reason is UI improvements over vega lite, but I think we can already see that there will likely be as many drawbacks as advantages to the plotly UI. I think the first 2 points are strong enough that it's worth moving, but I don't think we have time to work towards the 2nd point now, and we have already put a ton of time into plots, so I would consider plotly a "nice to have" rather than an urgent priority.
from dvc-render.
@shcheklein @tapadipti @Suor @rogermparent @mattseddon Interested to hear your thoughts on this.
from dvc-render.
I'd say if the goal is to encourage users to develop tools based on these plots, a more high-level alternative backend like plotly here could do the job. Worth noting it could add some complexity in features like
--show-vega
, but that can likely be handled without too much issue. Maybe that could be handled by changing--show-vega
to--show-json
with a new--backend
or--plots-backend
flag that'svega
by default and can be set toplotly
? The names would make sense if we added images to the--show-vega
output from iterative/dvc#6752.
Side note: I agree that --show-json
would be a better name, even without plotly
.
This is a very good point to discuss. My original idea for the backend
was to be a property of each individual plot or even inferred from the template. So that users could mix plots with different backends in a single dvc.yaml
:
stages:
train:
cmd: python train.py
plots:
- prc.json:
cache: false
x: recall
y: precision
template: linear_plotly.json # Inferred
- roc.json:
cache: false
x: fpr
y: tpr
backend: plotly # Explicit
The main motivation (besides giving users flexibility) was that plotly
(or another backend) would probably introduce some type of plot that is not easily supported in vega
and we don't want to commit ourselves to maintain feature parity across plot backends.
However, when considering --show-json
I assume that mixing schemas in that output could be a little problematic for integrations 🤔 (Is that assumption correct, @rogermparent ?)
from dvc-render.
Supporting plotly was one of the ideas that came up during Studio ideas brainstorming sessions a while back. And it is one of the items we have in the roadmap for next year. I'm not sure how much work it would be in Studio to support this (may be @Suor would have some idea), but eventually it needs to be supported (at least as per the current roadmap / plan).
from dvc-render.
As far as I see plotly template is not JSON but JavaScript code, which is problematic from security perspective.
from dvc-render.
I've started looking at this from the VS Code perspective.
I can see in #88 (not sure if that PR is active or not) that the current idea is for DVC to hold the required data in the same format for both Vega & Plotly. Might it make sense to change that approach given that dvc-render
/vscode-dvc
/studio
would all have to contain the same(-ish) logic to convert the datapoints from that format into what is required by Plotly? If the idea is to keep the data decoupled from the render is there a better/more general format to hold it in?
For the extension it would be good to get the contents of data
/ layout
provided separately under the --split
option through plots diff
. My suggestion for the json
output from plots diff
for Plotly
plots would be:
{ "data": {
"dvc.yaml::name": [
{
"type": "plotly",
"revisions": ["workspace"],
"layout": {LAYOUT},
"data": {DATA},
}]
}}
That way we'll be able to update the marker.color
(or equivalent) for each experiment where appropriate (e.g. linear/scatter plots). We can also continue to hold the equivalent of templates/data separately internally.
LMK what you think. If we can agree on the approach I have the capacity to make contributions here and in DVC to get this moving.
from dvc-render.
Might it make sense to change that approach given that
dvc-render
/vscode-dvc
/studio
would all have to contain the same(-ish) logic to convert the datapoints from that format into what is required by Plotly?
I can see this would be a more involved change because Studio reaches directly into DVC and calls repo.plots.collect()
here:
from dvc-render.
Might it make sense to change that approach given that
dvc-render
/vscode-dvc
/studio
would all have to contain the same(-ish) logic to convert the datapoints from that format into what is required by Plotly?I can see this would be a more involved change because Studio reaches directly into DVC and calls
repo.plots.collect()
here:
The draft P.R.'s motivation was to keep the "status quo" of Vega implementation, introducing Plotly in a transparent way for DVC.
Doubts about the right approach (and capacity of the teams to work on ot) for the UIs (VSCode, Studio) were the reason for not continuing the work on it.
For the extension it would be good to get the contents of data / layout provided separately under the --split option through plots diff. My suggestion for the json output from plots diff for Plotly plots would be:
For the --split
option, we can do whatever postprocessing we want for the extension. We are already "breaking" the internal format used today for Vega.
LMK what you think. If we can agree on the approach I have the capacity to make contributions here and in DVC to get this moving.
I will do a minor update to the dvc-render P.R. , as it is currently missing the layout part.
Then we can discuss how to handle the --json
and --split
in DVC. It would be great if we could think of a way to unify how VSCode and Studio (FE) do the postprocessing of the plots, so we can update Studio (BE and FE).
We also need to decide how/when to enable Plotly. Options from the top of my mind:
A) Have a feature flag in DVC like dvc config plots.plotly True
B) Do a silent drop-in replacement for default plots. Just start rendering using plotly in case there is no custom template.
C) Allow users to explicitly enable plotly for some plots (i.e. introduce and handle special template names like plotly_linear
)
from dvc-render.
I am going to catch up with @daavoo today about this (thanks for sending an invite David).
The plan for me right now is to build a thin vertical slice along the lines of option B above. Ideally in the next two sprints, I'd like to be able to replace the smooth/linear/scatter templates with Plotly implementations (feels ambitious).
Findings so far:
I have been playing around with Plotly and the biggest difference with respect to Vega seems that the data and template are much less separate and get mangled together in order to create the desired output. As the "smooth" template seems to be the hardest of the three to generate I've been working on that.
I've managed to adapt the below examples to generate a demo of what is possible in terms of "smoothing" (not worrying about style yet)
https://plotly.com/javascript/sliders/#add-a-play-button-to-control-a-slider
https://plotly.com/javascript/gapminder-example/#animating-with-a-slider
Screen.Recording.2023-09-06.at.12.30.11.pm.mov
Code for the demo
function smoothTriangle(data, degree) {
const triangle = [
...Array(degree + 1).keys(),
...[...Array(degree).keys()].reverse()
] // up then down
const smoothed = []
for (let i = degree; i < data.length - degree * 2; i++) {
const point = data
.slice(i, i + triangle.length)
.map((x, j) => x * triangle[j])
smoothed.push(
point.reduce((a, b) => a + b) / triangle.reduce((a, b) => a + b)
)
}
// Handle boundaries
const halfDegree = Math.floor(degree / 2)
const leftBoundary = Array(halfDegree + 1).fill(smoothed[0])
const rightBoundary = Array(data.length - smoothed.length).fill(
smoothed[smoothed.length - 1]
)
return [...leftBoundary, ...smoothed, ...rightBoundary]
}
const y = [
0.2707333333333333, 0.40696666666666664, 0.4991833333333333,
0.6582666666666667, 0.5437333333333333, 0.6674, 0.6644, 0.6833166666666667,
0.7272, 0.68985, 0.7435333333333334, 0.6868166666666666, 0.76165,
0.7097833333333333, 0.7694, 0.7323666666666667, 0.7824166666666666,
0.7494666666666666, 0.7894, 0.7608166666666667, 0.7819833333333334,
0.7650833333333333, 0.7718833333333334, 0.7713, 0.7773166666666667, 0.77915,
0.7855166666666666, 0.7837166666666666, 0.7916333333333333,
0.7893666666666667, 0.7960833333333334, 0.7940166666666667,
0.7951333333333334, 0.7986333333333333, 0.7998666666666666, 0.80405,
0.8076333333333333, 0.8097, 0.8160333333333334, 0.81405, 0.82245,
0.8198166666666666, 0.8292666666666667, 0.8251166666666667,
0.8348666666666666, 0.8303333333333334, 0.8397333333333333,
0.8357833333333333, 0.8434, 0.8401166666666666, 0.8468333333333333,
0.8441833333333333, 0.8502, 0.8476833333333333, 0.8530833333333333, 0.8513,
0.8561666666666666, 0.8553166666666666, 0.85905, 0.8595666666666667,
0.8616666666666667, 0.8631, 0.8645833333333334, 0.8659666666666667,
0.8678833333333333, 0.86965, 0.87255, 0.8734, 0.8752666666666666,
0.8785333333333334, 0.8778166666666667, 0.8829833333333333,
0.8794166666666666, 0.8860833333333333, 0.8793666666666666,
0.8881166666666667, 0.8799666666666667, 0.8906, 0.8814666666666666,
0.8921166666666667, 0.8832333333333333, 0.8939333333333334,
0.8849333333333333, 0.8918666666666667, 0.8869, 0.8937833333333334,
0.8885333333333333, 0.8953166666666666, 0.8903833333333333,
0.8961166666666667, 0.8915333333333333, 0.89725, 0.8925, 0.8984333333333333,
0.8935333333333333, 0.8996666666666666, 0.8948333333333334,
0.9006833333333333, 0.8959, 0.9020166666666666
]
const trace2 = {
y,
x: [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99
],
mode: 'lines',
name: 'workspace',
line: {
color: 'rgb(255, 217, 102)'
},
type: 'scatter'
}
const trace2A = {
y,
x: [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99
],
mode: 'lines',
opacity: 0.1,
type: 'scatter',
showlegend: false
}
var data = [trace2,trace2A];
Plotly.newPlot('myDiv', {data,
layout:{
sliders: [{
pad: {t: 30},
x: 0.05,
len: 0.95,
currentvalue: {
xanchor: 'right',
prefix: 'degree: ',
font: {
color: '#888',
size: 20
}
},
transition: {duration: 500},
// By default, animate commands are bound to the most recently animated frame:
steps: [
{
label: '0',
method: 'animate',
args: [['0'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]},
{
label: '1',
method: 'animate',
args: [['1'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]},
{
label: '2',
method: 'animate',
args: [['2'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]},
{
label: '3',
method: 'animate',
args: [['3'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]
}, {
label: '4',
method: 'animate',
args: [['4'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]
}, {
label: '5',
method: 'animate',
args: [['5'], {
mode: 'immediate',
transition: {duration: 300},
frame: {duration: 300, redraw: false}, }]
}]
}],
updatemenus: [{
type: 'buttons',
showactive: false,
x: 0.05,
y: 0,
xanchor: 'right',
yanchor: 'top',
pad: {t: 60, r: 20},
buttons: [{
label: 'Play',
method: 'animate',
args: [null, {
fromcurrent: true,
transition: {duration: 300},
frame: {duration: 500, redraw: false} }]
}]
}]
},
// The slider itself does not contain any notion of timing, so animating a slider
// must be accomplished through a sequence of frames. Here we'll change the color
// and the data of a single trace:
frames: [
{
name: '0',
data: [{
y
}],
},{
name: '1',
data: [{
y: smoothTriangle(y,1)
}],
},{
name: '2',
data: [{
y: smoothTriangle(y,2)
}],
},
{
name: '3',
data: [{
y: smoothTriangle(y,3)
}],
}, {
name: '4',
data: [{
y: smoothTriangle(y,4)}]
}, {
name: '5',
data: [{
y: smoothTriangle(y,5)
}]
}]
});
This does use the triangular moving average function mentioned previously (shown here) but that function is something that we have to implement on our own. We can also forgo the play button but it seems that in order to show different smoothed options we have to calculate all of the new y values ourselves and load each set of values into distinct frames
. The second example above shows how this can be done programmatically but it is going to get complicated when we add multiple data sources + revisions (seems like ordering is the only thing that ties them together).
Edit: Demo using ema as smoothing function -
Screen.Recording.2023-09-06.at.3.49.03.pm.mov
from dvc-render.
Today I've been looking at Vega. I have opened the above PR to add zoom/pan to plots in VS Code and have been able to come up with these tooltips for linear plots.
PTAL and LMK what you think/if this changes anything.
from dvc-render.
@mattseddon Is your point that we should reconsider plotly?
from dvc-render.
@mattseddon Is your point that we should reconsider plotly?
I am really not sure. I think both Vega and Plotly have their own benefits and constraints.
Let's chat about whether or not we still want to take this on when we meet this week.
In the meantime, I am going to attempt to update the default templates to add zoom + pan and new tooltips. E.g. for smooth/linear, we will end up with:
Screen.Recording.2023-09-11.at.9.33.00.am.mov
As you can see from the above screen recording the template is not perfect as the tooltip contains {rev::filename::field}
as an identifier. Right now this is the only way we can get around the fact that all templates are generalised.
I am also going to look further into the Studio/DVC code. Whatever we decide we need to start on removing parts of the legacy process.
New proposed smooth template
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": "<DVC_METRIC_DATA>"
},
"title": "<DVC_METRIC_TITLE>",
"width": "container",
"height": "container",
"params": [
{
"name": "smooth",
"value": 0.001,
"bind": {
"input": "range",
"min": 0.001,
"max": 1,
"<DVC_METRIC_X>": 0.001
}
}
],
"layer": [
{
"encoding": {
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
},
"layer": [
{ "mark": "line" },
{
"transform": [{ "filter": { "param": "hover", "empty": false } }],
"mark": "point"
}
],
"transform": [
{
"loess": "<DVC_METRIC_Y>",
"on": "<DVC_METRIC_X>",
"groupby": ["rev", "filename", "field", "filename::field"],
"bandwidth": {
"signal": "smooth"
}
}
]
},
{
"params": [{ "bind": "scales", "name": "grid", "select": "interval" }],
"mark": {
"type": "line",
"opacity": 0.2
},
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
}
},
{
"mark": {
"type": "circle",
"size": 10,
"tooltip": {
"content": "encoding"
}
},
"encoding": {
"x": {
"aggregate": "max",
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
},
"y": {
"aggregate": {
"argmax": "step"
},
"field": "<DVC_METRIC_Y>",
"type": "quantitative",
"title": "<DVC_METRIC_Y_LABEL>",
"scale": {
"zero": false
}
},
"color": {
"field": "rev",
"type": "nominal"
}
}
},
{
"transform": [
{
"calculate": "datum.rev + '::' + datum.filename + '::' + datum.field",
"as": "tooltip-group"
},
{
"pivot": "tooltip-group",
"value": "acc",
"groupby": ["<DVC_METRIC_X>"]
}
],
"mark": { "type": "rule", "tooltip": { "content": "data" } },
"encoding": {
"opacity": {
"condition": { "value": 0.3, "param": "hover", "empty": false },
"value": 0
}
},
"params": [
{
"name": "hover",
"select": {
"type": "point",
"fields": ["<DVC_METRIC_X>"],
"nearest": true,
"on": "mouseover",
"clear": "mouseout"
}
}
]
}
],
"encoding": {
"x": {
"field": "<DVC_METRIC_X>",
"type": "quantitative",
"title": "<DVC_METRIC_X_LABEL>"
}
}
}
Demo VS Code
Screen.Recording.2023-09-11.at.1.35.07.pm.mov
In order to implement this I think we need to consolidate the post-processing of data in the three products (due to the use of "calculate": "datum.rev + '::' + datum.filename + '::' + datum.field", "as": "tooltip-group"
for the new tooltips).
from dvc-render.
For anyone following this issue:
This has been temporarily deprioritised whilst iterative/dvc#9940 is worked on.
from dvc-render.
@shcheklein I think @dberenbaum summed it up well in the last comment. Plotly would not be a silver bullet and I don't think we can justify the effort for the benefits that we would get right now.
from dvc-render.
Related Issues (20)
- Support tooltips in plots HOT 11
- Plots are not rendered properly when there is a single value HOT 3
- `smooth` plots template broken HOT 2
- Revert legend explicit positioning since it breaks VS Code HOT 7
- html: improve table
- markdown: Use https://github.com/vega/vl-convert HOT 1
- Add matplotlib as a backend HOT 5
- Smoothing doesn't work as expected
- Story : Plots , UX for LLM support in Translation [1]
- Story : Plots , Metrics for LLM support [2] HOT 2
- Set xlim/ylim in plot HOT 1
- Invisible lines :bug: HOT 5
- Research JavaScript wrapper HOT 2
- vega: Escape special characters (`.[]`) in `field` HOT 1
- plots: expand legend character limit
- Render inline in IPython HOT 2
- Show params in report
- anchors: don't auto - fill labels on complex config
- template filling: allow filling template on no datapoints
- markdown: Support confusion matrix.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dvc-render.