rstudio / ggvis Goto Github PK

Interactive grammar of graphics for R

License: Other

R 52.29% Shell 0.07% JavaScript 41.89% CSS 0.40% HTML 5.35%

ggvis's Introduction

ggvis

Status

ggvis is currently dormant. We fundamentally believe in the ideas that underlie ggvis: reactive programming is the right foundation for interactive visualisation. However, we are not currently working on ggvis because we do not see it as the most pressing issue for the R community as you can only use interactive graphics once you've successfuly tackled the rest of the data analysis process.

We hope to come back to ggvis in the future; in the meantime you might want to try out plotly or creating inteactive graphics with shiny.

Introduction

The goal of ggvis is to make it easy to describe interactive web graphics in R. It combines:

a grammar of graphics from ggplot2,
reactive programming from shiny, and
data transformation pipelines from dplyr.

ggvis graphics are rendered with vega, so you can generate both raster graphics with HTML5 canvas and vector graphics with svg. ggvis is less flexible than raw d3 or vega, but is much more succinct and is tailored to the needs of exploratory data analysis.

If you find a bug, please file a minimal reproducible example at https://github.com/rstudio/ggvis/issues. If you're not sure if something is a bug, you'd like to discuss new features or have any other questions about ggvis, please join us on the mailing list: https://groups.google.com/group/ggvis.

Installation

Install the latest release version from CRAN with:

install.packages("ggvis")

Install the latest development version with:

# install.packages("devtools")
devtools::install_github("rstudio/ggvis")

Getting started

You construct a visualisation by piping pieces together with %>%. The pipeline starts with a data set, flows into ggvis() to specify default visual properties, then layers on some visual elements:

mtcars %>% ggvis(~mpg, ~wt) %>% layer_points()

The vignettes, available from https://ggvis.rstudio.com/, provide many more details. Start with the introduction, then work your way through the more advanced topics. Also check out the various demos in the demo/ directory. See the basics in demo/scatterplot.r then check out the the coolest demos, demo/interactive.r and demo/tourr.r.

ggvis's People

Stargazers

Watchers

Forkers

jcheng5 muraenok ameliamn aranlunzer wch eejd scttl lingbing shoha99 casunlight zhenglei-gao imclab strategist922 rickyking glennfulford bartleyneuro jonathancornelissen timelyportfolio benrollert fvd jimhester jfrostad robinlovelace briandk vnkta junzhaohu robha77 ijlyttle cabiling renkun-ken gnanapriyav saurfang moosilauke18 harryprince jjallaire trinen kismsu audiencepropensities ngwells lazycrazyowl jchaskell matrixya qlycool tavpritesh danilocarvalho xydrolase ranaivosonherimanitra libardo1 littlewhiter adiament er3kim78 mih5 tday55 aashish24 renzhonglu rpruim jslmann dongl rinatmenyashev baaqmd marcds lalithakishore josephmkahn pescadra nianxue emaasit bethetaedu umeshkbhaskar strepon beperrin dimitris1ps ashetti ying9264 umeshach tklebel hehuanshu96 kushdesh rlzijdeman semiexpert jtoloe rmatev cesarmaalouf shafcodes weininghu1012 vsskanand rodmorley erickramer radovankavicky parthasen monties josiekre lionel- kghub trinker arthashastra andronekomimi zhangyan715 pgnepal agkulkarni lbollar

ggvis's Issues

Need gigvis_node constructor

Used by gigvis, node, and mark. Should check required inputs are of the correct type. is.gigvis_node function should live in same file.

Complete remaining inputs

checkboxInput
checkboxGroupInput
dateInput
dateRangeInput
numericInput
textInput

I wonder if we should also add generic update methods for the delayed reactives (they'd just dispatch to update*Input)

R colours should be convert to hex

wherever they are used as constants.

Mapped props should evaluate in environment function

i.e. so the current code doesn't fail

f <- function(n) {
  force(n)
  props(x ~ n + wt)
}
prop_value(f(5)$x, mtcars)

That implies that props() should also capture the environment and store it, and prop_value needs to use the environment. Maybe that implies that this particular type of prop should be a different class?

Revert transforms to fixed variables

(instead of taking from data)

gigvis should automatically detect if plot is dynamic or static

One way to do would be for connect to return either non-reactive or reactive objects. A plot is then dynamic if any of the node data are reactive, and vega_spec would include something like:

datasets <- lapply(data_names, function(name) {
  data <- data_table[[name]]
  if (is.reactive(data)) {
    list(name = name)
  } else {
    data_table[[name]]
  }
})

Props should automatically convert underscores to camelCase

So you write props(font_size = 20) not props(fontSize = 20)

Modifying plots

Currently there's no easy way to create a plot and then modify it. We could add additional functions to change data, modify props and add/delete/update other components. Alternatively we could create + methods

That would let us reduce

g1 <- gigvis(mtcars, props(x ~ wt, y ~ mpg),
  mark_symbol(),
  branch_smooth(span = span_slider)
)
g2 <- gigvis(mtcars, props(x ~ wt, y ~ hp),
  mark_symbol(),
  branch_smooth(span = span_slider)
)

g <- gigvis(mtcars, props(x ~ wt),
  mark_symbol(),
  branch_smooth(span = span_slider)
)

g1 <- update_props(g, props(y ~ mpg))
g2 <- update_props(g, props(y ~ wt))

# or

g + props(y ~ mpg)
g + props(y ~ wt)

Better automatic detection for scales and axes

So that this example:

ggvis(mtcars, props(y ~ mpg), 
  mark_symbol(props(x = prop(quote(disp), scale = "xdisp"))),
  mark_symbol(props(x = prop(quote(wt), scale = "xwt"))),
  dscale("x", "numeric", name = "xdisp"),
  dscale("x", "numeric", name = "xwt"),
  axis("x", "xdisp", orient = "bottom"),
  axis("x", "xwt", orient = "bottom")
)

only needs to be

ggvis(mtcars, props(y ~ mpg), 
  mark_symbol(props(x = prop(quote(disp), scale = "xdisp"))),
  mark_symbol(props(x = prop(quote(wt), scale = "xwt")))
)

it would also be nice to have some standard way of adding on to the default scales/axes/legends instead of completely overriding them.

Should scaled constants be embedded in the data

Because we can't currently add constants to a scale: https://github.com/trifacta/vega/issues/103

Then it would work the same way as reactive constants.

Scaled constants are useful for cases like this:

ggvis(mtcars, props(x ~ wt, y ~ mpg), 
  mark_symbol(),
  branch_smooth(props(stroke = "lm"), method = lm),
  branch_smooth(props(stroke = "loess"))
)

Add resizing to static plot view

Suboptimal error message for invalid prop

p <- ggvis(mtcars, props(x ~ disp, y ~ mpg, strke ~ cyl), mark_symbol())
# Error when attempt to display it
print(p)
# Error: Don't know how to make default scale for strke with variable of type numeric

Clone method for delayed reactives?

So e.g. you could do:

vars <- input_select(c("disp", "wt"))
gigvis(mtcars, props(
  x = prop(clone(vars), constant = FALSE), 
  y = prop(clone(vars), constant = FALSE)), 
  mark_symbol())

it would be straightforward to implement because it would just create a new random id.

Properties for colour components

https://github.com/trifacta/vega/wiki/Marks#color-references

tourr.r requires dscale_y_numeric from scale_defaults.R

Couldn't do the tourr (very cool btw) without first opening and executing scale_defaults.R

(Don't know the proper way to load that from within the tourr.r file, so creating an issue instead of a pull request...)

Need better default behaviour for text

Currently need to request it not be scaled:

df <- data.frame(x = runif(5), y = runif(5), 
  labels = c("a", "b", "b", "a", "b"))
ggvis(df, props(x ~ x, y ~ y, text = prop(quote(labels), scale = F)), mark_text())

maybe:

props(x ~ x, y ~ y, text = quote(labels))

Install fails for install_github("gigvis", "rstudio")

The error message is:

Error : object 'stopApp' is not exported by 'namespace:shiny'

Add default scale function

e.g.

dscale("x", "numeric", domain = c(-1, 1))

Data envir should be captured at gigvis, not vega_spec

Right now, the data passed to gigvis()/node() is a string that is later evaluated in the environment given to vega(). Can we/I change data to just being an R expression that is captured using shiny::exprToFunction?

Histogram with 1-pixel wide bins

How could we make this possible? (Even as the user resizes the plot)

Error on install

Hi,
I tried installing and I got an error

install_github("ggvis", "rstudio")
Installing github repo(s) ggvis/master from rstudio
Downloading ggvis.zip from https://github.com/rstudio/ggvis/archive/master.zip
Installing package from /var/folders/tl/_8_djcq15pl01ht8z6hy9tww0000gn/T//RtmpOd4AEL/ggvis.zip
Installing ggvis
'/usr/local/Cellar/r/2.15.2/R.framework/Resources/bin/R' --vanilla CMD INSTALL  \
  '/private/var/folders/tl/_8_djcq15pl01ht8z6hy9tww0000gn/T/RtmpOd4AEL/ggvis-master'  \
  --library='/Users/nacho/Library/R/2.15/library' --with-keep.source --install-tests

* installing *source* package 'ggvis' ...
** R
** demo
** inst
** tests
** preparing package for lazy loading
** help
Error : /private/var/folders/tl/_8_djcq15pl01ht8z6hy9tww0000gn/T/RtmpOd4AEL/ggvis-master/man/branch_density.Rd: Sections \title, and \name must exist and be unique in Rd files
ERROR: installing Rd objects failed for package 'ggvis'
* removing '/Users/nacho/Library/R/2.15/library/ggvis'
Error: Command failed (1)

Generate controls at render time (not creation)

To make it easier to use information about the controls in other functions.

Constant properties should only take single value

i.e. props(x = c(30, 50)) should be an error.

Somewhat related to #30, because we could choose to do the recycling ourselves and save in the data frame (but I think that's probably a bad idea)

Need auto_split() pipeline

Autosplit should have the basic behaviour as ggplot2: i.e. it should split by every categorical variable in the layer. It should be straightforward - it's similar to by_group but it figures out which variables to split by in connect.

Then we could add branch_line() which would look like:

branch_line <- function(props, ...) {
  node(auto_split(...), mark_line(props))
}

Once we have transform_sort it might look like:

branch_line <- function(props, sort = TRUE, ...) {
  node(
    c(auto_split(), if (order) transform_sort())
    mark_line(props)
  )
}

Transforms need consistent names for output variables

Always use __ suffix? (Not sure if we need this or not). Need to update branch functions similarly, and use standard way of setting default properties.

`rep_len` not found

I'm running R version 2.15.2, and when I try demo(scatterplot, package="ggvis"), I get the following:

    demo(scatterplot)
    ---- ~~~~~~~~~~~

Type  <Return>   to start : 

> library(ggvis)

> # Basic scatter plot
> ggvis(mtcars, props(x = ~ wt, y = ~ mpg),
+   mark_symbol()
+ )
Error in FUN(X[[1L]], ...) : could not find function "rep_len"
Error in FUN(X[[1L]], ...) : could not find function "rep_len"

ggvis says it supports R (>= 2.15.1), so I guess either that function needs to be changed or the prereq version changed?

Detecting inherited props

In branch_density, it inspects the specified properties so that fill and fillOpacity go to the area mark, and stroke and strokeOpacity go to the line mark. But this doesn't work for properties inherited from ancestor nodes.

# As expected: semitransparent fill
ggvis(faithful, props(x = ~waiting), branch_density(props(fill :="red")))

# Not as expected: opaque fill for the line mark (not the area mark)
ggvis(faithful, props(x = ~waiting, fill := "red"), branch_density())

Is prop_value ever called with processed = TRUE?

And can it be removed? Similarly for plot_type

Options for more streamlined DSL for props()

These are some of the possibilities for the DSL for specifying within the props() function. This function has the approximately the same purpose as the aes() function in ggplot2.

A property is either scaled or unscaled. Examples of scaled values are: numeric value mapped to x position, and a categorical variable mapped to colors. Examples of unscaled values are: specifying a constant color for points, like "red", or a column in a data set that contains raw color names, like "#000000", "#ffffff", "red", etc.

The value should also be an expression to be evaluated at render time, a constant value, or a (reactive) input object.

Here are there are six different combinations and some examples of their use:

Scaled expression: Mapping a variable to a property. Similar to aes(x = wt) in ggplot2.
Scaled constant: Used for setting the x or y position of a line/point.
Scaled reactive: For interactive input - perhaps for adding lines at particular x/y values?
Unscaled expression: An expression, which, when evaluated in the data set, returns a vector of raw values, like c("red", "red", "black").
Unscaled constant: A raw value like "red".
Unscaled reactive: Interactive inputs setting color/size/opacity of points.

1 and 5 will probably be the most common use cases.

The options shown below are a subset of a long list of alternatives.

Option 1.5

scaled expr: x ~ quote(mpg)
scaled cnst: x ~ 1
scaled rctv: x ~ input_slider()
unscld expr: fill = quote(col)
unscld cnst: fill = "red"
unscld rctv: fill = input_slider()
prop obj : y = prop(...)

Cons: most common operation is longest (~ + quote).

Option 3

scaled expr: x ~ mpg
scaled cnst: x ~ 1
scaled rctv: x ~ input_slider()
unscld expr: fill ~ I(col)
unscld cnst: fill ~ I("red")
unscld rctv: fill ~ I(input_slider())
prop obj : y = prop(...)

Cons:

Option 4

scaled expr: x ~ mpg
scaled cnst: x ~ 1
scaled rctv: y = input_slider()
unscld expr: fill ~ I(col)
unscld cnst: fill = "red", or fill = I("red")
unscld rctv: fill = I(input_slider())
prop obj : y = prop(...)

Pros: No non-standard evaluation, Hadley can implement easily
Cons: lack of symmetry, confusing

Option 5

By default, expressions are evaluated
~ means to not evaluate the expression

= means to use scale = TRUE
:= means to use scale = FALSE

scaled expr: x = ~mpg
scaled cnst: x = 1
scaled rctv: x = input_slider()
unscld expr: fill := ~col
unscld cnst: fill := "red"
unscld rctv: fill := input_slider()
prop obj : y = prop(...)

Option 9

~ means to not evaluate the expression. Also default to scale=TRUE.
= means to evaluate the expression. Also default to scale=FALSE.

scaled() means to force the object to have scale=TRUE
unscaled() means to force the object to have scaled=FALSE

scaled expr: x ~ mpg
scaled cnst: x = scaled(1)
scaled rctv: x = scaled(input_slider())
unscld expr: fill ~ unscaled(col)
unscld cnst: fill = "red"
unscld rctv: fill = input_slider()
prop obj : y = prop(...)

Option 10

By default, expressions are evaluated
~ means to not evaluate the expression

Default is to use scale=TRUE
I() means to use scale=FALSE

scaled expr: x = ~mpg
scaled cnst: x = 1
scaled rctv: x = input_slider()
unscld expr: fill = ~I(col)
unscld cnst: fill = I("red")
unscld rctv: fill = I(input_slider())
prop obj : y = prop(...)

When resizing plots, result is off by one pixel

I believe this is the result of a bug in JQuery UI 1.10.3, which i filed here:
http://bugs.jqueryui.com/ticket/9547

Prop classes

A property has three binary options:

reactive or static
constant or variable
scaled or unscaled

and these are currently exposed through three classes:

prop_reactive = reactive
variable = static + variable
constant = static + constant

which doesn't seem like a great design. Maybe we only need one prop class with three binary inputs?

Do we need a way to remove props?

e.g.

merge_props(props(x ~ a, y ~ b), props(x = NULL))

It's probably not as important as ggplot2 since we have a richer data hierarchy and inherits = FALSE, but it might be handy from time-to-time.

Add dots argument to transform

To represent other arguments passed on down to the underlying statistical transformation function.

Add way to make it easy to interactively subset data

For example, maybe a transform that goes in the pipeline:

pipeline(
 mtcars,
 transform_custom(
   input_slider(0, 10), 
   function(val, data) {
     data[data$cyl < value,]
   })
)

Or a delayed_subset function:

delayed_subset(mtcars, cyl < input_slider(0, 10))
delayed_subset(mtcars, cyl < slider)
slider <- input_slider(0, 10))

Axis and legends need prefix

To avoid clash with base functions - maybe vega_axis and vega_legend since they do little apart from creating the needed json spec.

Better names for branches and nodes

It would be nice to have names that stuck to a consistent metaphor, and also fit with the other current functions for making graph components ggvis and mark_*

Add button to interactive plots for downloading static png/svg image

Error requires restart

Once you execute this code:

# Slider and select input in a transform
ggvis(mtcars, props(x ~ wt, y ~ mpg),
  mark_symbol(),
  branch_smooth(
    n = sliderInput('foo', 2, 80, value = 5, step = 1, label = "Interpolation points"),
    method = input_select(c("Linear" = "lm", "LOESS" = "loess"), label = "Method")
  )
)

Future dynamic ggvis invocations don't work, until you restart your R session.

Create the classic ggvis example dataset

The equivalent of diamonds for ggplot2.

Default html display tweaks for Rstudio

remove title
make renderer option less prominent (maybe show in small font at bottom-right of plot)
need autosize option so that plot fits the pane exactly
remove quit button
make download button less prominent
make default font size smaller

Should node and ggvis just take ...?

i.e. remove props and data and automatically assign and collapse based on their classes.

This would mean that instead of:

ggvis(, , mark_symbol(props(x ~ wt, y ~ mpg, stroke = "red"), mtcars))

you could just write

ggvis(mark_symbol(props(x ~ wt, y ~ mpg, stroke = "red"), mtcars))

S3 wrappers for math transforms of inputs

Not sure if this is a good idea or not - we can't make it work for every single function, but we can cover the most common.

input <- function(builder, map, args) {
  assert_that(is.function(map), length(formals(map)) == 1)

  structure(list(builder = builder, map = map, args = args),
    class = "input")
}
is.input <- function(x) inherits(x, "input")

input_slider <- function(min, max, value = min, step = NULL, round = FALSE,
  format = "#,##0.#####", locale = "us", ticks = TRUE,
  animate = FALSE, label = "", id = rand_id("slider_"),
  map = identity) {

  input("sliderInput", map = map, arg = list(id, label, min = min, max = max, 
    value = value, step = step, round = round, format = format, locale = locale, 
    ticks = ticks)) 
}


controls.input <- function(x, session = NULL, ...) {
  builder <- match.fun(builder)
  do.call(builder, x$args)
}

Math.input <- function(x) {
  f <- match.fun(.Generic)
  map <- function(y) f(x$map(y))
  input(x$builder, map, x$args)
} 
Ops.input <- function(e1, e2) {
  f <- match.fun(.Generic)
  if (is.input(e1)) {
    map <- function(y) f(e1$map(y), e2)
    input(e1$builder, map, e1$args)
  } else {
    map <- function(y) f(e1, e2$map(y))
    input(e2$builder, map, e2$args)
  }
} 

x <- input_slider(0, 10)

(x + 1)$map(10)
(10 ^ x)$map(2)

Support delayed reactive components on scales etc?

Should we support including delayed reactives in any arbitrary component, even though that would require resending the spec to vega? This would make it possible to easily implement zoom (by changing the domains of the scales), but it's not obvious if it's needed in general, or should just be something that could be implemented inside a shiny app.

Need some way to suppress axes/legends

In case you don't want one for some reason.

Make dynamic ggvis less chatty

i.e. eliminate:

Shiny URLs starting with /ggvis will mapped to /Users/hadley/R/ggvis/www

Listening on port 4343

Need prop_group

See https://github.com/trifacta/vega/issues/31 and https://github.com/trifacta/vega/wiki/Marks#value-references

Faster toJSON

Same as rstudio/shiny#228, but even more of a performance issue for the types of data structures that ggvis is likely to create.

pp <- function (n, r = 4) {
  width <- 2 * r * pi / (n - 1)

  mid <- seq(-r * pi, r * pi, len = n)
  df <- expand.grid(x = mid, y = mid)
  df$r <- sqrt(df$x^2 + df$y^2)
  df$z <- cos(df$r^2)*exp(-df$r/6)

  df$y2 <- df$y + width
  df$x2 <- df$x + width
  df$x <- df$x - width
  df$y <- df$y - width
  df
}

# Create the data
system.time(dat <- pp(100))
#    user  system elapsed 
#   0.003   0.001   0.004 

# Convert to a D3 data format that's ready for conversion to JSON
system.time(dfj <- df_to_json(dat))
#    user  system elapsed 
#   0.099   0.001   0.101 

# Two different functions for converting to JSON
system.time(json <- RJSONIO::toJSON(dfj))
#    user  system elapsed 
#    4.47    0.01    4.48 
system.time(json <- rjson::toJSON(dfj))
#    user  system elapsed 
#   0.166   0.001   0.168

Unfortunately, the output isn't quite the same from the two toJSON functions.

Consider supporting all property elements

Not just update, but also exit, enter, hover, or even custom values used by custom js events.

Name change proposal

ggvis -> gigplot3 - clear homage to ggplot2, while indicating that it's an evolution that adds interactivity
gigvis() -> gigplot() by analogy to ggplot2, and because plot is more of a verb than vis.

What do you think?

Error: Can't send message on a closed WebSocket

I've figured out how to reproduce this error. This is probably related to #44, with old reactive things hanging around.

library(ggvis)
library(shiny)

# This gives an error
ggvis(mtcars, props(x = ~ wt, y = ~ mpg),
  mark_symbol(),
  branch_smooth(
    n = sliderInput('foo', 2, 80, value = 5, step = 1, label = "Interpolation points"),
    method = input_select(c("Linear" = "lm", "LOESS" = "loess"), label = "Method")
  )
)
# Listening on port 8100
# Guess transform_smooth(formula = y ~ x)
# Error : length(trans$n) not equal to 1
# Error : length(trans$n) not equal to 1
# Error : length(trans$n) not equal to 1
# Error : length(trans$n) not equal to 1
# <break>

# This shouldn't give an error, but does - run it twice
ggvis(mtcars, props(x = ~wt, y = ~mpg),
  mark_symbol(),
  branch_smooth(
    method = input_select(c("Linear" = "lm", "LOESS" = "loess"), label = "Method")
  )
)
# Listening on port 8100
# Error : length(trans$n) not equal to 1
# <break>

# Run the previous one again
ggvis(mtcars, props(x = ~wt, y = ~mpg),
  mark_symbol(),
  branch_smooth(
    method = input_select(c("Linear" = "lm", "LOESS" = "loess"), label = "Method")
  )
)
# Listening on port 8100
# Guess transform_smooth(formula = y ~ x)
# Error in .websocket$send(json) : Can't send message on a closed WebSocket

At this point, running flushReact() a few times will give the WebSocket error, but eventually they clear out and it's possible to make dynamic plots again.