julianstanley / gfpopgui Goto Github PK
View Code? Open in Web Editor NEWA Shiny-based GUI for GFPOP. Project for Google Summer of Code 2020.
Home Page: https://julianstanley.shinyapps.io/gfpopgui/
License: Other
A Shiny-based GUI for GFPOP. Project for Google Summer of Code 2020.
Home Page: https://julianstanley.shinyapps.io/gfpopgui/
License: Other
[...] to have two linked displays, one with a zoomed out overview, and
another with zoom to details in a specific region. something like range
selector from dygraphs http://dygraphs.com/gallery/#g/range-selector
So, a nice-to-have enhancement: something like a range selector to make it easier for a user to stay oriented while zooming all around the changepoints visualization.
@tdhock Shinytest can be used with Travis, so I'll plan to set that up once I have some tests.
This is a big Phase 2 feature that I should be all set to implement now.
I think it is impossible to have client-side communication between the visNetwork and plotly visualizations.
So, for now, that's going to have to be server-side. My (crude/brute-force) implementation idea right now is just to have a column shared between the changepoint segments and the nodes/edges in the graph. When a user hovers over a point, I can grab that information and use that to change node/edge/changepoint opacity.
General best practices: https://community.rstudio.com/t/best-practices-shiny-development/1694/5
I'll need to learn more about using modules properly: https://mastering-shiny.org/scaling-modules.html
I also need to decide whether it's worth the effort to use golem to structure the package. I'll go ahead and make a branch that I'll use to prototype setting up the app with golem. Here it is
I initially tried designing modules for each page in the app. I knew that that wasn't best practice, but I wanted some modularization.
In #11 I ended up deleting the code from those modules and just putting them in the main UI and server functions because the "Welcome" and "Analysis" pages needed to communicate with each other, and modules should be independent.
So I'm going to read https://mastering-shiny.org/scaling-modules.html more carefully and learn more about it applies to this project.
A lot of little bugs can be traced back to graphdf_to_visNetwork()
and visNetwork_to_graphdf()
, in fct_visNetwork_helpers.R
.
That's because, when I convert a visNetwork list to graphdf, I still fail to keep track of all edge ids. I keep track of node ids, but no longer edge ids.
My goal should be to never fuck with ids. They should stay the same, always. That's part of why #55 was important--I needed to make a layer between the user's and the ids to make sure they didn't confuse visNetwork.
I may need to keep track of edge ids in a reactiveValue...maybe my graphdf can always be a list, with the real graphdf and an array of edge ids? I can keep the node ids in the same df, since they're the same length as the df itself, but that's not the case for the edge values.
This bug will be introduced when I merge pull request #32. It's workable but annoying.
I feel like this should be resolved if I just add a button to re-render the graph and then isolate all of the reactive components in the graph function. Hopefully, that's all it takes :)
Previously thought that I had to wrap reactive events into functions for testing purposes. This isn't necessary and reduces code coverage.
Cleaning this up may require changing eventReactives into observeEvents, since I was using eventReactives incorrectly. Note that:
library(shiny)
exampleServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
exampleVal <- reactiveValues(changeme = FALSE)
eventReactive(input$exampleInput, {
exampleVal$changeme <- TRUE
})
}
)
}
testServer(exampleServer, {
# changeme begins false
stopifnot(!exampleVal$changeme)
# changeme should change to true when exampleInput changes
session$setInputs(exampleInput = 1)
stopifnot(exampleVal$changeme)
})
Does not work, but:
library(shiny)
exampleServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
exampleVal <- reactiveValues(changeme = FALSE)
observeEvent(input$exampleInput, {
exampleVal$changeme <- TRUE
})
}
)
}
testServer(exampleServer, {
# changeme begins false
stopifnot(!exampleVal$changeme)
# changeme should change to true when exampleInput changes
session$setInputs(exampleInput = 1)
stopifnot(exampleVal$changeme)
})
Does.
One of my big goals is to have a minimally functional application up-and-running ASAP, and then keep the app functional as I build up features.
This fits with deploying the app to a free-tier shinyapps.io in Phase 1 and continuously re-deploying from travis.
It looks like doing that will be super straightforward, thanks to a tutorial from Jarret Meyer.
I made preliminary GUI sketches on May 14th, which I'll paste below.
I need to update those sketches with the following considerations:
On analysis page: add a dropbox to change the penalty score
Would also be nice to keep track of previously-run analyses to make them easy to re-run/refer to.
Would also be nice to have a button to get the R code corresponding to the current graph constraint (this corresponds to a common use case where you have the R code with a current graph, and just want to modify that).
Analysis and annotation should be the same plot, but should maybe have a button to hide the gfpop results and just add annotation.
Since Guillem isn't using GitHub, I'll copy part of one of his emails here:
A. When I modify the id of a node I get a duplication?
B. The default value for K should be inf (Gaussian model) and not 1 I think.
C. I am not able to build the up-dw graph starting from std graph.
I proceeded as follow:
- I started from std graph
- I added a node (labeled up) and made the connection to Std
got-this code in the bottom
gfpop::graph(
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'null', gap = 1, penalty = 0, K = 1, a = 0),
gfpop::Edge(state1 = 'up', state2 = 'up', type = 'null', gap = 1, penalty = 0, K = 1, a = 0),
gfpop::Edge(state1 = 'Std', state2 = 'up', type = 'std', gap = 0, penalty = 10, K = 1, a = 0),
gfpop::Edge(state1 = 'up', state2 = 'Std', type = 'std', gap = 0, penalty = 10, K = 1, a = 0)
)- I ran gfpop a nd everything is fine.
- I then replaced edges type to "up" and "down"
get-this code in the bottom (some edges are missing!)
gfpop::graph(
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'null', gap = 1, penalty = 0, K = 1, a = 0),
gfpop::Edge(state1 = 'up', state2 = 'up', type = 'null', gap = 1, penalty = 0, K = 1, a = 0)
)- then gfpop does not run because of the missing edges
My impression is that up and down types are no recognized at step 4?
Let me know what I should do.Cheers,
Guillem
And then my response:
This is helpful, I'll work on this and email you when these bugs are fixed.
In the meantime, I'm able to replicate your issues except for the K=1 default--from my end, K defaults to Inf.
More Info
When I modify the id of a node I get a duplication?
Yeah, this is a big to-do for me. I recently made a change so that users should never have to modify the id (they should just modify labels, and everything else is taken care of). The big reason for this is that modifying IDs causes problems, like duplications. So, I'm going to work on disabling the feature that lets users modify the id of a node.
The default value for K should be inf (Gaussian model) and not 1 I think.
Before making any changes, my graph code through the app looks like this:
gfpop::graph(
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'null', gap = 1, penalty = 0, K = Inf, a = 0),
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'std', gap = 0, penalty = 15, K = Inf, a = 0)
)And K is still Inf after making some changes, both via the graph visualization and the View/Edit tab:
gfpop::graph(
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'null', gap = 1, penalty = 0, K = Inf, a = 0),
gfpop::Edge(state1 = 'Std', state2 = 'Std', type = 'std', gap = 0, penalty = 15, K = Inf, a = 0),
gfpop::Edge(state1 = 'up', state2 = 'up', type = 'null', gap = 1, penalty = 0, K = Inf, a = 0),
gfpop::Edge(state1 = 'Std', state2 = 'up', type = 'up', gap = 0, penalty = 10, K = Inf, a = 0)
)From your end, when did K start to equal 1?
I am not able to build the up-dw graph starting from std graph.
Ahh, I suspect that you were editing the "type" parameter in the data table under "Current Graph", right?
I was able to follow your instructions and successfully build the up-dw graph from std, but by only using the add/edit buttons that built-in to the graph visualization.
But, if I edit those in the data table, things start to break. So, it looks like I missed a bug in that, I'll get started on fixing that now!
Need functionality to indicate which nodes are starting and ending.
Can't add a new recursive edge, because the angle automatically overlaps with the default null recursive edge
RSelenium has a vignette dedicated to testing shiny apps. I'm not sure about the relationship between shinytest and RSelenium yet, so I'm going to do more reading on those. I'll likely end up using both.
As mentioned in #1, RSelenium can also be integrated with Travis.
I added some very basic tests in #6 through testthat. In the near-term, I should play around with writing some very basic tests with shinytest and RSelenium to get a feel for that and make sure I can get Travis to run them.
To have something to test at least, I should go ahead and design a home page based on the sketches in #3. I'll make a separate issue for that (#8)
There's a lot of redundancy associated with the way that I label edges. I should work to remove that redundancy.
Putting this as a Phase 2 bug because deployment still works, it would just be nice to have continuous deployment moving forward.
In #20 and #21, I saw issues with the deployment at shinyapps.io.
When I re-deployed from my local copy, those issues went away.
There are some potential sticking points with the builds: (1) the build relies on installing gfpop-gui
, which travis may not update since it caches packages, and (2) when I build, I only upload select files: uploading files in website/
causes the build to timeout. I select those files manually (doing this automatically is a known issue in golem--see issue here.
Travis seems to think it deployed successfully, here's part of the log from when it built c3b5534:
> setAccountInfo(
+ name = Sys.getenv("shinyapps_name"),
+ token = Sys.getenv("shinyapps_token"),
+ secret = Sys.getenv("shinyapps_secret"))
>
> rsconnect::deployApp(forceUpdate = TRUE)
Preparing to deploy application...DONE
Uploading bundle for application: 2366586...DONE
Deploying bundle: 3202109 for application: 2366586 ...
Waiting for task: 738409997
building: Parsing manifest
building: Building image: 3580399
building: Fetching packages
building: Building package: gfpopgui
building: Installing packages
building: Installing files
building: Pushing image: 3580399
deploying: Starting instances
rollforward: Activating new instances
terminating: Stopping old instances
Application successfully deployed to https://[secure].shinyapps.io/gfpopgui/
Newly-created edges still go by their randomly-generated ID names, which is not great for readability. Should be easy to fix up
Somewhat low priority:
Custom graphs (specifically: the datatable on the 'home' page showing the graph) are working on my local version, but not the shinyapp deployed version.
I think this might be because my travis.yml caches R packages. This repo itself (julianstanley/gfpop-gui
) is also a travis R package dependency. So the weirdness might be because of the cache? So then, just re-installing gfpop-gui from the deploy script should do the trick.
So this is just a reminder to myself to look into that more, and maybe make a branch with that modified script and see if that helps.
The crosstalk feature slows everything down a bit, so it would be nice to be able to enable/disable. Both for testing purposes—to see how much that’s contributing to the slowness—and for practical features, for when I’m just editing the graph and don’t need that feature.
Candidates:
Moving forward:
(a) playing with my forked version of visNetwork to see if I can get it to match our specifications, and
(b) playing with shinyDAG's plotly graph implementation to see if that would end up being easier.
There are two obstacles to using visNetwork: (1) I'll need to make edits to visNetwork's core javascript file (for example, allowing edge labels to be updated and having more input parameters for edges/nodes), and (2) I don't think visNetwork can easily interface with plotly (for the "hover over changepoint-->highlight associated node" feature). If I just had two plotly graphs, then I could use crosstalk
, but visNetwork doesn't support crosstalk just yet.
I created issue 337 in visNetwork asking about the two obstacles expressed above.
I previously thought that saving the network visualization (after the user edits it) would be a problem, based on issue 356. But I found a StackOverflow post addressing that which I added as comment on that issue.
Issue #2 (making visNetwork interface with other plots) was already brought up with visNetwork about two years ago when a user asked about crosstalk
integration. At the time, the repo maintainer seems to have tried to implement that integration unsuccessfully. I asked about that in the crosstalk issue, number 229.
If I go down this route, looking at shinyDAG may be a good option.
This issue has been bothering me for a while now:
You shouldn't need a graph refresh after editing an edge when there's no server calculation necessary.
Playing around with examples, I think it's because I allow the "hidden" parameter to edited by the user. I think hidden: "false" is converted to a string, making the edge disappear. No need for the user to edit this parameter though, so just removing it should fix.
Users should be able to customize the information shown on edge labels
I should look into using cpVis functions to improve the app.
At first, this was just for testing purposes.
But I think there should be a proper "generate data" section on the home page, especially for demonstration purposes.
When editing an edge, if you accidentally set the type as "Std" instead of "std", the whole application crashes (oops).
So, there needs to be some user input validation. At least lowercase values. And, if values are invalid, maybe send a warning and default to null?
Guillem asked for this feature during the first intro Zoom meeting.
It should be super straightforward to do from a server-side. I can just have a reactiveValues
that contains (1) an array/list of lists/linked lists of all values needed to recreate an analysis, and (2) a smaller dataframe identifying each of those previous runs. I show that dataframe to the user and, when a they click a row, I can re-run the analysis with the same parameters as before.
There might be some other ways to approach the same thing. For example, shiny bookmarking might be able to do the trick. Or, it would be nice if I could store all that data on the client-side. So, this enhancement request may have layers.
Probably can get this done towards the end of Phase 1 (next week?)
(Low priority item--aligns more with Phase 2/3)
I'm pretty happy with the recent enhancements to the analysis page (screenshots below). You can edit the graph now and the data plot is actually useful. Now, I'm going to go back and clean things up and test better.
However, there is one thing that's bothering me. Models with just a few changepoints generate reasonably quickly (variations on the first screenshot below take up to a second or so) --I decoupled the main scatterplot and the overlain changepoints to make that faster.
However, as expected, trying to render hundreds of changepoints is really, really slow and can potentially crash the application. For example, the default std
graph with a penalty of 1 with 1000 datapoints (ended up being 336 changepoints) takes on the order of a minute or so.
Should I put a cap on these? I guess I could just show an error if the user tries to render X number of changepoints (or just show a static plot in that scenario)--does that sound like a good choice, or better to just let the users generate however many changepoints they're willing to wait for?
Either way, I'm going to work on making it more efficient--right now I add a new plotly trace/layer for every changepoint, but can probably try and get away with one trace for all of them. But, that might just move the reasonable upper limit from 1,000 to 5,000, etc.--I guess that there's always going to be some point where it's damn slow.
Need to refactor a lot of my tables. I bet I can make things a bit more efficient with data.table, like Toby mentioned early on.
EDIT: I fixed this before getting around to posting this issue (and will close this issue once it's merged), but I'm going to post it anyways for documentation/history purposes.
I fell for a typical blunder: when I ran read.csv
, I needed to set stringsAsFactors = FALSE
. After that (and a few other little changes: for example, dropping all columns that don't have a header name that gfpop::graph recognizes) things seem to be working dandily.
Original issue:
I'm having trouble with getting gfpop
to accept a user-uploaded graph.
I generated a graph with gfpop::graph
, which works with gfpop
. Then, when I export that graph to csv and then re-import the graph, gfpop
does not recognize it.
Passing the imported csv to gfpop::graph
works, but leads to a different error when I try to run gfpop ("replacement has length zero.")
I'll put a portable snippet below. Does this look like an error in gfpop
, or am I doing something wrong?
# Generate some data
data <- gfpop::dataGenerator(50, changepoints = c(1), parameters = c(1))
# Create a working graph
iso_graph <- gfpop::graph(penalty = 15, type = "isotonic")
gfpop::gfpop(data = data, mygraph = iso_graph, type = "mean")$changepoints
# [1] 499 1000
# Export and re-import the working graph
write.csv(iso_graph, "graphtest.csv", row.names = FALSE)
iso_graph_import <- read.csv("graphtest.csv")
# Attempt: imported graph
gfpop::gfpop(data = data, mygraph = iso_graph_import, type = "mean")$changepoints
# Error: Error in gfpop::gfpop(data = gfpop_data$primary_input$Y, mygraph = iso_graph_import, :
# Your graph is not a graph created with the graph function of the gfpop package.
# Attempt 2: imported graph passed through gfpop::graph
iso_graph_import_2 <- gfpop::graph(iso_graph_import)
gfpop::gfpop(data = data, mygraph = iso_graph_import_2, type = "mean")$changepoints
# Error: Error in x[[jj]][iseq] <- vjj : replacement has length zero
# Attempt 3: Maybe the NAs are causing trouble?
iso_graph_import_3 <- iso_graph_import_2 %>% dplyr::select_if(~ !any(is.na(.)))
gfpop::gfpop(data = data, mygraph = iso_graph_import_3, type = "mean")$changepoints
# Error: Error in x[[jj]][iseq] <- vjj : replacement has length zero
Hi @tdhock, here's something Guillem and I discussed today that I need your help/input on. This is a pretty long post, so no rush responding--I still have other things to do before getting to this, so this isn't a bottleneck.
The fact that the graph refreshes when new edges are added is a huge annoyance.
Example of problem:
With many nodes, this can be super disorienting. Guillem thinks that users could get used to it, but would be frustrated at first.
I need to refresh the graph whenever I want to send some data from R to HTML/JS.
Right now, that's important for three purposes:
(1) Setting default values for an edge.
(2) Validating that the parameters that the user gives are reasonable.
(3) Updating the pop-up "edit edge" window to have correct values after the user edits them.
A small part of this problem is a visNetwork bug--when you edit the penalty of an edge, the "edit edge" pop-up should reflect that change.
But most of it is a broader incompatibility with the way that visNetwork is designed:
The visNetwork canvas can't take any input from R without refreshing. So, if the graphdf changes, visNetwork needs to refresh to reflect that change. For example, if I want to send an "edit edge" command to R, have R check whether the "type" or "penalty" or etc of the edge is valid, R can't send that result back to visNetwork without a refresh.
Or, if I want to construct the "label" of an edge as a concatenation of different edge parameters, I have the same problem.
To demonstrate this, here's an example of adding/editing edges, but manually refreshing:
You can see that (1) everything is undefined when you make a new edge, (2) the edge label doesn't update, and (3) when you try to edit the penalty of the edge, the edge disappears!
Some of these problems could be solved from visNetwork's end, but others are pretty specific to our--e.g. data validation.
I can work more with visNetwork on this, but Guillem proposed a separate solution:
He thinks it might be easier to seperate the list of parameters from the visNetwork visualization. In that implementation, the visNetwork graph has nodes (with labels) and edges (with labels), and that's it. When you click a node/edge, the corresponding row in the gfpop graphdf (shown below the graph) is highlighted. Then you edit the parameters in that dataframe directly.
I think I agree with him that that could be a good idea. The problems presented here are probably fixable (by modifying visNetwork.js) but might take a while. His solution would be a lot faster, and might actually be the more practical implementation in a lot of cases--for example, if you want to quickly change the penalty in 4 different edges, clicking on each edge and clicking "edit", "save" individually is a bit tedious, but faster to just edit the dataframe directly.
So, the flow here would be: (1) draw the graph topology that you want in visNetwork, (2) tweak the parameters in the dataframe below the visNetwork graph, with cross-brushing between the dataframe and the network, (3) click checkboxes with how you want edges/nodes to be annotated, (4) option to export graph visualization and/or R code
But we both wanted to check with you to see if you had any thoughts/feedback on that!
Node IDs are long and complicated to avoid name conflicts.
Guillem suggested that they should be shorter--just have the same ID and label--to make things simpler for users.
This only happens in the shinyapps.io version. Maybe related to #20?
gif showing problem attached below.
Everything is all set with SauceLabs (it was so much easier to get that setup. It just works), so I'm going to work on writing some proper integration tests soon, especially as we get into Phase 2.
That means that I'm planning to just run RSelenium on the shinyapps.io version of the app, since SauceLabs integrates really seamlessly that way.
Related: In my latest pull request (#36), I started using the brand-new shiny::testServer
functionality. It's easy to write and does a lot--the devs call it integration testing, but it's somewhere in-between--for example, you can test whether a button works when pressed, but not whether that button actually shows up on the screen. I've used it, for example, to test whether the visNetwork graph can be edited (assuming that the user sends the input I expect). So, now I think shiny::testServer
can be our first big line of defense (it can cover most of the code and is super easy to write) and then hopefully RSelenium can be that second line on the deployed app.
On the "Analysis" tab, users can save a bunch of different graphs/analyses.
However, when they export their data to RData on the "Sharing" tab, only the current analysis is saved.
Really, that big RData export should include all of the analyses that they saved. This should be straightforward to implement, it just means that I need to also pass save data from mod_home to mod_analysis.
docusaurus has too much overhead for the purposes of this package. Probably easier to maintain a pkgdown website. Just turn the timeline into one vignette document, with headers for each phase/week.
@tdhock (btw: just tagging you in every issue, but lmk if I should tag more selectively/not at all. Not sure what best practices for that are)
In Phase 1, I should have an application shell with tabs and some basic functionality.
So:
Visualizing null edges isn't too much of a problem in some cases:
> generate_visNetwork(graphdf_to_visNetwork(gfpop::graph(type = "updown")))
But it is in others. When you have two recursive edges, they overlap so that you can only see one:
> graph <- graphdf_to_visNetwork(gfpop::graph(type = "std"))
> graph
$nodes
id label
1 Std Std
$edges
id label to from type parameter penalty K a min max
1 Std_Std_null Std_Std (null | 0) Std Std null 1 0 Inf 0 NA NA
2 Std_Std_std Std_Std (std | 0) Std Std std 0 0 Inf 0 NA NA
> generate_visNetwork(graph)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.