Comments (9)
This is entirely my opinion, but for what it is worth...
networkD3
whitelists columns but not for security, and I often find myself adding back lost data. As far as I know, no other htmlwidgets
constrain passed data. leaflet
might (but again unintentionally I believe) depending on the method. While users should be aware that all data might get passed as json
, I don't think the htmlwidget
author should bear responsibility for determining what columns might or might not be important and what columns might or might not contain sensitive information.
from timevis.
A bit more context:
In order to create a timevis widget, the user calls the timevis(data)
function in R, where data
is a data.frame containing the data to show, and it needs to have some specific columns in order for the visualization to work. For example:
data <- data.frame(content = c("event 1", "event 2"),
start = c("2016-01-10", "2016-01-12"),
end = c("2016-01-13", "2016-01-14"))
timevis::timevis(data)
results in
Since this package is a binding to a third party javascript library and that javascript library is responsible for building the visualization, the data
dataframe that is given to the timevis()
R function gets passed directly to the javascript layer. This dataframe is accessible in the javascript console as a json object. This means that if you have additional columns in the dataframe that are not actually needed to build the visualization, they will still be available, hence the potential "information leakage".
I want to hear from others if they think this is a big security issue and if timevis should explicitly remove any non standard columns before passing the data to javascript. I don't love that idea because 1. then the package will have to play catchup with visjs to make sure we always have a full list of all the supported
columns, and 2. it's possible that the names of columns is not a finite list, maybe there are some options that allow the user to define custom columns that are used (the documentation for visjs is long with a lot of options and I haven't gone through it in detail recently).
I think simply documenting the data
parameter to say that the entire dataframe will be visible to the client is sufficient. Would like to hear some opinions.
I wonder if other htmlwidgets whitelist columns in the data they use?
from timevis.
@greg-minshall My thoughts are along the same as what @timelyportfolio said. For now I would feel better about just adding a few words in the parameter documentation rather than making assumptions about and changing the data being passed in
from timevis.
hi. so, i sort of convinced myself other than the two of you. let me see if i can explain, both my proposed solution, and (harder) my rationale. my proposed solution:
-
timevis() drops unknown columns from the input dataframe. but, it does so "noisily", i.e., by issuing a warning. the warning includes a comment that hitherto unknown (to timevis()) columns will be allowed through if they are named in the (brand new) "passcolumns=" parameter.
-
the "passcolumns=" parameter, nominally NULL, can contain a list ("passcolumns=c("mightynew", "newest")) naming additional columns to allow through.
the warning in #1 is to (try to) prevent mystification of the user when their "mightynew" column doesn't seem to be having any effect. "passcolumns=" then decouples, to some extent, releases of timevis() from upgrades in vis.js() Timeline.
my reasoning is that security is a hard problem. while we can't fix the problem, we can semi-plug one hole, maybe stopping some very embarrassing leak from happening in the future. (if we do prevent it, we'll never know it; if we don't, we'll possibly hear about it.) on the other hand, every small possibility increases the overall likelihood of security failures.
from timevis.
I do understand your viewpoint eg. this is a security issue that timevis can potentially take some measure against. But in my opinion this is overstepping what timevis is, it's simply a visualization, if someone has sensitive data then it's on the user to not pass that sensitive data around if not needed, to any function. You shouldn't pass it to R functions if it's truly sensitive information that is not needed for the function you're calling, because every function you call is a blackbox inside and you don't know what it does with it (unless you actually look at the source code). To me it doesn't seem like it makes sense for such a simplistic project that's an R binding to a JS library to try to solve this problem.
If others will agree that it's the correct thing to do, I'm open to changing my mind, which is why I specifically asked on Twitter for people to comment here if they have an opinion.
Or, alternatively, if you'll find several other htmlwidgets that also have that line of thinking that variables should be whitelisted for security reasons, then it would also prompt me to reconsider. But the argument of a security hole sounds too weak for me because it seems like the user should just not use a dataframe with information that they really don't want to share.
I'll leave this issue open for a while to see if others chime in!
from timevis.
Dean, @timelyportfolio,
one thing to keep in mind is that, by definition, the user we are protecting is the user who (after the warning is on the man page) is, in fact, doing the wrong thing, making a mistake. the user who reads, understands, and pays attention to the warning is not the person we are trying to protect. rather, it is the person who, for whatever reason, doesn't do all those things. but, you know how we are when we are trying to get something to work, we read this and that, look at google, stackexchange, etc. then, we try this and that, possibly forgetting -- certainly at my age! -- some things we previously read while doing so. the programmer who reads all the documentation, follows all the guidelines, imho, is a bit like homo economicus.
in a scenario, the department clerk is told to put together a presentation for the next City Council meeting of what how the department employees are scheduled for the next six months. remembering that Janet did something similar, he asks and she sends him some code, using timevis. working on that, the clerk notices Janet gets the employee task database using the office API (which maybe involves, behind the curtains, a few joins), and then forms a new datafram with "name", "group", etc. but, the clever clerk says, i already have a dataframe, i'll just rename a few columns ("Name" to "name"; "Section" to "group", etc.), pass it to timevis(), and wow! since the meeting is public, the resulting web page is published online for the council members, and interested members of the public, to see. to see (well, for hackers), including, the non-timevis, non-vis.js columns such as mobile phone numbers, salary, etc.
timevis is a really nice package, and hopefully it will get an expanding user base. but, the likelihood of a leak is, i would guess, proportional to the number of users. (of course, i already did a leak; luckily not that horrible a one, just some e-mail address that, presumably, nobody noticed.)
so, obviously it's you decision, but i still offer up my solution (which i'm happy to code up): new columns can be passed, but they must be explicitly specified as new columns.
from timevis.
i think i did a boo boo. reopening.
from timevis.
You did in fact boo boo, let's keep it open :) I completely understand your argument. However, I still personally think that would be overstepping the boundary of the package, and that if someone has sensitive info the burden of being careful with it and not giving it away to other services is on them, not on the service to try to throw away what's given to it. This is clearly not a black and white matter and involves some personal judgment on what the solution (or no solution) is, so it would help if I would see that other htmlwidgets do take this into consideration. I haven't looked, but I don't think they do, and I imagine it's from similar reasons to mine. But I may be wrong .
from timevis.
As mentioned in PR #42 , instead of keeping this open forever, I'm closing this issue as a non-fix
from timevis.
Related Issues (20)
- timezone argument fails in shiny but works in console HOT 1
- Online demo license expired HOT 2
- timeline-arrows package HOT 1
- Deselecting Items does not trigger observeEvent(input$timeline_selected,{}) (RShiny) HOT 1
- Changing language in the viewer HOT 1
- About the language HOT 1
- On which version of visjs-Timeline is Timevis running? HOT 1
- How to format dates in the data retrieved from the widget? HOT 2
- wrong date on timeline HOT 3
- timevis not running in Quarto .qmd HOT 3
- name of the month is an abbreviation in Ukrainian HOT 1
- Crosstalk error: Error: timevis: 'data' must be a data.frame HOT 2
- timevis is excellent HOT 1
- No support for portuguese? HOT 1
- Error using `{crosstalk}` with `timevis()` HOT 1
- Cannot assign HTML attributes to table in content HOT 3
- Handlebars Template stripped out of classes HOT 3
- Stacking performance - update vis.js version? HOT 1
- Show warning or error on duplicate id's HOT 1
- Visualisation Shaking HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from timevis.