phuse-org / e2e-os-guidance Goto Github PK
View Code? Open in Web Editor NEWCollaboration area for PHUSE End-to-End Open-Source Guidance
Home Page: https://phuse-org.github.io/E2E-OS-Guidance/
License: MIT License
Collaboration area for PHUSE End-to-End Open-Source Guidance
Home Page: https://phuse-org.github.io/E2E-OS-Guidance/
License: MIT License
In #4, it was flagged more was wanted.
James to email contributors for feedback on the section
## Licences: releasing a project
in releasing.qmd
Noticed minor spelling error. Consider adding workflow to evaluate spelling across qmd files: https://github.com/ropensci/spelling
Need to incoporate below (many thanks to Ross for driving this):
i personally found this section a bit bloated: https://phuse-org.github.io/E2E-OS-Guidance/usingos/#how-active-are-the-community-behind-a-project - the bullet points were enough in my mind
out of interest, how come riskmetric never made it into https://phuse-org.github.io/E2E-OS-Guidance/usingos/#what-can-help-me-understand-the-risks-around-using-an-open-source-project ?
in your post-competitive explanation (https://phuse-org.github.io/E2E-OS-Guidance/releasingOS/ip/) i wonder if just talking of eCRF -> esub narrows this definition too much. i have a more generalised view of this that its anything where companies are needing to solve the same challenge and solving it collaboratively offers greater value than in isolation (e.g. there are similar collaborations across companies in the patient level data sharing space which we'd justify in the same way)
minor bug bear - the inconsistent spelling of license (licence)licenses - i expected a little more on copyleft and the implications of making such a choice
This would be useful for others to use in their publications/presentations.
"well as collaboration on and creation of open-source projects used by data scientists in clinical reporting workflows"
You could include statistical programmer, researcher, scientist etc. PhUSE members consist of many roles etc.
"Open source: the what and why"
Open source is also a step towards insuring "repoducibility"... should be "reproducibility" yea?
Also, some good info below on that topic:
https://ropensci-archive.github.io/reproducibility-guide/
"Open source: the what and why"
Given the historical context, and given the R package examples later on, you could, depending on the focus, reference work by John Chambers, Robert Gentleman, Ross Ihaka, JJ Allaire, Hadley Wickham and Joe Cheng etc. This could help bridge into the examples later on in the paper.
"How can I see the activity of an open-source project?"
This jumps into OS projects and more specifically packages. There could be a section about the languages and how they are built and maintained etc., like R core etc. R foundation etc. and how packages extend the capabilities by the community. Might take a look at the appendix here for some ideas:
How do I select an R package for my clinical workflow?
https://www.lexjansen.com/phuse-us/2019/tt/TT11.pdf
"How do I find open-source projects?"
Maybe list Bioconductor and work by ropensci. Could list some examples from the Python sciences packages/ecosystem too.
"In such cases, you may need to extend, or start a new package."
Maybe define fork/clone.
"What can help me understand the risks around using an open-source project?"
Lots of good info here as well:
https://www.pharmar.org/white-paper/
This section could also list the Fred Hutch, GSK and Phuse work on:
https://github.com/phuse-org/valtools
"Licenses: using a project"
Could be worth mentioning that companies can create a curated list of packages based on packages they want for various reasons, like licenses etc.
"When is a good time to open source?"
Caret is a good example from Pfizer:
https://www.r-project.org/conferences/useR-2010/slides/Kuhn.pdf
"Pfizer’s Statistics leadership for providing the time and support to create R packages"
https://www.nytimes.com/2009/01/07/technology/business-computing/07program.html
Also, Targets by Will Landau at Lilly is a good example:
https://books.ropensci.org/targets/
https://cran.r-project.org/web/packages/targets/LICENSE
"Company Github orgs"
https://github.com/orgs/Merck/repositories
"Collaboration and governance models"
Some good info here:
https://ropensci.org/stat-software-review/
"3.8 When do we need contracts?"
Might list an example from Julia, Python or JS etc. Also, lots of interesting examples of collaborative work in the pkpd space in Open Source:
https://nlmixrdevelopment.github.io/nlmixr/articles/xgxr-nlmixr-ggpmx.html
https://github.com/MetrumResearchGroup
https://pharmpy.github.io/latest/index.html
https://github.com/metrumresearchgroup/2021-r-in-pharma
Thank you for the invitation to review and provide feedback on the guidance. Please let me know if you'd like for me to create separate issues and/or submit a PR.
Line 55 in 0891e63
I think this example needs more details. It writes "This function exists in another GPL-3 copy left licenced project" - does that imply that the project which copies the function and then goes open-source at a later point in time will also be released under GPLv3?
If that is the assumption, then they are not necessarily in the wrong as per my understanding (is the original project attributed and referenced explicitly?):
https://fossa.com/blog/open-source-software-licenses-101-gpl-v3/
On the other hand, releasing the project, including a piece of code under GPLv3 and then marking the project as MIT would be an infringement, seeing as MIT is less strict than GPLv3 about permissions.
A project could also have no activity as it has been abondened after or before it reached"v1.0.
A summary and recommendation of licence types, with particular focus on permissive vs copyleft licences and the ramifications on code built on top of your project Relevance of licences present in dependencies, direct vs transitive dependencies, and the issues around compiling with dependencies that could occur in something like a public shiny app
In /using.html#what-is-the-open-source-health-of-the-package:
There are no universal magic metrics to summarise whether an OS project is ‘healthy’; for example if a project has had no activity for 12 months, is that because the product has been abandoned/superseded, or could it be it had a small well-defined scope and is now stable and feature complete?
In /using.html#how-active-are-the-community-behind-a-project:
By looking through the issues, subjective impressions on community health can be made, for instance whether it’s a few people giving feedback and one person developing, does it have stale issues no-one replies to, or does it have a lively community engaged in discussion and coordination.
The line length of many sections grow quite long as newline is seldomly used. This is no problem in some editors, which wrap the text to fit the window (github, as an example), but it is difficult in other editors that do not wrap. In this case, it is not a great user experience and it can also be difficult for git to display in a good way what has actually changed within one or more lines.
Gitlab wrote a blog post about this: https://about.gitlab.com/blog/2016/10/11/wrapping-text/
Consider the following git diff on my ongoing PR. The first two changes (licence -> license) are quite apparent, but the rest of the sections are less apparent what has actually changed, as it becomes part of big text sections:
https://github.com/phuse-org/E2E-OS-Guidance/blob/main/README.md#installation
This section reads a little bit like a split between contribution guideline (fork and make a PR) and a bit like a setup guide, but on a very high level. Think it is worth considering a detailed step-by-step that can take a user from zero to having it running locally, even for a user that is fairly unfamiliar with the tool stack used.
Thanks all for allowing Posit to review this doc! It's a great read and will certainly be an invaluable resource for developers in pharma looking to adopt open source tools for clin reporting. I provided some quick typo/sentence corrections, and also wanted to share (compliments of our Open Source Program Director, Tracy Teal) an article on writing clean and reliable open-source code which might be a helpful resource to link to.
I merged #10 but some links look like they are broken in "Built-with"
Built With
[![Next][Next.js]][Next-url]
[![React][React.js]][React-url]
[![Vue][Vue.js]][Vue-url]
[![Angular][Angular.io]][Angular-url]
[![Svelte][Svelte.dev]][Svelte-url]
[![Laravel][Laravel.com]][Laravel-url]
[![Bootstrap][Bootstrap.com]][Bootstrap-url]
[![JQuery][JQuery.com]][JQuery-url]
@epijim - any thoughts?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.