Giter Site home page Giter Site logo

phuse-org / e2e-os-guidance Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 3.0 9.39 MB

Collaboration area for PHUSE End-to-End Open-Source Guidance

Home Page: https://phuse-org.github.io/E2E-OS-Guidance/

License: MIT License

TeX 4.19% SCSS 3.93% HTML 8.04% CSS 20.84% R 63.00%

e2e-os-guidance's People

Contributors

aebilgrau avatar epijim avatar kimjj93 avatar mstackhouse avatar tkqt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

e2e-os-guidance's Issues

Feedback from Pharmaverse council

Need to incoporate below (many thanks to Ross for driving this):

i personally found this section a bit bloated: https://phuse-org.github.io/E2E-OS-Guidance/usingos/#how-active-are-the-community-behind-a-project - the bullet points were enough in my mind

out of interest, how come riskmetric never made it into https://phuse-org.github.io/E2E-OS-Guidance/usingos/#what-can-help-me-understand-the-risks-around-using-an-open-source-project ?
in your post-competitive explanation (https://phuse-org.github.io/E2E-OS-Guidance/releasingOS/ip/) i wonder if just talking of eCRF -> esub narrows this definition too much. i have a more generalised view of this that its anything where companies are needing to solve the same challenge and solving it collaboratively offers greater value than in isolation (e.g. there are similar collaborations across companies in the patient level data sharing space which we'd justify in the same way)
minor bug bear - the inconsistent spelling of license (licence)

licenses - i expected a little more on copyleft and the implications of making such a choice

Feedback by Phil (Posit)

"well as collaboration on and creation of open-source projects used by data scientists in clinical reporting workflows"
You could include statistical programmer, researcher, scientist etc. PhUSE members consist of many roles etc.

"Open source: the what and why"
Open source is also a step towards insuring "repoducibility"... should be "reproducibility" yea?

Also, some good info below on that topic:

https://nuest.staff.ifgi.de/N%C3%BCst-and-Pebesma_2020_AAM_Practical-Reproducibility-in-Geography-and-Geosciences.pdf

https://ropensci-archive.github.io/reproducibility-guide/

"Open source: the what and why"
Given the historical context, and given the R package examples later on, you could, depending on the focus, reference work by John Chambers, Robert Gentleman, Ross Ihaka, JJ Allaire, Hadley Wickham and Joe Cheng etc. This could help bridge into the examples later on in the paper.

"How can I see the activity of an open-source project?"
This jumps into OS projects and more specifically packages. There could be a section about the languages and how they are built and maintained etc., like R core etc. R foundation etc. and how packages extend the capabilities by the community. Might take a look at the appendix here for some ideas:

How do I select an R package for my clinical workflow?
https://www.lexjansen.com/phuse-us/2019/tt/TT11.pdf

"How do I find open-source projects?"
Maybe list Bioconductor and work by ropensci. Could list some examples from the Python sciences packages/ecosystem too.

"In such cases, you may need to extend, or start a new package."
Maybe define fork/clone.

"What can help me understand the risks around using an open-source project?"
Lots of good info here as well:

https://www.pharmar.org/white-paper/

This section could also list the Fred Hutch, GSK and Phuse work on:

https://github.com/phuse-org/valtools

"Licenses: using a project"
Could be worth mentioning that companies can create a curated list of packages based on packages they want for various reasons, like licenses etc.

"When is a good time to open source?"
Caret is a good example from Pfizer:

https://www.r-project.org/conferences/useR-2010/slides/Kuhn.pdf
"Pfizer’s Statistics leadership for providing the time and support to create R packages"

https://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

Also, Targets by Will Landau at Lilly is a good example:

https://books.ropensci.org/targets/
https://cran.r-project.org/web/packages/targets/LICENSE

"Company Github orgs"
https://github.com/orgs/Merck/repositories

"Collaboration and governance models"
Some good info here:

https://ropensci.org/stat-software-review/

"3.8 When do we need contracts?"
Might list an example from Julia, Python or JS etc. Also, lots of interesting examples of collaborative work in the pkpd space in Open Source:

https://nlmixrdevelopment.github.io/nlmixr/articles/xgxr-nlmixr-ggpmx.html
https://github.com/MetrumResearchGroup
https://pharmpy.github.io/latest/index.html
https://github.com/metrumresearchgroup/2021-r-in-pharma

Feedback from Pfizer

Thank you for the invitation to review and provide feedback on the guidance. Please let me know if you'd like for me to create separate issues and/or submit a PR.

Example clarification

It is possible that decisions made before open sourcing could become a risk after open sourcing. As an example of a plausible scenario; a team need to implement a new function. This function exists in another GPL-3 copy left licenced project. To add that project would introduce multiple dependencies that aren't used by that particular function so a member of the team decides to copy the function into the package. One year later, the package is open sourced with the licence infringing code. Such an occurrence could be lessened by a Contributor Licence Agreement (CLA; see [the bot contributor-assistant](https://github.com/contributor-assistant/github-action) for an example of CLA automation). A CLA helps ensure that anyone contributing to a project acknowledges specific terms expected of contributions, like the contributions are novel code and the author will abide by the projects licence terms. In the absence of a CLA it is important to ensure that all code within the package is original, and there is no culture of cannibalising external code and infringing on people's copyright within the development team even for internal projects.

I think this example needs more details. It writes "This function exists in another GPL-3 copy left licenced project" - does that imply that the project which copies the function and then goes open-source at a later point in time will also be released under GPLv3?

If that is the assumption, then they are not necessarily in the wrong as per my understanding (is the original project attributed and referenced explicitly?):
https://fossa.com/blog/open-source-software-licenses-101-gpl-v3/
image

On the other hand, releasing the project, including a piece of code under GPLv3 and then marking the project as MIT would be an infringement, seeing as MIT is less strict than GPLv3 about permissions.

Additional feedback from Pfizer

A project could also have no activity as it has been abondened after or before it reached"v1.0.

A summary and recommendation of licence types, with particular focus on permissive vs copyleft licences and the ramifications on code built on top of your project Relevance of licences present in dependencies, direct vs transitive dependencies, and the issues around compiling with dependencies that could occur in something like a public shiny app

  • Consider breaking down/simplifying sentences to improve readability:

In /using.html#what-is-the-open-source-health-of-the-package:

There are no universal magic metrics to summarise whether an OS project is ‘healthy’; for example if a project has had no activity for 12 months, is that because the product has been abandoned/superseded, or could it be it had a small well-defined scope and is now stable and feature complete?

In /using.html#how-active-are-the-community-behind-a-project:

By looking through the issues, subjective impressions on community health can be made, for instance whether it’s a few people giving feedback and one person developing, does it have stale issues no-one replies to, or does it have a lively community engaged in discussion and coordination.

Consider line character limits

The line length of many sections grow quite long as newline is seldomly used. This is no problem in some editors, which wrap the text to fit the window (github, as an example), but it is difficult in other editors that do not wrap. In this case, it is not a great user experience and it can also be difficult for git to display in a good way what has actually changed within one or more lines.

Gitlab wrote a blog post about this: https://about.gitlab.com/blog/2016/10/11/wrapping-text/

Consider the following git diff on my ongoing PR. The first two changes (licence -> license) are quite apparent, but the rest of the sections are less apparent what has actually changed, as it becomes part of big text sections:
image

Feedback from Ryan (Posit)

Thanks all for allowing Posit to review this doc! It's a great read and will certainly be an invaluable resource for developers in pharma looking to adopt open source tools for clin reporting. I provided some quick typo/sentence corrections, and also wanted to share (compliments of our Open Source Program Director, Tracy Teal) an article on writing clean and reliable open-source code which might be a helpful resource to link to.

Section 2.2.1

  • Title should read "How active is the community..." or "how active are the communities..."
  • "Some things to consider when trying to establish the activity of a community are..." I feel that this sentence should refer to the "community activity for a given package/project."
  • This sentence can be re-written for clarity: "What is the recent and trends in commit activity?" Maybe have it read "What are the most recent commits and are there any trends in commit activity?"

Section 2.2.3

  • Typo: "It also and indexes and"

Section 2.4

  • Can maybe link to https://r-pkgs.org/ when discussing the idea of creating your own R package.

Broken links in Readme

I merged #10 but some links look like they are broken in "Built-with"

Built With
[![Next][Next.js]][Next-url]
[![React][React.js]][React-url]
[![Vue][Vue.js]][Vue-url]
[![Angular][Angular.io]][Angular-url]
[![Svelte][Svelte.dev]][Svelte-url]
[![Laravel][Laravel.com]][Laravel-url]
[![Bootstrap][Bootstrap.com]][Bootstrap-url]
[![JQuery][JQuery.com]][JQuery-url]

@epijim - any thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.