Giter Site home page Giter Site logo

nhs-r-community / statements-on-tools Goto Github PK

View Code? Open in Web Editor NEW
14.0 3.0 6.0 1.23 MB

The NHS-R Community statements on the use of tools including (but not exclusively) R and R Studio.

Home Page: https://tools.nhsrcommunity.com/

License: Creative Commons Zero v1.0 Universal

R 100.00%
community book

statements-on-tools's Introduction

This is an evolving document which describes how and why to use R and other data science tools and to share and reuse code safely in health and social care settings.

The scope and content are expanding all the time as the community collaboratively produces a definitive statement of the NHS-R way.

Please file issues, make pull requests, and get involved, we're very happy from hear from friends from inside and outside of NHS-R.

statements-on-tools's People

Contributors

bclarke-nes avatar chrisbeeley avatar lextuga007 avatar matt-dray avatar tomjemmett avatar wbryant avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

statements-on-tools's Issues

Cannot locally build book because of child documents

Currently it is not possible to locally build the book due to the use of child documents - while you can get bookdown to render the document, it will only ever pull in the files from the main branch making it impossible to ever verify your own changes. The only way to ever check that the changes you make work would be to commit to main, then check.

I suggest that we remove the use of child documents and move to a single repo (#19, #23 seem to be handling this), then move to use bookdown in a more traditional sense with individual files. I'm happy to sort this.

Options then would be to either name each file as "00-filename.rmd", setting the numbers to order the chapters as we wish, set the order in _bookdown.yml (e.g., like rd4s did in the first edition), or move to quarto (e.g. as r4ds has moved to. My shout would be on the first option, I think this typically makes it clear how the book will be built just looking in the files pane.

Once we have done this we can sort of resolve #18 - though this will not be run once per day but run on merge's into main (usethis::use_github_action("bookdown") will probably be sufficient, and the action will need no tweaking)

Glossary of terms

Although there should be descriptions of each technical term within each chapter it would be great to collate them into a quick reference guide.

Don't know if we'd need to have internal links from this to relevant chapters?

IT security

Highlight that IT security needs to be contacted about site access like GitHub, published pages in GitHub, Slack, Netlify. People often get permissions that change as security is built upon and IT helpdesk may not necessarily understand the implication where IT security staff would.

Addressing other current challenges to modern analytics adoption

On top of issues around the basics of getting open source tools installed onto Trust machines, there are further topics that must be addressed in order to move towards the analytics vision set out in the Goldacre Report, the NHSX Open Source policy and the NHSD RAP Community.

This include but are not limited to:

  • Open code: publication of code in public repositories,
  • Synthetic data: the creation and authorisation of synthetic/dummy datasets in order to improve reproducibility (amongst other things),
  • Balancing analytics processes and software engineering.

NHSR Vision

The vision was originally written from a workshop in 2021 but was very focussed on research, statistics and refers to clinicians rather than analysts and data scientists. It might be that the vision requires rewriting completely to include other languages like Python, tools like Git/GitHub and relationships to other communities like Government Data Science Campus and NHS.pycom.

Minor formatting questions

  • Do we want to replace "document" with "chapter" within each chapter?
  • Any strong feelings about whether to number sections/chapters?
  • Anything else you'd like to see on the cover page (link to NHS-R twitter account?)

Maybe @bclarke-nes or @Lextuga007 would like to offer an opinion?

Pulling together some thoughts for Purpose chapter draft

I wanted to start bringing some thoughts together about the overarching purpose chapter. Roughly:

  • I'll move some of the material from section 3.1 (about tools) into the main purposes chapter. It'll do more work there as part of a general introduction to the handbook, and allow us to cut to the chase on the specific tools/packages issues
  • I'll copy some of the points over from 2.1 too
  • And make some general comments about R / use in the NHS, based on @wbryant 's discussion in #4

Question: what else should I look at for this? Any suggestions very welcome.

Integrate the content from this book with NHS-R Way

There are a few cross overs on the content with NHS-R Way and also there is now a difference in book type between the two as NHS-R Way is Quarto (it was released many months after this was created).

In order to keep this open to R and Python users should this book be moved to quarto or merged directly into the NHS-R Way?

Generalise to Python (and Docker?)

The arguments around R and R packages will be similar for Python and Docker, though their repos (e.g. PyPi and Dockerhub) are run somewhat differently to CRAN as I understand it.

Can this therefore be generalised, but with a few R/Python specific chunks mixed in there.

Not to overcomplicate things but I just discovered RMarkdown parameters; I could just imagine having a document here where you can select Python or R at the beginning and the entire document adjusts accordingly ...

Curriculum for training and learning

NHS-R needs to have a considered view on a "curriculum" or competency framework for an NHS-R analyst, and this document could be a place for that.

@Lextuga007 will have good ideas, as will others no doubt

Locally hosted Git and GitHub

There are various locally hosted Git platforms with varying costs and implications. These might also require a particular workflow from private to public.

Posit Cloud

Particularly in terms of sensitive data, i.e. never use it for analysis with patient identifiable or sensitive data.

This is often used for training where packages and data are pre-installed. Reasons why this can be helpful. How workspaces are available through organisations with paid subscriptions (like NHS-R Community) but also on individual accounts.

Section on licences

What happens if code/GitHub repositories don't have a licence and suggestions/quick reference to what can be used and any implications.

Setting up git/github safely with .ignore files

A key risk associated with using git/github for healthcare settings, involving working with identifiable or potentially identifiable patient level data, would be in not thinking carefully enough about project/repo structures and which locations to .gitignore within a project/repo. For example, if someone filters on a half dozen records from a secure database it's important no commit contains these records, even though they may want to include the code which performs the filter.
This suggests it's important to have both a clear understanding about how to .gitignore locations, and a priori agreement about which folders inside a project should contain what kinds of data. Some discussion about data security roles within an active repo might be important to include too, so there's not any kind of 'incident' involving this kind of accidental release of data, which could set back progress on collaborative coding and version control quite quickly and quite fast.

Guide to deployment

Lots of people have questions about deploying Shiny and RMarkdown in the NHS. Someone should write this stuff down

Data in packages

I was writing out the difference between R, R Studio and R packages:

R is a programming language.
R Studio is an integrated development environment which supports R and other languages.
[R packages](https://r-pkgs.org/intro.html) bundles together R code, data, documentation and tests.

And realised that the definition from R packages is that it includes data - that needs to be addressed around concerns for IG/Security. The only data in other people's packages will be (or should be) publicly available but then the issue of our own data should only be around our creating packages.

This probably needs to be spelled out for clarity.

Is renv necessary?

Sorry if this is a stupid question but is renv necessary in this repo?

GitHub action failed

Check to see too if this relates to #42 where _main files epub , pdf and .tex are not being updated (at least from a local Build)

Practical guidance for adopting open source tooling and open coding

Part of what-is-the-need-in-the-system that could be of great benefit would be a knowledgebase/playbook for adopting and supporting open source tools and mindsets in the NHS. For instance someone shared some great resources addressing (I think) getting authorisation for R installation and running open source tools.

It's quite a broad topic so could be honed for NHS-R but I think it's a great community to be involved in that sort of work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.