Giter Site home page Giter Site logo

kwb-r / fakin.doc Goto Github PK

View Code? Open in Web Editor NEW
1.0 4.0 0.0 28.24 MB

Best Practices in Research Data Management

Home Page: https://kwb-r.github.io/fakin.doc

TeX 33.54% CSS 4.80% R 57.20% Shell 2.27% HTML 2.18%
r rstats publication project-fakin research-data-management r-bookdown best-practices

fakin.doc's Introduction

fakin.doc

Build Status

Documents to be used in our FAKIN project (in German)

fakin.doc's People

Contributors

hsonne avatar mrustl avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

fakin.doc's Issues

Adapt hadley`s R package checklist to KWB needs (packages, workflows, and so on)?

  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Polish NEWS
  • If new failures, update email.yml then revdepcheck::revdep_email_maintainers()
  • Bump version (in DESCRIPTION and NEWS)
  • devtools::check_win_devel() (again!)
  • devtools::submit_cran()
  • pkgdown::build_site()
  • Approve email
  • Tag release
  • Bump dev version
  • Write blog post
  • Tweet

Template from r-lib/usethis#338

Too advanced for us as focuss is on CRAN release (as for ggplot2 3.0 tidyverse/ggplot2#2568) but we can use it as a starting point. For R packages Andi`s "kwb.resilience" could be our first use case

  • How to create a Github styled TO DO list in rmarkdown (for details see issue #23)

Add more References

Reminder to myself to add the following references in the relevant chapters of the report, e.g.:

Why to share Code?

Uncategorized

Best practices scientific computing

Why GitHub?

All above references found in:
Lowndes, Julia S Stewart, Benjamin D Best, Courtney Scarborough, Jamie C Afflerbach, Melanie R Frazier, Casey C O’Hara, Ning Jiang, and Benjamin S Halpern. 2017. “Our Path to Better Science in Less Time Using Open Data Science Tools.” Nature Ecology & Evolution 1 (6). Nature Publishing Group
https://www.nature.com/articles/s41559-017-0160

In addition add new R Markdown Book (https://bookdown.org/yihui/rmarkdown/) nd Link to R package rticles, which provides templates for writing journal articles in R markdown.

Sackler Colloquium on Improving the Reproducibility of Scientific Research (Free Online) http://www.pnas.org/content/115/11 , e.g.:

Add link to web page https://regex101.com/

This page let's you test out regular expressions. Do we already have a chapter about the importance of regular expressions? We should.
I found this link in the DataCamp Course "Intermediate R - Practice"

Link to "Research Compendium" website

A research compendium accompanies, enhances, or is a scientific publication providing data, code, and documentation for reproducing a scientific workflow. It can be published on different platforms using the label (or tag, community, ...)
research-compendium (applied on GitHub, Zenodo, OSF) or as a fallback the term "research compendium" in the description (used on GitLab). The Zenodo community even has a curation policy for the accepted records.

https://research-compendium.science/

Tools: R packages for "Metadata"

codemetar (https://ropensci.github.io/codemetar)

"We recommend you to use the codemetar (https://ropensci.github.io/codemetar) package for creating and updating a JSON CodeMeta metadata file (https://codemeta.github.io/) for your package via codemetar::write_codemeta(). It will automatically include all useful information, including GitHub topics. CodeMeta uses schema.org terms so as it gains popularity the JSON metadata of your package might be used by third-party services, maybe even search engines. " (https://ropensci.github.io/dev_guide/building.html#creating-metadata-for-your-package)

dataspice (https://github.com/ropenscilabs/dataspice)

The goal of dataspice is to make it easier for researchers to create basic, lightweight and concise metadata files for their datasets. These basic files can then be used to:

  • make useful information available during analysis.
  • create a helpful dataset README webpage.
  • produce more complex metadata formats to aid dataset discovery.

Metadata fields are based on schema.org and other metadata standards.

R package: dirdf - Extracts Metadata from Directory and File Names https://github.com/ropenscilabs/dirdf
Create tidy data frames of file metadata from directory and file names.

QMS: define a company wide strategy for publishing code

After talking with @daniel-wicke today on publically publishing two R packages used in the project-ogre (see KWB-R/kwb.ogre#2 and KWB-R/kwb.ogre.model#1) it became obvious that we currently are lacking a company wide strategy for publishing code.

For this a workflow should be developed within FAKIN and implemented in the QMS. This for sure requires that the KWB management and the department leaders

I would propose the following:

  • 100% publically sponsored projects (e.g. BMBF, EU, and so on): source code will always be published on https://github.com/kwb-r as public repository (i.e. it will be accessible for everyone) in case it is possible to the code does not contain security critical paths (e.g. to our company server) or confidential data. Code should be developed in such a way that ideally does not include both (security critical paths and confidential data). Making the code openly available will decrease our burden to install them (e.g. not each student needs to get an "access" token to install private repositories, as required for "contract" projects, see below).

  • Contract projects (BWB, Veolia): will be published as private repositories by default on https://github.com/kwb-r in case that the funder does not pre-define a specific workflow.

Could this topic also be addressed within one of the next management meetings @chsprenger ?

UFOPLAN BaSaR: data workflow/structure recommendation

Within the new KWB project UFOPLAN BaSaR, R scripts already used in OgRe should be reused.

As the project is at the beginning with regular sampling starting next week it is a good time for optimising the data workflow according to Daniel.

He hopes to get some recommendations on how to improve the current folder structure in order to make it easy in the future for being integrated in the workflow proposed by FAKIN.

However, it needs to be assured that R scripts work also on field laptops without connection to the KWB intranet (i.e. adapting folder paths with minimal effort)

Code Ocean for running code (alternative to mybinder)

www.codeocean.com

HI Michael, we've upgraded your quota for the time being to 10 hours/50 GB and we'll reset the official plan designation later today. But you have all of the privileges of the pro account as of now and anyone else who signs up via the www.kompetenz-wasser.de domain should be automatically upgraded as well.

Best,

Seth Green
Developer Advocate

From FAQ: https://codeocean.com/plans

What is included in the Researcher plan?
The researcher plan includes everything you need to get started, explore and run code, download code and data, unlimited compute capsule publishing, privately modify published code, collaborate with peers, embed code onto your personal site. Everyone is allotted 5GB of storage and 1 hour of compute time per month. Use your academic email to get 20GB of storage and 10 hours of compute time per month.

Batch Script For File Tracking

Develop a batch script that tracks all file/folder changes on the KWB servers

/projekte$
/processing
/rawdata

To be used for Brownbag and later as general tool to identify when (un)intended changes in folder structure occured.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.