Comments (5)
How about:
- Is there data?
- If yes to 1, where can it be found?
- What is the data format?
- Is there software to interpret the data format?
- If yes to 4, where can the software be found?
- Is there documentation on the content and structure of the data?
- Where can additional documentation on the data be found? E.g. scientific papers
- Where can additional documentation be found on the process that produced the data, e.g. scientific sensor equipment documentation?
- Has the data been used in science before? If yes, where and how?
- Are links between subsets of the data present, are they clear, and are they correct?
- Is there domain specific documentation on how to verify the quality of the data?
- Is the dataset complete?
- What is the size of the data and is there a plan for storing/archiving it?
- In case of animal or human data: Has approval been obtained to collect the data?
- If yes to 14, what is the approval reference number?
- What are the desired access rights for the data?
- What obligations does the owner of the data have in terms of data protection?
- Is there documentation on how the data should be cited?
from guide.
Hi Anand, can you add some more information. Specifically, what kind of information should a guide to data review
process contain?
from guide.
The purpose of this guide could be for example"
- Consistency in data throughout the process flow at various steps where data transformation happens.
- Cross verify if the data matches with its metadata, its source and location
- Data processing steps
- Creation of simulated data to cross check.
- Also the stuff that you mentioned in Data test could be added here.
- How could someone review this ? (Personally I do not know much about this either)
from guide.
For what exactly is this a guide, for releasing data? Or just for using it?
I'd like to add some legal issues, of course:
-
If you want to use an external data set, does it come with any restrictions on how it can be used, e.g. only non-commercial or academic use? Does your project fulfill these restrictions? (Think about commercial partners and such, just because we're a scientific not-for-profit organisation doesn't automatically make everything we do non-commercial.)
-
Does the data you want to release incorporate (modified versions of) third-party data that you got elsewhere?
-
If so, under what licenses are these other data sets available, and do you have the necessary rights (copyright, database right where applicable, click-wrap licenses, disclaimers, etc.) to redistribute them?
from guide.
- How can one cross verify if the data matches with its metadata, source and location ?
- Can we reproduce data from a previous step (as mentioned in point 7. in above statement from Vincent) ?
- How to create simulated data with expected characteristics as the original data?
- Is there any guideline for data driven usecases ?
from guide.
Related Issues (20)
- Add a page on Agile at the eScience Center HOT 1
- Rename master branch to main HOT 1
- Update software citation section HOT 2
- Issues with conda HOT 1
- Language guide on Go
- Add explicit workflow for NLeSC projects HOT 1
- Check if chapter owners need updating HOT 3
- NLeSC specific is outdated HOT 1
- Add information about working in teams HOT 7
- Java chapter is very outdated HOT 6
- Concerning language preferences HOT 2
- Language guides vs. templates HOT 1
- Add a build GH Actions workflow HOT 2
- Update documentation chapter HOT 13
- JavaScript/TypeScript chapter is quite outdated HOT 5
- PDF generation is broken
- Update references from nlesc.github.io to research-software-directory.org
- Calling cffconvert HOT 5
- Example for different topics HOT 2
- Teamwork.md HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from guide.