Giter Site home page Giter Site logo

Comments (5)

vincentvanhees avatar vincentvanhees commented on May 26, 2024 1

How about:

  1. Is there data?
  2. If yes to 1, where can it be found?
  3. What is the data format?
  4. Is there software to interpret the data format?
  5. If yes to 4, where can the software be found?
  6. Is there documentation on the content and structure of the data?
  7. Where can additional documentation on the data be found? E.g. scientific papers
  8. Where can additional documentation be found on the process that produced the data, e.g. scientific sensor equipment documentation?
  9. Has the data been used in science before? If yes, where and how?
  10. Are links between subsets of the data present, are they clear, and are they correct?
  11. Is there domain specific documentation on how to verify the quality of the data?
  12. Is the dataset complete?
  13. What is the size of the data and is there a plan for storing/archiving it?
  14. In case of animal or human data: Has approval been obtained to collect the data?
  15. If yes to 14, what is the approval reference number?
  16. What are the desired access rights for the data?
  17. What obligations does the owner of the data have in terms of data protection?
  18. Is there documentation on how the data should be cited?

from guide.

mkuzak avatar mkuzak commented on May 26, 2024

Hi Anand, can you add some more information. Specifically, what kind of information should a guide to data review process contain?

from guide.

anandgavai avatar anandgavai commented on May 26, 2024

The purpose of this guide could be for example"

  1. Consistency in data throughout the process flow at various steps where data transformation happens.
  2. Cross verify if the data matches with its metadata, its source and location
  3. Data processing steps
  4. Creation of simulated data to cross check.
  5. Also the stuff that you mentioned in Data test could be added here.
  6. How could someone review this ? (Personally I do not know much about this either)

from guide.

LourensVeen avatar LourensVeen commented on May 26, 2024

For what exactly is this a guide, for releasing data? Or just for using it?

I'd like to add some legal issues, of course:

  • If you want to use an external data set, does it come with any restrictions on how it can be used, e.g. only non-commercial or academic use? Does your project fulfill these restrictions? (Think about commercial partners and such, just because we're a scientific not-for-profit organisation doesn't automatically make everything we do non-commercial.)

  • Does the data you want to release incorporate (modified versions of) third-party data that you got elsewhere?

  • If so, under what licenses are these other data sets available, and do you have the necessary rights (copyright, database right where applicable, click-wrap licenses, disclaimers, etc.) to redistribute them?

from guide.

anandgavai avatar anandgavai commented on May 26, 2024
  1. How can one cross verify if the data matches with its metadata, source and location ?
  2. Can we reproduce data from a previous step (as mentioned in point 7. in above statement from Vincent) ?
  3. How to create simulated data with expected characteristics as the original data?
  4. Is there any guideline for data driven usecases ?

from guide.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.