Giter Site home page Giter Site logo

opensciencemooc / module-5-open-research-software-and-open-source Goto Github PK

View Code? Open in Web Editor NEW
73.0 79.0 52.0 27.82 MB

Module 5: Open Research Software and Open Source

Home Page: https://eliademy.com/catalog/oer/module-5-open-research-software-and-open-source.html

License: MIT License

Jupyter Notebook 2.76% HTML 97.24%
open-source opensource open-source-design open-source-licensing open-source-hardware open-source-project open-source-community open-science open-research

module-5-open-research-software-and-open-source's People

Contributors

alexmorley avatar arfon avatar danielskatz avatar ericdwilkey avatar fkohrt avatar gabr-orl avatar heidiseibold avatar hsodaci avatar inacsmith avatar jcolomb avatar jolyphil avatar konrad avatar lhehnke avatar lmatthia avatar luiscamachocaballero avatar lwjohnst86 avatar mooholl avatar mrchristian avatar nmstreethran avatar pablobernabeu avatar paultgriffiths avatar protohedgehog avatar raulcanay avatar rvosa avatar sarahsauve avatar sdruskat avatar tosteiner avatar trallard avatar zoranpandovski avatar zuphilip avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

module-5-open-research-software-and-open-source's Issues

GETTING REPRODUCIBLE , from nature blog, food for thoughts?

from https://www.nature.com/articles/d41586-018-05990-5, while it is more for module 3 on data analysis, there is probably a connection.
just wondering...

Use code. Instead of pointing and clicking, use programming languages to download, filter, process and output your data, and command-line scripts to document how those tools are executed.

Go open-source. Code transparency is key to reproducibility, so use open-source tools whenever possible. “If you give me a black box with no source code and it just gives me numbers, as far as I am concerned, it’s a random-number generator,” says mathematician Les Hatton of Kingston University in London.

Track your versions. Using version-control software such as Git and GitHub, researchers can document and share the precise version of the tools that they use, and retrieve specific versions as necessary.

Document your analyses. Use computational notebooks such as Jupyter to interleave code, results and explanatory text in a single file.

Archive your data. Freeze data sets at key points — when submitting an article for publication, for example — with archiving services such as Zenodo, Figshare or the Open Science Framework.

Replicate your environment. Software ‘containers’, such as Docker, bundle code, data and a computing environment into a single package; by unboxing it, users can recreate the developer’s system. ReproZip, developed in the lab of New York University computer scientist Juliana Freire, simplifies container creation by watching program execution to identify its requirements. The commercial service Code Ocean and an open-source alternative, Binder, enable researchers to create and share executable Docker containers that users can explore in a web browser.

Automate. Automation provides reproducibility without users really having to think about it, says bioinformatician Casey Greene at the University of Pennsylvania in Philadelphia. Continuous integration services such as Travis CI can automate quality-control checks, for instance, and the Galaxy biocomputing environment automatically logs details of the jobs it runs.

Get help. Resources abound for interested researchers; see practicereproducibleresearch.org, for instance, or find a Software Carpentry workshop near you to learn basic computing skills.

Question about 'existing platforms and tools for Open Source Software' section

Existing platforms and tools for Open Source Software

Virtual environments and machines are becoming increasingly popular as high-powered research workflow enablers. Popular services include Google Cloud and Amazon Web Services, which also assist with database storage and content delivery, as well as computational power. InsideDNA is a computing platform for reproducible research in bioinformatics, genomics and the life sciences.

This first paragraph reads a little strangely to me and it's not clear how it's connected to the Open Source topic area. A couple of things you could be trying to say here:

  1. Virtual environments and machines are becoming increasingly popular as high-powered research workflow enablers and many of these are built upon open source software (operating systems, programming languages, data processing frameworks etc.)
  2. Popular services include Google Cloud and Amazon Web Services, which also assist with database storage and content delivery, as well as computational power provide a great venue for running your open source software.
  3. Something else?

expand content to talk about code

One stated objective is to transform code one is writing for himself into a code others can use.

I would be nice to have something about that in the course (write code for re-use, docking, liscences, ....).

Any thoughts?

There's nothing special (line)

Hi ,

There is a line about Zenodo and other DOI services.

https://github.com/OpenScienceMOOC/Module-5-Open-Research-Software-and-Open-Source/blame/master/content_development/MAIN.md#L210

I will clarify that this is specifically about DOIs and not the type of repository, as they are all very different from one another.

Although there is a point here that is worth logging for later, that neither FigShare or Zenodo offer adequate longterm preservation guarantees, at all, and what they do say on this matter is very basic. To be fair it's not what they are there for.

Simon

quizz

all right answers are false maybe we could rephrase some... I think we could go to 20 questions with more technical aspects to it ?

how to contribute to existing open source research projects

I think it would be nice to have a little module on open-source codes. R, but also python and julia as possible languages. The idea is to develop a little module with opensource codes so to contribute. The final goal, but that sounds a bit hard work is to contribute to one of these open source projects.

This idea is similar to what has been proposed by @jcolomb but maybe at a much lower level. I would be happy to contribute to an idea as the one proposed by JC but it sounds for an expert audience. I believe as an introductory course it should be nice to show people how to actively contribute to an existing project.

Please let me know your suggestions.

External Links

It would be great external links would open in a new tab, automatically. For the markdown version, opening links in a new tab works with ctrl+click, but it doesn't work in the Python version. This can probably be fixed by tweaking the html but as below this might need another fix as well.

Also, internal links (Table of Contents) don't seem to work in the Python version. This is probably because of the conversion not working with html.

Content: writing good documentations

In line with issue #29, this came across my twitter line and made me realize that, unless I have overlooked it, I have not seen anything yet in the MOOC about writing documentation (what it entails and what guidelines to follow, what tools might exist).

Is that something worth considering or adding?

Was not sure whether it was more adated to this module or to module 3

Note at end of Software citation section

I think we need a note at the end of the software citation section to instruct readers that they need to use the next section and task two to learn how to 'make code citable'.

Otherwise, the way headers are organized at the moment the logical order does not indicate that subheaders are necessarily linked.

A box with something along these lines would do.

For instructions on 'making your code citable' read the 'Using GitHub and Zenodo' section as well as 'Task 2: Linking GitHub and Zenodo'.

Add the titles of articles in Reading material folder.

Is it possible change the names of the articles in Reading Material folder of GitHub? I think that if we rename each article with its title could be easier to choose the paper which you want to read.

Please, let me know if this it's possible and how we could do it.

software citation learning outcome

I would like to suggest a learning outcome:

  • Software developers will be able to make their software citable, and software users will understand how to cite the software they use.

Citation Style and starting a Zotero Group?

What citation style are we using?

Can we start a Zotero group to store sources being used. It's more efficient and lost can be done with it, FYI we can embed all the citations so that a user can choose to download all or some of the citations in one go.

I could start one up!

Simon

authors in Zenodo

Hi,

We need clarification about where authors are derived from in Zenodo when importing a GitHub repo. Is it from the GitHub user account, or is it from the CONTRIBUTOR file.

Also note in Zenodo you can add authors.

I could test out where the authors are derived from but if someone has the answer that will be easier.

When we have clarification I'll edit para.

Simon

mention alternative to github-zenodo

At the moment, content focus on the github-zenodo integration and worflow.

It seems to me that the content is not software specific. I have a non software project (rdmpromotion.rbind.io) which follows the github-zenodo scheme. Other topics should come forward (write code for re-use, docking, liscences, ....)

On the other hand, focusing on one tool is not the way I would like to see this going. The principles are more important. In 2 years from now, a gitlab/figshare worflow may exist and be better (who knows?). But the principles of a version controlled platform allowing collaboration for development and backup + a repository for long term storage and citation will still be there(?). (BTW, zenodo gives no garantee about how long the storage will be (I think) such that other repositories may be more usefull.) So it should be clear that the tasks are examples (even if that's the easiest way to go at the moment).

I am too early for these comments?

Checklist for citing your project

Hi,

This needs clarification as currently it's not clear.

https://github.com/OpenScienceMOOC/Module-5-Open-Research-Software-and-Open-Source/blame/master/content_development/Task_2.md#L81

I would suggest the following:

GitHub project has been linked to Zenodo and you have the DOI button pasted on your README file in GitHub

Zenodo and GitHub integrated setup works nicely.
(this needs a sub-list of things to check, i.e., what are the parts that define nicely. Will have a think)

Simon

Writing style guide: we need one

OSMOOC should have a page for writing a writing style guide. It could be that the project uses an existing one, like Chicago, and then lists variations like mixing US and UK spellings.

Extra guidance can be given on styles, for example like using not using bold for emphasis. Nielsen guides on good on web styles.

Nielsen, Jakob. 1997. ‘How Users Read on the Web’. Nielsen Norman Group (blog). 1 October 1997. https://www.nngroup.com/articles/how-users-read-on-the-web/.

‘Writing Digital Copy for Domain Experts’. n.d. Nielsen Norman Group. Accessed 17 November 2017. https://www.nngroup.com/articles/writing-domain-experts/.

There will also be preferences of terms and acronyms.

Cover research software

There is very little about research software, and quite a lot about generic open source (understandable, because there is more material to borrow from). As examples of open source software, we name LibreOffice and FireFox, which don't make the case at all: most people use Microsoft Office and Chrome, and the alternatives seem like, well, alternatives. What we need to discuss here, I think, is actual open source software used in research, such as R and Rstudio, python tools, QGIS, basically all of the bioinformatics tools, etc. and point out the culture that surrounds them, where scientists develop tools and libraries, and publish them.

Incorrect description of OSS vs. free/libre software

In the section The Open Source community and its governance it reads:

The core principle of re-use is what separates OSS from ‘Free Software’.

and:

The big difference between free software and OSS is that the former must distribute updated versions under the same license as the original, whereas newer versions of OSS can be distributed under different licenses. FOSS combines the best of both worlds.

What I really like about how the module is currently written is that it makes clear that open source is not just about the source being accessible. However, the cited paragraphs are probably misleading in the difference between free software and OSS: It's not about copyleft/re-use. I'll detail that in the following, along with some other suggestions for improvement. The last paragraph contains an idea on how I would explain the difference between free and OSS in this module.

Differentiation between software and license

I think it would be better to differentiate more precisely between the software and its license. Free software, as understood by the FSF, respects the user's freedom. A necessary requirement is that it is accompanied by a free license, because the default is that all rights stay with the author. [It is probably not the only requirement, as some software under a free license still artificially limits its users (example here). For now, let's call software under a free license free and software under a open source license open source.]

Accordingly, for a software license to be free, the license does not need to have a copyleft clause. Take, for example, the MIT license (Expat or X11, to be precise): it's not a copyleft license but still a free software license (source; more about this overlap here).

Differentiation between license requirement and license itself

I think the relationship between license requirements and licenses could also be highlighted more: OSI's OSS definition is a license requirement, as is the FSF's definition of free software (and thus, free software licenses). Various licenses may then be either free, open source, or both. It then can be argued that free may be a subset of open source:

Differences between free and open source software

It is expected by the FSF that all free software should also considered OSS by the OSI. For the other way round, there may be cases where this is not true, but most of the OSS should be free software as well. If requiring free software over OSS is not so much about the amount of software matching that criteria, what else is it about?

Differences between requiring free software vs. OSS

Free software is about protecting the user's freedom. From this perspective, nonfree software is a social problem. OSS instead has a more pragmatic approach, it is centered around the product being better in the end. Nonfree software just cuts efficiency on that road.

This makes clear that requiring free software instead of OSS is about spreading freedom as a value.

FOSS

Now, regarding FOSS: It's a term derived to cover both OSS and free software. Its practical value may be limited, however, as it only adds to the complexity and stays neutral about the actual difference between the two. I certainly wouldn't call it “the best of both worlds”, which sounds as if open source and free are features that can be meaningful combined—and not philosophies with different goals. Free software should (see above) also be OSS, so the only “best of both worlds” would be to choose free software.

Copyleft vs. public domain

While the difference between OSS licenses and free software licenses is not about copyleft clauses, the copyleft vs. public domain conflict may still be worth mentioning.

For someone who believes in the user's freedom, is it better to aim for most permissive licensing (releasing in the public domain) and thus allowing others to limit the users of their derived versions again? Or better make the own work more restrictive in the first place (copyleft), contrary to the higher goal of freedom, but more resistant in the long run?

The OpenStreetMap's wiki has some thoughts about this here.

Further thoughts

Similar to the comparison between open source and free is that between Fecher's and Friesike's Pragmatic School and Democratic School: again, it's about (research) efficiency vs. access to knowledge as a means on its own. These different goals and values may still lead to the same practical behavior.

This actually may be the best way to introduce the reader to the differences without adding to much complexity.

Another interesting comparison is that to Stalder's Commons vs Post-democracy, which are about input and output legitimation. But probably not for this guide ;)

Bold for emphasis?

I notice bolding isn't being used for emphasis.

For web reading its a good idea to use it to facilitate scan reading, see:

Nielsen, Jakob. 1997. ‘How Users Read on the Web’. Nielsen Norman Group (blog). 1 October 1997. https://www.nngroup.com/articles/how-users-read-on-the-web/.

I would suggest a style guide is added.

For the existing module or complete MOOC it would make sense for this to be done in bulk copy editing sweeps.

Simon

Software Citation - section addition

Hi,

At the Open Science Lab I am coordinating examination of the topic of 'Software Citation' as an editorial theme to run over May 2018 on a new Open Science Platform. OSL: https://www.tib.eu/en/research-development/open-science/

We can offer to add information on the subject of 'Software Citation'.

We are asking community members to make blog contributions, take part in discussion, create how-tos (or pointers to existing help resources).

The topic is outlined here https://wiki.tib.eu/confluence/display/osp/Theme%3A+Software+Citation

The FORCE11 Software Citation working groups give a good basis for understanding the topic, especially the paper Software Citation Principles (very big tip of hat):

Smith, Arfon M., Daniel S. Katz, and Kyle E. Niemeyer. ‘Software Citation Principles’. PeerJ Computer Science 2 (19 September 2016): e86. https://doi.org/10/bw3g.

Thanks

Simon Worthington - Open Science Lab
TIB – German National Library of Science and Technology

PLOS Open Source Toolkit

A potentially useful resource: https://channels.plos.org/open-source-toolkit

The Open Source Toolkit features articles and online projects describing hardware and software that can be used in a research and/or science education settings across different fields, from basic to applied research. The Channel Editors aim to showcase how open source tools can lead to innovation, democratization and increased reproducibility.

Channel Editors select content from PLOS journals and also highlight articles from the broader literature. Additional resources such as open source projects, preprints, information about repositories and policy discussions that are deemed of particular interest are also featured.

If you have already published your open source research at PLOS or elsewhere and would like to see it featured in the Open Source Toolkit please email [email protected] and your paper will be sent to the Channel Editors for consideration, or tweet us @PLOSChannels with the hashtag #PLOSOpenSource

Thanks,

Nathaniel

Add in to next release

The NumFOCUS is a nonprofit organization that supports and promotes world-class,
innovative, open source scientific software (http://www.numfocus.org). The mission
of NumFOCUS is to promote sustainable high-level programming languages, open
code development, and reproducible scientific research. A list of sponsored projects is
available at http://goo.gl/VQgw0M. Amongst these:
o The IPython (Interactive Python, http://ipython.org), with the Jupyter
Notebook available at https://jupyter.org, and a gallery of interactive
Notebooks available at https://goo.gl/z3HgwH.
o The rOpenSci (R Open Science, http://ropensci.org), which promotes the open
source R statistical environment for transparent and reproducible research. A
8 a gallery of Notebooks is available at http://nb.bianp.net
list of open source rOpenSci packages is available at
http://ropensci.org/packages/. Some of these packages enable communication
with widely used repositories, such as Figshare and Dyrad.
o The Software Carpentry (http://software-carpentry.org) is a training
organization that runs workshops and lessons to teach scientists basic
computing skills. All educational materials are developed collaboratively
online on GitHub (https://github.com/swcarpentry), and are distributed under
the open CC-BY license.
o The Data Carpentry (http://www.datacarpentry.org) is a sister organization
to Software Carpentry, and aims to teach basic concepts, skills and tools for
working more effectively with data. Again, lessons and workshop materials are
available online and distributed under the open CC-BY license.

Source: https://doi.org/10.7287/peerj.preprints.2689v1

hyperlinks need to be corrected

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.