carpentries-lab / reviews Goto Github PK
View Code? Open in Web Editor NEWOpen peer review of lessons from The Carpentries community.
License: Other
Open peer review of lessons from The Carpentries community.
License: Other
Introduction to Geospatial Raster and Vector Data with Python
https://github.com/carpentries-incubator/geospatial-python
https://github.com/carpentries-incubator/geospatial-python
The data used in this lesson includes optical satellite images from the Copernicus Sentinel-2 mission and public geographical datasets from the dedicated distribution platform of the Dutch government. These are real-world data sets that entail sufficient complexity to teach many aspects of data analysis and management. They have been selected to allow students to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data.
The lesson walks the learner through accessing, fetching, inspoecting, munging, and visualizing satellite imagery and geospatial vector data.
@rbavery @fnattino @rogerkuou @mkuzak
No response
This lesson was originally ported from the Geospatial R lesson, using the same datasets.
The lesson has been revamped to highlight python libraries for fetching raster and vector data from cloud hosted data stores and take advantage of the Cloud Optimized Geotiff format.
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesToby Hodges
Any other carpentries community members who are into geospatial!
FAIR in (biological) practice
https://github.com/carpentries-incubator/fair-bio-practice
https://carpentries-incubator.github.io/fair-bio-practice/
Open Science is disruptive. It will change how we do reasearch and how society benefits from it. Making data re-usable is key to this, and FAIR principles are a way to achieve it.
But what does it mean in practice?
How can a biologist incorporate those principles in their workflow?
We will learn that becoming FAIR and following OS practices is a process.
We will learn how to work more efficient with the data
We will teach you how planning and using the correct set of tools you can make your outputs ready for public sharing and reuse.
This hands-on 4 half-day sessions workshop covers the basics of Open Science and FAIR practices, and looks at how to use these ideas in your own projects. The workshop is a mix of lectures and hands-on lessons where you will use the approaches learned and implement some of the discussed practices.
The course is aimed at active researchers in biomedicine science (PhD students, postdocs, technicians, young PIs etc.) who are interested in Open Science, FAIR (Findable, Accessible, Interoperable and Reusable) principles and efficient data management. This training is aimed at those who want to be familiar with these concepts and apply them throughout their project’s life cycle. The course is covered in four half days.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Introduction to Bioinformatics workflows with Nextflow and nf-core
https://github.com/carpentries-incubator/workflows-nextflow
https://carpentries-incubator.github.io/workflows-nextflow/
This lesson is a three day introduction to the workflow manager Nextflow, and nf-core, a community effort to collect a curated set of analysis pipelines built using Nextflow.
Nextflow enables scalable and reproducible scientific workflows using software enviroments like conda. It allows the adaptation of pipelines written in the most common scripting languages such as Bash, R and Python. Nextflow is a Domain Specific Language (DSL) that simplifies the implementation and the deployment of complex parallel and reactive workflows on clouds and clusters.
This lesson also introduces nf-core: a framework that provides a community-driven, peer reviewed platform for the development of best practice analysis pipelines written in Nextflow.
This lesson motivates the use of Nextflow and nf-core as a development tool for building and sharing computational pipelines that facilitate reproducible (data) science workflows.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Responsible machine learning in Python
https://github.com/carpentries-incubator/machine-learning-responsible-python
https://carpentries-incubator.github.io/machine-learning-responsible-python/
This lesson explores key topics on the responsible application of machine learning. The lesson is presented as a series of case studies that illustrate real world examples. Sections cover a broad range of topics, including reproducibility, bias, and interpretability. Broadly the topics are ordered chronologically, appearing as they would when thinking through a research study.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Reproducible Computational Environments Using Containers: Introduction to Docker
https://github.com/carpentries-incubator/docker-introduction
https://carpentries-incubator.github.io/docker-introduction/
This lesson aims to teach researchers how to use and create containers using Docker. It requires knowledge of navigating the unix shell and using a text editor but otherwise intended for indviduals with no experience using using Docker or containers.
@dme26 @jcohen02 @ChristinaLK @aturner-epcc @sstevens2
No response
To our knowledge this lesson is not similar to any of the lessons in the Carpentries Lab or Lesson programs.
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesmkuzak colinsuaze gcapes mmalenta
Short-format lesson ideas (1.5 to 3 hours):
Workshop ideas (1 to 2 days):
Thank you for your interest in submitting your lesson for review in The Carpentries Lab!
Please respond to the prompts below to complete your submission.
Check boxes by adding an 'x' between the square brackets at the start of each point,
or submit the issue and check off the boxes afterwards.
What is the title of the lesson?
Metagenomics Workshop Overview
Provide URLs to
The lesson repository:
https://github.com/carpentries-incubator/metagenomics-workshop
The lesson homepage:
https://carpentries-incubator.github.io/metagenomics-workshop/
Briefly describe the lesson (50 words or fewer).
What does it aim to teach and to whom?
This workshop teaches data management and analysis for metagenomics research, including best practices for organizing bioinformatics projects and data and connecting to and using cloud computing. It also provides experience using command-line utilities and tools to analyze sequence quality and R studio and R libraries to compare diversity between samples.
If you are submitting this lesson for review on behalf
of multiple authors, list the GitHub usernames below for
all authors who should receive notifications relating to the review.
AbrahamAvelar, aaronejaime, fabel134, Vanessaarfer, Czirion, Bedxxe, nselem, BwanyaBrian, EdderDaniel, ahmedmoustafa
Provide URLs to workshop webpages and/or an issue
on the lesson repository from any beta pilots of the lesson.
(A beta pilot is a workshop where the lesson was taught
by any instructor who was not part of the lesson development team
before the pilot took place.)
Alpha workshop https://betterlabmx.github.io/2020-11-27-BetterLab/
First beta pilot https://czirion.github.io/2021-06-30-BetterLab-online/
The second beta pilot was a course in a University, Here we provide the letter of the University to the professor: Guanajuato University Letter
(Optional) If you have obtained a DOI for the lesson via Zenodo,
paste that DOI below.
10.5281/zenodo.4285901
If the lesson is similar in topic to any other lesson
already included in The Carpentries Lab and/or
The Carpentries Lesson Programs (Software, Library, and Data Carpentry), briefly describe how this lesson differs and why a separate lesson was developed.
This workshop is a curriculum that comprises four lessons. The first two lessons are adapted to metagenomics from the Genomics’ Data Carpentry. The third part includes a brief introduction to R, and the fourth lesson teaches a complete shotgun metagenomics workflow using public data which was not previously included in The Carpentrie’s lessons but it is a topic of interest to the biological community.
Check the boxes to confirm that the lesson
If you wish to submit the lesson for publication in
the Journal of Open Source Education (JOSE):
(see the repository README for more details):
paper.md
and paper.bib
files as described inIntroduction to genomic data analysis with R and Bioconductor
https://github.com/carpentries-incubator/bioc-intro/
https://carpentries-incubator.github.io/bioc-intro/
This is the first part of a series of lessons prepared by the Bioconductor teaching group. It introduces general data science with R and the tidyverse and the Bioconductor SummarizedExperiment data structure for quantitative omics assays. It is means for complete beginners without any pre-requisites.
No response
The lesson is similar that the ecology lesson, but with a focus on biomedical data, and has a dedicated chapter on Bioconductor.
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
High-dimensional Statistics with R
https://github.com/carpentries-incubator/high-dimensional-stats-r
https://carpentries-incubator.github.io/high-dimensional-stats-r/
This course is intended for those who have a working knowledge of statistics and linear models with R and wish to learn high-dimensional statistical methods with R.
This is a short course aimed at familiarising learners with statistical and computational methods for the extremely high-dimensional data commonly found in biomedical and health sciences (e.g., gene expression, DNA methylation, health records). These datasets can be challenging to approach, as they often contain many more features than observations, and it can be difficult to distinguish meaningful patterns from natural underlying variability. To this end, we will introduce and explain a range of methods and approaches to disentangle these patterns from natural variability. After completion of this course, learners will be able to understand, apply, and critically analyse a broad range of statistical methods. In particular, we focus on providing a strong grounding in high-dimensional regression, dimensionality reduction, and clustering.
@alanocallaghan
@catavallejos
@ailithewing
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
I would like to propose a lesson on Machine Learning with scikit-learn and Python.
I've created a lesson which I ran at a Spring school earlier this year. I then ran a session on this at Carpentry Connect Manchester last month where some additional ideas for developing this course were gathered. These still need to be implemented.
repository: https://github.com/machinelearningcarpentry/machine-learning-novice
webpage: https://machinelearningcarpentry.github.io/machine-learning-novice/
Slides talking about the motivation for developing the course and experience of running it the first time: https://github.com/machinelearningcarpentry/machine-learning-novice/blob/ccmcr19/slides.pdf
Material from the discussion at carpentry connect:
https://github.com/machinelearningcarpentry/machine-learning-novice/tree/ccmcr19
I would like to submit to the CarpentriesLab the lesson on Programming with GAP:
I wrote a couple of blog posts about this lesson some time ago:
and also described it in my CarpentryCon talk this morning: https://doi.org/10.6084/m9.figshare.8330060.v1
Use this: https://blog.github.com/2018-01-25-multiple-issue-and-pull-request-templates/
and have one template for people who want to suggest an idea for a lesson, and another one for people who are ready to submit a lesson proposal.
Test
test
test
test
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesIntroduction to Tree Models in Python
https://github.com/carpentries-incubator/machine-learning-trees-python
https://carpentries-incubator.github.io/machine-learning-trees-python/
Decision trees are a family of algorithms that are based around a tree-like structure of decision rules. These algorithms often perform well in tasks such as prediction and classification. This lesson explores the properties of tree models in the context of mortality prediction.
The dataset that we will be using for this project is a subset of the eICU Collaborative Research Database that has been created for demonstration purposes.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
I have developed a lesson that provides an introduction to Conda for (data) scientists.
Conda is an open source package and environment management system that runs on Windows, macOS and Linux. Conda installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments on your local computer. While Conda was created for Python programs it can package and distribute software for any languages such as R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN. This lesson motivates the use of Conda as a development tool for building and sharing project specific software environments that facilitate reproducible (data) science workflows.
Link to the actual lesson:
https://kaust-vislab.github.io/introduction-to-conda-for-data-scientists/
Link to the repo:
https://github.com/kaust-vislab/introduction-to-conda-for-data-scientists
I teach this lesson to an audience of graduate students (MSc and PhD), post-docs, research scientists and faculty at KAUST. I look forward to getting feedback from the community!
Thank you for your interest in submitting your lesson for review in The Carpentries Lab!
Please respond to the prompts below to complete your submission.
Check boxes by adding an 'x' between the square brackets at the start of each point,
or submit the issue and check off the boxes afterwards.
What is the title of the lesson?
Metagenomics Workshop Overview
Provide URLs to
The lesson repository:
https://github.com/carpentries-incubator/metagenomics-workshop
The lesson homepage:
https://carpentries-incubator.github.io/metagenomics-workshop/
Briefly describe the lesson (50 words or fewer).
What does it aim to teach and to whom?
This workshop teaches data management and analysis for metagenomics research, including best practices for organizing bioinformatics projects and data and connecting to and using cloud computing. It also provides experience using command-line utilities and tools to analyze sequence quality and R studio and R libraries to compare diversity between samples.
If you are submitting this lesson for review on behalf
of multiple authors, list the GitHub usernames below for
all authors who should receive notifications relating to the review.
AbrahamAvelar, aaronejaime, fabel134, Vanessaarfer, Czirion, Bedxxe, nselem, BwanyaBrian, EdderDaniel, ahmedmoustafa
Provide URLs to workshop webpages and/or an issue
on the lesson repository from any beta pilots of the lesson.
(A beta pilot is a workshop where the lesson was taught
by any instructor who was not part of the lesson development team
before the pilot took place.)
Alpha workshop https://betterlabmx.github.io/2020-11-27-BetterLab/
First beta pilot https://czirion.github.io/2021-06-30-BetterLab-online/
The second beta pilot was a course in a University, Here we provide the letter of the University to the professor: Guanajuato University Letter
(Optional) If you have obtained a DOI for the lesson via Zenodo,
paste that DOI below.
10.5281/zenodo.4285901
If the lesson is similar in topic to any other lesson
already included in The Carpentries Lab and/or
The Carpentries Lesson Programs (Software, Library, and Data Carpentry), briefly describe how this lesson differs and why a separate lesson was developed.
This workshop is a curriculum that comprises four lessons. The first two lessons are adapted to metagenomics from the Genomics’ Data Carpentry. The third part includes a brief introduction to R, and the fourth lesson teaches a complete shotgun metagenomics workflow using public data which was not previously included in The Carpentrie’s lessons but it is a topic of interest to the biological community.
Check the boxes to confirm that the lesson
If you wish to submit the lesson for publication in
the Journal of Open Source Education (JOSE):
(see the repository README for more details):
paper.md
and paper.bib
files as described inRNA-seq analysis with Bioconductor
https://github.com/carpentries-incubator/bioc-rnaseq
https://carpentries-incubator.github.io/bioc-rnaseq/
This is the second part of a series of lessons prepared by the Bioconductor teaching group. It is designed to equip participants with the essential skills and knowledge needed to analyze RNA-seq data using the Bioconductor ecosystem. It is aimed for learners already having a basic knowledge of R.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Introduction to Machine Learning in Python
https://github.com/carpentries-incubator/machine-learning-novice-python
https://carpentries-incubator.github.io/machine-learning-novice-python/
This lesson provides an introduction to some of the common methods and terminologies used in machine learning research. We cover areas such as data preparation and resampling, model building, and model evaluation.
It is a prerequisite for the other lessons in the machine learning curriculum. In later lessons we explore tree-based models for prediction, neural networks for image classification, and responsible machine learning.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
placeholder issue to create review badge for "Python for Atmosphere and Ocean Scientists" lesson
Introduction to deep learning
https://github.com/carpentries-incubator/deep-learning-intro
https://carpentries-incubator.github.io/deep-learning-intro/
This is a hands-on introduction to the first steps in Deep Learning, intended for researchers who are familiar with (non-deep) Machine Learning.
The use of Deep Learning has seen a sharp increase of popularity and applicability over the last decade. While Deep Learning can be a useful tool for researchers from a wide range of domains, taking the first steps in the world of Deep Learning can be somewhat intimidating. This introduction aims to cover the basics of Deep Learning in a practical and hands-on manner, so that upon completion, you will be able to train your first neural network and understand what next steps to take to improve the model.
We start with explaining the basic concepts of neural networks, and then go through the different steps of a Deep Learning workflow. Learners will learn how to prepare data for deep learning, how to implement a basic Deep Learning model in Python with Keras, how to monitor and troubleshoot the training process and how to implement different layer types such as convolutional layers.
@dsmits @psteinb @cpranav93 @colinsauze @CunliangGeng
10.5281/zenodo.8308392
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Introduction to artificial neural networks in Python
https://github.com/carpentries-incubator/machine-learning-neural-python
https://carpentries-incubator.github.io/machine-learning-neural-python/
This lesson gives an introduction to artificial neural networks. We begin by an outlining an important application of machine learning in healthcare: the development of algorithms for classification of chest X-ray images. During the lesson we explore how to prepare and visualise data for algorithm development, and construct a neural net that is able to classify disease.
No response
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Snakemake for Bioinformatics
https://github.com/carpentries-incubator/snakemake-novice-bioinformatics
https://carpentries-incubator.github.io/snakemake-novice-bioinformatics/
Researchers needing to implement data analysis workflows face a number of common challenges, including the need to organise tasks, make effective use of compute resources, handle any errors in processing, and document and share their methods. The Snakemake workflow system provides effective solutions to these problems. By the end of the course, you will be confident in using Snakemake to run real workflows in your day-to-day research.
Snakemake workflows are described by special scripts that define steps in the workflow as rules, and these are then used by Snakemake to construct and execute a sequence of shell commands to yield the desired output. Re-calculation of existing results is avoided where possible, so you can add or update input data, then efficiently generate an updated result. Workflows can be seamlessly scaled to server, cluster, grid and cloud environments without the need to modify the workflow definition.
This course is primarily intended for researchers who need to automate data analysis tasks for biological research involving next-generation sequence data, for example RNA-seq analysis, variant calling, CHIP-Seq, bacterial genome assembly, etc. However, Snakemake has many uses beyond this and the course does not assume any specialist biological knowledge. The language used to write Snakemake workflows is Python-based, but no prior knowledge of Python is required or assumed either. We do require that attendees must have familiarity with using the Linux command line (pipes, redirects, variables, …).
@
No response
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesNo response
Good Enough Practices in Scientific Computing
https://github.com/carpentries-incubator/good-enough-practices
https://carpentries-incubator.github.io/good-enough-practices
This repository contains a 3 hour carpentries format lesson covering Good Enough Practices in Scientific Computing (Wilson et al., 2017): "a set of good computing practices that every researcher can adopt, regardless of their current level of computational skill".
The workshop is targeted at a broad audience of researchers who want to learn how to be more efficient and effective in their data analysis and computing, whatever their career stage.
No response
This lesson is more basic than existing Carpentries lessons. It covers topics like the motivations for scripting your data analysis, and organising files into a folder. It does not teach program.
Particularly Good Enough Practices aims to complement the Carpentries Git Novice lesson, by providing the background on organisation, names, and the need for version control that motivates using git.
paper.md
and paper.bib
files as described in the JOSE submission guide for learning modulesThe lesson is heavily adapted from the Good Enough Practices paper, so authors of that would be ideal reviewers:
Greg Wilson , Jennifer Bryan , Karen Cranston , Justin Kitzes , Lex Nederbragt , Tracy K. Teal
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.