Giter Site home page Giter Site logo

datacarpentry / semester-biology Goto Github PK

View Code? Open in Web Editor NEW
73.0 9.0 113.0 392.81 MB

Forkable teaching materials for course on working with data in R

Home Page: http://datacarpentry.org/semester-biology

License: Other

HTML 75.53% Ruby 0.01% Python 0.79% CSS 1.52% R 0.54% Jupyter Notebook 21.40% JavaScript 0.03% TeX 0.19%
teaching-materials biology data-science data-carpentry sql r spatial-data

semester-biology's Introduction

Data Carpentry for Biologists - Semester Course

JOSE DOI Zenodo DOI

Forkable teaching materials for course on working with data in R.

This repository contains the complete teaching materials (excluding exams and answers to assignments) and website for a university style and self-guided course teaching computational data skills to biologists. The course is designed to work primarily as a flipped classroom, with students reading and viewing videos before coming to class and then spending the bulk of class time working on exercises with the teacher answering questions and demoing the concepts.

Helpful information is available regarding the structure and function of the course and website materials for customized development and delivery of the course.

We encourage collaborative development. This repository was used by @ethanwhite to teach a version of this course (Fall 2016) at the University of Florida. The course remains under active development. We welcome contributions to all aspects of the course/site and are especially seeking exercises and assignments for a range of disciplines. Key site and course materials are available as templates for contributions of new materials and other materials that are specific to the course (e.g., the syllabus) are developed in a way to facilitate easy customization.

Here are some examples of courses using the infrastructure and material from this course:

Where is everything

Core teaching materials are stored in exercises/, lectures/, and materials/.

Class specific materials are stored in the syllabus, schedule and assignments/.

Most of the other folders and files support creating the course website using Jekyll.

How to contribute

We use standard GitHub flow, so fork the repository, add or change material, and submit a pull request.

The goal of making this course forkable is to facilitate collaboration on developing this kind of material for university courses. The central component of a flipped computing course is the exercises, so one of the primary forms of contribution will be adding exercises to the pool of exercises. Individual instructors can then select from a rich pool of exercises the ones that fit the topics, languages, and scientific domains that best fit the material they want to cover in the course.

There are lots of great resources for being introduced to the individual concepts being taught in courses like this. Our philosophy is to use and improve these external resources when available instead of creating new versions of the same content. In particularly we actively use Data Carpentry and Software Carpentry workshop materials. However, in cases where the necessary material doesn't exist elsewhere it can certainly be added here.

Accessibility

New pull requests to this site are scanned using pa11y and pa11y-ci to ensure that additions to the site follow best practices for accessibility. If you discover any accessibility issues with the site please open an issue and we'll get them fixed.

Using Jekyll to build your own course website

Simple setup

The website is setup to be easy to run automatically through GitHub:

  1. Fork or import the repository to https://github.com/yourusername/semester-biology.
  2. Update # Setup information in _config.yml in the main directory for proper site rendering.
    • You must push this change to your repository to build and browse your forked version.
    • In a few minutes you should be able to see the site at: https://yourusername.github.io/semester-biology/
  3. Edit any of the markdown (.md) files
  4. Commit and push the changes
    • The changes should now be reflected on the website
  5. If you want to use a custom domain name instead of github.io, follow GitHub's instructions for setting up a custom domain.

If you have any problems please let us know and we'll be happy to help.

Previewing changes locally

If you want to view your changes locally, before pushing them to the live website, you'll need to setup Jekyll locally. GitHub provides a good introduction on how to do this.

If you have Jekyll properly installed, you can then run

bundle exec jekyll serve --baseurl ''

from the command line and navigate to http://localhost:4000/ in your browser to preview the current state of the website.

Creating new pages

If you want to add new exercises, lecture notes, etc. you do this by creating a markdown file in the appropriate directory. Each markdown file needs to start with some information that tells Jekyll what the page is. This is done using something called YAML, and the standard YAML for a new exercise would look like this:

---
layout: exercise
topic: Topic group of exercise
title: Name of exercise
language: [R, Python, SQL]
---

This is placed at the very beginning of the markdown file and provides information on what kind of content it is (e.g., exercise, page, etc.), the title of the page, and what language it applies to.

The page should then be available at a url based on where the file is located and what the file name is. So if you created a new exercise in the exercises/ folder called my_awesome_exercise.md it would be located at:

Locally: http://localhost:4000/exercises/my_awesome_exercise

After pushing to GitHub: https://yourusername.github.io/semester-biology/exercises/my_awesome_exercise

Dependencies

Building the site locally requires a local Ruby installation with 3 packages (gems):

  • jekyll
  • github-pages
  • jekyll-sitemap

For help with installation see:

One you have installed Ruby and the jekyll gem go to the root of the site repository and run:

bundle install

to install the rest of the dependencies.

Acknowledgements

Development of this material is funded by the Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative through Grant GBMF4563 to Ethan White and the National Science Foundation as part of a CAREER award to Ethan White.

semester-biology's People

Contributors

andrewmarx avatar asntech avatar atyre2 avatar beastyblacksmith avatar brymz avatar danieleweeks avatar davharris avatar dependabot[bot] avatar drlabratory avatar ethanwhite avatar garezana avatar gvwilson avatar hlapp avatar katrinleinweber avatar kristinariemer avatar marconis avatar mikoontz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

semester-biology's Issues

R curriculum discussion

Re: Basic-Python2.md exercise comments.
I include a mention of built-in functions in this exercise. I'm not sure that custom functions are required here, so maybe it would be best to introduce the idea in a later lesson.
We should introduce the various data classes (character, factor, numeric) and organizational structures (list, matrix, array, data frame) in an early lesson. Not sure where is the best place.

Add Tidy Data problems based on Data Carpentry messy data

Rather than have students use a database with some problems for several weeks and potentially internalize poor structure, we've switched over to using the Portal Project Teaching Database for the main SQL exercises. This means we need messy data for database structure/tidy data problems.

Data Carpentry has messy data designed for looking at this problem. See, e.g.,
http://datacarpentry.github.io/spreadsheet-ecology-lesson/01-format-data.html

We should concert the current database structure problems to something based on this data, or add new "Tidy Data" problems based on this data and tweak the existing database structure problems to have the students download a the original database file that has all of the structural issues in it.

Translate Lists-2 to Lists-9

(Subissue #1 )

There are nine Lists exercises, but only Lists-1 is used in an assignment. Are we going to use them?
Lists-2 and Lists-3 follow-up on Lists-1.
Lists-5 can introduce matrices.
Also, we should come up with an exercise to introduce lists. I use them when I have tables or lists of multiple data types that is entered from the script. Data with multiple data types in .csv get entered as data frames.

***I've been meaning to have this 'what to do with unused exercises` chat more broadly.

Add links to output solutions

Add these links to:

  1. The bottom of each problem in an assignment
  2. Parenthetically after each exercise on the Exercises page

Code blocks

Reduce length of code blocks to match web translation.
Code chunks that take up a whole line should be placed in a code block.

Consider revising schedule.md

While looking through the schedule.md, it struck me that we could organize the order of videos/readings to follow the order of the exercises. In my mind, the structure would look like:

  • Topic 1 [reading link] | [video link]
  • Topic 2 [reading link]
  • Topic 3 [video link]
  • etc.

Create outcome solutions for all exercises

For both R and Python exercises we need a way to help both self-directed learners and university students check their work, but without giving them answers in code that they could just cut and paste for assignments. By showing them what the outcome of successfully running the code should look like, we both clarify the intent of the question and help students check their work. This also begins to introduce the benefits of testing.

The result here would be a new folder containing the "solutions" (i.e., what the output should look like) for each exercise, using the same naming structure as the associated exercise. Separate solutions will be necessary for R and Python since the details of the output won't be the same.

Update urls to gh-pages links

(Subtask of #1 )

OLD: [Functions 5]({{ site.baseurl }}/exercises/Functions-5/)

NEW: [Functions 5]({{ site.baseurl }}/exercises/Functions-5-R/)

Translate Making-choices-4

(subissue #1)

I will skip this exercise for now because it is not in the assignments list, but I'd like to revisit it as it looks like a strong exercise.

Translate Loops-4 & Loops-5

(Subissue #1)

Loops-4 seems like a useful extension of Loops-2 (old name: 'Loops-3')
Not sure what Loops-5 is about.

Update file names

(mentioned in discussion for #1)
(related to #35)

Make sure file names / titles that were changed get updated in all files and urls.

Jeckyl Formatting

(For down the road.)

I saw the newest Software Carpentry lesson (http://swcarpentry.github.io/web-data-python/) and it made me think about the way we format our exercises and how that will look when rendered by Jeckyl.

I'd like to look through a couple examples to gather some thoughts and chat with you sometime. We can also look through some of the Data Carpentry lessons, though they like look they are still mostly 'generic' github wiki pages.

Advanced Topics

(sub issue #1; related to #46)

I have gone through the advanced course exercises and chosen a small(ish) set of topics and exercises I think would be worth considering for inclusion in the project. My idea is that these would provide an opportunity for classroom students to continue on from the course and have a direction for what is next if they are to continue pursuit of scientific programming and for at-home students to learn a handful of important, but a bit more complicated, skills.

After completing this list, #1 will be complete. We can also decide any or all are not worth it, and can be done with #1 now.

The list of exercises breaks into two categories.

  1. New 'advanced' skills:
    -'Higher Order Functions 2'
    -Regular Expressions 1'
    -'Debugging'
    -'Tests 1'
  2. Challenging review:
    -'Basic 1'
    -'Basic 2'
    -'Making Choices 4'
    -'Scientific Python 3'

Organizing page links by title always chooses Python exercises

(Subissue #85, Related PR #91)

assignments/index.md directs Jekyll to arrange find a list of exercises and arrange an assignments page using the exercise titles. Python and R assignments share titles, which means that currently the R assignments list is populated by Python exercises. Will have to code in the language from yaml here or change the titles throughout.

Issues with formatting in assignments

Something got mixed up a little in the formatting of assignments. Compare:

I think this is happening because the assignment name is capitalized. See the commit message for my solution to this:
ethanwhite/progbio@79c5bbc

This means that the assignments need to start with lower case letters and the exercises start with capital letters. Yes, it is awful.

Descriptive Titles

(subissue #1)

'Graphing 3' used to be called 'Graphing adult size vs newborn size'. Simplifying the descriptive title to a number made me wonder if all of the problems should have a descriptive title. The descriptive title would identify the new problem/solution presented in the exercise. One of the strings exercises might get a descriptive title of 'Basic stringr functions'. A making choices exercise might get a descriptive title of 'Using mathematical operators' or 'if else statements'.

I think it makes sense to organize the directory using the current Name-X 'titles' and add a 'descriptive title' or 'subtitle' to the exercise yaml.

Update Python material to Python 3

All necessary modules are now available in Python 3 and it will make teaching easier by removing common points of confusion like integer division. The most common thing we'll been to fix is changing print statements from:

print x

to

print(x)

`dplyr` module

(Reference PRs #49, #60)

We need a module set that introduces dplyr. I like the Dr. Granger - shrub carbon problem set for this. The order would be 'Scientific 0', 'Combining Basics', 'Statistics 2'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.