Giter Site home page Giter Site logo

fredhutchio / intro-bioinformatics Goto Github PK

View Code? Open in Web Editor NEW
12.0 5.0 7.0 24.12 MB

Website and slides for intro to bioinformatics class at Fred Hutch

Home Page: http://fredhutchio.github.io/intro-bioinformatics/

HTML 58.88% Makefile 0.27% Shell 1.20% CSS 21.55% Python 0.34% JavaScript 17.76%

intro-bioinformatics's Introduction

Intro to Bioinformatics

A short series offered by fredhutch.io to members of Fred Hutch.



Overview of class

Class outline:

  1. Overview & remote server access
  2. Unix Shell I - overview
  3. Unix Shell II - scripting
  4. Version Control with Git
  5. Python I - syntax and data types
  6. Python II - building data structures
  7. Python III - writing a program

The class will be loosely following Bioinformatics Data Skills for the overview, unix, and version control sections. For the python sections, we'll be following our own curriculum, and referring to the Codecademy course for practice with the concepts.

Each of the classes come with a few resources, which for your convenience are aggregated (and extended some) on the Resources page.

Prerequisites (connecting to the remote server)

You will need to bring a laptop to class to participate in the interactive sessions. However, all our work will be happening on a remote server to which we connect. Thus, it is vital that you ensure you can connect to this remote server before coming to the first class. (Specifically, make sure you can connect over the Marconi wifi network while on the Hutch campus.)

Connecting to the remote server requires the use of a terminal program, which let's us control a computer with text based commands. How you open the terminal program depends a bit on what operating system you are using. However, once you have the terminal open, you'll execute the following command

ssh <username>@rhino

Here, <username> should be replaced by your Fred Hutch network username. When this command runs, you'll be prompted for a password; use your Fred Hutch network password. You should see an informational message about the system, followed by a prompt that looks something like:

<username>@rhino04:~$

Try typing echo "it is working", and hitting enter. The terminal should print out "it is working" on the following line, and return a new prompt.

Next, try typing cd ~ and hitting enter. The terminal should return a new prompt.

If instead of this behavior you see something that looks like an error message, please notify us so we can help you get set up before the first class starts. (Note that you may get rhino01, rhino02 or rhino03 instead of rhino04; this is ok).

Below are the details about how you can get a terminal program with SSH, given your operating system.

Linux or OSX

Linux operating systems (such as Ubuntu) and Mac OSX come with SSH-capable terminals pre-installed. It should be straight forward to Google how to find and run the terminal program for your particular operating system.

Note that if you are using a Linux operating system, you may have problems with the wifi on campus, due to some unusual network settings. If you are unable to connect via wifi on your laptop, please ping the Fred Hutch computing Help Desk or Sci Comp to get you running.

Windows

Windows does not ship with an SSH capable terminal. As such, you will need to do one of the following:

  • [Recommended] If you are on Windows 10, it is recommended you install "Bash on Windows" (aka the "Windows Subsystem for Linux", or WSL). You can follow the "How to Install ..." instructions section here.
  • If you aren't running Windows 10, we recommend you upgrade to Windows 10 if possible.
  • If not, we recommend you install PuTTY, which will let you SSH into the rhino servers.
  • You can also use NoMachine on the laptop you plan to bring to class to log into a remote Ubuntu/Linux desktop to use its Terminal (see directions here).
  • Finally, you can also install Virtual Box, and an Ubuntu virtual machine if you want to tinker with your own Linux environment and use the terminal from there.

Other prereqs

You should also be able to copy and paste text to and from the terminal. In Mac OSX, you can do this with Command-c and Command-v, as you normally would. On a Linux terminal, you need Ctrl-Shift-C and Ctrl-Shift-V. For Windows/PuTTY users, copying and pasting is accomplished using select and right click (see here). Make sure to try pasting and copying from the terminal before coming to the first class, so that you can copy and paste longer commands from the slides.

You will also need a GitHub account for the course. You can sign up for free here.

We also ask that you read the first two chapters of the book (particularly the first) before coming to class, as they'll give you some nice big picture context to work from.

Text editors

In the first class, we'll teach you how to use the vi/vim text editor, a useful and powerful tool for editing text and code when you are restricted to using a terminal. However, for the remainder of the classes you can use a desktop text editor as long as you can connect to your shared Hutch drive (frequently mounted as the "H" drive, at least on Macs), and edit your ~/bioinfclass files from your laptop's text editor. If you choose this path, we recommend you look at Atom or Sublime for text editors.

You may also choose to continue to use vim for the remainder of the classes. If you do so, there will be occasional guidance and tips in class which may be helpful.

Technical Notes

The slides (source code here) are written in a little extension of Markdown. After processing, the Markdown gets converted to a reveal.js presentation. HTML is rendered by the brilliant pandoc.

Source files are in the src/ directory, and have extension .mds. You can compile the file src/somefile.mds by typing make somefile.html, and open the file with firefox or chromium. You may need to install pandoc and some python related things to run this code. See the Makefile for some demonstration of make, if you're interested in using that as a replacement for building bioinformatics pipelines, or for customizing details of where files go and such in your fork. I may rewrite the build in SCons as well, so you have an example of that (and so I can work with this project more sanely).

License

You are free to use and modify this code under the terms of the creative commmons license:

Creative Commons License
Intro Bioinformatics curricula and slide processing by Christopher Small & Frederick A. Matsen IV is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

intro-bioinformatics's People

Contributors

matsen avatar metasoarous avatar seaaan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

intro-bioinformatics's Issues

Prune some of the duplicate Unix intro stuff

We're covering quite a bit more in the first class now, and while the extra time is making the second class SO much better, it still felt like it would have been nice to have a little more time for the interesting piping and redirection stuff at the end. It felt slow then fast, and it could be more balanced if we tidied up the material in the first part of the class.

python2 versus python3

There is apparently some significant momentum behind python3 adoption. There's this survey which shows a non-trivial uptick in use, particularly amongst Windows users (whodathunkit?). I think something that is also notable is that the Software Carpentry project is going to start teaching 3 in upcoming releases.

So I know, barn doors, cows, close. However, I'm thinking we need to start looking towards starting down the path towards python 3. Maybe for the next iteration? Or maybe a python3 workshop independent of this?

Switch from nano to jed?

@bcclaywell brought up this idea today. Might make things easier for folks on the editor front. Main downside I see is that jed won't be as standard on installs, while nano is. I think I'll be sold though if I can figure out a way to have it give us smarter indentation and copy/paste action.

spelling error in presentation slides for intro bioinformatics class #1

On one of the last slides in the presentation for class #1 - the slide entitled "Manual detach" - there is a spelling error. Inside the highlighted box it says "tmux detatch" but it should say "tmux detach"

Hootie Warren

BTW, this class is great.... Thanks so much for offering it, and thanks so much for letting me take it

Fix pasting indentation in nano

We set them up with a nanorc that auto-indents, but there's no :paste mode like in vim. Need to find a way around this, or remove that line from the nanorc.

Get everyone on zsh?

They use it in the book, and the improved autocomplete really is quite helpful... In theory it shouldn't be too difficult to have everyone make the switch, but for some reason chsh fails for us...

"apropos" command

The slides for class 2 of introduction to bioinformatics suggest running the command:

"apropos calculator"

When I do this, I encounter:

"calculator: nothing appropriate."

Is "calculator" part of a module that I need to load?

Hootie Warren

Clarify tmux vs tmux attach

If people don't have a tmux session open, tmux attach will fail. Need to make sure they're aware of this early on and make it an early prereq. Relates to #23.

Maybe first class should be required, so we're all on the same page.

Fix tmux switching directories on new pane

This is rather annoying and makes it hard to assume things about what directories people are in. There should be a way to set this, but I think it unfortunately depends on the version of tmux...

Vanilla instructor account?

@atombaby - Do you think it would be possible to set up a csmall2 (or some such) user that wouldn't have things so customized? Might lead to less confusion, but I don't know how much red tape there is on that sort of thing. Thoughts?

Drop all method calls from class 5

Instead of teaching "string".upper(), [1,2,3].append(4), etc., do str.upper("string") and list.append([1,2,3], 4). This way, everything is just a function, and the full brilliance of python OOP will hopefully fall as a greater eureka once we actually cover it more carefully in class 6.

Slide numbers?

A student suggested this as an easier way of referring to issues with slides. Not a bad idea, but could introduce confusion if slides are deleted/added in editing.

Have to see if there's a slidy option for this.

intro-bio module broken?

@atombaby - Students yesterday were unable to load the intro-bio module. @bcclaywell said that there were some changes made to the module system, but that it shouldn't have affected the intro-bio module. Was something overlooked in this perhaps?

suggested revisions for python intro from office hours

  • a little more about namespaces and import (a student suggested that namespaces are like "folders" of names)
  • shebang for python scripts
  • an argument to a function substitutes the specified value
  • pip is like the App Store or Android Marketplace

Replace email example with code that actually runs

Someone is always going to try to run it... And in general it's probably better if they can run all the code and tinker with it.

This somewhat plays into #21, since if we do that we need to replace the email example anyway.

Biggest challenge here I think is replacing the Company class example with something analogous for pointing out the flexibility trade-off with OOP. Probably not the biggest concern if we can't find one; can look for another place to make that point.

Add location-specific output to the data versioned in the git repo

If we switch things around with git 3rd, al a #18, we may want to later have them version in the location based tree data they were supposed to compute themselves once they do the current class 3 (the shell scripting class). This way they have it for things later that need it. Maybe the final results of all class work should be a separate commit or branch they can check out? Takes away some of the consistency problem...

sshfs from homebrew may need a few more steps

On a student's MacBook running Yosemite today, getting sshfs installed required the following:

$ brew install Caskroom/cask/osxfuse
$ brew tap homebrew/fuse
$ brew install sshfs

(For some reason I didn't have to tap homebrew/fuse on my MacBook.)

It's probably also worth noting that the user may need to supply a username to the sshfs command if their Mac username is different than their Hutch username.

Move explanation of certain things up front and highlight them

The following should all be made very very clear and bold somewhere, since they tend to cause confusion and slow things down greatly. They should be upfront and loud.

  • Use Ctrl-C, Ctrl-D, q, Esc, exit or quit to exit
  • Our <insert-thing-specific-to-you-here-syntax> needs to be standardized, and upfront
  • Formalized cues for when you need to nano <file>
  • [Edit 2015/11/16: Add] Clarify that # ... means "stuff that was in this file before"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.