Giter Site home page Giter Site logo

mangul-lab-usc / review-technology-dictates-algorithms Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 3.0 45.71 MB

A systematic survey of algorithmic foundations and methodologies across 107 alignment methods (1988-2021), for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. Described by Alser et al. at https://arxiv.org/abs/2003.00110.

License: MIT License

Jupyter Notebook 99.66% Python 0.02% R 0.31%
read-alignments needleman-wunsch smith-waterman heuristics read-mapping sequence-alignment hts nanopore-sequencing illumina-sequencing pacbio-sequencing

review-technology-dictates-algorithms's Introduction

Technology dictates algorithms: Recent developments in read alignment

Preprint Available MIT Licence

Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today’s diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.

Table of Contents

Directory structure

review-technology-dictates-algorithms-master
├───1. figures
├───2. multi_panel
├───3. notebooks
├───4. raw_data
├───5. scripts
├───6. summary_data
  1. In the "figures" directory, you will find all figures used in our study.
  2. In the "multi_panel" directory, you will find all figures used in our study.
  3. In the "notebooks" directory, you will find all python scripts used to produce Figures 2, 3, 4, and supplementary figures.
  4. In the "raw_data" directory, you will find the raw data used for generating the figures and running the python scripts in "notebooks" directory.
  5. In the "scripts" directory, you will find R codes used for the statistical analyses.
  6. In the "summary_data" directory, you will find csv files for the collected data about all studied read alignment tools from 1988 until 2021.

Datasets

We used 10 WGS datasets with the following accession numbers: ERR009309, ERR013127, ERR013138, ERR045708, ERR050158, ERR162843, ERR181410, ERR183377, SRR061640, SRR360549

Reproducing results

  1. Install Jupyter Notebook
pip3 install jupyter
  1. Install some dependencies
pip3 install wheel
pip3 install pandas
pip3 install seaborn
pip3 install ipysankeywidget
pip3 install floweaver
  1. Run Jupyter Notebook and you will have a new tab in your web browser
jupyter notebook
  1. Navigate to review-technology-dictates-algorithms-master/notebooks in your Jupyter Notebook session and make sure you have a trusted session (by clicking on "Not trusted" on the right top corner of the session page) so that you can save the figures into your machine.
  2. Run the python code used to generate any of the subject figures by opening the code in the Notebook session and run the code using: "Cell --> Run All"

How-to-cite-this-study?

If you use our study in your work, please cite:

Mohammed Alser, Jeremy Rotman, Kodi Taraszka, Huwenbo Shi, Pelin Icer Baykal, Harry Taegyun Yang, Victor Xue, Sergey Knyazev, Benjamin D. Singer, Brunilda Balliu, David Koslicki, Pavel Skums, Alex Zelikovsky, Can Alkan, Onur Mutlu, Serghei Mangul. "Technology dictates algorithms: Recent developments in read alignment" arXiv preprint arXiv:2003.00110 (2020). link

Below is bibtex format for citation.

@article{alser2020technology,
  title={Technology dictates algorithms: Recent developments in read alignment},
  author={Alser, Mohammed and Rotman, Jeremy and Taraszka, Kodi and Shi, Huwenbo and Baykal, Pelin Icer and Yang, Harry Taegyun and Xue, Victor and Knyazev, Sergey and Singer, Benjamin D and Balliu, Brunilda and others},
  journal={arXiv preprint arXiv:2003.00110},
  year={2020}
}

License

This repository is under MIT license. For more information, please read our LICENSE file.

Contact

Please do not hesitate to contact us ([email protected], [email protected]) if you have any comments, suggestions, or clarification requests regarding the study or if you would like to contribute to this resource. If you encounter bugs or have further questions or requests, you can raise an issue at the issue page.

review-technology-dictates-algorithms's People

Contributors

dhrithideshpande avatar jrotman avatar mealser avatar smangul1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.