Giter Site home page Giter Site logo

guigolab / lyric Goto Github PK

View Code? Open in Web Editor NEW
13.0 3.0 4.0 680 KB

Long RNA-seq analysis workflow

License: GNU General Public License v3.0

Python 47.56% R 4.12% AngelScript 0.14% Shell 3.89% Perl 41.56% Awk 0.08% Raku 2.64%
bioinformatics snakemake-workflow cls rnaseq pacbio-iso-seq transcriptomics gencode

lyric's Introduction

LyRic is a versatile automated transcriptome annotation and analysis workflow written in the Snakemake language. Its core functionality is the production of:

  1. a set of high-quality RNA Transcript Models (TMs) mapped onto a genome sequence, based on Long-Read (LR) RNA sequencing data.
  2. various summary statistics plots and analysis results that describe the input and output data in details
  3. an interactive HTML table reporting statistics for each input sample, enabling easy and intuitive sample-to-sample comparison
  4. a UCSC Track Hub to display output TMs, as well as various other tracks produced by LyRic.

(Note that features 2, 3 and 4 can be easily switched on and off).

LyRic is platform-agnostic, i.e. it can deal with FASTQ data coming from both the ONT and PacBio platforms.

Full LyRic documentation is here.

lyric's People

Contributors

emi80 avatar julienlag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

lyric's Issues

keep track of unmapped reads

Currently, the "unmapped reads" coming from Minimap2 are discarded by LyRic. Is it possible to create a bam file for "unmapped reads" to tack them?

Rule makeHtmlSummaryDashboard fails complaining about missing columns

This is the error message:

Error: Can't subset columns that don't exist.✖ Column `intron.x` doesn't exist.Backtrace:     █  1. ├─dplyr::relocate(ntCoverageStats, intron.x, .after = exonOfPseudo.y)  2. ├─dplyr:::relocate.data.frame(ntCoverageStats, intron.x, .after = exonOfPseudo.y)  3. │ ├─base::unname(tidyselect::eval_select(expr(c(...)), .data))  4. │ └─tidyselect::eval_select(expr(c(...)), .data)  5. │   └─tidyselect:::eval_select_impl(...)  6. │     ├─tidyselect:::with_subscript_errors(...)  7. │     │ ├─base::tryCatch(...)  8. │     │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)  9. │     │ │   └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]]) 10. │     │ │     └─base:::doTryCatch(return(expr), name, parentenv, handler)
 11. │     │ └─tidyselect:::instrument_base_errors(expr)
 12. │     │   └─base::withCallingHandlers(...)
 13. │     └─tidyselect:::vars_select_eval(...)
 14
Execution halted
Error: Can't subset columns that don't exist.
✖ Column `intron.x` doesn't exist.
Backtrace:
     █
  1. ├─dplyr::relocate(ntCoverageStats, intron.x, .after = exonOfPseudo.y)
  2. ├─dplyr:::relocate.data.frame(ntCoverageStats, intron.x, .after = exonOfPseudo.y)
  3. │ ├─base::unname(tidyselect::eval_select(expr(c(...)), .data))
  4. │ └─tidyselect::eval_select(expr(c(...)), .data)
  5. │   └─tidyselect:::eval_select_impl(...)
  6. │     ├─tidyselect:::with_subscript_errors(...)
  7. │     │ ├─base::tryCatch(...)
  8. │     │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
  9. │     │ │   └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
 10. │     │ │     └─base:::doTryCatch(return(expr), name, parentenv, handler)
 11. │     │ └─tidyselect:::instrument_base_errors(expr)
 12. │     │   └─base::withCallingHandlers(...)
 13. │     └─tidyselect:::vars_select_eval(...)
 14
Execution halted
[Thu Jul 28 18:08:30 2022]
Error in rule makeHtmlSummaryDashboard

Usee POSIX atomic file install instead of copying from TMPDIR

LyRic would be faster and more robust if it used POSIX atomic file creation instead of writing to TMPDIR and then
copying in place.

TMPDIR is usually on a separate file system, so

mv {TMPDIR}/$uuidTmpOut {output.bam}

Would usually be a copy. If there is an error, such as out of disk space, an incomplete output file
out exist.

If the file is created on the same file system as the output file, so the mv is an atomic rename.
In bash I would do:

tmpout=${outbam}.tmp
do_something  > ${tmpout}
mv -f ${tmpout} ${outbam}

Assuming bash is run with -e, the completed file doesn't exist under
the final name until it is successfully completed, and there is no copy overhead.

lyric can't be checkout on a MacOS

the default MacOS file system is case-insensitive, and lyric has two files that vary only by case:

utils/HashToGff.pm
utils/hashToGff.pm

while running lyric on MacOS is not that useful, it is useful to have a local copy of the source when debugging.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.