Giter Site home page Giter Site logo

adarob / express Goto Github PK

View Code? Open in Web Editor NEW
36.0 36.0 10.0 149.18 MB

Streaming fragment assignment and quantification for high-throughput sequencing.

Home Page: bio.math.berkeley.edu/eXpress

License: Other

CSS 11.17% Shell 0.64% C++ 58.06% C 0.64% CMake 0.68% HTML 28.81%

express's People

Contributors

adarob avatar hartzell avatar jonchang avatar pimentel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

express's Issues

double free or corruption

Probably caused by my shoddy TPM hack...

You are using eXpress v1.5.0, which is the most recent release.

2013-Nov-25 18:12:50 - Attempting to read 'snapout.sam' in BAM format...
2013-Nov-25 18:12:50 - Input is not in BAM format. Trying SAM...
2013-Nov-25 18:12:50 - Loading target sequences and measuring bias background...

2013-Nov-25 18:13:11 - Processing input fragment alignments...
2013-Nov-25 18:13:13 - COMPLETED: Processed 25261 mapped fragments, targets are in 49061 bundles.
2013-Nov-25 18:13:13 - WARNING: Not enough fragments observed to accurately learn bias parameters. Either disable bias correction (--no-bias-correct) or provide a file containing auxiliary parameters (--aux-param-file).
2013-Nov-25 18:13:13 - Writing results to file...
2013-Nov-25 18:13:14 - Done.
*** Error in `/home/rds45/apps/express-1.5.0-linux_x86_64/express': double free or corruption (out): 0x0000000002f39b20 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x80a46)[0x7fee74f10a46]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_ZN13MismatchTableD1Ev+0x6a)[0x4e843a]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_ZN5boost6detail17sp_counted_impl_pI13MismatchTableE7disposeEv+0x12)[0x4e84d2]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_ZN7LibraryD1Ev+0x15a)[0x4e0bba]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_ZNSt6vectorI7LibrarySaIS0_EED1Ev+0x1b)[0x4e0eab]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_Z15estimation_mainv+0x10fc)[0x4de4cc]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fee74eb1ea5]
/home/rds45/apps/express-1.5.0-linux_x86_64/express(_ZNSt15basic_streambufIcSt11char_traitsIcEE6xsputnEPKcl+0x61)[0x48e179]
======= Memory map: ========
00400000-00640000 r-xp 00000000 08:01 51643144                           /home/rds45/apps/express-1.5.0-linux_x86_64/express
00840000-00843000 rw-p 00240000 08:01 51643144                           /home/rds45/apps/express-1.5.0-linux_x86_64/express
00843000-00844000 rw-p 00000000 00:00 0
02831000-2d235000 rw-p 00000000 00:00 0                                  [heap]
7fee6c000000-7fee6c23c000 rw-p 00000000 00:00 0
7fee6c23c000-7fee70000000 ---p 00000000 00:00 0
7fee73c4f000-7fee73c50000 ---p 00000000 00:00 0
7fee73c50000-7fee74650000 rw-p 00000000 00:00 0
7fee74650000-7fee74666000 r-xp 00000000 08:01 13897435                   /lib/x86_64-linux-gnu/libresolv-2.17.so
7fee74666000-7fee74866000 ---p 00016000 08:01 13897435                   /lib/x86_64-linux-gnu/libresolv-2.17.so
7fee74866000-7fee74867000 r--p 00016000 08:01 13897435                   /lib/x86_64-linux-gnu/libresolv-2.17.so
7fee74867000-7fee74868000 rw-p 00017000 08:01 13897435                   /lib/x86_64-linux-gnu/libresolv-2.17.so
7fee74868000-7fee7486a000 rw-p 00000000 00:00 0
7fee74870000-7fee74876000 r-xp 00000000 08:01 13897388                   /lib/x86_64-linux-gnu/libnss_dns-2.17.so
7fee74876000-7fee74a75000 ---p 00006000 08:01 13897388                   /lib/x86_64-linux-gnu/libnss_dns-2.17.so
7fee74a75000-7fee74a76000 r--p 00005000 08:01 13897388                   /lib/x86_64-linux-gnu/libnss_dns-2.17.so
7fee74a76000-7fee74a77000 rw-p 00006000 08:01 13897388                   /lib/x86_64-linux-gnu/libnss_dns-2.17.so
7fee74a78000-7fee74a7a000 r-xp 00000000 08:01 13893671                   /lib/libnss_mdns4_minimal.so.2
7fee74a7a000-7fee74c79000 ---p 00002000 08:01 13893671                   /lib/libnss_mdns4_minimal.so.2
7fee74c79000-7fee74c7a000 r--p 00001000 08:01 13893671                   /lib/libnss_mdns4_minimal.so.2
7fee74c7a000-7fee74c7b000 rw-p 00002000 08:01 13893671                   /lib/libnss_mdns4_minimal.so.2
7fee74c80000-7fee74c8c000 r-xp 00000000 08:01 13897390                   /lib/x86_64-linux-gnu/libnss_files-2.17.so
7fee74c8c000-7fee74e8b000 ---p 0000c000 08:01 13897390                   /lib/x86_64-linux-gnu/libnss_files-2.17.so
7fee74e8b000-7fee74e8c000 r--p 0000b000 08:01 13897390                   /lib/x86_64-linux-gnu/libnss_files-2.17.so
7fee74e8c000-7fee74e8d000 rw-p 0000c000 08:01 13897390                   /lib/x86_64-linux-gnu/libnss_files-2.17.so
7fee74e90000-7fee7504e000 r-xp 00000000 08:01 13897317                   /lib/x86_64-linux-gnu/libc-2.17.so
7fee7504e000-7fee7524d000 ---p 001be000 08:01 13897317                   /lib/x86_64-linux-gnu/libc-2.17.so
7fee7524d000-7fee75251000 r--p 001bd000 08:01 13897317                   /lib/x86_64-linux-gnu/libc-2.17.so
7fee75251000-7fee75253000 rw-p 001c1000 08:01 13897317                   /lib/x86_64-linux-gnu/libc-2.17.so
7fee75253000-7fee75258000 rw-p 00000000 00:00 0
7fee75258000-7fee7526c000 r-xp 00000000 08:01 13897342                   /lib/x86_64-linux-gnu/libgcc_s.so.1
7fee7526c000-7fee7546c000 ---p 00014000 08:01 13897342                   /lib/x86_64-linux-gnu/libgcc_s.so.1
7fee7546c000-7fee7546d000 r--p 00014000 08:01 13897342                   /lib/x86_64-linux-gnu/libgcc_s.so.1
7fee7546d000-7fee7546e000 rw-p 00015000 08:01 13897342                   /lib/x86_64-linux-gnu/libgcc_s.so.1
7fee75470000-7fee75573000 r-xp 00000000 08:01 13897365                   /lib/x86_64-linux-gnu/libm-2.17.so
7fee75573000-7fee75773000 ---p 00103000 08:01 13897365                   /lib/x86_64-linux-gnu/libm-2.17.so
7fee75773000-7fee75774000 r--p 00103000 08:01 13897365                   /lib/x86_64-linux-gnu/libm-2.17.so
7fee75774000-7fee75775000 rw-p 00104000 08:01 13897365                   /lib/x86_64-linux-gnu/libm-2.17.so
7fee75778000-7fee7585d000 r-xp 00000000 08:01 8396585                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7fee7585d000-7fee75a5c000 ---p 000e5000 08:01 8396585                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7fee75a5c000-7fee75a64000 r--p 000e4000 08:01 8396585                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7fee75a64000-7fee75a66000 rw-p 000ec000 08:01 8396585                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7fee75a66000-7fee75a7b000 rw-p 00000000 00:00 0
7fee75a80000-7fee75a98000 r-xp 00000000 08:01 13897429                   /lib/x86_64-linux-gnu/libpthread-2.17.so
7fee75a98000-7fee75c97000 ---p 00018000 08:01 13897429                   /lib/x86_64-linux-gnu/libpthread-2.17.so
7fee75c97000-7fee75c98000 r--p 00017000 08:01 13897429                   /lib/x86_64-linux-gnu/libpthread-2.17.so
7fee75c98000-7fee75c99000 rw-p 00018000 08:01 13897429                   /lib/x86_64-linux-gnu/libpthread-2.17.so
7fee75c99000-7fee75c9d000 rw-p 00000000 00:00 0
7fee75ca0000-7fee75cb6000 r-xp 00000000 08:01 13897470                   /lib/x86_64-linux-gnu/libz.so.1.2.7
7fee75cb6000-7fee75eb5000 ---p 00016000 08:01 13897470                   /lib/x86_64-linux-gnu/libz.so.1.2.7
7fee75eb5000-7fee75eb6000 r--p 00015000 08:01 13897470                   /lib/x86_64-linux-gnu/libz.so.1.2.7
7fee75eb6000-7fee75eb7000 rw-p 00016000 08:01 13897470                   /lib/x86_64-linux-gnu/libz.so.1.2.7
7fee75eb8000-7fee75edb000 r-xp 00000000 08:01 13897293                   /lib/x86_64-linux-gnu/ld-2.17.so
7fee760b7000-7fee760b8000 rw-p 00000000 00:00 0
7fee760d4000-7fee760da000 rw-p 00000000 00:00 0
7fee760da000-7fee760db000 r--p 00022000 08:01 13897293                   /lib/x86_64-linux-gnu/ld-2.17.so
7fee760db000-7fee760dd000 rw-p 00023000 08:01 13897293                   /lib/x86_64-linux-gnu/ld-2.17.so
7fee760dd000-7fee760e0000 rw-p 00000000 00:00 0
7fffe9b3e000-7fffe9f01000 rw-p 00000000 00:00 0                          [stack]
7fffe9f58000-7fffe9f5a000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Command terminated by signal 6

TPM is always NaN or 0

I recently updated my eXpress version in order to take advantage of the TPM calculation. Unfortunately, I'm only getting NaN's or 0's. I would appreciate any help or explanation!

I ran eXpress in the following way:

 #! /bin/sh
#$ -S /bin/sh
#$ -l mem_free=6G,mem_token=6G,h_vmem=6G
#$ -q centos5.q

source /home/mkreitzman/.bashrc
date
echo "step 1: aligning with bowtie"
bowtie2 -x Homo_sapiens.GRCh37.69.cdna.all -1 NanoString_validation/wgEncodeCaltechRnaSeqHepg2R2x75Il200FastqRd1Rep2/all.fastq -2 NanoString_validation/wgEncodeCaltechRnaSeqHepg2R2x75Il200FastqRd2Rep2/all.fastq | samtools view -Sb - > NanoString_validation/bowtie2_alignment/hits.bam 
date
echo "step 2: eXpress"
cd expression_quantification_testing/testing/eXpress/HepG2rep2
express Homo_sapiens.GRCh37.69.cdna.all.fa NanoString_validation/bowtie2_alignment/hits.bam
date

here are a few lines of the results file:

bundle_id  target_id   length  eff_length  tot_counts  uniq_counts est_counts  eff_counts  ambig_distr_alpha   ambig_distr_beta    fpkm    fpkm_conf_low   fpkm_conf_high  solvable    tpm
1   ENST00000530128 775 485.194377  5   5   5.000000    7.986490    0.000000e+00    0.000000e+00    1.617381e-01    1.617381e-01    1.617381e-01    T   nan
2   ENST00000483629 1818    1469.073410 495 495 495.000000  612.569797  0.000000e+00    0.000000e+00    5.288346e+00    5.288346e+00    5.288346e+00    T   nan
3   ENST00000398831 2574    3017.226332 315 315 315.000000  268.726940  0.000000e+00    0.000000e+00    1.638554e+00    1.638554e+00    1.638554e+00    T   nan
4   ENST00000502938 810 553.658872  7   7   7.000000    10.240963   0.000000e+00    0.000000e+00    1.984330e-01    1.984330e-01    1.984330e-01    T   nan
5   ENST00000539234 440 269.236664  1   1   1.000000    1.634250    0.000000e+00    0.000000e+00    5.829402e-02    5.829402e-02    5.829402e-02    T   nan
6   ENST00000373539 4476    4147.779393 168 168 168.000000  181.294116  0.000000e+00    0.000000e+00    6.356994e-01    6.356994e-01    6.356994e-01    T   nan

and here is uniq on the last (TPM) column

 cut -f 15 /projects/mkreitzman_prj/expression_quantification_testing/testing/eXpress/HepG2rep2/results.xprs | sort | uniq
0.000000e+00
nan
tpm

Please add release tags

Hi,
eXpress is packaged for Debian. To enable us spotting new versions quickly some automatic tools could relay on tags on Github. So it would be very helpful if you would consider adding release tags to flag the code that should be distributed to users.
Thanks for considering, Andreas.

Tests

Having even a very simple testing setup would make contributing to the repo a better experience. Perhaps it could start out as a script that runs eXpress for a bunch of different, small input datasets and sanity checks the output?

Build fails with gcc > 6 (will PR the fix from the Debian team)

I'm building eXpress within the Spack framework.

Building express fails with the following:

     383    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/logger.h: In member function 'void Logger::severe
            (const char*, ...) const':
     384    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/logger.h:72:37: warning: unnecessary parentheses
            in declaration of '_mut' [-Wparentheses]
     385         boost::unique_lock<boost::mutex>(_mut);
     386                                         ^
     387    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.cpp: In member function 'double Target::s
            ample_likelihood(bool, const std::vector<const Target*>*) const':
  >> 388    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.cpp:116:62: error: no matching function f
            or call to 'Target::cached_effective_length(const boost::shared_ptr<BiasBoss>&) const'
     389       double tot_eff_len = cached_effective_length(lib.bias_table);
     390                                                                  ^
     391    In file included from /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.cpp:10:
     392    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.h:399:10: note: candidate: 'double Target
            ::cached_effective_length(bool) const'
     393       double cached_effective_length(bool with_bias=true) const;
     394              ^~~~~~~~~~~~~~~~~~~~~~~
     395    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.h:399:10: note:   no known conversion for
             argument 1 from 'const boost::shared_ptr<BiasBoss>' to 'bool'
  >> 396    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.cpp:121:77: error: no matching function f
            or call to 'Target::cached_effective_length(const boost::shared_ptr<BiasBoss>&) const'
     397                                 neighbor->cached_effective_length(lib.bias_table));
     398                                                                                 ^
     399    In file included from /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.cpp:10:
     400    /home/ghartzell/spack/var/spack/stage/express-1.5.2-6423bq5bjovjwpq2hc4tyi6lwslzubv7/eXpress-1.5.2/src/targets.h:399:10: note: candidate: 'double Target
            ::cached_effective_length(bool) const'
     401       double cached_effective_length(bool with_bias=true) const;
     402              ^~~~~~~~~~~~~~~~~~~~~~~

There's a fix for this problem in the Debian package repository. It works well when applied here. I'll PR it.

No idea where to submit this - piano-scribe is AMAZING, but there's a few issues...

Piano-scribe is an indispensable transcription program I want to thank you for creating it (or co-creating it, no idea). I haven't even found commercial software can match its' quality.

I've noticed two issues though, and I also have a question. There's no way to contact you on the project itself, so I'm asking here:

  1. Uploading audio clips longer than 1 minute simply makes the page hang indefinitely, and nothing happens. I've managed to bypass this by splitting my audio into 1 minute segments before uploading.

  2. When the transcription is done and it plays in the browser, it sounds (and looks) excellent. However, the downloaded MIDI files have the note lengths wrong - there's a few enormously long notes, while all other notes are incredibly short. Here's what that looks like, and here's links to the buggy MIDI in the image and the original audio input for comparison. I've managed to bypass the issue by manually correcting the note lengths, but it can get quite tedious.

  3. How hard would it be to make this into a standalone program? Online services have a way of disappearing, and it would be an incredible shame to lose this project forever if Glitch gets goes bankrupt/bought out/whatever.

I also asked the other person listed in the project, asking here too just in case.

Fail on sorted bam input

The documentation covers the requirement for random ordered input, however It would be nice to add a check to be sure that a bam file is not coordinate sorted before attempting to process it. (see https://twitter.com/KeywanHP/status/646661366774923264)

Testing for a stretch of reads from the same chromosome and monotonically increasing position is probably a reliable test. If that's hard, at minimum you could check for presence of a so:coordinate header.

eXpress can't handle empty lines in FASTA files

Today I tried to run eXpress, and it gave an error that a particular sequence was not found in the supplied FASTA file. In fact, the sequence was present as the second entry in the file.

I eventually realised that the FASTA was formatted like this:

>seq1
ATGC

>seq2
GTCA

After removing the empty lines, eXpress ran fine.

So, bug: eXpress can't handle empty lines between records in FASTA files, despite this being fairly common in practise.

softclipped parts of reads as considered as indels

eXpress seems to treat soft clipping in read as indels. This can be proven with the option --max-indel-size. I have reads with soft clips ranging from 3 to 40 bases. With default --max-indel-size (i.e., 10), express could only quantify reads with max indels <= 10 includin soft clips. I lost most of my perfectly matching reads in quantification because they have soft clipped parts which are longer than 10 bases long . With changing the --max-indel-size to 40 could quantify most of the reads, but sometimes lead to segfault.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.