tba123 / rna-star Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/rna-star
Automatically exported from code.google.com/p/rna-star
What steps will reproduce the problem?
1. Download STAR_2.3.1z15.tgz
2. Start cygwin 1.7.31 64bit version
3. Extract and use command make in the folder STAR_2.3.1z15
What is the expected output? What do you see instead?
I wanted to compile STAR_2.3.1z15 with the command make, but it produced the
following errors starting:
Makefile:56: Depend.list: No such file or directory
What version of the product are you using? On what operating system?
STAR_2.3.1z15 with cygwin 1.7.31 64bit version
Please provide any additional information below.
Unfortunately my system language is in german, but I attached the whole error
message and hope it helps. I am aware that cygwin is not supported officially,
but I hoped maybe someone could help regardless as long as I must rely on
cygwin.
Original issue reported on code.google.com by [email protected]
on 5 Aug 2014 at 2:42
Attachments:
STAR 2.3.0e
Linux dna 3.2.0-41-generic #66-Ubuntu SM
THIS WORKED:
* The fasta file is a 2.8 Mbp bacteria
STAR --runMode genomeGenerate --genomeDir ref --genomeFastaFiles 6008.fna --runThreadN 32
THIS CORE DUMPED:
* Read file is FASTQ, 31bp reads, in Phred+64 format.
Core was generated by `STAR --genomeDir ref --readFilesIn 6008_mRNA.fastq
--runThreadN 32'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000405e4b in compareSeqToGenome(char**, unsigned long long,
unsigned long long, unsigned long long, char*, PackedArray&, unsigned long
long, bool, bool&, Parameters*) ()
Original issue reported on code.google.com by [email protected]
on 6 Jun 2013 at 6:07
What steps will reproduce the problem?
1. downloaded STAR_2.3.0e.Linux_x86_64.tgz
2. uploaded to /home/username/src/
3. unzipped following directions at
http://www.linuxforums.org/forum/linux-tutorials-howtos-reference-material/64958
-how-install-software-linux.html
4. could not find any installation instructions
5. unable to go any further in installation process
Please provide any additional information below.
I'm completely new to linux. I'm trying to install the newest version of STAR
because the old version I was using (2.2.0c) has the same segmentation fault
documented here (https://groups.google.com/forum/#!topic/rna-star/j8KomjbDfW0)
when trying to generate a genome file from a small fasta file. This was the
recommended fix.
A way to make the installation work, or another way around the segmentation
fault, would be very much appreciated!
Original issue reported on code.google.com by [email protected]
on 29 Apr 2014 at 12:47
What steps will reproduce the problem?
1. Running STAR is successful, problem is about some inconsistency of the
alignment.
2.
3.
What is the expected output? What do you see instead?
below are a pair of reads aligned to 3 different places, one end align chr1,
the other end align to chr1 and ch3 as a fusion read. I compared the reference
sequences, there is no mismatch. But the nM field indicates two mismatches.
C185NACXX121031:7:1214:18090:96585 337 chr3 33430704 3 68S33M chr1 569497 0 TTCT
AGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAGCCCCTAAATCATCAC
CAGAATGTCTATCCATG >CADC>:DBBB@CDDBEEBDDDBDHEHECCGGFIGHF@IGGGJJJIIJIJIIHIIJ
JIIIBIIJJHJIJJJIIHHIGEIHGHFJJJIGHGHHHFFFFFCCC NH:i:2 HI:i:1 AS:i:36 nM:i:2
C185NACXX121031:7:1214:18090:96585 163 chr1 569497 3 101M = 569722 293 TAGTTATTA
TCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTC
ATGCACCTAATT BCCDFFFFGHHHHJJJJJJJIIJJIJJJJIJJJJIJJJHJFIJJGIJJJIBFHIJJJJGIHHHF
FDDEDCDDDDDDDCDDDDDBDDDDDDCDDDDDDDCD: NH:i:2 HI:i:2 AS:i:168 nM:i:2
C185NACXX121031:7:1214:18090:96585 83 chr1 569722 3 68M33S = 569497 -293 TTCTAGT
AAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAGCCCCTAAATCATCACCAG
AATGTCTATCCATG >CADC>:DBBB@CDDBEEBDDDBDHEHECCGGFIGHF@IGGGJJJIIJIJIIHIIJJIIIBIIJ
JHJIJJJIIHHIGEIHGHFJJJIGHGHHHFFFFFCCC NH:i:2 HI:i:2 AS:i:168 nM:i:2
What version of the product are you using? On what operating system?
STAR_2.3.0e_r291
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 6 May 2014 at 4:54
Hey Alex,
I am having problems indexing h19 on ec2 (m3.2xlarge) with STAR_2.3.0e_r291.
Any thoughts on what is going on? There is 31GB of RAM, so it should all fit:
cat /proc/meminfo | grep Mem
MemTotal: 30828584 kB
MemFree: 29358328 kB
Log.out:
Aug 05 00:41:01 ..... Started STAR run
Aug 05 00:41:01 ... Starting to generate Genome files
Aug 05 00:42:23 ... starting to sort Suffix Array. This may take a long time...
Aug 05 00:42:44 ... sorting Suffix Array chunks and saving them to disk...
Aug 05 01:07:44 ... loading chunks from disk, packing SA...
Aug 05 01:12:00 ... writing Suffix Array to disk ...
Aug 05 01:13:46 ... Finished generating suffix array
Aug 05 01:13:46 ... starting to generate Suffix Array index...
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
run.sh: line 5: 11810 Aborted STAR --runMode genomeGenerate
--genomeDir .../human_g1k_v37/star/ --genomeFastaFiles
....human_g1k_v37/human_g1k_v37.fasta --sjdbGTFfile .../human_g1k_v37/...gtf
--runThreadN 8
Original issue reported on code.google.com by [email protected]
on 5 Aug 2014 at 1:24
I'm trying to generate a custom genome using the provided sjdb file by running
this:
"$star_Path --runMode genomeGenerate --genomeDir $star_genome_dir
--genomeFastaFiles $genomeFastaFiles --runThreadN 8 --sjdbFileChrStartEnd
$sjdbOverhangFile --sjdbOverhang 91"
I'm using the mm9.fa and the Mus_musculus.NCBIM37.66.gtf.sjdb file both
provided under genome downloads, however this is the output:
EXITING because of FATAL error, the sjdb chromosome Y is not found among the
genomic chromosomes
SOLUTION: fix your file
sjdbFileChrStartEnd=...../Mus_musculus.NCBIM37.66.gtf.sjdb_OG.txt at line #1
What is wrong?
Olivier
Original issue reported on code.google.com by [email protected]
on 12 Sep 2013 at 11:52
Hello,
I'm getting a segmentation fault while trying to align reads to a bacterial
genome. I was able to isolate the problem to the following read:
@M01793:3:000000000-A5GLB:1:1101:14294:3045 1:N:0:4
CCGAAGGACATTGCAGCACCGTTCTGAGACTTAACAGCAGCCAGGTAGCCGAAGTAAGTCCACTGGATAGCTTCTGTCTC
TTATACACATCTCCGAGCCCACGAGACTCCTGAGCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAA
+
ABB?ADBFFFBFGGFG4GGGGGGHGGHFFHHHGHHHHFHGHHHHH3DGFGEEFGAGHEHFHHHHFHHHGFBBGHGHDGGH
HHHHHHHHGGHHHHGGGGGGHGGCEGDFHHHGF2FFEHEE0??EBBHFDFGGGHGFHGGFGGFG//>A@-
My guess is that it has something to do with the run of A at the end of the
read. I'd like to use the aligner, but would like to refrain from diving into
the code. Perhaps this case would easy to debug and fix given the single read
on which it fails.
Cheers,
Alexey
Original issue reported on code.google.com by [email protected]
on 15 Nov 2013 at 12:22
On several occasions it happens that the Unmapped reads are not transferred
from the _tmp folder to the main file but just get deleted or so it seems?
(Unmapped file is empty).
If i run the software with no threading, everything is fine.
Aligned reads are reported in both cases without problems.
Original issue reported on code.google.com by [email protected]
on 22 Jan 2014 at 12:16
Hi,
I'd like to rank multiple-mappers based on the alignment quality.
following the manual, I can distinguish the secondary multi mappers from the
primary
"For multi-mappers, all alignments except one are marked with 0x100 (secondary
alignment) in the FLAG column 2. The un-marked alignment is either the best one
(i.e. highest scoring), or is randomly selected from the alignments of equal
quality."
is there a way I could rank the secondary multi mappers?
best,
M.
Original issue reported on code.google.com by [email protected]
on 3 Feb 2014 at 6:17
I have been using STAR with Illumina 101bp paired end reads. The first set of
libraries I sequenced work great going through the pipeline, but I have had a
very strange problem with the most recent libraries.
I call star using the following call:
Star_Directory/STAR --genomeDir Star_Directory/STAR_2.3.0/Genome --readFilesIn
$f $f2 --outSAMstrandField intronMotif --runThreadN 3
where f and f2 are the paired end reads:
1-Nq-C96_S94_L001_R1_001_val_1.fq
1-Nq-C96_S94_L001_R2_001_val_2.fq
which have been trimmed by trim_galore with the call:
trim_galore -q 15 --phred33 --paired --length 50 -a CTGTCTCTTATACACATCT
--stringency 3 $f $f2
where f and f2 are the untrimmed fastq files:
1-Nq-C96_S94_L001_R2_001.fastq
1-Nq-C96_S94_L001_R1_001.fastq
For these runs the log.out file shows something like this:
Started job on | Sep 17 13:16:13
Started mapping on | Sep 17 13:17:17
Finished on | Sep 17 13:17:47
Mapping speed, Million of reads per hour | 21.76
Number of input reads | 181350
Average input read length | 179
UNIQUE READS:
Uniquely mapped reads number | 1973
Uniquely mapped reads % | 1.09%
Average mapped length | 176.75
Number of splices: Total | 24
Number of splices: Annotated (sjdb) | 0
Number of splices: GT/AG | 23
Number of splices: GC/AG | 1
Number of splices: AT/AC | 0
Number of splices: Non-canonical | 0
Mismatch rate per base, % | 0.39%
Deletion rate per base | 0.04%
Deletion average length | 2.22
Insertion rate per base | 0.00%
Insertion average length | 1.50
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 948
% of reads mapped to multiple loci | 0.52%
Number of reads mapped to too many loci | 22
% of reads mapped to too many loci | 0.01%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 98.37%
% of reads unmapped: other | 0.01%
However looking at the Fastq files it looks like the reads are for the most
part adequate.
I've attached abreviated versions of the two of the paired end read fastqs.
I've also attached abbreviated versions of two of the paired end fastqs that
have mapped with a unique mapping percentage of approximately 90% (called
read1/2_goodMappers.fq)
I am new to RNAseq analysis, so this may be a trivial issue. I am hoping I can
get any sort of help I can.
I am using STAR 2.3.0 on Mac OSX.
Thanks so much.
Original issue reported on code.google.com by [email protected]
on 18 Sep 2014 at 4:17
Attachments:
What steps will reproduce the problem?
1. running STAR allignment (stranded paired-end)
2.
3.
What is the expected output? What do you see instead?
The allignment files are generated and have some content but the program is
stuck after printing:
Nov 19 17:18:30 ..... Started STAR run
Nov 19 17:18:36 ..... Started mapping
What version of the product are you using? On what operating system?
STAR_2.3.0e.Linux_x86_64
Running on Linux CentOS 6.4 Kernl Linux 2.6.32-358.0.1.el6.x86_64 GNOME 2.2.8.2
Please provide any additional information below.
The command is:
/rdata/ngseq/Playground/guy/STAR/STAR_2.3.0e.Linux_x86_64/STAR --genomeDir
/rdata/ngseq/Playground/guy/STAR/Genome --readFilesIn
/rdata/ngseq/original_data/rna/illumina/2013-05-05_Guy/GW1/fastq/R1.fastq
/rdata/ngseq/original_data/rna/illumina/2013-05-05_Guy/GW1/fastq/R2.fastq
--runThreadN 8
Thanks,
Guy
Original issue reported on code.google.com by [email protected]
on 19 Nov 2013 at 11:04
What steps will reproduce the problem?
1. Start STAR on a RNA dataset
2. Specify --runThreadN 60 (on a 160Core Machine)
3. Watch Star use only 6~7 cores (top shows <800% CPU)
What is the expected output? What do you see instead?
In your publication you write that STAR scales well,
well, I was hoping for that, but I can't get it to scale.
What version of the product are you using? On what operating system?
...since STAR does not have a version-output, hard to say, but I can try to
update...
Fedora 20, 64bit
Please provide any additional information below.
I also played with the 'genomeLoad' params, as a colleague told me, he saw some
strange things, and advised me to disable the genomeLoad thing.
However, I cannot see a difference in speed (or scaling) if using "genomeLoad
NoSharedMemory" or "genomeLoad LoadAndKeep" (followed by "genomeLoad
LoadAndKeep")
I'll try updating STAR, lets see.
Original issue reported on code.google.com by [email protected]
on 24 Feb 2014 at 1:25
What steps will reproduce the problem?
1. star --genomeDir ./hg19 --genomeLoad LoadAndExit
What is the expected output? What do you see instead?
Expected: a loaded genome.
Often, however, we receive error messages in the 'Log.out' file claiming,
"Another job is still loading the genome, sleeping for 1 min", which isn't true.
What version of the product are you using? On what operating system?
STAR_2.1.4a_r178 on
Red Hat Enterprise Linux Server release 6.3 (Santiago)
Linux version 2.6.32-279.el6.x86_64 (gcc version 4.4.6 20120305 (Red Hat
4.4.6-4) (GCC))
Original issue reported on code.google.com by [email protected]
on 21 Nov 2012 at 2:30
What steps will reproduce the problem?
1. File
ftp://ftp2.cshl.edu/gingeraslab/tracks/STARrelease/STARgenomes/hg19_Gencode19.tg
z unavailable
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
The mouse reference index is available but not the human.
Thank you
Original issue reported on code.google.com by [email protected]
on 25 Mar 2014 at 10:33
The STAR command is not very helpful. This will annoy potential users.
Especially as the documentation is a PDF, and the src/ distribution doesn't
even include it. An ASCII README for basic usage and some pointers to the
documentation.
% STAR
EXITING because of fatal input ERROR: could not open readInFile=Read1
% STAR -h
% STAR --help
EXITING: FATAL INPUT ERROR: empty value for paramter "-h" in input
"Command-Line-Initial"
SOLUTION: use non-empty value for this parameter
Original issue reported on code.google.com by [email protected]
on 6 Jun 2013 at 6:10
I get this error using g++-4.9:
Genome.cpp:218:57: error: 'SHM_NORESERVE' was not declared in this scope
shmID = shmget(shmKey, shmSize, IPC_CREAT | SHM_NORESERVE | 0666); // shmID = shmget(shmKey, shmSize, IPC_CREAT | SHM_NORESERVE | SHM_HUGETLB | 0666);
^
make: *** [Genome.o] Error 1
Original issue reported on code.google.com by [email protected]
on 22 Feb 2014 at 12:28
What steps will reproduce the problem?
1. Download STAR_2.3.1q.tgz
2. Start cygwin 2.831
3. Extract and use command make in the folder STAR_2.3.1q
What is the expected output? What do you see instead?
I wanted to compile STAR_2.3.1q with the command make, but it produced the
following error:
Genome.cpp:218:57: Fehler: »SHM_NORESERVE« was not declared in this scope
What version of the product are you using? On what operating system?
STAR_2.3.1q with cygwin 2.831
Please provide any additional information below.
Unfortunately my system language is in german, but I attached the whole error
message and hope it helps. I am aware that cygwin is not supported officially,
but I hoped maybe someone could help regardless as long as I must rely on
cygwin.
Original issue reported on code.google.com by [email protected]
on 5 Mar 2014 at 10:34
Attachments:
It would be nice to have some kind of an input flag that will print the version
for when the STAR binary is installed by itself somewhere like under
/usr/local/bin. Something like STAR --version or STAR -V
Original issue reported on code.google.com by mariogiov
on 20 Nov 2013 at 10:27
What steps will reproduce the problem?
1. Use the following as input.
$ cat offending_R1.fa
>HS13_186:2:2106:7552:5018/1
TTGTTTTTTGTGTCTCAAATTAACAACCTAACATCATAACTGAAAGAATAAGTGAAGCAAGAACAAATCAAC
$ cat offending_R2.fa
>HS13_186:2:2106:7552:5018/2
TTGTTTTTTGTGTCTCAATTTCCTTCGATTCAGCTCTGATTTTGGTTATTTCTTATCTTCTGCTAGCTTTGG
2. Run STAR like this:
STAR_2.3.1z1/STAR --genomeDir /raid/references-and-indexes/hg19/star/2013_03_04
--genomeLoad NoSharedMemory --readFilesIn
offending_R1.fa offending_R2.fa --outStd SAM > test.sam
What is the expected output? What do you see instead?
STAR quits with the following error:
EXITING because of FATAL ERROR in reads input: short read sequence line: 1
Read Name=>HS13_186:2:2106:7552:5018/2
Read Sequence====
DEF_readNameLengthMax=50000
DEF_readSeqLengthMax=50000
What version of the product are you using? On what operating system?
STAR_2.3.1z1
$ cat /etc/SuSE-release
openSUSE 12.3 (x86_64)
VERSION = 12.3
CODENAME = Dartmouth
Please provide any additional information below.
I was having a problem with this input using STAR version STAR_2.3.0e although
I think the error was different. I upgraded to see if the problem was fixed but
it does not seem to be.
-Chris DeBoever
Original issue reported on code.google.com by [email protected]
on 14 Apr 2014 at 8:51
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
I have multiple gzipped fastq file:
I run STAR with --readFilesIn set to be:
--readFilesIn
CRC0321-1_FCC4JJ9ACXX_L6_HUMhiaTACYRBAPEI-220_1.fq.gz,CRC0321-1_FCC4JJ9ACXX_L7_H
UMhiaTACYRBAPEI-220_1.fq.gz,CRC0321-1_FCC4JJ9ACXX_L8_HUMhiaTACYRBAPEI-220_1.fq.g
z
CRC0321-1_FCC4JJ9ACXX_L6_HUMhiaTACYRBAPEI-220_2.fq.gz,CRC0321-1_FCC4JJ9ACXX_L7_H
UMhiaTACYRBAPEI-220_2.fq.gz,CRC0321-1_FCC4JJ9ACXX_L8_HUMhiaTACYRBAPEI-220_2.fq.g
z
Below is the ending part of the log generated by STAR
...
Starting to map file # 0
mate 1:
/home/hbi16088/data/projects/pdx/fastq/CRC0321-1_FCC4JJ9ACXX_L6_HUMhiaTACYRBAPEI
-220_1.fq.gz
mate 2:
/home/hbi16088/data/projects/pdx/fastq/CRC0321-1_FCC4JJ9ACXX_L6_HUMhiaTACYRBAPEI
-220_2.fq.gz
Created thread # 7
Created thread # 8
Created thread # 9
Completed: thread #7
Completed: thread #1
Completed: thread #5
Completed: thread #3
Completed: thread #2
Completed: thread #4
Completed: thread #6
Completed: thread #9
Completed: thread #0
Joined thread # 1
Joined thread # 2
Joined thread # 3
Joined thread # 4
Joined thread # 5
Joined thread # 6
Joined thread # 7
Completed: thread #8
Joined thread # 8
Joined thread # 9
ALL DONE!
--genomeLoad=LoadAndKeep .
STAR only used the first pair of fastq file instead of all three paired fastq
files.
What version of the product are you using? On what operating system?
STAR svn revision compiled=STAR_2.3.1z4_r419
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 3 Jun 2014 at 2:51
Hi,
I'm trying to figure out how to run a second mapping pass, as described in the
paper
"It is also possible to run a second mapping pass, supplying it with splice
junction loci found in the first mapping pass. In this case, STAR will not
discover any new junctions but will align spliced reads with short overhangs
across the previously detected junctions." (Dobin et al., 2012)
I didt found any examples so I don't understand how to run it...
thanks in advance
M.
Original issue reported on code.google.com by [email protected]
on 25 Jun 2013 at 9:15
What steps will reproduce the problem?
Anytime I run START genomeGenerate with more than 1 thread, the SA file is not
generated (see command line)
1. STAR --runMode genomeGenerate --runThreadN 6 --genomeDir ${gdir}
--genomeFastaFiles ${faFiles}
What is the expected output? What do you see instead?
All output files are present, except for the SA file.
What version of the product are you using? On what operating system?
STAR 2.3 on Red Hat 6.3
Please provide any additional information below.
Error output:
Jan 25 21:35:27 ..... Started STAR run
Jan 25 21:35:27 ... Starting to generate Genome files
Jan 25 21:35:27 ... starting to sort Suffix Array. This may take a long time...
Floating point exception
Original issue reported on code.google.com by [email protected]
on 26 Jan 2014 at 5:39
What steps will reproduce the problem?
1. Run Make on ubuntu
2.
3.
What is the expected output? What do you see instead?
Compilation
What version of the product are you using? On what operating system?
STAR_2.3.0e.tgz
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 25 Sep 2013 at 9:25
What steps will reproduce the problem?
1. input: paired-end read
2. Log.final.out
What is the expected output?
statistics for both paired-end reads
read1: 188881802
read2: 188881802
What do you see instead?
just for one single file ??
e.g.
Number of input reads | 188881802
Average input read length | 102
UNIQUE READS:
Uniquely mapped reads number | 159058244
Uniquely mapped reads % | 84.21%
however Aligned.out.sam has both ends reads mapped:
418527540 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
379423218 + 0 mapped (90.66%:-nan%)
418527540 + 0 paired in sequencing
209249304 + 0 read1
209278236 + 0 read2
379423218 + 0 properly paired (90.66%:-nan%)
379423218 + 0 with itself and mate mapped
What version of the product are you using? On what operating system?
STAR 2.3
Linux tbi-pbs1 2.6.37.6-0.20-default #1 SMP 2011-12-19 23:39:38 +0100 x86_64
x86_64 x86_64 GNU/Linux
Please provide any additional information below.
any comments would be appreciated very much!
Original issue reported on code.google.com by [email protected]
on 25 Feb 2013 at 7:50
What steps will reproduce the problem?
1.can not open the linkage..
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 17 Mar 2014 at 10:54
What steps will reproduce the problem?
1. use multiple fasta (e.g. contigs) with many entries (>10000) as a reference
file to be mapped to
What is the expected output? What do you see instead?
expected: clean run, seen: crash
What version of the product are you using? On what operating system?
newest (2.3 or so)
Original issue reported on code.google.com by [email protected]
on 12 Feb 2013 at 10:18
If you set outSAMattributes as All then the sam file generated is incompatible
with Cufflinks. The error is something like no XS tag for a spliced read. But
when one uses outSAMattributes in standard mode and pass the sam file to
cufflinks then no error is thrown. i think too many new tags when using the All
mode create confusion for Cufflinks.
Also, i would appreciate if you could add a command line option to add RG, LB,
SM, PL tags in the SAM file.
What version of the product are you using? On what operating system?
Latest released version 2.3.0.1 on Linux
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 15 Apr 2013 at 5:42
What steps will reproduce the problem?
1. runing star with parameters:
star --genomeDir $GENOME_DIR \
--readFilesIn $READS1 $READS2 \
--runThreadN $TREADS \
--genomeLoad LoadAndKeep \
--alignIntronMax 500000 \
--alignMatesGapMax 500000 \
--outFileNamePrefix $OUT/ \
--outFilterMultimapNmax 6 \
--outFilterMismatchNmax 3 \
--outFilterMismatchNoverLmax 0.05 \
--outFilterMatchNmin 16 \
--outFilterScoreMinOverLread 0 \
--outFilterMatchNminOverLread 0 \
--outSAMunmapped None \
--outReadsUnmapped Fastx \
--sjdbFileChrStartEnd $GENOME_DIR \
--sjdbOverhang $SJ_DB_OVERHANG \
--chimSegmentMin $CHIM_SEGMENT_MIN \
--chimScoreMin $CHIM_SCORE_MIN \
--clip3pAdapterSeq TCGTATGCCGTCTTCTGCTTG \
--clip3pAdapterMMp 0.1
2. when applied htseq-count as read counter in some cases htseq fails and
report an error
3.What is the expected output? What do you see instead?
htseq-count normally produced table with counts for each gene-ID.
got errors, similar to this one:
Error occured when processing SAM input (line 3773018 of file
/III_pREP_Input/star/Aligned.out.sam):
Python int too large to convert to C long
[Exception type: OverflowError, raised in _HTSeq.pyx:1313]
that line doesn't look ok at all - huge number or/and merged with read
sequence:
HWI-ST1149:193:C4309ACXX:3:1114:1490:42590 355 chr2 27274108 0 28S23M = 27274082
18446744073709551615CCGGGGGGATTAGCTCCAATGGTAGAGCCTCGCTTGGCTTGCGAGAGGTAG =?=DBDD
:0:>>AAA3((383>388(8>=A87:<==<AA?:?:0055=339 NH:i:5 HI:i:4 AS:i:28 nM:i:2
What version of the product are you using? On what operating system?
star231z1
Please provide any additional information below.
see other examples of problematic lines from .sam files below
HWI-ST1149:193:C4309ACXX:3:1104:17209:8771 339 chr14 70236478 3 28S16M7S = 69795
937 -440557 ATTGCTCTCGTTACCTCGGGAATTGAGGTTCCGAATAAGAGGTCATTGGCG HJJJJIIIJJHHFJII
JJJJJJJJJJJJJJJJJJJJJJHHHHHFFFDDB:B NH:i:2 HI:i:2 AS:i:20 nM:i:0
HWI-ST1149:193:C4309ACXX:3:1109:17901:52090 355 chr2 70230204 0 22S15M14S = 7023
0182 18446744073709551611 GGGGCAATACAGAATGTTCGTCGAGTTAAATCCTCTGTAGACGACTTAAAT BB
CDFFFFHHHHGIJJIIJJJJJEHGHJIJJHIIIJJIJIGlsHJJJJJIHIJ NH:i:5 HI:i:2 AS:i:16 nM:i:0
HWI-ST1149:193:C4309ACXX:3:1114:1490:42590 355 chr2 27274108 0 28S23M = 27274082
18446744073709551615CCGGGGGGATTAGCTCCAATGGTAGAGCCTCGCTTGGCTTGCGAGAGGTAG =?=DBDD
:0:>>AAA3((383>388(8>=A87:<==<AA?:?:0055=339 NH:i:5 HI:i:4 AS:i:28 nM:i:2
HWI-ST1149:193:C4309ACXX:1:2105:10512:63315 355 chr2 70230204 1 35S15M1S = 70230
182 18446744073709551611 GCGACGATATTTCACCACAATACAGAATGTTGGTCGAGTTAAATCCTCTGT BB@
DFFFFHHHHDIJIJJJIJJJJJIJIJHIIIHIJFIFHIJJIIJIIJIC NH:i:4 HI:i:4 AS:i:16 nM:i:0
thanks for you help.
Vladimir
Original issue reported on code.google.com by [email protected]
on 4 Apr 2014 at 1:09
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.