Comments (11)
Hi Duma,
Thanks for spotting this, it was due to a bug I introduced in the last update where I revised the UMI extraction. I fixed it and pushed an update to the repository.
Thanks,
- Paul
from split-seq_demultiplexing.
Hi Paul,
Thank you for your reply. Just one last question.
Does SPLiT-Seq_demultiplexing0.1.1 searches for the Barcodes in specific positions in the reads of READ2 ?
According to Splt-seq protocol, Barcode3 should start at position 11nts, Barcode2 at Position 48 nts and Barcodes1 at position 86nts in Read2.
Thank you again in advance for your reply.
with best wishes
Duma
from split-seq_demultiplexing.
Hi Duma,
Yes SPLiT-Seq_demultiplexing searches for the barcodes in read 2. However, rather than use fixed barcode positions (11nts, 48nts, 86nts etc) SPLiT-Seq_demultiplexing uses searches for barcodes based on their sequence + a static flanking sequence. See the Round1_barcodes_new5.txt, Round2_barcodes_new4.txt, Round3_barcodes_new4.txt barcode files as examples. Additionally the barcodes must be found in the correct order. By using this approach instead of fixed positions the demultiplexing process remains accurate even for sequences where the read starts in a slightly different position and reads containing insertions or deletions at non-barcode positions.
from split-seq_demultiplexing.
Hi Paul,
Thank you for your reply. Now its clear. As , i am trying to demultiplex our Splitseq data ( 27 GB ) where the sequening did not work very well, because the reads of READ2 are 84nts in length ( Other then 94 nts defined by splitseq protocols ). So, we thought that there can't be Barcode1 in Read2 as the reads are too short.
When i tried to demultiplex our above described Splitseq data with splitseqdemultiplex_0.1.1 , it was producing 23 MB of MergedCells_1.fastq file. So, i was wandering whether those reads in the MergedCells_1.fastq were correct or not ?
Now, i understood that your software does not look for Barcodes in specific postions of Read2, which makes sense for my result.
Thank you again for this great work.
Keep it up.
with best wishes
Dipankar
from split-seq_demultiplexing.
Hi Paul,
One more thing. Even when i run splitseqdemultiplex_0.1.1 with different Round1 barcodes file ( where the barcodes are different then the default round1 barcodes you provided ) with the option -1 ./testbc1.txt ( where ./testbc1.txt is my test barcode file ) , it always uses the default round1 barcode file Round1_barcodes_new5.txt and produces the same results!!!
I even used a empty round1 barcode file for testing, but splitseqdemultiplex_0.1.1 uses the default round1 barcodes and produces the same results!!!!!
So, do you think its a bug ? Or i need to adjust the parameters?
Thank you again in advance
with best wishes
Duma
from split-seq_demultiplexing.
Hi Duma,
Thanks again, yes this was a bug it should be fixed now. Let me know if things work as you expect.
Thanks,
- Paul
from split-seq_demultiplexing.
Hi Paul,
Thank you a lot. It works now with different barcode file.
There are some more things i need to ask you.
-
If i use star aligner with option -a then should i must use genome annotation file in GTF format with option -y , or this is optional ? As in your example command line there was no -y parameter.
-
One need to use another annotation file in SAF format with the option -s ? Will it not work if i use a annotation file in GTF format ? If not, do you know any tools which can convert GTF to SAF format?
I thank you again for all your patience of replying me and making the changes in your codes.
with best wishes
Duma
from split-seq_demultiplexing.
Hi Duma,
I just updated the syntax of the -y and -s arguments and confirmed that both work. Please pull the latest version of of the software and use the following instructions:
-
If you use the
-a star
option then you must also provide either a-y
(.gtf format) or-s
(.saf format) annotation file. -
How to choose an annotation file: If you want to assign and count reads that align to only exonic positions use -y (the GTF option). If you want to assign and count reads that align to exonic and intronic regions use -s (the SAF option).
-
Syntax for the -y and -s commands: You must provide "GTF" or "SAF" with the path to the annotation file. There must be a space in between the two entries as in the following example.
-y GTF /path/to/my/.gtf
-s SAF /path/to/my/.saf
NOTES: the only way to assign intronic reads is to use a .saf file. A .saf file can be created using biomart http://useast.ensembl.org/biomart/martview/5a9818755a5c89e3c5865cc947b3d44b
the format is as follows with NO header:
GeneName Chromosome GeneStart GeneEnd Strand
I hope this helps, let me know if you have further questions or issues.
Thanks,
- Paul
from split-seq_demultiplexing.
Hi Paul,
Thank you again. I will test the alignment part of **splitseqdemultiplex_0.1.1** today and I will let you know if i find some more bugs or some other issues.
Thank you,
with best wishes,
Duma
from split-seq_demultiplexing.
Hi Paul,
Thank you a lot. I tested your splitseqdemultiplex_0.1.1, and i finally i was able to get it work.
There is little error when i do not use -s SAF option
splitseqdemultiplex_0.1.1.sh: line 278: [: =: unary operator expected
Also, for command line i had to use quates " " for the option after -s or -y. Otherwise it is unable to see the .saf or .gtf file, May be because its an error specific to UBUNTU ( I am using UBUNTU 18.04 ).
So, finally i used the command like this for my GTF files:
bash splitseqdemultiplex_0.1.1.sh -n 4 -e 1 -m 10 -1 Round1_barcodes_new5.txt -2 Round2_barcodes_new4.txt -3 Round3_barcodes_new4.txt -f SRR6750041_1_smalltest.fastq -r SRR6750041_2_smalltest.fastq -o results -t 20000 -g 100000 -c true -a star -x /full/file/path_of_starindex/star -y "GTF /full/file/path_of_annotation_gtf_file/genes.gtf"
Thank you again.
with best wishes
Duma
from split-seq_demultiplexing.
Hi Duma,
Glad you have it working. Thanks for pointing out the unary operator error. I forgot to quote a variable in an if statement. I fixed that and pushed another update. Along with updates to the README indicating that the -y and -s options should be quoted.
Thanks for your help in tracking down these bugs. I appreciate it.
- Paul
from split-seq_demultiplexing.
Related Issues (10)
- how to do demultiplexing HOT 5
- KeyError: '0_34' HOT 1
- odd results HOT 1
- step 4 issue HOT 16
- What are the recommended steps once demultiplexing is complete? HOT 6
- How to prepare the roundXbarcodes file๏ผ HOT 2
- Demultiplexing and collapsing takes days to complete for at-scale experiment HOT 14
- Confirm that OdT/ranHex collapse is occurring on correct BC position HOT 4
- Version of Split-seq HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from split-seq_demultiplexing.