Hi, Thank you for this good work. I am testing your last version

Hi Paul, <div class="snippet-clipboard-content notranslate position-relative overf

Questions about UMI and Barcode Position about split-seq_demultiplexing HOT 11 CLOSED

paulranum11 commented on June 12, 2024

Questions about UMI and Barcode Position

from split-seq_demultiplexing.

Comments (11)

paulranum11 commented on June 12, 2024

Hi Duma,

Thanks for spotting this, it was due to a bug I introduced in the last update where I revised the UMI extraction. I fixed it and pushed an update to the repository.

Thanks,

Paul

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,
Thank you for your reply. Just one last question.

Does SPLiT-Seq_demultiplexing0.1.1 searches for the Barcodes in specific positions in the reads of READ2 ?

According to Splt-seq protocol, Barcode3 should start at position 11nts, Barcode2 at Position 48 nts and Barcodes1 at position 86nts in Read2.

Thank you again in advance for your reply.
with best wishes
Duma

from split-seq_demultiplexing.

paulranum11 commented on June 12, 2024

Hi Duma,

Yes SPLiT-Seq_demultiplexing searches for the barcodes in read 2. However, rather than use fixed barcode positions (11nts, 48nts, 86nts etc) SPLiT-Seq_demultiplexing uses searches for barcodes based on their sequence + a static flanking sequence. See the Round1_barcodes_new5.txt, Round2_barcodes_new4.txt, Round3_barcodes_new4.txt barcode files as examples. Additionally the barcodes must be found in the correct order. By using this approach instead of fixed positions the demultiplexing process remains accurate even for sequences where the read starts in a slightly different position and reads containing insertions or deletions at non-barcode positions.

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,
Thank you for your reply. Now its clear. As , i am trying to demultiplex our Splitseq data ( 27 GB ) where the sequening did not work very well, because the reads of READ2 are 84nts in length ( Other then 94 nts defined by splitseq protocols ). So, we thought that there can't be Barcode1 in Read2 as the reads are too short.
When i tried to demultiplex our above described Splitseq data with splitseqdemultiplex_0.1.1 , it was producing 23 MB of MergedCells_1.fastq file. So, i was wandering whether those reads in the MergedCells_1.fastq were correct or not ?

Now, i understood that your software does not look for Barcodes in specific postions of Read2, which makes sense for my result.

Thank you again for this great work.
Keep it up.
with best wishes
Dipankar

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,
One more thing. Even when i run splitseqdemultiplex_0.1.1 with different Round1 barcodes file ( where the barcodes are different then the default round1 barcodes you provided ) with the option -1 ./testbc1.txt ( where ./testbc1.txt is my test barcode file ) , it always uses the default round1 barcode file Round1_barcodes_new5.txt and produces the same results!!!

I even used a empty round1 barcode file for testing, but splitseqdemultiplex_0.1.1 uses the default round1 barcodes and produces the same results!!!!!

  So, do you think its a bug ? Or i need to adjust the parameters?

Thank you again in advance
with best wishes
Duma

from split-seq_demultiplexing.

paulranum11 commented on June 12, 2024

Hi Duma,

Thanks again, yes this was a bug it should be fixed now. Let me know if things work as you expect.

Thanks,

Paul

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,
Thank you a lot. It works now with different barcode file.

There are some more things i need to ask you.

If i use star aligner with option -a then should i must use genome annotation file in GTF format with option -y , or this is optional ? As in your example command line there was no -y parameter.
One need to use another annotation file in SAF format with the option -s ? Will it not work if i use a annotation file in GTF format ? If not, do you know any tools which can convert GTF to SAF format?

I thank you again for all your patience of replying me and making the changes in your codes.
with best wishes
Duma

from split-seq_demultiplexing.

paulranum11 commented on June 12, 2024

Hi Duma,

I just updated the syntax of the -y and -s arguments and confirmed that both work. Please pull the latest version of of the software and use the following instructions:

If you use the -a star option then you must also provide either a -y (.gtf format) or -s (.saf format) annotation file.
How to choose an annotation file: If you want to assign and count reads that align to only exonic positions use -y (the GTF option). If you want to assign and count reads that align to exonic and intronic regions use -s (the SAF option).
Syntax for the -y and -s commands: You must provide "GTF" or "SAF" with the path to the annotation file. There must be a space in between the two entries as in the following example.

-y GTF /path/to/my/.gtf
-s SAF /path/to/my/.saf

NOTES: the only way to assign intronic reads is to use a .saf file. A .saf file can be created using biomart http://useast.ensembl.org/biomart/martview/5a9818755a5c89e3c5865cc947b3d44b the format is as follows with NO header:
GeneName Chromosome GeneStart GeneEnd Strand

I hope this helps, let me know if you have further questions or issues.
Thanks,

Paul

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,

            Thank you again. I will test the alignment part of **splitseqdemultiplex_0.1.1**  today and I will let you know if i find some more bugs or some other issues.

Thank you,
with best wishes,
Duma

from split-seq_demultiplexing.

dumaatravaie commented on June 12, 2024

Hi Paul,
Thank you a lot. I tested your splitseqdemultiplex_0.1.1, and i finally i was able to get it work.

There is little error when i do not use -s SAF option

splitseqdemultiplex_0.1.1.sh: line 278: [: =: unary operator expected

Also, for command line i had to use quates " " for the option after -s or -y. Otherwise it is unable to see the .saf or .gtf file, May be because its an error specific to UBUNTU ( I am using UBUNTU 18.04 ).

So, finally i used the command like this for my GTF files:

bash splitseqdemultiplex_0.1.1.sh -n 4 -e 1 -m 10 -1 Round1_barcodes_new5.txt -2 Round2_barcodes_new4.txt -3 Round3_barcodes_new4.txt -f SRR6750041_1_smalltest.fastq -r SRR6750041_2_smalltest.fastq -o results -t 20000 -g 100000 -c true -a star -x /full/file/path_of_starindex/star -y "GTF /full/file/path_of_annotation_gtf_file/genes.gtf"

Thank you again.
with best wishes
Duma

from split-seq_demultiplexing.

paulranum11 commented on June 12, 2024

Hi Duma,

Glad you have it working. Thanks for pointing out the unary operator error. I forgot to quote a variable in an if statement. I fixed that and pushed another update. Along with updates to the README indicating that the -y and -s options should be quoted.

Thanks for your help in tracking down these bugs. I appreciate it.

Paul

from split-seq_demultiplexing.

Questions about UMI and Barcode Position about split-seq_demultiplexing HOT 11 CLOSED

Comments (11)

Related Issues (10)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent