Giter Site home page Giter Site logo

Comments (11)

marcelm avatar marcelm commented on August 29, 2024

According to the log, the error appears to actually be in the igdiscover_igblast rule. If the below doesn’t fix the issue, can you show the contents of the file /lustre/home/acct-medsn/medsn/IMQ1_BCR/.snakemake/log/2023-07-06T155201.795624.snakemake.log mentioned in the very last line of the log?

Regading barcodes/UMIs: What IgDiscover calls "barcode" is actually the UMI, so your thinking is correct. However, IgDiscover expects the UMI to be at the 5' end of the read, so if you set barcode_length_5prime to 10, the first 10 bases of the 10X barcode would be interpreted as the UMI. I think this would mean that all reads from one GEM would be merged into a single one by IgDiscover. This might even explain the above igdiscover_igblast error.

My suggestion is to try setting barcode_length_5prime to 26 to make IgDiscover treat the combined 10X barcode plus UMI as UMI. (Disclaimer: I have not used IgDiscover myself with 10X data.)

from igdiscover-legacy.

Jyoung-0108 avatar Jyoung-0108 commented on August 29, 2024

According to the log, the error appears to actually be in the igdiscover_igblast rule. If the below doesn’t fix the issue, can you show the contents of the file /lustre/home/acct-medsn/medsn/IMQ1_BCR/.snakemake/log/2023-07-06T155201.795624.snakemake.log mentioned in the very last line of the log?

Regading barcodes/UMIs: What IgDiscover calls "barcode" is actually the UMI, so your thinking is correct. However, IgDiscover expects the UMI to be at the 5' end of the read, so if you set barcode_length_5prime to 10, the first 10 bases of the 10X barcode would be interpreted as the UMI. I think this would mean that all reads from one GEM would be merged into a single one by IgDiscover. This might even explain the above igdiscover_igblast error.

My suggestion is to try setting barcode_length_5prime to 26 to make IgDiscover treat the combined 10X barcode plus UMI as UMI. (Disclaimer: I have not used IgDiscover myself with 10X data.)

Thanks for your reply. And I've tried to corrected the barcode_length_5prime to 26 but it didn't fix my issue. I still got the problem. and i've attached the log file here.
2023-07-07T095124.994995.snakemake.log

from igdiscover-legacy.

Jyoung-0108 avatar Jyoung-0108 commented on August 29, 2024

hello,i've updated my issue and i'm waiting for your reply. Thanks!

from igdiscover-legacy.

marcelm avatar marcelm commented on August 29, 2024

Can you please run igdiscover igblastwrap --sequence-type=Ig --threads=16 iteration-01/database reads/sequences.fasta.gz > /dev/null and paste the output here? There should be an error message.

from igdiscover-legacy.

Jyoung-0108 avatar Jyoung-0108 commented on August 29, 2024

Thanks! I ran the commands and got the following output.

INFO: Running IgBLAST on input reads
Traceback (most recent call last):
File "/lustre/home/acct-medsn/medsn/.conda/envs/igdiscover/bin/igdiscover", line 10, in
sys.exit(main())
File "/lustre/home/acct-medsn/medsn/.conda/envs/igdiscover/lib/python3.10/site-packages/igdiscover/main.py", line 92, in main
to_run()
File "/lustre/home/acct-medsn/medsn/.conda/envs/igdiscover/lib/python3.10/site-packages/igdiscover/main.py", line 90, in
to_run = lambda: module.main(args)
File "/lustre/home/acct-medsn/medsn/.conda/envs/igdiscover/lib/python3.10/site-packages/igdiscover/cli/igblastwrap.py", line 122, in main
"Processed {:10,d} sequences at {:.1f} ms/sequence".format(n, elapsed / n * 1e3)
ZeroDivisionError: float division by zero

from igdiscover-legacy.

marcelm avatar marcelm commented on August 29, 2024

Thanks, that is helpful. It appears that no reads remain after preprocessing. That is, IgBLAST works on zero input sequences. There are a couple of preprocessing steps and it is unclear to me at the moment which one filters out all reads. What are the contents of reads/2-pear.log?

from igdiscover-legacy.

Jyoung-0108 avatar Jyoung-0108 commented on August 29, 2024

I uploaded the log file here.
2-pear.log

from igdiscover-legacy.

marcelm avatar marcelm commented on August 29, 2024

Ok, that step went fine. What are the contents of the files stats/trimmed.json and stats/groups.json?

from igdiscover-legacy.

marcelm avatar marcelm commented on August 29, 2024

... and of reads/4-sequences.fasta.gz.log?

from igdiscover-legacy.

Jyoung-0108 avatar Jyoung-0108 commented on August 29, 2024

it seemed that all sequences were trimmed.
groups.json:
{
"unique_barcodes": 0,
"barcode_singletons": 0,
"groups_written": 0,
"group_size_1": 0,
"group_size_2": 0,
"group_size_3plus": 0
}

and trimmed.json:
{"trimmed": 21081671}

4-sequences.fasta.gz.log

from igdiscover-legacy.

marcelm avatar marcelm commented on August 29, 2024

The large number in trimmed.json is actually the number of sequences remaining after trimming, so it is good that it is high. ("Trimming" actually refers to primer removal, but since you haven’t configured any primers in your configuration, no trimming is done anyway.)

The problem appears to be in the "grouping" step where IgDiscover mainly groups together sequences that have the same UMI, but it also does some minor filtering. In particular, it filters out sequences that are too short, and it seems that in your case, it considers all sequences to be too short:

INFO: 21081671 sequences in input
INFO: 0 sequences long enough

Your configuration file has this:

minimum_merged_read_length: 300

So it appears that all your sequences end up as being shorter than 300 bp.

What is actually the read length of your 10X reads? Maybe the data isn’t suitable for IgDiscover.

from igdiscover-legacy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.