Giter Site home page Giter Site logo

Comments (24)

zhangrengang avatar zhangrengang commented on June 23, 2024

No. It may be very easy to know the homology by aligning your chromosomes to the genome of Fragaria vesca using tools such as minimap2 etc. You may be able to get a figure like below:

fan-fve

So the sg.config configuration can be set like:

1-1 1-2 1-3 1-4
2-1 2-2 2-3 2-4
...

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

Another way for you, the HiC contact heatmap can also show diagonal signals between homeologous chromosomes like:
hic-heatmap
So the sg.config configuration can be set like:

chr01a chr01b chr01c chr01d
chr02a chr02b chr02c chr02d
...

Note that the HiC heatmap should not be filtered by mapping qualtiy of reads.

from subphaser.

hushaoqiang avatar hushaoqiang commented on June 23, 2024

谢谢老师您的解答,我将使用您所推荐的方式来做同源分析。另外还有个问题请教一下老师您,为了说详细点,我就是用中文了。
老师,使用Camarosa草莓的基因组序列想要试运行一下,基因组文件是发表的序列。配置文件内容:
Fvb1-1 Fvb1-2 Fvb1-3 Fvb1-4
Fvb2-1 Fvb2-2 Fvb2-3 Fvb2-4
Fvb3-1 Fvb3-2 Fvb3-3 Fvb3-4
Fvb4-1 Fvb4-2 Fvb4-3 Fvb4-4
Fvb5-1 Fvb5-2 Fvb5-3 Fvb5-4
Fvb6-1 Fvb6-2 Fvb6-3 Fvb6-4
Fvb7-1 Fvb7-2 Fvb7-3 Fvb7-4

运行命令是:subphaser -i F_ana_Camarosa_6-28-17_hardmasked.fasta -c sg.config
但是会出现:
22-04-28 18:21:39 [INFO] Consistent with subgenome assignment: 16 (6.93%); potential exchange: 149 (64.50%)
22-04-28 18:21:39 [INFO] Output: /home/hsq/hhh/subphaser/kmls/phase-results/k15_q200_f2.bin.enrich
22-04-28 18:21:39 [INFO] ###Step: LTR
22-04-28 18:21:39 [INFO] Identifying LTR-RTs by ['ltr_harvest']
22-04-28 18:21:45 [INFO] Check point file: /home/hsq/hhh/subphaser/kmls/tmp/LTR.scn.ok exists; skip this step
22-04-28 18:21:45 [INFO] 8186 LTRs identified
22-04-28 18:21:45 [INFO] Check point file: /home/hsq/hhh/subphaser/kmls/tmp/LTR.tesort.ok exists; skip this step
22-04-28 18:21:45 [INFO] By TEsorter, 190 (2.3%) are classified as LTRs, of which 0 (0.0%) are intact with complete protein domains
22-04-28 18:21:45 [INFO] After filtering, 186 / 8186 (2.3%) LTRs retained
22-04-28 18:21:45 [INFO] Outputing coordinate - LTR maps to /home/hsq/hhh/subphaser/kmls/phase-results/k15_q200_f2.ltr.bin.count
22-04-28 18:21:45 [INFO] Start Pool with 32 process(es)
22-04-28 18:21:45 [INFO] Processed 0 sequences
Traceback (most recent call last):
File "/home/hsq/miniconda3/envs/SubPhaser/bin/subphaser", line 33, in
sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
File "/home/hsq/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 779, in main
pipeline.run()
File "/home/hsq/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 516, in run
ltr_bedlines, enrich_ltr_bedlines = self.step_ltr(d_kmers) if not self.disable_ltr else ([],[])
File "/home/hsq/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 563, in step_l tr
Seqs.map_kmer3([ltrfile], d_kmers, fout=fout, k=self.k, ncpu=self.ncpu,
File "/home/hsq/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/Seqs.py", line 97, in map_kmer3
logger.info('{} ({:.2%}) sequences contain subgenome-specific kmers'.format(mapped_seqs, mapped_seqs/i))
ZeroDivisionError: division by zero

k15_q200_f2.kmer_pca.pdf是成功生成的。生成的pca图与老师您附表中的图不一样。
请问一下老师我是不是需要调整参数或者其他设置有什么问题。
期待老师您的回复

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

Do not use hardmasked genome sequences, because subphaser is based on repeated sequences (TEs in most cases) in fact.

from subphaser.

hushaoqiang avatar hushaoqiang commented on June 23, 2024

老师,请问您对开麦罗莎草莓做分析时(Fig. S27 Subgenome phasing with SubPhaser in the Fragaria x ananassa genome.),使用的文件是哪一个啊。

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

F_ana_Camarosa_6-28-17.fasta.gz. But you may not get the identical results but similar results.

from subphaser.

hushaoqiang avatar hushaoqiang commented on June 23, 2024

老师,我刚刚又去GDR网站看了一下,我只看到了 F_ana_Camarosa_6-28-17_hardmasked.fasta.gz跟[F_ana_Camarosa_6-28-17_unmasked.fasta.gz两个文件,请问一下您告诉我的F_ana_Camarosa_6-28-17.fasta.gz文件这个在哪个链接下载啊,麻烦老师您了。

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

F_ana_Camarosa_6-28-17_unmasked.fasta.gz

from subphaser.

hushaoqiang avatar hushaoqiang commented on June 23, 2024

老师,已成功运行,4套亚基因组成功分型。图非常好看,感谢老师您的耐心指导与为科研工作者所开发的SubPhaser软件。

from subphaser.

hushaoqiang avatar hushaoqiang commented on June 23, 2024

Teacher, I have successfully found homologous chromosomes using MUMmer software, and then performed genotyping using SubPhaser. The results are very good.
Thank you!

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

Great. Thanks for your feedback.

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

Hi @zhangrengang, Thanks for this excellent software. I have been using this for my newly assembled allopolyploidy genome (2n = 4x = 40). The genome has two subgenomes, but I don't know the homologous pairs. So as you suggested in previous comments, I used mummer and minimapper2 to find the homologous pairs and assigned them in the configurations file.

But after running the SubPhaser with default parameters I got 19 chr in SG1 and only one chr in SG2. Idially, it should be 10 chr in each subgenome.
image
The differential k-mer heatmap also shows the similar result:
image

Can you suggest why it happend? Am I approaching rightly?

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

@kashiff007 It seems that your genome is contig-level, but not chromosome-level?

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

Yes, It is a newly assembled genome and from cytometry, we speculate that the largest 20 contigs are full genome length.

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

@kashiff007 Can you show the homoelogous relationship such as dot plot, and your config file? How much percent do the largest 20 contigs account for the whole genome?

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

image

here y-axis is my new genome and when we align with another closely related species Oropetium (x-asix), id gives the following plot. You can see each of the x-axis has two aligned positions in y-axis, which tells that these are homologous.

Based on this I made a config file which looks like:
ptg000016l ptg000003l
ptg000013l ptg000004l
ptg000022l ptg000015l
ptg000011l ptg000007l
ptg000012l ptg000006l
ptg000019l ptg000008l
ptg000017l ptg000018l
ptg000002l ptg000010l
ptg000001l ptg000005l
ptg000020l ptg000014l
Each row has two contigs (assuming homologous chromosomes) separated by space.

Total genoem size is 587 mb (587146212) and top 20 contigs are abount 573mb (573812465). or 97.72% of the total genome.

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

@kashiff007 It seems no problem. Do you have evidence for allo?

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

I don't have evidence for allo. My primary job was to separate the subgenemes so I followed this thread. The circos looks like this:
circos

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

is it something to do with clustering?

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

@kashiff007 If there are no much assembly errors (swtich errors, etc,), your genome is likely to be an auto-polyploid or be heavily recombined, which can not by phased by subphaser. You can try to change the parameters, such as setting -q 100 or reducing the number of chromosome sets like:

ptg000016l ptg000003l
ptg000013l ptg000004l
ptg000022l ptg000015l
ptg000011l ptg000007l
ptg000012l 
ptg000006l
ptg000019l 
...
ptg000020l 
ptg000014l

But these may also not work.

from subphaser.

kashiff007 avatar kashiff007 commented on June 23, 2024

OK, I appreciate the explanation.

from subphaser.

kedduck avatar kedduck commented on June 23, 2024

No. It may be very easy to know the homology by aligning your chromosomes to the genome of Fragaria vesca using tools such as minimap2 etc. You may be able to get a figure like below:

fan-fve

So the sg.config configuration can be set like:

1-1 1-2 1-3 1-4
2-1 2-2 2-3 2-4
...

Hi, @zhangrengang

I'm soryy to disturb you. The plot is very fancy. Is it plotted by "dosPlotly" ?

Thanks!

from subphaser.

zhangrengang avatar zhangrengang commented on June 23, 2024

@kedduck No, it is from the paper: Edger P P, Poorten T J, VanBuren R et. al. Origin and evolution of the octoploid strawberry genome [J]. Nature Genet., 2019, 51 (3): 541–547 [http://doi.org/10.1038/s41588-019-0356-4]

from subphaser.

kedduck avatar kedduck commented on June 23, 2024

Thanks for your quick reply!

from subphaser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.