Giter Site home page Giter Site logo

zhangrengang / subphaser Goto Github PK

View Code? Open in Web Editor NEW
48.0 3.0 12.0 3.33 MB

Phase, partition and visualize subgenomes of a neoallopolyploid or hybrid based on the subgenome-specific repetitive kmers.

Home Page: https://doi.org/10.1111/nph.18173

License: GNU General Public License v3.0

Python 99.03% Perl 0.17% Shell 0.80%
allopolyploid subgenome kmer partition phasing exchange

subphaser's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

subphaser's Issues

matplotlib raise RuntimeError ('Invalid DISPLAY variable')

Hi, when plotting the kmer_freq, it reported errors like this:

"23-09-13 23:23:33 [INFO] Plot k15_q200_f2.kmer_freq.pdf
Traceback (most recent call last):
  File "~/.conda/envs/SubPhaser/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.6', 'console_scripts', 'subphaser')())
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/__main__.py", line 790, in main
    pipeline.run()
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/__main__.py", line 415, in run
    d_mat = dumps.filter(d_mat, lengths, self.sgs, outfig=histfig, #d_targets=d_targets, 
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/Jellyfish.py", line 504, in filter
    plot_histogram(tot_freqs, outfig, vline=None)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/Jellyfish.py", line 647, in plot_histogram
    plt.figure(figsize=(7,5), dpi=300, tight_layout=True)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/pyplot.py", line 797, in figure
    manager = new_figure_manager(
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/pyplot.py", line 316, in new_figure_manager
    return _backend_mod.new_figure_manager(*args, **kwargs)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 3545, in new_figure_manager
    return cls.new_figure_manager_given_figure(num, fig)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 3550, in new_figure_manager_given_figure
    canvas = cls.FigureCanvas(figure)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/backends/backend_qt5agg.py", line 21, in __init__
    super().__init__(figure=figure)
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/backends/backend_qt5.py", line 213, in __init__
    _create_qApp()
  File "~/.conda/envs/SubPhaser/lib/python3.8/site-packages/matplotlib/backends/backend_qt5.py", line 108, in _create_qApp
    raise RuntimeError('Invalid DISPLAY variable')
RuntimeError: Invalid DISPLAY variable"

how can I solve it?

ValueError: 0 kmer with fold > 2. Please reset the filter options.

When I analyze using the default parameters, the following error occurs. What should I set the kmer parameters to?

23-12-30 15:24:20 [INFO] After filtering, remained 0 (0.00%) differential (freq >= 200) and 0 (0.00%) candidate (freq > 0) kmers
Traceback (most recent call last):
File "/home/zuozd/miniconda3/envs/SubPhaser/bin/subphaser", line 33, in
sys.exit(load_entry_point('subphaser==1.2.6', 'console_scripts', 'subphaser')())
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 797, in main
pipeline.run()
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 422, in run
d_mat = dumps.filter(d_mat, lengths, self.sgs, outfig=histfig, #d_targets=d_targets,
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/Jellyfish.py", line 502, in filter
raise ValueError('0 kmer with fold > {}. Please reset the filter options.'.format(min_fold))
ValueError: 0 kmer with fold > 2. Please reset the filter options.

THANK YOU!

ValueError: n_components=3 must be between 0 and min

Dear Writer !
Thanks for your useful pipline, but I meet an error when use SubPhaser.
My command
subphaser -i groups_genome.fasta -c groups_sg.config
My configue file
image
But I have the error
image
What's wrong with me ?

亚基因组分析

张老师, 您好!正在用subphaser分一个异源四倍体的AB亚基因组,得到初步结果,请您帮忙看看。装出来的基因组共22条染色体,在subphaser分析后,一组为12条,另一组10条,目前这里比较迷惑,请您指点。结果如下图:
k15_q50_f2 0 circos
1709383368373
image
image

Singularity container fails if environmental variable `R_LIBS_USER` is set

Hi!

I was able to finish the pipeline Singularity version but had to reset the path to R libraries manually (I have a custom R library path set in my .bashrc). The following was sufficient:

export R_LIBS_USER=/share/home/app/bin/miniconda3/envs/SubPhaser/lib/R/library/

Maybe it's worth adding that variable to the container recipe.

Also, the mafft stage fails if $TMPDIR is in other path than /tmp (was /scratch in my case), I had to specify the bind path manually when running the container. (Could be fixed by adding $TMPDIR to default bindpaths or SINGULARITY_BIND variable?)

Thanks again!
Nikita

`TEsorter` cannot find `rexdb` in Singularity container

Hi and thanks for the tool! It looks very promising.

I had to use Singularity because of some cluster vs. conda Qt conflicts that I could not resolve. However, with Singularity I found myself unable to proceed beyond the TEsorter stage because of the following error:

Apptainer> cat /netscratch/dep_mercier/grp_novikova/software/SubPhaser/example_data/tmp/LTR.inner.fa.tesort.log
2023-12-08 17:33:23,593 -INFO- VARS: {'sequence': '/netscratch/dep_mercier/grp_novikova/software/SubPhaser/example_data/tmp/LTR.inner.fa', 'hmm_database': 'rexdb', 'seq_type': 'nucl', 'prefix': '/netscratch/dep_mercier/grp_novikova/software/SubPhaser/example_data/tmp/LTR.inner.fa', 'force_write_hmmscan': False, 'processors': 48, 'tmp_dir': '/netscratch/dep_mercier/grp_novikova/software/SubPhaser/example_data/tmp/LTR', 'min_coverage': 20, 'max_evalue': 0.001, 'disable_pass2': True, 'pass2_rule': '80-80-80', 'no_library': False, 'no_reverse': False, 'no_cleanup': False}
2023-12-08 17:33:23,594 -INFO- checking dependencies:
Traceback (most recent call last):
  File "/share/home/app/bin/miniconda3/envs/SubPhaser/bin/TEsorter", line 10, in <module>
    sys.exit(main())
  File "/share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/app.py", line 1014, in main
    pipeline(Args())
  File "/share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/app.py", line 145, in pipeline
    Dependency().check_hmmer(db=DB[args.hmm_database])
  File "/share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/app.py", line 952, in check_hmmer
    dp_version = self.get_hmm_version(db)[:3]
  File "/share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/app.py", line 967, in get_hmm_version
    line = open(db).readline()
**FileNotFoundError: [Errno 2] No such file or directory: '/share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm'**

Turns out databases are not loaded at all:

Apptainer> ls /share/home/app/bin/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/TEsorter/
__init__.py  __main__.py  app.py       modules/     version.py

How would I fix that?

Cheers,
Nikita

Unbalanced of chromosomes number and differential kmers number among subgenomes

Hi~,
I have got a whole new set of problems now:
image

there is abnormally few subgenome-specific kmers in one of subgenomes, and the numbers of assigned chromosomes among subgenomes are abnormally unbalanced. I have also tried -k ( 8,13,15,17,22,27,33,37,45,50), -q (10,200,600,1000), -f(1.5,2), but failed to deal with this problem. Have you any suggestions about that?

Division by zero when trying to build trees?

Hi, when running SubPhaser i get the following error:

Traceback (most recent call last):
  File "/home/531734/.conda/envs/SubPhaser/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/__main__.py", line 779, in main
    pipeline.run()
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/__main__.py", line 516, in run
    ltr_bedlines, enrich_ltr_bedlines = self.step_ltr(d_kmers) if not self.disable_ltr else ([],[])
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/__main__.py", line 615, in step_ltr
    d_files = tree.build(job_args=job_args)
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/LTR.py", line 210, in build
    ncpus = [max(1, int(self.ncpu*v/tprop)) for v in prop]
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/LTR.py", line 210, in <listcomp>
    ncpus = [max(1, int(self.ncpu*v/tprop)) for v in prop]
ZeroDivisionError: division by zero

I believe this could be because only one scaffold was identified as a subgenome. Does this sound possible?

Many thanks

Mike

Is there a limitation on chromosome counts?

Hi Rengang,

Thanks for your pipeline ! SubPhaser is very useful for our project.

I wonder whether the chromosome counts is limit or not for Subphaser? Because my species has a huge chromosome numbers.

best,

Cheng

The output subgenomes are not paired

Hi~,
I used this software to analyze the subgenome, input pairs of chromosome files, but output 11 chromosomes each and 13 chromosomes each. Is this result correct? What am I to make of this result?

look for you reply!
Hang

Can't install SubPhaser: : Found conflicts! Looking for incompatible packages.

Dear Dr Zhang
Thanks for developing this useful tool.
Unfortunately, I was stuck in the installation step of SubPhaser.
When I run
conda env create -f SubPhaser.yaml
The conda environment can not be set up and errors like
`Collecting package metadata (repodata.json): done
Solving environment: Found conflicts! Looking for incompatible packages.
...
The following specifications were found to be incompatible with your system:

  • feature:/linux-64::__glibc==2.17=0
  • feature:|@/linux-64::__glibc==2.17=0
  • biopython==1.79=py38h497a2fe_0 -> libgcc-ng[version='>=9.3.0'] -> __glibc[version='>=2.17']
  • blast==2.11.0=pl526he19e7b1_0 -> libgcc-ng[version='>=7.5.0'] -> __glibc[version='>=2.17']
    ...
    Your installed version is: 2.17
    `
    Looking forward to your responses.

Too few markers

Hi,

I am trying to use the SubPhaser to phase the subgenomes of my species. The parental species are unknown and I built a quite good chromosomal assembly with 99% of BUSCOs
k13_q100_f2.0.circos.pdf
complete. I named the subgenomes after synteny analysis with a close species sorghum bicolor. When I tried to use Subphaser, I managed to phase the subgenomes, but there seems to be very few kmer markers, and no ltr was found- much less than the numbers in your example files. I used parameter of this -k 13 -q 100 -f 2 -disable_ltr
Is this result trustworthy? What could be the reason?

thanks,
Cui

IndexError: cannot do a non-empty take from an empty axes.

Hi, I got the following error with my dataset when I was trying to pre-assign all 40 chromosomes to 2 subgenomes. Apparently, SubPhaser re-assigned all chromosomes to SG1. With a smaller number of assignments, SubPhaser successfully completed in the same genome with a smaller number of homologous chromosome assignments, as you suggested in #7.

22-12-23 03:08:56 [INFO] Version: 1.2.5
22-12-23 03:08:56 [INFO] Arguments: {'genomes': ['/gfe_data/species_genome/Nepenthes_gracilis_male_HiC.fa.gz'], 'sg_cfgs': ['/gfe_data/species_subphaser_cfg/Nepenthes_gracilis_subphaser_cfg.txt'], 'labels': None, 'no_label': True, 'target': None, 'sg_assigned': None, 'sep': '|', 'custom_features': None, 'prefix': 'Nepenthes_gracilis.', 'outdir': 'Nepenthes_gracilis.subphaser', 'tmpdir': 'Nepenthes_gracilis.tmp', 'k': 15, 'min_fold': 2, 'min_freq': 200, 'baseline': 1, 'lower_count': 3, 'min_prop': None, 'max_freq': 1000000000.0, 'max_prop': None, 'low_mem': None, 'by_count': False, 're_filter': False, 'nsg': None, 'replicates': 1000, 'jackknife': 50, 'max_pval': 0.05, 'test_method': 'ttest_ind', 'figfmt': 'pdf', 'heatmap_colors': ('green', 'black', 'red'), 'heatmap_options': "Rowv=T,Colv=T,scale='col',dendrogram='row',labCol=F,trace='none',key=T,key.title=NA,density.info='density',main=NA,xlab='Differential kmers',margins=c(2.5,12)", 'just_core': False, 'disable_ltr': False, 'ltr_detectors': ['ltr_harvest'], 'ltr_finder_options': '-w 2 -D 15000 -d 1000 -L 7000 -l 100 -p 20 -C -M 0.8', 'ltr_harvest_options': '-seqids yes -similar 80 -vic 10 -seed 20 -minlenltr 100 -maxlenltr 7000 -mintsd 4 -maxtsd 6', 'tesorter_options': '-db rexdb -dp2', 'all_ltr': False, 'intact_ltr': False, 'exclude_exchanges': False, 'non_specific': False, 'mu': 1.3e-08, 'disable_ltrtree': False, 'subsample': 1000, 'ltr_domains': ['INT', 'RT', 'RH'], 'trimal_options': '-automated1', 'tree_method': 'FastTree', 'tree_options': '', 'ggtree_options': "branch.length='none', layout='circular'", 'disable_circos': False, 'window_size': 1000000, 'disable_blocks': False, 'aligner': 'minimap2', 'aligner_options': '-x asm20 -n 10', 'min_block': 100000, 'alt_cfgs': None, 'chr_ordered': None, 'ncpu': 4, 'max_memory': '32', 'cleanup': False, 'overwrite': False}
22-12-23 03:08:56 [INFO] Target chromosomes: ['scaffold2', 'scaffold1', 'scaffold8', 'scaffold11', 'scaffold12', 'scaffold3', 'scaffold17', 'scaffold23', 'scaffold24', 'scaffold40', 'scaffold4', 'scaffold22', 'scaffold30', 'scaffold33', 'scaffold39', 'scaffold5', 'scaffold13', 'scaffold16', 'scaffold18', 'scaffold26', 'scaffold6', 'scaffold15', 'scaffold20', 'scaffold32', 'scaffold38', 'scaffold7', 'scaffold14', 'scaffold27', 'scaffold28', 'scaffold29', 'scaffold9', 'scaffold19', 'scaffold21', 'scaffold34', 'scaffold36', 'scaffold10', 'scaffold25', 'scaffold31', 'scaffold35', 'scaffold37']
22-12-23 03:08:56 [INFO] Splitting genomes by chromosome into `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.`
22-12-23 03:09:08 [INFO] New check point file: `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.split.ok`
22-12-23 03:09:08 [INFO] Chromosomes: ['scaffold2', 'scaffold1', 'scaffold8', 'scaffold11', 'scaffold12', 'scaffold3', 'scaffold17', 'scaffold23', 'scaffold24', 'scaffold40', 'scaffold4', 'scaffold22', 'scaffold30', 'scaffold33', 'scaffold39', 'scaffold5', 'scaffold13', 'scaffold16', 'scaffold18', 'scaffold26', 'scaffold6', 'scaffold15', 'scaffold20', 'scaffold32', 'scaffold38', 'scaffold7', 'scaffold14', 'scaffold27', 'scaffold28', 'scaffold29', 'scaffold9', 'scaffold19', 'scaffold21', 'scaffold34', 'scaffold36', 'scaffold10', 'scaffold25', 'scaffold31', 'scaffold35', 'scaffold37']
22-12-23 03:09:08 [INFO] Chromosome Number: 40
22-12-23 03:09:08 [INFO] CONFIG: [[['scaffold2'], ['scaffold1', 'scaffold8', 'scaffold11', 'scaffold12']], [['scaffold3'], ['scaffold17', 'scaffold23', 'scaffold24', 'scaffold40']], [['scaffold4'], ['scaffold22', 'scaffold30', 'scaffold33', 'scaffold39']], [['scaffold5'], ['scaffold13', 'scaffold16', 'scaffold18', 'scaffold26']], [['scaffold6'], ['scaffold15', 'scaffold20', 'scaffold32', 'scaffold38']], [['scaffold7'], ['scaffold14', 'scaffold27', 'scaffold28', 'scaffold29']], [['scaffold9'], ['scaffold19', 'scaffold21', 'scaffold34', 'scaffold36']], [['scaffold10'], ['scaffold25', 'scaffold31', 'scaffold35', 'scaffold37']]]
22-12-23 03:09:08 [INFO] Genome size: 746,713,351 bp
22-12-23 03:09:08 [INFO] ###Step: Kmer Count
22-12-23 03:09:08 [INFO] Counting kmer by jellyfish
22-12-23 03:09:08 [INFO] Start Pool with 4 process(es)
.
.
.
22-12-23 03:13:26 [INFO] Bootstrap: mean Adjusted Rand-Index: 0.9428; mean V-measure score: 0.9295
22-12-23 03:13:26 [INFO] Subgenome assignments: OrderedDict([('scaffold2', 'SG1'), ('scaffold1', 'SG1'), ('scaffold8', 'SG1'), ('scaffold11', 'SG1'), ('scaffold12', 'SG1'), ('scaffold3', 'SG1'), ('scaffold17', 'SG1'), ('scaffold23', 'SG1'), ('scaffold24', 'SG1'), ('scaffold40', 'SG1'), ('scaffold4', 'SG1'), ('scaffold22', 'SG1'), ('scaffold30', 'SG2'), ('scaffold33', 'SG2'), ('scaffold39', 'SG1'), ('scaffold5', 'SG1'), ('scaffold13', 'SG1'), ('scaffold16', 'SG1'), ('scaffold18', 'SG1'), ('scaffold26', 'SG1'), ('scaffold6', 'SG1'), ('scaffold15', 'SG1'), ('scaffold20', 'SG1'), ('scaffold32', 'SG1'), ('scaffold38', 'SG1'), ('scaffold7', 'SG1'), ('scaffold14', 'SG1'), ('scaffold27', 'SG1'), ('scaffold28', 'SG1'), ('scaffold29', 'SG1'), ('scaffold9', 'SG1'), ('scaffold19', 'SG1'), ('scaffold21', 'SG1'), ('scaffold34', 'SG1'), ('scaffold36', 'SG1'), ('scaffold10', 'SG1'), ('scaffold25', 'SG1'), ('scaffold31', 'SG1'), ('scaffold35', 'SG1'), ('scaffold37', 'SG1')])
22-12-23 03:13:26 [INFO] Outputing `chromosome` - `subgenome` assignments to `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.chrom-subgenome.tsv`
22-12-23 03:13:26 [INFO] Outputing significant differiential `kmer` - `subgenome` maps to `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.sig.kmer-subgenome.tsv`
22-12-23 03:13:26 [INFO] Start Pool with 4 process(es)
22-12-23 03:13:26 [INFO] 9 significant subgenome-specific kmers
22-12-23 03:13:26 [INFO] 	9 SG2-specific kmers
22-12-23 03:13:27 [INFO] run CMD: `Rscript /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.kmer.mat.R`
22-12-23 03:13:27 [INFO] Outputing PCA plot to `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.kmer_pca.pdf`
22-12-23 03:13:28 [INFO] Outputing `coordinate` - `subgenome` maps to `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.subgenome.bin.count`
22-12-23 03:13:28 [INFO] Start Pool with 4 process(es)
.
.
.
22-12-23 03:14:47 [INFO] Processed 94 sequences
22-12-23 03:14:47 [INFO] 92 (97.87%) sequences contain subgenome-specific kmers
22-12-23 03:14:47 [INFO] 100.00% of 9 subgenome-specific kmers are mapped
22-12-23 03:14:47 [INFO] New check point file: `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.Nepenthes_gracilis.k15_q200_f2.subgenome.bin.count.ok`
22-12-23 03:14:47 [INFO] Enriching subgenome by chromosome window (size: 1000000)
22-12-23 03:14:47 [INFO] Start Pool with 4 process(es)
.
.
.
22-12-23 03:21:31 [INFO] finished with 0 commands uncompleted
22-12-23 03:21:32 [INFO] New check point file: `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR.scn.ok`
22-12-23 03:21:32 [INFO] 23051 LTRs identified
22-12-23 03:21:32 [INFO] Extracting inner sequences of LTRs to classify by `TEsorter`
22-12-23 03:21:32 [INFO] run CMD: `TEsorter /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR.inner.fa -db rexdb -dp2 -p 4 -pre /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR.inner.fa -tmp /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR &> /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR.inner.fa.tesort.log`
22-12-23 03:39:13 [INFO] New check point file: `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.LTR.tesort.ok`
22-12-23 03:39:13 [INFO] By TEsorter, 13396 (58.1%) are classified as LTRs, of which 5538 (41.3%) are intact with complete protein domains
22-12-23 03:39:13 [INFO] After filtering, 13202 / 23051 (57.3%) LTRs retained
22-12-23 03:39:13 [INFO] Outputing `coordinate` - `LTR` maps to `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.ltr.bin.count`
22-12-23 03:39:13 [INFO] Start Pool with 4 process(es)
22-12-23 03:39:23 [INFO] Processed 13202 sequences
22-12-23 03:39:23 [INFO] 204 (1.55%) sequences contain subgenome-specific kmers
22-12-23 03:39:23 [INFO] 44.44% of 9 subgenome-specific kmers are mapped
22-12-23 03:39:25 [INFO] New check point file: `/gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.tmp/Nepenthes_gracilis.Nepenthes_gracilis.k15_q200_f2.ltr.bin.count.ok`
22-12-23 03:39:25 [INFO] Enriching subgenome-specific LTR-RTs
22-12-23 03:39:25 [INFO] Start Pool with 4 process(es)
/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/Stats.py:157: RuntimeWarning: invalid value encountered in divide
  ratios = np.array(row) / np.array(total)
/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/Stats.py:157: RuntimeWarning: invalid value encountered in divide
  ratios = np.array(row) / np.array(total)
/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/Stats.py:157: RuntimeWarning: invalid value encountered in divide
  ratios = np.array(row) / np.array(total)
/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/Stats.py:157: RuntimeWarning: invalid value encountered in divide
  ratios = np.array(row) / np.array(total)
22-12-23 03:39:25 [INFO] Output: /gfe_data/tmp/14_Nepenthes_gracilis/Nepenthes_gracilis.subphaser/Nepenthes_gracilis.k15_q200_f2.ltr.enrich
22-12-23 03:39:25 [INFO] 0 significant subgenome-specific LTR-RTs
22-12-23 03:39:28 [INFO] Summary of overall LTR insertion age (million years):
/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3432: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/opt/conda/envs/biotools/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 784, in main
    pipeline.run()
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 518, in run
    ltr_bedlines, enrich_ltr_bedlines = self.step_ltr(d_kmers) if not self.disable_ltr else ([],[])
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 602, in step_ltr
    enrich_ltrs = LTR.plot_insert_age(ltrs, d_enriched, prefix, 
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/LTR.py", line 515, in plot_insert_age
    d_info = summary_ltr_time(d_data, fout)
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/LTR.py", line 601, in summary_ltr_time
    np.median(xages), abs(np.percentile(xages, 2.5)), np.percentile(xages, 97.5)))
  File "<__array_function__ internals>", line 180, in percentile
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4166, in percentile
    return _quantile_unchecked(
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4424, in _quantile_unchecked
    r, k = _ureduce(a,
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3725, in _ureduce
    r = func(a, **kwargs)
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4593, in _quantile_ureduce_func
    result = _quantile(arr,
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4699, in _quantile
    take(arr, indices=-1, axis=DATA_AXIS)
  File "<__array_function__ internals>", line 180, in take
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 190, in take
    return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
IndexError: cannot do a non-empty take from an empty axes.

Failed to install SubPhaser

Hi~
When i type 'conda env create -f SubPhaser.yaml', an error showed:
图片
Do u known how to solve this?
Best wishes.

三倍体基因组

14a9c9ddaefa300032617119cf4534e
老师,您好
三倍体基因组划分成这样了,单套17条,还能有改善吗

Changing mutation rate

Hello,

I'm curious why the ***.ltr.insert.density.pdf plot densities change in an order of magnitude if you change the mutation rate, but the ***.ltr.insert.histo.pdf remains the same. I anticipated that only the x-axis would change with the new mutation rate, but somehow the density (y-axis) of LTRs change as well. I attached the small genome example ran as default and with a different mutation rate of -mu 6.7e-09.

I used the default example script then I ran this:

prefix=Arabidopsis_suecica
DT=date +"%y%m%d%H%M"
options="-pre ${prefix}_" # to avoid conflicts
subphaser -i ${prefix}_genome.fasta.gz -c ${prefix}_sg.config -max_memory 128G -disable_circos -intact_ltr -mu 1.75e-09 $options 2&gt;&amp;1 | tee ${prefix}.log.$DT

I checked without the -intact_ltr flag and the results are the same.

Any insights would be greatly appreciated.

Arabidopsis_suecica_k15_q200_f2.ltr.insert.histo.default.pdf
Arabidopsis_suecica_k15_q200_f2.ltr.insert.density.1.75.pdf
Arabidopsis_suecica_k15_q200_f2.ltr.insert.histo.1.75.pdf
Arabidopsis_suecica_k15_q200_f2.ltr.insert.density.default.pdf

Thank you.
Crystal

ValueError: All singletons are not allowed

when I use the command subphaser -i brg.fa -c brg.txt ,it show the error ValueError: All singletons are not allowed

how can I slove? Thanks!

Traceback (most recent call last):
File "/home/zuozd/miniconda3/envs/SubPhaser/bin/subphaser", line 33, in
sys.exit(load_entry_point('subphaser==1.2.6', 'console_scripts', 'subphaser')())
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 797, in main
pipeline.run()
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 422, in run
d_mat = dumps.filter(d_mat, lengths, self.sgs, outfig=histfig, #d_targets=d_targets,
File "/home/zuozd/miniconda3/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/Jellyfish.py", line 479, in filter
raise ValueError('All singletons are not allowed')
ValueError: All singletons are not allowed

Installation problem

When I use conda to install this software, it prompts me as follows:
Grid computing is not available because DRMAA not configured properly: Could not find drmaa library.
How can I solve this problem?
Many thanks.

IndexError: index -1 is out of bounds for axis 0 with size 0

Hi, Thanks for developing the tool. I tried the example of ginger and successfully procressed. But When I used my own triploid genome (3n=63), I met an error. My config file is as follow:
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 21 22
23 24 25
27 28 29
30 31 32
33 34 35
36 37 38
39 40 41
42 43 45
47 48 49
50 51 52
53 54 59
60 64 65
67 68 71
72 73 74
75 76 77

The command was 'subphaser -i ref.fa -c config.txt -pre out', The I get the error like this:

22-06-02 16:24:49 [INFO] Summary of overall LTR insertion age (million years):
/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "/home/wangyue/software/miniconda2/envs/SubPhaser/bin/subphaser", line 33, in
sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 779, in main
pipeline.run()
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 516, in run
ltr_bedlines, enrich_ltr_bedlines = self.step_ltr(d_kmers) if not self.disable_ltr else ([],[])
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/main.py", line 600, in step_ltr
enrich_ltrs = LTR.plot_insert_age(ltrs, d_enriched, prefix, shared=d_shared,
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/LTR.py", line 513, in plot_insert_age
d_info = summary_ltr_time(d_data, fout)
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/LTR.py", line 578, in summary_ltr_time
np.median(xages), abs(np.percentile(xages, 2.5)), np.percentile(xages, 97.5)))
File "<array_function internals>", line 5, in percentile
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3867, in percentile
return _quantile_unchecked(
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3986, in _quantile_unchecked
r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out,
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3564, in _ureduce
r = func(a, **kwargs)
File "/home/wangyue/software/miniconda2/envs/SubPhaser/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4098, in _quantile_ureduce_func
n = np.isnan(ap[-1])
IndexError: index -1 is out of bounds for axis 0 with size 0

And the results in the file "outk15_q200_f2.chrom-subgenome.tsv" showed different number of chromosomes for each genotype.

I don't know where is my problem. Can you give me any advises? Thanks a lot

Error in os.link(figfile, dstfig)

I came across a seemingly rare error, which occurred when I tried to run SubPhaser installed to a Singularity container on macOS using Vagrant. This is not a big issue to me because it didn't happen when I used the same container on my main environment on a Linux server, but I would like to report it here.

I found a similar problem on hard links reported elsewhere:
https://neurostars.org/t/qsiprep-raw-src-qc-os-link-self-inputs-src-file-linked-src-file-permissionerror-errno-1-operation-not-permitted/19076

(Please note that I used Arabidopsis thaliana as input only for a testing purpose)

22-12-22 12:11:09 [INFO] ###Step: Circos
22-12-22 12:11:09 [INFO] Limit memory 4.6G per process with total memory 1.2
22-12-22 12:11:09 [INFO] Using 1 processes to align chromosome sequences
22-12-22 12:11:09 [INFO] Check point file: `/gfe_data/tmp/1_Arabidopsis_thaliana/Arabidopsis_thaliana.tmp/Arabidopsis_thaliana.Blocks/Chr1-Chr2.paf.ok` exists; skip this step
22-12-22 12:11:09 [INFO] Start Pool with 1 process(es)
22-12-22 12:11:10 [INFO] Copy `/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/circos` to `/gfe_data/tmp/1_Arabidopsis_thaliana/Arabidopsis_thaliana.subphaser/`
using cutoff: upper 43576.5 for SG1
using cutoff: upper 694.5 for SG2
22-12-22 12:11:12 [INFO] run CMD: `cd /gfe_data/tmp/1_Arabidopsis_thaliana/Arabidopsis_thaliana.subphaser/Arabidopsis_thaliana.k15_q200_f2.circos && circos -conf ./circos.conf`
Traceback (most recent call last):
  File "/opt/conda/envs/biotools/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 784, in main
    pipeline.run()
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 524, in run
    self.step_circos(
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 684, in step_circos
    Circos.circos_plot(self.chromfiles, wkdir, *args, **kargs)
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/Circos.py", line 515, in circos_plot
    os.link(figfile, dstfig)
PermissionError: [Errno 1] Operation not permitted: '/gfe_data/tmp/1_Arabidopsis_thaliana/Arabidopsis_thaliana.subphaser/Arabidopsis_thaliana.k15_q200_f2.circos/circos.png' -> '/gfe_data/tmp/1_Arabidopsis_thaliana/Arabidopsis_thaliana.subphaser/Arabidopsis_thaliana.k15_q200_f2.circos.png'

Arabidopsis_suecica_LTR.inner.fa.cls.tsv not found

Thank you for developing this excellent tool! I installed SubPhaser in a singularity container with the minimal dependencies listed in #10 (comment) and tried to run the Arabidopsis test dataset, but an error occurred. Another run with my own dataset stopped with the same error message. I would be grateful if you could provide me with potential solutions.

22-12-18 06:54:20 [INFO] finished with 0 commands uncompleted
22-12-18 06:54:20 [INFO] New check point file: `/home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.scn.ok`
22-12-18 06:54:20 [INFO] 5566 LTRs identified
22-12-18 06:54:20 [INFO] Extracting inner sequences of LTRs to classify by `TEsorter`
22-12-18 06:54:20 [INFO] run CMD: `TEsorter /home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.inner.fa -db rexdb -dp2 -p 128 -pre /home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.inner.fa -tmp /home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR &> /home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.inner.fa.tesort.log`
22-12-18 06:54:56 [INFO] New check point file: `/home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.tesort.ok`
Traceback (most recent call last):
  File "/opt/conda/envs/biotools/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 784, in main
    pipeline.run()
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 518, in run
    ltr_bedlines, enrich_ltr_bedlines = self.step_ltr(d_kmers) if not self.disable_ltr else ([],[])
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/__main__.py", line 556, in step_ltr
    ltrs, ltrfile = pipeline.run()
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/LTR.py", line 335, in run
    d_class = self.classfify(ltrs)
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/LTR.py", line 397, in classfify
    for classification in CommonClassifications(clsfile):
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/subphaser-1.2.5-py3.9.egg/subphaser/api/TEsorter/app.py", line 339, in _parse
    for i, line in enumerate(open(self.clsfile)):
  File "/opt/conda/envs/biotools/lib/python3.9/site-packages/xopen/__init__.py", line 1291, in xopen
    opened_file = open(filename, mode, **text_mode_kwargs)  # type: ignore
FileNotFoundError: [Errno 2] No such file or directory: '/home/kfuku/docker_img/gfe/usr/local/bin/SubPhaser/example_data/Arabidopsis_suecica_tmp/Arabidopsis_suecica_LTR.inner.fa.cls.tsv'

Only one pair of homologous chromosomes were not phased

Hi~
SubPhaser is a great piece of software, I have suffered some problems when I use this software to phase my diploid genome.

After the previous hic scaffolding, I got 22 superscaffolds, then I want to divide these scaffolds into 2 parts(2n=2x=22).
3c28a7d68e52191977cb542b72062b2
so I used this SubPhaser(-k 17 -q 50 -f 1.5), then 20 superscaffolds were phased and only 1 pair of homologous scaffolds(scaffold_9 and scaffold_10) were not phased.
0d8d5c8a695e265d13d28060e73c722
k17_q50_f1.5.kmer_freq.pdf
k17_q50_f1.5.kmer_pca.pdf
k17_q50_f1.5.ltr.insert.density.pdf

How can I solve this problem?
Looking forward to your reply!
Yang

Used for contig-level assembly

Hi, Thanks for developing such a useful tool. I wonder if it can be used for contig-level assembly.
Thank you for your reply in advance~

ModuleNotFoundError: No module named 'TEsorter'

Hi, I got module error (No. 1) when I run subphaser.
To solve the error, I installed TEsorter using new and old school methods, but subphaser can't find that module.
When I run from TEsorter.app import CommonClassifications in python3, I got ImportError (No. 2).
Could you give me the solution?

Thank you !
Jung

(No. 1)
(SubPhaser)$ subphaser

Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/subphaser-1.2.6-py3.8.egg/subphaser/LTR.py", line 9, in
from TEsorter.app import CommonClassifications
ModuleNotFoundError: No module named 'TEsorter'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/subphaser", line 11, in
load_entry_point('subphaser==1.2.6', 'console_scripts', 'subphaser')()
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 490, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 2854, in load_entry_point
return ep.load()
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 2445, in load
return self.resolve()
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 2451, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
File "/usr/local/lib/python3.8/dist-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 15, in
from . import LTR
File "/usr/local/lib/python3.8/dist-packages/subphaser-1.2.6-py3.8.egg/subphaser/LTR.py", line 11, in
from .api.TEsorter.app import CommonClassifications
File "/usr/local/lib/python3.8/dist-packages/subphaser-1.2.6-py3.8.egg/subphaser/api/TEsorter/app.py", line 37, in
from .modules.get_record import get_records
File "/usr/local/lib/python3.8/dist-packages/subphaser-1.2.6-py3.8.egg/subphaser/api/TEsorter/modules/get_record.py", line 6, in
from TEsorter.modules.small_tools import open_file as open
ModuleNotFoundError: No module named 'TEsorter'

(No. 2)

from TEsorter.app import CommonClassifications
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'CommonClassifications' from 'TEsorter.app' (/home/super/miniconda3/envs/SubPhaser/lib/python3.10/site-packages/TEsorter/app.py)

cannot allocate memory

Hi,

Thanks a lot for the very nice tool!

I am trying to phase the subgenomes from this hexaploid haplotype-phased genome (9Gb), but somehow I always get stuck with the error message cannot allocate memory, despite changing the memory option several times... Any help with that is appreciated.

Cheers
André
...
24-01-25 07:23:35 [INFO] Loading kmer matrix from jellyfish
24-01-25 07:23:35 [INFO] Start Pool with 40 process(es)
24-01-25 07:23:57 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_53.fasta_15.fa
24-01-25 07:28:54 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_60.fasta_15.fa
24-01-25 07:29:21 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_5.fasta_15.fa
24-01-25 07:30:13 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_57.fasta_15.fa
24-01-25 07:30:47 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_61.fasta_15.fa
24-01-25 07:30:52 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_54.fasta_15.fa
24-01-25 07:31:00 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_22.fasta_15.fa
24-01-25 07:31:36 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_50.fasta_15.fa
24-01-25 07:31:46 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_52.fasta_15.fa
24-01-25 07:32:25 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_48.fasta_15.fa
24-01-25 07:32:31 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_42.fasta_15.fa
24-01-25 07:32:38 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_47.fasta_15.fa
24-01-25 07:32:44 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_55.fasta_15.fa
24-01-25 07:32:49 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_4.fasta_15.fa
24-01-25 07:33:38 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_35.fasta_15.fa
24-01-25 07:33:47 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_40.fasta_15.fa
24-01-25 07:33:53 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_25.fasta_15.fa
24-01-25 07:34:02 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_27.fasta_15.fa
24-01-25 07:34:12 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_38.fasta_15.fa
24-01-25 07:34:22 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_37.fasta_15.fa
24-01-25 07:35:11 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_41.fasta_15.fa
24-01-25 07:35:17 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_26.fasta_15.fa
24-01-25 07:35:28 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_33.fasta_15.fa
24-01-25 07:35:40 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_65.fasta_15.fa
24-01-25 07:35:52 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_28.fasta_15.fa
24-01-25 07:36:01 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_7.fasta_15.fa
24-01-25 07:36:12 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_17.fasta_15.fa
24-01-25 07:36:21 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_36.fasta_15.fa
24-01-25 07:36:32 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_30.fasta_15.fa
24-01-25 07:36:44 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_14.fasta_15.fa
24-01-25 07:37:41 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_18.fasta_15.fa
24-01-25 07:37:57 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_63.fasta_15.fa
24-01-25 07:38:08 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_1.fasta_15.fa
24-01-25 07:38:19 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_16.fasta_15.fa
24-01-25 07:38:27 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_31.fasta_15.fa
24-01-25 07:38:36 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_12.fasta_15.fa
24-01-25 07:38:49 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_11.fasta_15.fa
24-01-25 07:39:01 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_62.fasta_15.fa
24-01-25 07:39:07 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_23.fasta_15.fa
24-01-25 07:39:18 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_64.fasta_15.fa
24-01-25 07:39:23 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_66.fasta_15.fa
24-01-25 07:39:37 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_39.fasta_15.fa
24-01-25 07:39:55 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_15.fasta_15.fa
24-01-25 07:40:08 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_3.fasta_15.fa
24-01-25 07:40:19 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_21.fasta_15.fa
24-01-25 07:40:29 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_24.fasta_15.fa
24-01-25 07:41:21 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_29.fasta_15.fa
24-01-25 07:41:31 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_34.fasta_15.fa
24-01-25 07:41:40 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_32.fasta_15.fa
24-01-25 07:41:52 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_56.fasta_15.fa
24-01-25 07:42:08 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_8.fasta_15.fa
24-01-25 07:42:20 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_9.fasta_15.fa
24-01-25 07:42:32 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_10.fasta_15.fa
24-01-25 07:42:43 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_13.fasta_15.fa
24-01-25 07:42:55 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_2.fasta_15.fa
24-01-25 07:43:08 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_19.fasta_15.fa
24-01-25 07:43:20 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_20.fasta_15.fa
24-01-25 07:43:30 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_45.fasta_15.fa
24-01-25 07:43:38 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_46.fasta_15.fa
24-01-25 07:43:44 [INFO] Loading /netscratch/dep_mercier/grp_marques/marques/LPA/CBC/SubPhaser/wgdi/non-necessary/CBC_tmp/CBC_chromosomes/scaffold_6.fasta_15.fa
24-01-25 07:43:51 [INFO] 62557073 kmers in total
24-01-25 07:43:51 [INFO] Filtering differential kmers
Traceback (most recent call last):
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/bin/subphaser", line 33, in
sys.exit(load_entry_point('subphaser==1.2.6', 'console_scripts', 'subphaser')())
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 797, in main
pipeline.run()
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/main.py", line 422, in run
d_mat = dumps.filter(d_mat, lengths, self.sgs, outfig=histfig, #d_targets=d_targets,
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/Jellyfish.py", line 487, in filter
for kmer, freqs, tot_freq in pool_func(_filter_kmer, args, self.ncpu,
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/site-packages/subphaser-1.2.6-py3.8.egg/subphaser/RunCmdsMP.py", line 336, in pool_func
pool = multiprocessing.Pool(processors)
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/pool.py", line 212, in init
self._repopulate_pool()
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/netscratch/dep_mercier/grp_marques/bin/marques-envs/SGphasing/lib/python3.8/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Invalid specifier: '>=3.6:'

With python 3.9, I got the following error with python setup.py install when installing the latest SubPhaser.

> python setup.py install
error in subphaser setup command: 'python_requires' must be a string containing valid version specifiers; Invalid specifier: '>=3.6:'

The installation worked well after replacing python_requires='>=3.6:' with python_requires='>=3.6' in setup.py.

No differential kmers

Hi,

I am trying to use the SubPhaser to phase the subgenomes of my species. However, after filtering differential kmers, no differential kmers were remained. I used parameter of this -k 15 -q 50 -f 2 . The same result is achieved even if I continue to reduce -k and -q.

23-12-27 17:32:21 [INFO] 125035 kmers in total 23-12-27 17:32:21 [INFO] Filtering differential kmers 23-12-27 17:32:22 [INFO] Start Pool with 112 process(es) 23-12-27 17:32:25 [INFO] After filtering, remained 0 (0.00%) differential (freq >= 25) and 0 (0.00%) candidate (freq > 0) kmers

Do you have any suggestions?

thanks,
Chen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.