Giter Site home page Giter Site logo

Comments (21)

satta avatar satta commented on July 30, 2024

Just to debug: Have you tried calling the tool by using its full path? I.e.

/home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

instead of however you called it before (so no relative paths)?
Also, what GenomeTools (gt) version are you using? Can you try calling the tool as:

gt /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

and see what happens?

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

Thanks for you prompt reply.
xin@compare-vm-1:/analysis/xin/parasite/bin$ gt /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua
tool '/home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua' not found; option -help lists possible tools
xin@compare-vm-1:/analysis/xin/parasite/bin$ ls -l /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua
-rwxrwxr-x 1 xin xin 14960 Jun 4 17:17 /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

gt -version
gt (GenomeTools) 0.6.5 (2018-06-04 10:46:00)
Copyright (c) 2003-2007 Gordon Gremme [email protected]
Copyright (c) 2003-2007 Center for Bioinformatics, University of Hamburg
See LICENSE file or http://genometools.org/license.html for license details.

Used compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
Compile flags: -Wall -Os -I/homes/xin/Downloads/genometools-0.6.5/src -I/homes/xin/Downloads/genometools-0.6.5/obj

from companion.

satta avatar satta commented on July 30, 2024

Thanks! First of all, please install a newer GenomeTools version, the one you seem to use is quite old and does not support some of the features the Lua script needs to run. GenomeTools is at version 1.5.10 currently (http://genometools.org/pub/)

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

Many thanks for your support!

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

@satta
I am considering download the reference genomes from ftp://ftp.sanger.ac.uk/pub/project/pathogens/companion/CryptoDB.org/. There is references.json and references-in.json files, but no config file.
Can they be directly put into the command line "nextflow run sanger-pathogens/companion" or update_references.lua still need to be run to import the reference data?

from companion.

satta avatar satta commented on July 30, 2024

update_references.lua is only needed to build new references from basic sequence+annotation files. The pre-compiled files should be usable directly by downloading them and using their location in the Companion workflow's config file as the value of the ref_dir variable. For example:

ref_dir = "/path/to/my/companion/CryptoDB.org"
ref_species = "Cryptosporidium_parvum_Iowa_II"

etc.

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

There is WEIGHT_FILE = "" in the config file. Is this field requested? if yes, how can I get the weight_file for such as Cryptosporidium_parvum_Iowa_II?

from companion.

satta avatar satta commented on July 30, 2024

If you don't have a kinetoplastid genome (or anything else with weird polycistronic stuff) just use the plasmodium weight file.

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

@satta
Got the following error while running companion:

xin@compare-vm-1:/analysis/xin/parasite/bin$ ./nextflow run sanger-pathogens/companion -profile /analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt
N E X T F L O W  ~  version 0.29.1
Launching `sanger-pathogens/companion` [clever_dalembert] - revision: db91c7dc11 [master]
Unknown configuration profile: '/analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt'

Is there any problem of the config file? /analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt was attache:
Cryptosporidium_parvum_Iowa_II.config.txt

from companion.

satta avatar satta commented on July 30, 2024

Without looking at the file, shouldn't it be -c and not -profile?
Example:

nextflow run -c Cryptosporidium_parvum_Iowa_II.config.txt sanger-pathogens/companion -profile docker

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

Thanks! But got the following error:
.command.stub: line 45: ps: command not found
ZOE ERROR (from /usr/lib/snap/snap): error opening file (/usr/share/snap/Zoe/HMM/snap.hmm)

  1. ps is available in my server:
    xin@compare-vm-1:~/.nextflow/assets/sanger-pathogens/companion$ which ps
    /bin/ps
  2. There is NO /usr/share/snap/ in my server, but I am not ROOT, so even I install snap, I can not put it into /usr/share/

Full output is in the following:
nextflow run sanger-pathogens/companion -c /analysis/xin/parasite/data/Cryptosporidium_parvum_Iowa_II.config.txt -profile docker
N E X T F L O W ~ version 0.30.0
Launching sanger-pathogens/companion [loving_torricelli] - revision: db91c7d [master]

C O M P A N I O N ~ version 1.0.2
query : /analysis/xin/parasite/working_dir/assembly/scaffolds.fasta
reference : Cryptosporidium_parvum_Iowa_II
reference directory : /analysis/xin/parasite/data/companion/CryptoDB.org
WARN: Access to undefined parameter dist_dir -- Initialise it to a default value eg. params.dist_dir = some_value

[warm up] executor > local
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel ncrna_cmindex
WARN: Access to undefined parameter TRANSCRIPT_FILE -- Initialise it to a default value eg. params.TRANSCRIPT_FILE = some_value
[f3/bf5099] Submitted process > press_ncRNA_cms
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel pseudochr_last_index
[84/96dc71] Submitted process > truncate_input_headers
[43/f15da4] Submitted process > exonerate_empty_hints
[8f/8cf2d2] Submitted process > ratt_make_ref_embl
[29/28e63b] Submitted process > transcript_empty_hints
[6c/084117] Submitted process > pseudogene_indexing
[3f/6b4fbf] Submitted process > make_ref_input_for_orthomcl
[df/a31d6e] Submitted process > sanitize_input
[11/47014d] Submitted process > merge_hints
[9c/5c455f] Submitted process > contiguate_pseudochromosomes
[d6/8c6913] Submitted process > predict_tRNA
[26/634d2a] Submitted process > make_distribution_seqs
[59/5e1a05] Submitted process > run_augustus_contigs
[f8/310cfd] Submitted process > run_snap
WARN: Access to undefined parameter print_paths -- Initialise it to a default value eg. params.print_paths = some_value
[33/a13d6b] Submitted process > run_ratt
ERROR ~ Error executing process > 'run_snap'

Caused by:
Process run_snap terminated with an error exit status (2)

Command executed:

echo '##gff-version 3' > snap.gff3
snap -gff -quiet snap.hmm pseudo.pseudochr.fasta > snap.tmp
snap_gff_to_gff3.lua snap.tmp > snap.tmp.2
if [ -s 1 ]; then
gt gff3 -sort -tidy -retainids snap.tmp.2 > snap.gff3;
fi

Command exit status:
2

Command output:
(empty)

Command error:
.command.stub: line 45: ps: command not found
ZOE ERROR (from /usr/lib/snap/snap): error opening file (/usr/share/snap/Zoe/HMM/snap.hmm)
ZOE library version 2013-02-16

Work dir:
/home/xin/.nextflow/assets/sanger-pathogens/companion/work/f8/310cfde2451caac375658810e06b5f

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (2)

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

For the following steps, which steps are requested for annotate parasite genomes: run_exonerate, run_snap, run_ratt, do_contiguation, do_circos, do_pseudo, make_embl, use_reference, fix_polycistrons, truncate_input_headers?

from companion.

satta avatar satta commented on July 30, 2024

That depends on what you want as output and what features you want in your annotation:

  • run_exonerate: use this if you want to use reference protein alignments to aid in gene model building
  • run_snap: also run SNAP to find genes (I would disable this, it might also help with your earlier error)
  • run_ratt: use this if you want to try and map reference gene models directly on the new sequence
  • do_contiguation: use this if you want to order your input sequence to be like the reference chromosomes -- this will assign chromosome names as in the reference and obviously requires that the reference is good anough to have chromosomes
  • do_circos: generate Circos plots with reference vs. new genome matches, gene locations, etc. - see Companion paper or website for example
  • do_pseudo: try to detect pseudogenes
  • make_embl: create EMBL files as output (e.g. to speed up ENA submission)
  • use_reference: if this is false then only ab initio detection is done
  • fix_polycistrons: needed to make sure that polycistronic translation units are not broken by interspersed antisense genes (only important for kinetoplastids)
  • truncate_input_headers: fixes some issues with input data, shouldn't hurt keeping this on

Please keep in mind that some of the options only work when supported by the reference (e.g. do_contiguation) or by some of the tools used in the workflow (e.g. run_snap).

from companion.

satta avatar satta commented on July 30, 2024

As for the ps part: Nextflow jobs are run in a Docker container, so you don't need to have all the dependent software for Companion installed. No idea why SNAP won't run but as I stated above, just disable it for now. There are no pre-generated gene models for Cryptosporidium in the container anyway, it's optimized for kinetoplastids at the moment. Using SNAP would probably need some extra development work and preparation of the correct models.

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

@satta
Many thanks for your reply. Unfortunately some other error appear:
Command error:
.command.stub: line 45: ps: command not found
warning: line 1 in file "stdin" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
gt gff3: error: line 1 in file "stdin" does not contain 9 tab (\t) separated fields

Full output is in the following:
N E X T F L O W ~ version 0.30.0
Launching sanger-pathogens/companion [furious_khorana] - revision: db91c7d [master]

C O M P A N I O N ~ version 1.0.2
query : /analysis/xin/parasite/working_dir/assembly/scaffolds.fasta
reference : Cryptosporidium_parvum_Iowa_II
reference directory : /analysis/xin/parasite/data/companion/CryptoDB.org
WARN: Access to undefined parameter dist_dir -- Initialise it to a default value eg. params.dist_dir = some_value

[warm up] executor > local
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel ncrna_cmindex
WARN: Access to undefined parameter TRANSCRIPT_FILE -- Initialise it to a default value eg. params.TRANSCRIPT_FILE = some_value
[2a/16e905] Submitted process > truncate_input_headers
[6a/b6c1d4] Submitted process > press_ncRNA_cms
WARN: The into operator should be used to connect two or more target channels -- consider to replace it with .set { integrated_gff3_processed }
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel pseudochr_last_index
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel core_comp_circos_chr
WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel core_comp_circos_bin
[72/d923fb] Submitted process > exonerate_empty_hints
[b2/c73b4e] Submitted process > ratt_make_ref_embl
[d2/cc41d4] Submitted process > transcript_empty_hints
[18/0c296c] Submitted process > make_empty_snap
[fb/ecaf05] Submitted process > pseudogene_indexing
[da/bbc4fe] Submitted process > make_ref_input_for_orthomcl
[b3/b03478] Submitted process > make_empty_circos_clusters
[d6/97a69b] Submitted process > sanitize_input
[9b/3998bf] Submitted process > merge_hints
[5a/f383f6] Submitted process > contiguate_pseudochromosomes
[6b/2acfa2] Submitted process > predict_tRNA
[f9/673055] Submitted process > make_distribution_seqs
[42/f3900c] Submitted process > run_augustus_contigs
WARN: Access to undefined parameter print_paths -- Initialise it to a default value eg. params.print_paths = some_value
[1a/c7a5d1] Submitted process > run_ratt
[e7/1cb383] Submitted process > pseudogene_last (1)
[db/23430c] Submitted process > predict_ncRNA (1)
[f9/b71129] Submitted process > run_augustus_pseudo
[7a/7c4e7f] Submitted process > pseudogene_last (2)
[41/630edd] Submitted process > pseudogene_last (3)
[79/90216d] Submitted process > predict_ncRNA (2)
[6b/6e8ed1] Submitted process > blast_for_circos
[3f/1f0634] Submitted process > ratt_to_gff3
[06/6b3916] Submitted process > merge_genemodels
[07/69c9f8] Submitted process > integrate_genemodels
[32/219bc3] Submitted process > remove_exons
[a3/ff1c95] Submitted process > pseudogene_calling (1)
[d1/4c488b] Submitted process > merge_ncrnas (1)
[25/1057d1] Submitted process > merge_structural (1)
[57/68f989] Submitted process > add_gap_features (1)
[b0/ffb5db] Submitted process > split_splice_models_at_gaps (1)
[14/7891f7] Submitted process > add_polypeptides (1)
ERROR ~ Error executing process > 'add_polypeptides (1)'
Caused by:
Process add_polypeptides (1) terminated with an error exit status (1)

Command executed:

create_polypeptides.lua input.gff3 "CM0004(%w+)" | gt gff3 -sort -retainids -tidy > output.gff3

Command exit status:
1

Command output:
(empty)

Command error:
.command.stub: line 45: ps: command not found
warning: line 1 in file "stdin" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
gt gff3: error: line 1 in file "stdin" does not contain 9 tab (\t) separated fields

Work dir:
/home/xin/work/14/7891f764130fb085fb9af9c0251d8f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details

from companion.

satta avatar satta commented on July 30, 2024

Looks like it got much further this time :)
Difficult to debug without access to the data though... can you provide the contents of your work directory so I can understand what is the matter with the GFF3? If you want you can only give me /home/xin/work/14/7891f764130fb085fb9af9c0251d8f but please make sure all files symlinked from outside this directory are included.

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

only 2 files under ~/work/14/7891f764130fb085fb9af9c0251d8f, and one of them is empty, the other file was attached:
lrwxrwxrwx 1 xin xin 64 Jun 7 19:01 input.gff3 ->
/home/xin/work/b0/ffb5db1c4a5a8d99ba967179826917/merged_out.gff3
-rw-r--r-- 1 xin xin 0 Jun 7 19:01 output.gff3

/home/xin/work/b0/ffb5db1c4a5a8d99ba967179826917/merged_out.gff3 attached:
merged_out.gff3.txt

from companion.

satta avatar satta commented on July 30, 2024

OK there are indeed annotations in there. I'll need to redo the part of Companion that failed for you, which will take some time as I am not working full time on Companion anymore.

from companion.

xinliu005 avatar xinliu005 commented on July 30, 2024

@satta
">>I'll need to redo the part of Companion that failed for you"
How is it going? We are trying to built a parasite analysis pipeline and hopefully can use companion as the tool of annotation. Thanks.

from companion.

ybdong919 avatar ybdong919 commented on July 30, 2024

lua5.3 ../../bin/update_references.lua
lua5.3: ../../bin/update_references.lua:20: attempt to index a nil value (global 'gt')
stack traceback:
../../bin/update_references.lua:20: in main chunk
[C]: in ?

from companion.

ybdong919 avatar ybdong919 commented on July 30, 2024

when I run "../../bin/update_references.lua", error outputted:
"
gt: error: could not execute script ../../bin/update_references.lua:268: bad argument #1 to 'pairs' (table expected, got nil)
"

from companion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.