Giter Site home page Giter Site logo

remiallio / mitofinder Goto Github PK

View Code? Open in Web Editor NEW
82.0 4.0 12.0 274.5 MB

MitoFinder: efficient automated large-scale extraction of mitogenomic data from high throughput sequencing data

Python 69.69% C 7.27% Shell 0.52% HTML 1.13% Dockerfile 0.01% Makefile 0.09% M4 0.01% C++ 21.07% Perl 0.14% Starlark 0.08%
mitochondrial-genomes mitochondrial-sequences trna-annotation mitochondrial-genes mitogenome mitochondrial-assembly mitochondrial-annotation

mitofinder's Introduction

MitoFinder version 1.4.2

Allio, R., Schomaker-Bastos, A., Romiguier, J., Prosdocimi, F., Nabholz, B., & Delsuc, F. (2020) Mol Ecol Resour. 20, 892-905. (publication link)

Drawing

Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.

MitoFinder is also designed to find and annotate mitochondrial sequences in existing genomic assemblies (generated from Hifi/PacBio/Nanopore/Illumina sequencing data...)

MitoFinder is distributed under the license.

Requirements

This software is suitable for all linux-like systems with automake, autoconf, gcc (Unfortunately not Windows < v.10) or Singularity version >= 3.0 installed. The pipeline is mainly written in python 2.7.

Table of content

  1. Installation guide for MitoFinder
  2. How to use MitoFinder
  3. Detailed options
  4. INPUT FILES
  5. OUTPUT FILES
  6. Particular cases
  7. UCE annotation
  8. How to cite MitoFinder
  9. How to get reference mitochondrial genomes from ncbi
  10. How to submit your annotated mitochondrial genome(s) to GenBank NCBI

Installation guide for MitoFinder

Run MitoFinder with Singularity

There are many cases where using a singularity container would be easier than installing MitoFinder from source. In particular, if you are working on a server (without administrator rights) or if your default version of python is python 3.x, I recommend you to use the singularity container available here to run MitoFinder.

How to install Singularity version >= 3.0

Since you have Singularity version >= 3.0 installed, you can clone MitoFinder's container directly using singularity with the following command:

singularity pull --arch amd64 library://remiallio/default/mitofinder:v1.4.2 

and then run it as follows:

singularity run mitofinder_v1.4.2.sif -h  

or:

mitofinder_v1.4.2.sif -h  

Then, you can get MitoFinder test cases (optional)
See MitoFinder_container github page. To clone this reduced repository:

git clone https://github.com/RemiAllio/MitoFinder_container.git

Add MitoFinder's container to your path

cd PATH/TO/MITOFINDER_IMAGE/
p=$(pwd)
echo -e "\n#Path to MitoFinder image \nexport PATH=\$PATH:$p" >> ~/.bashrc 
source ~/.bashrc  

WARNING: If you previously installed MitoFinder on your system and want to install a new version, you should replace the old MitoFinder PATH by the updated one in your ~/.bashrc file. To do so, you need to edit your ~/.bashrc file, remove the lines that add MitoFinder to the PATH, and close your terminal. Then, you should open a new terminal and re-execute the command lines from above.
TIP: If you are connected to cluster, you can use either nano or vi to edit the ~/.bashrc file.

To check if the right version of MitoFinder is actually in your PATH:

mitofinder_v1.4.2.sif -v

Get and install MitoFinder (Linux)

*/!* WARNING **/!**: MitoFinder is written in python2.7 (I recommend to use the singularity image if your python default version is python3.+. Otherwise, you may encounter some issues associated with python2.7 dependencies.

Before starting, MitoFinder installation requires automake and autoconf to be installed. Even if these two tools are often already installed in the linux system, here is the command to install them if necessary:

sudo apt-get install automake autoconf  

If you want to use MiTFi for the tRNA annotation step (recommended), java needs to be installed. Here is the command to install it:

sudo apt install default-jre

(optional) If you want to generate images (annotated contigs), you will need the package pillow:

pip2.7 install pillow==6.2.2

Clone MitoFinder from GitHub

git clone https://github.com/RemiAllio/MitoFinder.git
cd MitoFinder
./install.sh

PATH/TO/MITOFINDER/mitofinder -h  

or download master.zip

wget https://github.com/RemiAllio/MitoFinder/archive/master.zip
unzip master.zip
mv MitoFinder-master MitoFinder
cd MitoFinder
./install.sh

PATH/TO/MITOFINDER/mitofinder -h  

Add MitoFinder to your path

cd PATH/TO/MITOFINDER/
p=$(pwd)
echo -e "\n#Path to MitoFinder \nexport PATH=\$PATH:$p" >> ~/.bashrc 
source ~/.bashrc  

WARNING: If you previously installed MitoFinder on your system and want to install a new version, you should replace the old MitoFinder PATH by the updated one in your ~/.bashrc file. To do so, you need to edit your ~/.bashrc file, remove the lines that add MitoFinder to the PATH, and close your terminal. Then, you should open a new terminal and re-execute the command lines from above.
TIP: If you are connected to cluster, you can use either nano or vi to edit the ~/.bashrc file.

To check if the right version of MitoFinder is actually in your PATH:

mitofinder -v

Get MitoFinder and install dependencies (Mac OS)

To install MitoFinder in Mac OS, you need automake and autoconf to be installed. The easiest way to install it (if necessary) is to use Homebrew as follow:

brew install autoconf automake

Get MitoFinder

Clone MitoFinder from GitHub

git clone https://github.com/RemiAllio/MitoFinder.git
cd MitoFinder
./install.sh

or download master.zip

wget https://github.com/RemiAllio/MitoFinder/archive/master.zip
unzip master.zip
mv MitoFinder-master MitoFinder
cd MitoFinder
./install.sh

Install dependencies

Once installed, you need to indicate the paths to the directory containing the executables on the Mitofinder.config file.
TIPS:
(1) If the executable is in your PATH, to find it you can use which. For example, which megahit.
(2) If not, you can go to the directory containing the executable and use pwd to get the PATH. Then, you can copy the PATH in the Mitofinder.config file.

BLAST

Given that MitoFinder uses makeblastdb, blastn, and blastx, you need to download the associated binaries available here. TIP: wget can be install with Homebrew using brew install wget

wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.12.0+-x64-macosx.tar.gz  
tar -xvf ncbi-blast-*-x64-macosx.tar.gz 
cd ncbi-blast-*+/bin/

Once downloaded, you need to indicate the PATH to the directory containing the binaries makeblastdb, blastn, etc. in the Mitofinder.config file.

Assemblers

To get MitoFinder to work, you need to install at least one of the following assemblers.

Installation with conda (miniconda2 or anaconda2):

conda install -c bioconda megahit

Once installed, you need to indicate the PATH to the directory containing the exectutable megahit in the Mitofinder.config file.
If you install megahit with conda, the executable will be in the miniconda/anaconda bin directory.
For example : /Users/remiallio/bin/miniconda2/bin/

We recommend to download the pre-compiled binaries:

curl http://cab.spbu.ru/files/release3.14.0/SPAdes-3.14.0-Darwin.tar.gz -o SPAdes-3.14.0-Darwin.tar.gz
tar -zxf SPAdes-3.14.0-Darwin.tar.gz
cd SPAdes-3.14.0-Darwin/bin/

Once downloaded, you need to indicate the PATH to the directory containing the exectutable metaspades.py in the Mitofinder.config file.

To our knowledge, IDBA-UD is not supported for Mac OS at the moment.

tRNA annotation

  • MiTFi and tRNAscan-SE should be installed by MitoFinder when running the install.sh.

However, MiTFi requires java to be installed. If java is not installed, you can download the .dmg file to install it here.

The arwen source code is available in the arwen directory of MitoFinder. However, it is compiled for Linux. So, to make it executable you need to compile it on your own Mac OS system using gcc.

cd PATH/TO/MITOFINDER/arwen/
gcc arwen1.2.3.c
mv a.out arwen

Once it is compiled, you can test it by running:

./arwen -h

How to use MitoFinder

Assembler

First, you can choose the assembler using the following options:
-- megahit (default: faster)
-- metaspades (recommended: a bit slower but more efficient (see associated paper). WARNING: Not compatible with single-end reads)
-- idba

tRNA annotation step

Second, you can choose the tool for the tRNA annotation step of MitoFinder using the -t option:
-t mitfi (default, MiTFi: slower but really efficient)
-t arwen (ARWEN: faster)
-t trnascan (tRNAscan-SE)

Mitochondrial genome assembly and annotation

TIP: use mitofinder --example to print basic usage examples.

Trimmed paired-end reads

mitofinder -j [seqid] -1 [left_reads.fastq.gz] -2 [right_reads.fastq.gz] -r [genbank_reference.gb] -o [genetic_code] -p [threads] -m [memory]   

WARNING: If you are using capture data (e.g. UCE), consider specifying the --min-contig-size parameter. The default value is 1,000 bp which may be too high for mitochondrial contigs assembled from off-target reads. The same applies for the parameter --blast-size (default: 30%).

Trimmed single-end reads

mitofinder -j [seqid] -s [SE_reads.fastq.gz] -r [genbank_reference.gb] -o [genetic_code] -p [threads] -m [memory]

WARNING: If you are using capture data (e.g. UCE), consider specifying the --min-contig-size parameter. The default value is 1,000 bp which may be too high for mitochondrial contigs assembled from off-target reads. The same applies for the parameter --blast-size (default: 30%).

Find and/or annotate a mitochondrial genome

MitoFinder can also be run directly on a previously computed assembly (one or several contig.s in fasta format)

mitofinder -j [seqid] -a [assembly.fasta] -r [genbank_reference.gb] -o [genetic_code] -p [threads] -m [memory]

WARNING: If you are using capture data (e.g. UCE), consider specifying the --min-contig-size parameter. The default value is 1,000 bp which may be too high for mitochondrial contigs assembled from off-target reads. The same applies for the parameter --blast-size (default: 30%).

Restart

Use the same command line.
WARNING: If you want to compute the assembly again (for example because it failed) you have to remove the assembly results' directory (--override option). If not, MitoFinder will skip the assembly step.

Test cases

If you want to run a test with assembly and annotation steps:

cd PATH/TO/MITOFINDER/test_case/
../mitofinder -j Aphaenogaster_megommata_SRR1303315 -1 Aphaenogaster_megommata_SRR1303315_R1_cleaned.fastq.gz -2 Aphaenogaster_megommata_SRR1303315_R2_cleaned.fastq.gz -r reference.gb -o 5 -p 5 -m 10   

If you want to run a test with only the annotation step:

cd PATH/TO/MITOFINDER/test_case/
../mitofinder -j Hospitalitermes_medioflavus_NCBI -a Hospitalitermes_medioflavus_NCBI.fasta -r Hospitalitermes_medioflavus_NCBI.gb -o 5

Detailed options

usage: mitofinder [-h] [--megahit] [--idba] [--metaspades] [-t TRNAANNOTATION]
                  [-j PROCESSNAME] [-1 PE1] [-2 PE2] [-s SE] [-c CONFIG]
                  [-a ASSEMBLY] [-m MEM] [-l SHORTESTCONTIG]
                  [-p PROCESSORSTOUSE] [-r REFSEQFILE] [-e BLASTEVAL]
                  [-n NWALK] [--override] [--adjust-direction] [--ignore]
                  [--new-genes] [--allow-intron] [--numt]
                  [--intron-size INTRONSIZE] [--max-contig MAXCONTIG]
                  [--cds-merge] [--out-gb] [--min-contig-size MINCONTIGSIZE]
                  [--max-contig-size MAXCONTIGSIZE] [--rename-contig RENAME]
                  [--blast-identity-nucl BLASTIDENTITYNUCL]
                  [--blast-identity-prot BLASTIDENTITYPROT]
                  [--blast-size ALIGNCUTOFF] [--circular-size CIRCULARSIZE]
                  [--circular-offset CIRCULAROFFSET] [-o ORGANISMTYPE] [-v]
                  [--example] [--citation]

Mitofinder is a pipeline to assemble and annotate mitochondrial DNA from
trimmed sequencing reads.

optional arguments:
  -h, --help            show this help message and exit
  --megahit             Use Megahit for assembly. (Default)
  --idba                Use IDBA-UD for assembly.
  --metaspades          Use MetaSPAdes for assembly.
  -t TRNAANNOTATION, --tRNA-annotation TRNAANNOTATION
                        "arwen"/"mitfi"/"trnascan" tRNA annotater to use.
                        Default = mitfi
  -j PROCESSNAME, --seqid PROCESSNAME
                        Sequence ID to be used throughout the process
  -1 PE1, --Paired-end1 PE1
                        File with forward paired-end reads
  -2 PE2, --Paired-end2 PE2
                        File with reverse paired-end reads
  -s SE, --Single-end SE
                        File with single-end reads
  -c CONFIG, --config CONFIG
                        Use this option to specify another Mitofinder.config
                        file.
  -a ASSEMBLY, --assembly ASSEMBLY
                        File with your own assembly
  -m MEM, --max-memory MEM
                        max memory to use in Go (MEGAHIT or MetaSPAdes)
  -l SHORTESTCONTIG, --length SHORTESTCONTIG
                        Shortest contig length to be used (MEGAHIT). Default =
                        100
  -p PROCESSORSTOUSE, --processors PROCESSORSTOUSE
                        Number of threads Mitofinder will use at most.
  -r REFSEQFILE, --refseq REFSEQFILE
                        Reference mitochondrial genome in GenBank format
                        (.gb).
  -e BLASTEVAL, --blast-eval BLASTEVAL
                        e-value of blast program used for contig
                        identification and annotation. Default = 0.00001
  -n NWALK, --nwalk NWALK
                        Maximum number of codon steps to be tested on each
                        size of the gene to find the start and stop codon
                        during the annotation step. Default = 5 (30 bases)
  --override            This option forces MitoFinder to override the previous
                        output directory for the selected assembler.
  --adjust-direction    This option tells MitoFinder to adjust the direction
                        of selected contig(s) (given the reference).
  --ignore              This option tells MitoFinder to ignore the non-
                        standart mitochondrial genes.
  --new-genes           This option tells MitoFinder to try to annotate the
                        non-standard animal mitochondrial genes (e.g. rps3 in
                        fungi). If several references are used, make sure the
                        non-standard genes have the same names in the several
                        references
  --allow-intron        This option tells MitoFinder to search for genes with
                        introns. Recommendation : Use it on mitochondrial
                        contigs previously found with MitoFinder without this
                        option.
  --numt                This option tells MitoFinder to search for both
                        mitochondrial genes and NUMTs. Recommendation : Use it
                        on nuclear contigs previously found with MitoFinder
                        without this option.
  --intron-size INTRONSIZE
                        Size of intron allowed. Default = 5000 bp
  --max-contig MAXCONTIG
                        Maximum number of contigs matching to the reference to
                        keep. Default = 0 (unlimited)
  --cds-merge           This option tells MitoFinder to not merge the exons in
                        the NT and AA fasta files.
  --out-gb              Do not create annotation output file in GenBank
                        format.
  --min-contig-size MINCONTIGSIZE
                        Minimum size of a contig to be considered. Default =
                        1000
  --max-contig-size MAXCONTIGSIZE
                        Maximum size of a contig to be considered. Default =
                        25000
  --rename-contig RENAME
                        "yes/no" If "yes", the contigs matching the
                        reference(s) are renamed. Default is "yes" for de novo
                        assembly and "no" for existing assembly (-a option)
  --blast-identity-nucl BLASTIDENTITYNUCL
                        Nucleotide identity percentage for a hit to be
                        retained. Default = 50
  --blast-identity-prot BLASTIDENTITYPROT
                        Amino acid identity percentage for a hit to be
                        retained. Default = 40
  --blast-size ALIGNCUTOFF
                        Percentage of overlap in blast best hit to be
                        retained. Default = 30
  --circular-size CIRCULARSIZE
                        Size to consider when checking for circularization.
                        Default = 45
  --circular-offset CIRCULAROFFSET
                        Offset from start and finish to consider when looking
                        for circularization. Default = 200
  -o ORGANISMTYPE, --organism ORGANISMTYPE
                        Organism genetic code following NCBI table (integer):
                        1. The Standard Code 2. The Vertebrate Mitochondrial
                        Code 3. The Yeast Mitochondrial Code 4. The Mold,
                        Protozoan, and Coelenterate Mitochondrial Code and the
                        Mycoplasma/Spiroplasma Code 5. The Invertebrate
                        Mitochondrial Code 6. The Ciliate, Dasycladacean and
                        Hexamita Nuclear Code 9. The Echinoderm and Flatworm
                        Mitochondrial Code 10. The Euplotid Nuclear Code 11.
                        The Bacterial, Archaeal and Plant Plastid Code 12. The
                        Alternative Yeast Nuclear Code 13. The Ascidian
                        Mitochondrial Code 14. The Alternative Flatworm
                        Mitochondrial Code 16. Chlorophycean Mitochondrial
                        Code 21. Trematode Mitochondrial Code 22. Scenedesmus
                        obliquus Mitochondrial Code 23. Thraustochytrium
                        Mitochondrial Code 24. Pterobranchia Mitochondrial
                        Code 25. Candidate Division SR1 and Gracilibacteria
                        Code
  -v, --version         Version 1.4.2
  --example             Print getting started examples
  --citation            How to cite MitoFinder

INPUT FILES

Mitofinder needs several files to run depending on the method you have choosen (see above):

  • Reference_file.gb containing at least one mitochondrial genome of reference extracted from NCBI
  • left_reads.fastq.gz containing the left reads of paired-end sequencing
  • right_reads.fastq.gz containing the right reads of paired-end sequencing
  • SE_reads.fastq.gz containing the reads of single-end sequencing
  • assembly.fasta containing the assembly on which MitoFinder have to find and annotate mitochondrial contig.s

OUTPUT FILES

Results' folder

Mitofinder returns several files for each mitochondrial contig found:

  • [Seq_ID]_final_genes_NT.fasta containing the nucleotides sequences of the final genes selected from all contigs found by MitoFinder
  • [Seq_ID]_final_genes_AA.fasta containing the amino acids sequences of the final genes selected from all contigs found by MitoFinder
  • [Seq_ID]_mtDNA_contig.fasta containing a mitochondrial contig
  • [Seq_ID]_mtDNA_contig.gff containing the final annotation for a given contig (GFF3 format)
  • [Seq_ID]_mtDNA_contig.tbl containing the final annotation for a given contig (Genbank submission format)
  • [Seq_ID]_mtDNA_contig.gb containing the final annotation for a given contig (Genbank format for visualization)
  • [Seq_ID]_mtDNA_contig_genes_NT.fasta containing the nucleotide sequences of annotated genes for a given contig
  • [Seq_ID]_mtDNA_contig_genes_AA.fasta containing the amino acids sequences of annotated genes for a given contig
  • [Seq_ID]_mtDNA_contig.png schematic representation of the annotation of the mtDNA contig
  • [Seq_ID]_mtDNA_contig.infos containing the initial contig name, the length of the contig and the GC content

Particular cases

/!\ Close reference required /!\

For the particular cases below, we recommend using MitoFinder in two different steps. First, you can use it to assemble and/or identify mitochondrial-like contigs, then use it in a second step to annotate these particular contigs (option -a) with the corresponding additional options.
Also, these options are recommended for cases in which a (really) close reference is available.

Annotation of mitochondrial genes containing intron(s)

/!\ Close reference required /!\

In some taxa (e.g. fungi), it's possible to find mitochondrial genes containing intron(s). In these cases, we add the --allow-intron option (combined with --intron-size and --cds-merge). However, it is important to note that, despite the search for start and stop codons is functional for this option, there is no search for intronic boundaries. The exon annotation is based only on the similarity with the reference. That's why a close reference is necessary and even with a good reference, we recommend to double check the exon annotation.

Annotation of NUMTs

/!\ Close reference required /!\

Once you have identified nuclear contigs that may contain NUMTs, you can use MitoFinder to find the NUMTs using the --numt option. Basically, this option allows MitoFinder to find the same gene several times in a contig. Given that the NUMTs can be full of stop codons, we recommand to limit the number of walks (--nwalk 0) that MitoFinder can do to improve the annotation (looking for start and stop codons).

UCE annotation

MitoFinder starts by assembling both mitochondrial and nuclear reads using de novo metagenomic assemblers. It is only in a second step that mitochondrial contigs are identified and extracted. MitoFinder thus provides UCE contigs that are already assembled and the annotation can be done from the following file:

  • [Seq_ID]_link_[assembler].scafSeq containing all assembled contigs from raw reads.

To do so, we recommend the use of the PHYLUCE pipeline which is specifically designed to annotate ultraconserved elements (Faircloth 2015; Tutorial: https://phyluce.readthedocs.io/en/latest/tutorial-one.html#finding-uce-loci).
You can thus use the file [Seq_ID]_link_[assembler].scafSeq and start the Phyluce pipeline at the "Finding UCE" step.

How to cite MitoFinder

If you use MitoFinder, please cite:

  • Allio, R, Schomaker‐Bastos, A, Romiguier, J, Prosdocimi, F, Nabholz, B, Delsuc, F. (2020). MitoFinder: Efficient automated large‐scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour. 20, 892-905. https://doi.org/10.1111/1755-0998.13160

Please also cite the following references depending on the option chosen for the assembly step in MitoFinder:

  • Li, D., Luo, R., Liu, C. M., Leung, C. M., Ting, H. F., Sadakane, K., Yamashita, H. & Lam, T. W. (2016). MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods, 102(6), 3-11.
  • Nurk, S., Meleshko, D., Korobeynikov, A., & Pevzner, P. A. (2017). metaSPAdes: a new versatile metagenomic assembler. Genome research, 27(5), 824-834.
  • Peng, Y., Leung, H. C., Yiu, S. M., & Chin, F. Y. (2012). IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics, 28(11), 1420-1428.

For tRNAs annotation, depending on the option chosen:

  • MiTFi: Juhling, F., Putz, J., Bernt, M., Donath, A., Middendorf, M., Florentz, C., & Stadler, P. F. (2012). Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements. Nucleic acids research, 40(7), 2833-2845.
  • Laslett, D., & Canback, B. (2008). ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics, 24(2), 172-175.
  • Chan, P. P., & Lowe, T. M. (2019). tRNAscan-SE: searching for tRNA genes in genomic sequences. In Gene Prediction (pp. 1-14). Humana, New York, NY.

For UCEs extraction:

  • Faircloth, B. C. (2016). PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics, 32(5), 786-788.

HOW TO GET REFERENCE MITOCHONDRIAL GENOMES FROM NCBI

Using the NCBI web interface

  1. Go to NCBI
  2. Select "Nucleotide" in the search bar
  3. Search for mitochondrion genomes:
  • RefSeq (if available)
  • Sequence length from 12000 to 20000
  1. Download complete record in GenBank format

Depending on the proximity of your reference, you can play with the following parameters : nWalk; --blast-eval; --blast-identity-nucl; --blast-identity-prot; --blast-size

Using entrez-direct utilities

  1. Install entrez-direct utilities (instructions here, or using conda -c bioconda entrez-direct)
  2. Use the following in a shell to batch download all mitochondrial genomes associated with Carnivora:
taxa=Carnivora 
esearch -db nuccore -query "\"mitochondrion\"[All Fields] AND (\"${taxa}\"[Organism]) AND (refseq[filter] AND mitochondrion[filter] AND (\"12000\"[SLEN] : \"20000\"[SLEN]))" | efetch -format gbwithparts > reference.gb

How to submit your annotated mitochondrial genome(s) to NCBI GenBank

Submission with BankIt

If you have few mitochondrial genomes to submit, you should be able to do it with BankIt through the NCBI submission portal.

Submission with tbl2asn

If you want to submit several complete or partial mitogenomes, we designed MitoFinder to strealine the submission process using tbl2asn.
tbl2asn requires:

  • Template file containing a text ASN.1 Submit-block object (suffix .sbt). Create submission template.
  • Nucleotide sequence data containing the mitochondrial sequence(s) and associated information (suffix .fsa).
  • Feature Table containing annotation information for the mitochondrial sequence(s).
  • Comment file containing assembly and annotation method information (assembly.cmt). Create comment template

Creating a compatible FASTA file

Because tbl2asn requires the FASTA file to contain information associated with the data, we wrote a script to create a FASTA file containing the mitochondrial contig(s) found by MitoFinder for each species (Seq_ID) with the associated information. This script and the associated example files can be found in the MitoFinder directory named "NCBI_submission".

INPUT

  • index_file.csv A CSV file (comma-delimited table) containing the metadata information.

The headers of the index file are as follows: Directory path, Seq ID, organism, location, mgcode, SRA, keywords ...

The first two columns are mandatory and the names cannot be changed but you can complete the index file with the different source modifiers of NCBI by adding columns in the index file.

The directory path correponds to the path where the [Seq_ID]_mtDNA_contig.fasta file, or [Seq_ID]mtDNA_contig*.fasta files if you have several contigs for the same individual, could be found. If left blank, the script will search for the contig in the directory where you run the script from (./).

/PATH/TO/MITOFINDER/NCBI_submission/create_tbl2asn_files.py -i index_file.csv

TIPS:
(1) You can copy or link (symbolic links) all your FASTA and TBL contig files in the same directory and run the script from this directory.
(2) You can leave blanks in the index file if some species do not need a given source modifier.

OUTPUT

  • [Seq_ID].fsa new FASTA file containing all mtDNA contigs and the information for a given [Seq_ID]
  • [Seq_ID].tbl new TBL file containing all mtDNA contigs and the information for a given [Seq_ID]

Command line to run tbl2asn

Once your FASTA and TBL files have been created, you can run tbl2asn (download here: ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/tbl2asn/) as follows:

tbl2asn -t template.sbt -i [Seq_ID].fsa -V v -w assembly.cmt -a s

This command will create several files:

  • [Seq_ID].sqn Submission file (.sqn) to be sent by e-mail to [email protected]
  • [Seq_ID].val Containing ERROR and WARNING values associated with tbl2asn. (ERROR explanations here)

If you don't have any error and you are happy with the annotation, you can submit your mitochondrial contig(s) by sending the .sqn files to [email protected]

mitofinder's People

Contributors

remiallio avatar sam217pa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mitofinder's Issues

Continue stopped assembly job

Hi all,
thanks for the nice piece of software! I had an assembly running, which was stopped because I did not allocate enough memory. I used metaspades as assembler which has the option "--continue" to continue started jobs, and I was wondering if there is a way to tell Mitofinder to tell metaspades to use this option. Or generally how can I change parameters in the programs Mitofinder uses.

Cheers!

Python 3 support

It would be nice to run MitoFinder under Python 3. This will enable to mix this nice pipeline with others in one environment.

Metaspades has stuck when it is processing reversed reads

Hi, Remi,

I found some of my jobs have stuck in processing R2 reads when I do assembly using MitoFinder 1.4.

To investigate whether it was because of Spades assembler or Mitofinder itself, I tried Spades 1.13 using the same sample and the assembly worked well.

here I attached the metaspades.log file generated by both Mitofinder and Spades.

Spades_spades.log

MitoFinder_metaspades.log

In addition, I have some samples sequenced using different sequencing platforms, so I also want to know if it is
ok if I use four input files? I know it is possible for Spades, but how about Mitofinder?

git clone is now up to speed with git-lfs clone

Hi and thanks for MitoFinder!

Just wanted to let you know that I followed your instruction and got a warning message telling me that git clone now has comparable speed to git lfs clone (working with git 2.20.1, git lfs 1.5.2);
that should make the installation instruction for singularity a bit simpler.

cheers,

About the output file

Hi! When I used mitofinder, I meet something wrong. In my output folder, I didn't find files such as [Seq_ID]_final_genes_NT.fasta . I only found [checkpoints.txt done intermediate_contigs options.json wyj_test_megahit.contigs.fa wyj_test_megahit.log] in [Seq_id]_megahit folder. Is that right? How can I fix it ? Thanks

Installation Problem

Hello, sorry to disturb you, I install mitofinder following your README. but after I add the PATH, it still has such error:

peng@peng-System-Product-Name:~/mitofinder/MitoFinder$ mitofinder -v
File "/home/peng/mitofinder/MitoFinder/mitofinder", line 148
print "\n # For trimmed paired-end reads:\nmitofinder --megahit -j [seqid] \\n\t-1 [left_reads.fastq.gz] \\n\t-2 [right_reads.fastq.gz] \\n\t-r [genbank_reference.gb] \\n\t-o [genetic_code] \\n\t-p [threads] \\n\t-m [memory]\n\n # For trimmed single-end reads:\nmitofinder --megahit -j [seqid] \\n\t-s [SE_reads.fastq.gz] \\n\t-r [genbank_reference.gb] \\n\t-o [genetic_code] \\n\t-p [threads] \\n\t-m [memory]\n\n # For one assembly (one or more contig(s))\nmitofinder -j [seqid] \\n\t-a [assembly.fasta] \\n\t-r [genbank_reference.gb] \\n\t-o [genetic_code] \\n\t-p [threads] \\n\t-m [memory]\n"
^
SyntaxError: invalid syntax

It's the first installation.

Would it work with Wolbachia (or other symbionts) ?

Hi Remi,

I was wondering how mitochondrial-genomes specific is MitoFinder ?
Do you think it would work on Wolbachia for instance ? To me it looks like it should, given a set of Wolbachia genes instead of a set of mitochondrial genes.
We are looking for integrated and "free-living" Wolbachia, the parallel with NUMTS / free-living mitochondria made me thought we could use mitofinder for this task.

Thanks !

ncbi-blast-2.12.0+/binmakeblastdb is not executable

Hello,

I am trying to run MitoFinder on a mac OS Big Sur and have installed the dependencies needed to run MitoFinder (at least Metaspades, BLAST, and the tRNA annotation dependencies).

Currently I have MitoFinder as a directory by itself, and BLAST is outside of it. MitoFinder is in my path, as is blast. Every time I try to run MitoFinder, it says:

Now running MitoFinder ...

Start time : 2021-09-01 14:49:08

Job name = CS

Creating Output directory : /Users/name/PhyloPrograms/MitoTest/CS
All results will be written here

Program folders:
MEGAHIT = /Users/name/PhyloPrograms/MitoFinder/megahit/
MetaSPAdes folder = /Users/name/PhyloPrograms/SPAdes-3.14.0-Darwin/bin
IDBA-UD folder = /Users/name/PhyloPrograms/MitoFinder/idba/bin/
Blast folder = /Users/name/PhyloPrograms/ncbi-blast-2.12.0+/bin
ARWEN folder = /Users/name/PhyloPrograms/MitoFinder/arwen/
MiTFi folder = /Users/name/PhyloPrograms/MitoFinder/mitfi/
tRNAscan-SE folder = /Users/name/PhyloPrograms/MitoFinder/trnascanSE/tRNAscan-SE-2.0/

/Users/name/PhyloPrograms/ncbi-blast-2.12.0+/binmakeblastdb is not executable
Please check the installation and the path indicated above and restart MitoFinder.
Aborting 

I have tried removing the + sign from the ncbi-blast directory, I've tried moving the directory around, I have tried having 'default' as the path in the MitoFinder.config file, but no matter what I do, it things that 'bin' is part of the word 'makeblastdn'

Has anyone ever encountered this before? Do you know what the issue might be? Thank you!

Best,
Justin

Warning: None of paired reads aligned properly

Hi,

I am working with formalin (degraded) samples in which I have 30-50 millions of reads from paired-end data. I know the degraded samples are a hassle to begin with and may not yield results, but I am trying to see if I can obtain any mitochondrial data from some sequence capture data. I had one of these work well and I obtained two mitochondrial genes, but another 7 samples I got the following errors:

======= SPAdes pipeline finished WITH WARNINGS!

=== Error correction and assembling warnings:
 * 0:51:23.455     8G / 9G    WARN    General                 (pair_info_count.cpp       : 341)   Unable to estimate insert size for paired library #0
 * 0:51:23.455     8G / 9G    WARN    General                 (pair_info_count.cpp       : 347)   None of paired reads aligned properly. Please, check orientation of your read pairs.
 * 0:51:23.458     8G / 9G    WARN    General                 (repeat_resolving.cpp      :  63)   Insert size was not estimated for any of the paired libraries, repeat resolution module will not run.
 * 1:25:43.121     8G / 8G    WARN    General                 (pair_info_count.cpp       : 175)   Single reads are not used in metagenomic mode
======= Warnings saved to /Volumes/BackupPlus/Mitolopsidae/Cryn_370/Cryn_370_metaspades/warnings.log

Is the orientation error possibly due to fragmented DNA? The first time I ran MitoFinder it worked fine, and I have have my R1.fastq.gz for the -1 flag and the R2.fastq.gz file for the -2 flag. I'm trying to figure out if the issue with 'unable to estimate insert size' and the other warnings are due to this just being form a degraded sample. Do you have any suggestions?

Lastly, I assume I should subsample these from 30-50 million reads to ~10 million reads if I want to run these on a local desktop (8 cores, 32 GB RAM) vs, the cluster, which is having issues? Thank you!

My script for running this was the following:
mitofinder --metaspades -j Cryn_370 -1 Cryn_370/6112-JK-12_16_S1_L005_R1_001.fastq.gz -2 Cryn_370/6112-JK-12_16_S1_L005_R2_001.fastq.gz -r /Volumes/BackupPlus/HP_mtDNAref.gb -o 2 -p 4 -m 22

I've tried running this on the cluster, but I keep getting the error of:
-bash: /home/jmb689/ncbi-blast-2.12.0+/bin/makeblastdb: cannot execute binary file even though I have the / at the end of the word /bin/ (I did it as an installation for linux; our cluster couldn't install git-lfs).

Best,
Justin

When run Mitofinder met a problem

Traceback (most recent call last):
File "/data/01/user164/workspace/try/MitoFinder/mitofinder", line 1205, in
rename = Popen(args1, stdout=open(os.devnull, 'wb'))
File "/usr/lib64/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

upon is the error,and i also attach the log file
Mf_MitoFinder.log

I dont know what should i do
Thank u so much

multi-user / HPC support for MitoFinder

Hi @RemiAllio,

Congrats on this nice project. I wanted to create a Bioconda package for MitoFinder but found a few problems that prevent to use MitFinder in multi-user settings or HPC environments.

The main point that prevents us from using MitoFinder currently is that the installation is not decoupled from its execution. You rely on a lot of path mangling like pathToMegahitFolder = os.path.join(module_dir, 'megahit/') instead it would be better imho to just assume megahit is on your PATH. The user or the package manager is then responsible to put it in the path.
Another point is that Mitofinder.config seems to be assumed next to the main python file. Is that correct? I could not find a way to change it - maybe I missed it.
On HPC systems or in multi-user systems the installation path is not writable and a user can not modify the config file.
It would be nice if Mitofinder.config can be passed to MitoFinder via some command line arguments.

We have recently collected a few tips how developers can create tools that are easily deployable on HPC and cloud systems [1]. Maybe that helps a little bit. Please let me know if I can help in any way. I would really like to see this tool running as part of our pipelines. I think with some restructuring this project could be easily used by many more researchers.

[1] https://academic.oup.com/gigascience/article/8/5/giz054/5497810

blast problem on ubuntu 22.04

Hi Remi,

I have a problem with the makeblastdb available with MitoFinder on ubuntu 22.04:
MitoFinder/blast/bin/makeblastdb: error while loading shared libraries: libidn.so.11: cannot open shared object file: No such file or directory

I tried made a symbolic link my version of blast but the pipeline doesn't work and I fear it could be because of that (not sure).

Cheers,
Benoit

Licensing

Hey Remio,
Congrats, this is a great pipeline!

I would like to implement some steps of your MitoFinder in a pipeline of mine. If your intention is to have it free, could you add a MIT or similar license to your pipeline?

Thank you!
Marcela.

mitofinder installation

Hi,
I succesfully installed mitofinder in an old computer but I am trying in a new one and it is not working. I tried to use singularity but I did not manage.
I have created a new conda environment with python2.7 and mitofinder:
conda create -n mitofinder -c bioconda mitofinder python=2.7

but when trying to run the program, I get this message:

/home/goliath/miniconda3/envs/mitofinder2/lib/python2.7/site-packages/Bio/SearchIO/__init__.py:211: BiopythonExperimentalWarning: Bio.SearchIO is an experimental submodule which may undergo significant changes prior to its future official release.
  BiopythonExperimentalWarning)

Command line: /home/goliath/miniconda3/envs/mitofinder2/bin/mitofinder -j 12953 -1 /mnt/DiscoA/cerastes/raw_data/wgs/filtered/12953_filtered_1.fastq.gz -2 /mnt/DiscoA/cerastes/raw_data/wgs/filtered/12953_filtered_2.fastq.gz -r Vip_berus_mito.gb -o 2 -p 40 -m 80 --override

Now running MitoFinder ...

Start time : 2023-06-07 15:20:30

Job name = 12953

ERROR: MitoFinder is not installed.
No such file or directory: /home/goliath/miniconda3/envs/mitofinder2/bin/install.sh.ok
Please run ./install.sh in the MitoFinder directory.
Aborting.

Do you know what's going on?

Thanks in advance!

Annotation discrepancies

Dear Rémi,

I hope you are well.

We have been making some tests and comparisons between annotations retrieved by MitoFinder to those from MITOS2, and found some discrepancies.

We assembled reads using metaSPAdes, by modifying lines #96 and #98 from runMetaspades.py to specify k-mers (21, 31, 41, 51, 71 and 91).

Here is the annotation performed directly by MitoFinder: 1191_mtDNA_contig.txt

And here, the one performed by MITOS2 by inputting the same contig:
mitofinder_to_mitos

As you can see, MitoFinder fails at retrieving tRNA-Trp and tRNA-Cys between ND2 and COX1; and tRNA-Val between rrnL and rrnS. In general, it also appears that there is a slight variation for START and STOP positions.

For further comparison, we ran metaSPAdes (from SPAdes 3.11.1) installed separately from MitoFinder, using the same parameters as for the assembly performed by MitoFinder. The largest contig was submitted to MITOS, with results as follows:
metaspades_to_mitos

As you can see, annotation is consistent with the previous scaffold produced by MitoFinder and fed to MITOS, with the exception that the former one has a lot of internal stops, which is not the case when using metaSPAdes outside of MitoFinder.

Would the observed discrepancies be linked to the assembly step?

Additionally, we see that the default option of --circular-size is set to 45. Is it meant to be 45kb? Knowing that our model has size of roughly 16kb, would we benefit to changing this option?

Thanks a lot in advance for you reply,
Simon

Not finding nucleotide sequence in reference file

Hello! I am running MitoFinder on linux, using the bioconda install, all the program files load fine, but then I get the error: "ERROR: MitoFinder didn't found any nucleotide sequence in the reference(s) file(s)." in the log.
I downloaded the reference from genbank, (https://www.ncbi.nlm.nih.gov/nuccore/OX608057.1) and the file looks to be in genbank format, I checked for any odd whitespace characters, on the offchance my upload to the super changed something, but couldn't find anything. I'm sure I am missing something simple, but I am fairly new to all of this, so any help would be much appreciated!
Thank you~
Teagan

Absolute path to the scaffold file is not accepted (IOError: [Errno 2] No such file or directory)

I am having an issue testing MitoFinder before running this on all of my samples. I can only run this program if my scaffolds are in the directory I call the program; even though I can copy the scaffold.fasta file into my working directory, or even the directory above, using the exact same path. If I use the absolute path I get this error:

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd ~/mito/test2

mitofinder_v1.4.1.sif -j Acilius_canaliculatus_SRR12339052/ \
-a /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta \
-r ../reference.gb \
-m 25 -p 4 -o 5 \
--min-contig-size 500

traceback (most recent call last):
File "/opt/MitoFinder/mitofinder", line 168, in
logfile=open(Logfile,"w")
IOError: [Errno 2] No such file or directory: '/data/crcardenas/mito/test2/Acilius_canaliculatus_SRR12339052/_MitoFinder.log'

if I $cp /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta ./
and run the code again where -a ./scaffold.fasta; it works fine.

It does not seem that the program can accept the absolute path.

Issue running current version

Hi there!

I am getting an issue with the current installation. The assembly step runs absolutely fine, but when I try to run the numt identification step on the contigs from the first step I get this error:

Traceback (most recent call last):
  File "/scratch/PI/dpetrov/armstrong/applications/MitoFinder/mitofinder", line 1249, in <module>
    fifthStep = Popen(args1, cwd=pathOfFinalResults, stdout=open('geneChecker.log', 'a'), stderr=open('geneChecker_erreur.log', 'a'))
  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

I have tried it on multiple different systems and using git clone/etc, so I think it may be something to do with the python script being solved, but please let me know if you have any ideas!

Here is my code:
//MitoFinder/mitofinder -j sample1-numt-test1 -a sample1-mito-test1_link_megahit.scafSeq -r refsample.gb -o 2 -p 40 --numt --nwalk 0

Thank you so much!

NameError: name 'featureName' is not defined

This is the command I have used to run for fungal genomes

singularity run mitofinder_v1.4.1.sif -j A5_mito -s A5_HiFi_reads.fastq -r Rhizophagus_irregularis_DAOM197198_mtDNA.gb -o 1 -p 5 -m 10

I am getting following error
Command line: /opt/MitoFinder/mitofinder -j A5_mito -s A5_HiFi_reads.fastq -r Rhizophagus_irregularis_DAOM197198_mtDNA.gb -o 1 -p 5 -m 10

Now running MitoFinder ...

Start time : 2023-07-31 16:01:30

Job name = A5_mito

Creating Output directory : /home/eniac/A5_mito
All results will be written here

Program folders:
MEGAHIT = /opt/MitoFinder/megahit/
MetaSPAdes folder = /opt/MitoFinder/metaspades/bin/
IDBA-UD folder = /opt/MitoFinder/idba/bin/
Blast folder = /opt/MitoFinder/blast/bin/
ARWEN folder = /opt/MitoFinder/arwen/
MiTFi folder = /opt/MitoFinder/mitfi/
tRNAscan-SE folder = /opt/MitoFinder/trnascanSE/tRNAscan-SE-2.0/

Traceback (most recent call last):
File "/opt/MitoFinder/mitofinder", line 574, in
featureName = ''.join(featureName.split())
NameError: name 'featureName' is not defined

Naive question

Hi - thanks for this useful program. I am using it for existing assembies, and I would like to know whether it outputs somewhere the name of the Contig that was found to be the mitochondrial genomes (I could always do a blast back). That would be useful for some application. Cheers!

Add annotations to assembled mitogenomes

Great tool,
I have several mitogenomes already assembled belonging to mammals. I wonder if I could use MitoFinder to do the annotations and formatting for GenBank. What would the command look like?

Thanks a lot !

ERROR: MetaSPAdes didn't run well

Hello!

I was trying to run MitoFinder and got the error "ERROR: MetaSPAdes didn't run well".
When checking the metaspades.log within the created result directory, it showed the following:

The program was terminated by segmentation fault
=== Stack Trace ===
'tuple' object has no attribute 'release'
Traceback (most recent call last):
  File "/opt/MitoFinder/metaspades/bin/metaspades.py", line 626, in main
    executor.execute(commands)
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/executors/executor_local.py", line 37, in execute
    command.run(self.log)
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/commands_parser.py", line 38, in run
    support.sys_call(self.to_list(), log)
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/support.py", line 330, in sys_call
    sys_error(cmd, log, proc.returncode)
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/support.py", line 91, in sys_error
    hints_str = get_error_hints(exit_code)
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/support.py", line 87, in get_error_hints
    return wsl_check()
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/support.py", line 78, in wsl_check
    if in_wsl():
  File "/opt/MitoFinder/metaspades/share/spades/spades_pipeline/support.py", line 76, in in_wsl
    return 'microsoft' in uname().release.lower()
AttributeError: 'tuple' object has no attribute 'release'

I used the following script to run it:

#!/bin/bash

#PBS -N 001_mitofinder_SPopGen_2023
#PBS -l select=1:ncpus=1:mem=70gb:scratch_local=300gb
#PBS -l walltime=23:59:00
#PBS -o /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/PBS/out/001_mitofinder_SPopGen.txt
#PBS -e /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/PBS/err/001_mitofinder_SPopGen.txt

#clean scratch after the end
trap 'clean_scratch' TERM EXIT

# go to scratch directory
cd $SCRATCHDIR || exit 1

export TMPDIR=$SCRATCHDIR

source /storage/brno2/home/pedroribeiro/.bashrc
cd /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads

singularity run -B $SCRATCHDIR ~/software/mitofinder_latest.sif \
 --metaspades -j PR001.mitogenome \
 -1 /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/PR001_R1.fastq \
 -2 /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/PR001_R2.fastq \
 -r /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/mito_reference/mito_reference.gb \
 -o 5 \
 -p 8 \
 -m 70

Any tips on how to proceed? Let me know if anything else from my part would help!

Thank you

Pedro Ribeiro

tRNA annotation

Dear Remi,
I'm using your MitoFinder for annotation of termite mitochondrial genomes and I'm super-satisfied. With a large reference database of termite mtGenomes it outperforms in my hands both Mitos and MitoZ for protein coding and rRNA genes.

However, I found that the tRNA predictions are rather poor (compared to manual annotation or to the two pipelines above).
I tested your annotation pipeline using 40 published termite mtGenomes (KP026260-KP026298). MitoFinder predicted on average 16 tRNAs per genome (out of 22) while the other two pipelines were above 21 average predicted tRNAs per genome.
The summary stats of annotated tRNAs for the 40 genomes are below, showing the total number of each predicted tRNA among the 40 genomes. The summary stats shows also fuzzy naming of tRNAs: there should be a single tRNA copy except for tRNA-Leu1/2 and tRNA-Ser1/2.

34 Name=tRNA-Ala 
  3 Name=tRNA-Ala2 
 34 Name=tRNA-Arg 
  2 Name=tRNA-Arg2 
 31 Name=tRNA-Asn 
  6 Name=tRNA-Asn2 
 38 Name=tRNA-Asp 
  1 Name=tRNA-Asp2 
 38 Name=tRNA-Gln 
 35 Name=tRNA-Glu 
 34 Name=tRNA-Gly 
  3 Name=tRNA-Gly2 
 31 Name=tRNA-His 
  7 Name=tRNA-His2 
 29 Name=tRNA-Ile 
 37 Name=tRNA-Leu 
  1 Name=tRNA-Leu2 
 36 Name=tRNA-Lys 
  1 Name=tRNA-Lys2 
 38 Name=tRNA-Met 
  1 Name=tRNA-Met2 
 35 Name=tRNA-Phe 
 35 Name=tRNA-Pro 
  2 Name=tRNA-Pro2 
 33 Name=tRNA-Ser 
 17 Name=tRNA-Ser2 
 34 Name=tRNA-Thr 
  4 Name=tRNA-Thr2 
  1 Name=tRNA-Trp 
  2 Name=tRNA-Trp2 
 35 Name=tRNA-Tyr 
  3 Name=tRNA-Tyr2 
 12 Name=tRNA-Val 
  4 Name=tRNA-Val2

I'm wondering, is there anything which can be done from within the MitoFinder pipeline to tune the performance of ARWEN module for tRNA annotation?
Thanks in advance and best regards,
Ales

Unprivileged user installation of dependencies.

Hi,

I am trying to install MitoFinder on a shared HPC system. Unfortunately, the following make install step from install.sh:

cd "$wd"/trnascanSE
tar -xvf infernal-1.1.3.tar.gz
cd infernal-1.1.3
./configure
make
make install

fails because the --prefix is not set to anything, thus defaulting to /usr/local. What should the correct prefix be?

Thanks.

Can't install MitoFinder

Hi,

I am quite thrilled about MitoFinder and wanted to install this week. Unfortunately I got an error. It is strange. Apparently the script is looking for something which is not there. I tried to fix it with a colleague but we were not able to get it running. Can you help us?
Please find the log below:

florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ./install.sh
./build.sh: line 1: aclocal: command not found
./build.sh: line 2: autoconf: command not found
./build.sh: line 3: automake: command not found
./build.sh: line 4: ./configure: No such file or directory
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ cat install.sh
chmod 764 ///
chmod 764 //*
chmod 764 /
chmod 764 *
cd idba
./build.sh
cd ..
cd metaSpades/bin/
p=$(pwd)
ln -s "$p"/spades.py "$p"/metaspades.py
cd ../../
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ cd idba/
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 build.sh configure.ac install-sh NEWS stamp-h1
AUTHORS ChangeLog depcomp lib README.md test
bin config.h Dockerfile Makefile.am script WORKSPACE
BUILD config.h.in gtest_src missing src
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ ./build.sh
./build.sh: line 1: aclocal: command not found
./build.sh: line 2: autoconf: command not found
./build.sh: line 3: automake: command not found
./build.sh: line 4: ./configure: No such file or directory
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ cd..
cd..: command not found
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 build.sh configure.ac install-sh NEWS stamp-h1
AUTHORS ChangeLog depcomp lib README.md test
bin config.h Dockerfile Makefile.am script WORKSPACE
BUILD config.h.in gtest_src missing src
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ cd ..
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio megahit
blast metaSpades
CHANGELOG mitofinder
circularizationCheck.py Mitofinder.config
circularizationCheck.pyc README.md
export_mitochondrial_contigs.py rename_fasta_seqID.py
extract_genes.py runIDBA.py
extract_seq.py runIDBA.pyc
FirstBuildChecker.py runMegahit.py
FirstBuildChecker.pyc runMegahit.pyc
genbankOutput.py runMetaspades.py
genbankOutput.pyc runMetaspades.pyc
geneChecker_fasta.py sort_gff.py
geneChecker.py testcase
geneChecker.pyc tRNAscan
idba tRNAscanChecker_arwen.py
image tRNAscanChecker.py
install.sh tRNAscanChecker.pyc
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ make clean
make: *** No rule to make target 'clean'. Stop.
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ make
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ cd Bio/
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/Bio$ ls
Affy FSSP NeuralNetwork SeqRecord.py
Align GA Nexus SeqRecord.pyc
AlignIO GenBank NMR Sequencing
Alphabet Geo pairwise2.py SeqUtils
Application Graphics ParserSupport.py Statistics
bgzf.py HMM ParserSupport.pyc SubsMat
bgzf.pyc HotRand.py Pathway SVDSuperimposer
Blast Index.py PDB SwissProt
CAPS init.py Phylo TogoWS
Cluster init.pyc PopGen trie.c
Compass KDTree _py3k triefind.py
cpairwise2module.c KEGG Restriction trie.h
Crystal kNN.py SCOP triemodule.c
Data LogisticRegression.py SearchIO UniGene
DocSQL.py MarkovModel.py Search.py UniProt
Emboss MaxEntropy.py SeqFeature.py _utils.py
Entrez Medline SeqFeature.pyc _utils.pyc
ExPASy Motif SeqIO Wise
File.py motifs Seq.py
File.pyc NaiveBayes.py Seq.pyc
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/Bio$ cd ..
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio megahit
blast metaSpades
CHANGELOG mitofinder
circularizationCheck.py Mitofinder.config
circularizationCheck.pyc README.md
export_mitochondrial_contigs.py rename_fasta_seqID.py
extract_genes.py runIDBA.py
extract_seq.py runIDBA.pyc
FirstBuildChecker.py runMegahit.py
FirstBuildChecker.pyc runMegahit.pyc
genbankOutput.py runMetaspades.py
genbankOutput.pyc runMetaspades.pyc
geneChecker_fasta.py sort_gff.py
geneChecker.py testcase
geneChecker.pyc tRNAscan
idba tRNAscanChecker_arwen.py
image tRNAscanChecker.py
install.sh tRNAscanChecker.pyc
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ ./install.sh
./build.sh: line 1: aclocal: command not found
./build.sh: line 2: autoconf: command not found
./build.sh: line 3: automake: command not found
./build.sh: line 4: ./configure: No such file or directory
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
ln: failed to create symbolic link '/home/florian/Downloads/MitoFinder-master/metaSpades/bin/metaspades.py': File exists
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ emacs install.sh -nwflorian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ which bash
/bin/bash
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ emacs install.sh -nwflorian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ./install.sh
bash: ./install.sh: /usr/bin/bash: bad interpreter: No such file or directory
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ emacs install.sh -nwflorian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ./install.sh
./build.sh: line 1: aclocal: command not found
./build.sh: line 2: autoconf: command not found
./build.sh: line 3: automake: command not found
./build.sh: line 4: ./configure: No such file or directory
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
ln: failed to create symbolic link '/home/florian/Downloads/MitoFinder-master/metaSpades/bin/metaspades.py': File exists
florian@florian-ThinkPad-P50:~/Downloads/MitoFinder-master$ cat install.sh
#!/bin/bash

chmod 764 ///
chmod 764 //*
chmod 764 /
chmod 764 *
cd idba
./build.sh
cd ..
cd metaSpades/bin/
p=$(pwd)
ln -s "$p"/spades.py "$p"/metaspades.py
cd ../../
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio megahit
blast metaSpades
CHANGELOG mitofinder
circularizationCheck.py Mitofinder.config
circularizationCheck.pyc README.md
export_mitochondrial_contigs.py rename_fasta_seqID.py
extract_genes.py runIDBA.py
extract_seq.py runIDBA.pyc
FirstBuildChecker.py runMegahit.py
FirstBuildChecker.pyc runMegahit.pyc
genbankOutput.py runMetaspades.py
genbankOutput.pyc runMetaspades.pyc
geneChecker_fasta.py sort_gff.py
geneChecker.py testcase
geneChecker.pyc tRNAscan
idba tRNAscanChecker_arwen.py
image tRNAscanChecker.py
install.sh tRNAscanChecker.pyc
install.sh

florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ cd idba/
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 build.sh configure.ac install-sh NEWS stamp-h1
AUTHORS ChangeLog depcomp lib README.md test
bin config.h Dockerfile Makefile.am script WORKSPACE
BUILD config.h.in gtest_src missing src
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ ./config.h
./config.h: line 1: /bin: Is a directory
./config.h: line 2: /bin: Is a directory
./config.h: command substitution: line 4: unexpected EOF while looking for matching '' ./config.h: command substitution: line 14: syntax error: unexpected end of file ./config.h: line 4: /bin: Is a directory ./config.h: line 73: unexpected EOF while looking for matching ''
./config.h: line 75: syntax error: unexpected end of file
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ lsls

Command 'lsls' not found, did you mean:

command 'fsls' from deb python-fs
command 'lsns' from deb util-linux

Try: sudo apt install

florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 bin build.sh config.h configure.ac Dockerfile install-sh Makefile.am NEWS script stamp-h1 WORKSPACE
AUTHORS BUILD ChangeLog config.h.in depcomp gtest_src lib missing README.md src test
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ./configure
bash: ./configure: No such file or directory
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ less install-sh
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ less README.md
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ ./configure.ac
./configure.ac: line 4: syntax error near unexpected token 2.59' ./configure.ac: line 4: AC_PREREQ(2.59)'
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ./configure
bash: ./configure: No such file or directory
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ make
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 bin build.sh config.h configure.ac Dockerfile install-sh Makefile.am NEWS script stamp-h1 WORKSPACE
AUTHORS BUILD ChangeLog config.h.in depcomp gtest_src lib missing README.md src test
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ less install-sh
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ./install-sh
./install-sh: no input file specified.
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ cd bin/
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba/bin$ ls
Makefile.am
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba/bin$ make
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba/bin$ ls
Makefile.am
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba/bin$ cd ..
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 bin build.sh config.h configure.ac Dockerfile install-sh Makefile.am NEWS script stamp-h1 WORKSPACE
AUTHORS BUILD ChangeLog config.h.in depcomp gtest_src lib missing README.md src test
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ cd src/
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba/src$ ls
assembly basic container graph misc release sequence test tools
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba/src$ cd ..
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ l
saclocal.m4* BUILD* config.h* depcomp* install-sh* missing* script/ test/
AUTHORS* build.sh* config.h.in* Dockerfile* lib/ NEWS* src/ WORKSPACE*
bin/ ChangeLog* configure.ac* gtest_src/ Makefile.am* README.md* stamp-h1*
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ less BUILD
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/idba$ ls
aclocal.m4 bin build.sh config.h configure.ac Dockerfile install-sh Makefile.am NEWS script stamp-h1 WORKSPACE
AUTHORS BUILD ChangeLog config.h.in depcomp gtest_src lib missing README.md src test
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/idba$ cd ..
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh~ runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ make
make: *** No targets specified and no makefile found. Stop.
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ ös
ös: command not found
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh
runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ cd ..
florian@florian-ThinkPad-P50:
/Downloads$ ls
Fake-Gelelektrophorese.pdf MitoFinder-master MitoFinder-master.zip
florian@florian-ThinkPad-P50:/Downloads$ cd MitoFinder-master/
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh~ runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ less CHANGELOG
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh~ runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ sh install.sh
./build.sh: 1: ./build.sh: aclocal: not found
./build.sh: 2: ./build.sh: autoconf: not found
./build.sh: 3: ./build.sh: automake: not found
./build.sh: 4: ./build.sh: ./configure: not found
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
ln: failed to create symbolic link '/home/florian/Downloads/MitoFinder-master/metaSpades/bin/metaspades.py': File exists
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ cd blast/
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master/blast$ ls
bin ChangeLog doc LICENSE ncbi_package_info README
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master/blast$ cd ..
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh
runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ less Mitofinder.config
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$ less mitofinder
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ ls
Bio FirstBuildChecker.py image rename_fasta_seqID.py testcase
blast FirstBuildChecker.pyc install.sh runIDBA.py tRNAscan
CHANGELOG genbankOutput.py install.sh
runIDBA.pyc tRNAscanChecker_arwen.py
circularizationCheck.py genbankOutput.pyc megahit runMegahit.py tRNAscanChecker.py
circularizationCheck.pyc geneChecker_fasta.py metaSpades runMegahit.pyc tRNAscanChecker.pyc
export_mitochondrial_contigs.py geneChecker.py mitofinder runMetaspades.py
extract_genes.py geneChecker.pyc Mitofinder.config runMetaspades.pyc
extract_seq.py idba README.md sort_gff.py
florian@florian-ThinkPad-P50:/Downloads/MitoFinder-master$ sudo ./install.sh
./build.sh: line 1: aclocal: command not found
./build.sh: line 2: autoconf: command not found
./build.sh: line 3: automake: command not found
./build.sh: line 4: ./configure: No such file or directory
make: *** No rule to make target 'clean'. Stop.
make: *** No targets specified and no makefile found. Stop.
ln: failed to create symbolic link '/home/florian/Downloads/MitoFinder-master/metaSpades/bin/metaspades.py': File exists
florian@florian-ThinkPad-P50:
/Downloads/MitoFinder-master$

NCBI submission

Hi Remi,
I used MitoFinder 1.4.1. (singularity) to annotate mitochondrial genes from UCE library assemblies. I then used the script create_tbl2asn_files.py to generate sqn files for NCBI submission. However, when I ran table2asn (the updated version of tbl2asn) to validate my sequences, I got warnings like the following in the .val file:

Warning: valid [SEQ_FEAT.PartialProblemNotSpliceConsensus3Prime] 3' partial is not at end of sequence, gap, or consensus splice site FEATURE: CDS: NADH dehydrogenase subunit 4 <7> [lcl|NODE_1_length_16677_cov_24.806642:c6164-<4788] [lcl|NODE_1_length_16677_cov_24.806642: raw, dna len= 16677] -> [lcl|NODE_1_length_16677_cov_24.806642_4]

I only got one mitochondrial contig in this particular case and the length seems near complete. But I wonder if this warning indicates something wrong with my MitoFinder annotation. Could you shed some light on it? Thanks a lot!

Best,
Ling

Error while annotating assembled mitochondrion

Hi!
I have an assembled mitochondrion from a microalgae (Scenedesmaceae). I was trying to annotate it with this command line:

mitofinder -j Sdimorphus -a mt_Sdim_flye_consensus.fasta -r mitochondrions_sphaeropleales_refseq.gb -o 22

But I got this error:

Traceback (most recent call last):
File "/home/kim/MitoFinder/mitofinder", line 635, in
geneOut=open("ref_"+name+"_database.fasta",'a')
IOError: [Errno 2] No such file or directory: 'ref_hypotheticalreversetranscriptase/maturase_database.fasta'

I guess it has something to do with the reference file? I downloaded a genbank file directly from NCBI using "mitochondrion" AND "sphaeropleales" search terms and filtering by RefSeq and sequence length, as described here.

I appreciate any suggestions.

Thank you very much!

./install error

I met errors when I run ./install ,here is the roport:
mkdir -p /home/hm/bioapp/MitoFinder/mitfi/infernal-1.0.2/exec/bin
for file in cmalign cmbuild cmcalibrate cmemit cmscore cmsearch cmstat; do
cp src/$file /home/hm/bioapp/MitoFinder/mitfi/infernal-1.0.2/exec/bin/;
done
cp: cannot stat ‘src/cmalign’: No such file or directory
cp: cannot stat ‘src/cmbuild’: No such file or directory
cp: cannot stat ‘src/cmcalibrate’: No such file or directory
cp: cannot stat ‘src/cmemit’: No such file or directory
cp: cannot stat ‘src/cmscore’: No such file or directory
cp: cannot stat ‘src/cmsearch’: No such file or directory
cp: cannot stat ‘src/cmstat’: No such file or directory
Makefile:205: recipe for target 'install' failed
make: *** [install] Error 1
I checked those directory or file ,they are all exited, but with suffix '.c'.
Is this mean I need to install C++ or something?
Thank you very much for your response. :)

Problem with creating summary statistics for the mtDNA contig

Hello Remi!

I have been trying to run MitoFinder directly on a previously computed assembly via phyluce, and all works well until I receive the following error:

Command line: /home/sallesmath/MitoFinder/mitofinder -j amcc204510 -a amcc204510.contigs.fasta -r sequence.gb -o 2 -p 4

Now running MitoFinder ...

Start time : 2022-04-15 15:19:16

Job name = amcc204510

Creating Output directory : /home/sallesmath/dados_uce/phyluce_dados_uce_tropidurus/mitofinder/amcc204510
All results will be written here

Program folders:
MEGAHIT = /home/sallesmath/MitoFinder/megahit/
MetaSPAdes folder = /home/sallesmath/MitoFinder/metaspades/bin/
IDBA-UD folder = /home/sallesmath/MitoFinder/idba/bin/
Blast folder = /home/sallesmath/ncbi-blast-2.13.0+/bin/
ARWEN folder = /home/sallesmath/MitoFinder/arwen/
MiTFi folder = /home/sallesmath/MitoFinder/mitfi/
tRNAscan-SE folder = /home/sallesmath/MitoFinder/trnascanSE/tRNAscan-SE-2.0/

Formatting database for mitochondrial contigs identification...
Running mitochondrial contigs identification step...

MitoFinder found a single mitochondrial contig
Checking resulting contig for circularization...

Evidences of circularization could not be found, but everyother step was successful

Creating summary statistics for the mtDNA contig
Traceback (most recent call last):
  File "/home/sallesmath/MitoFinder/mitofinder", line 1020, in <module>
    rename = Popen(args1, stdout=open(os.devnull, 'wb'))
  File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

It seems that there's something causing an error in the annotation step. I get final the mtDNA contig found by MitoFinder, but there is just one file present in the final results folder (.info).

I'm not sure what the cause of this error is. Any suggestions?

Best regards,
Matheus.

run time for assembly step very long

Hi!

Thanks for putting this resource together! I'm hoping that you can help me with some issues I'm running into.

I'm trying to use MitoFinder to pull mitochondrial genomes from whole genome shotgun data of non-human primates. MitoFinder is running, but the assembly step is taking a very long time with megahit (> 18 hours with 8 threads and 64 GB of memory) and with metaspades it failed with the error message: "The reads contain too many k-mers to fit into available memory. You need approx. 166.78GB of free RAM to assemble your dataset."

This is much longer and much more memory than is suggested to be necessary in your paper. I'm using paired-end data with ~ 200 million reads total (~100 million per direction). Do I need to downsample my input data or is something else going on? If downsampling is the answer, how many reads would you recommend as input?

Java Error

Hi Remi,
I've been trying to run MitoFinder on a dataset of 34 whole genome files and I'm running into some issues. I'm running this on a cluster using LAUNCHER to run in parallel, and I seem to be alternating between two errors: I'm either getting 1.) ERROR: java is not installed/loaded., and 2.) ERROR: MEGAHIT didn't run well
Please check log file : /scratch/05104/thomm/22111Hil_N22089/22111Hil_H21189_S187_L004/megahit.log
Launcher: Job 11 completed in 597 seconds.
Launcher: Task 92 done. Exiting.

I assumed the second to be an issue with memory allocation, but I'm puzzled with the first error. I have java downloaded in my mitofinder environment. I've also tried loading the cluster's java module. The thing that really confuses me is that I am not getting this error consistently, but every time I think I might have the memory issue solved, I'm hit with the Java error again. Perhaps this is an issue I should take up with my cluster's IT department, but just wanted to see if you had any suggestions.

Thanks!
Thom

AttributeError: 'NoneType' object has no attribute 'extract'

Hi,

I've been running mitofinder on 294 samples, using our own assemblies (-a option) and the program has largely been successful. However, I have 4 samples that keep throwing "AttributeError: 'NoneType' object has no attribute 'extract'". I cannot seem to pinpoint what is specifically causing the issue, but am curious if there are suggestions on what I can try.

/apps/mitofinder/1.0.2/Bio/GenBank/init.py:1096: BiopythonParserWarning: Couldn't parse feature location: '-2..238'
% (location_line)))
Traceback (most recent call last):
File "/apps/mitofinder/1.0.2/mitofinder", line 1264, in
out_fasta.write(str(feature.extract(record).seq) + '\n')
File "/apps/mitofinder/1.0.2/Bio/SeqFeature.py", line 339, in extract
return self.location.extract(parent_sequence)
AttributeError: 'NoneType' object has no attribute 'extract'

Mitochondrial contigs were detected after blast, but no results were written in to <Sample_ID>_mitfi_Final_Results folder

Hi Remi,

Recently, we found mitochondrial contigs were detected according to the blast results (IDXX_blast_ouT.txt), but no results were written into folder <Sample_ID>_mitfi_Final_Results.

We assumed that it was probably because of some threshold settings. So is there any way to change the settings, or is there any other reasons?

Here i attached one example.

73.txt

73.zip

Best wishes,
Menglin

Failure during checking for circularization

Hi I have been running the pipeline and all works well until I receive the following error:

MitoFinder found a single mitochondrial contig
Checking resulting contig for circularization...
Traceback (most recent call last):
File "/home/nikolasjohnston/MitoFinder/mitofinder", line 961, in
fourthStep = Popen(args1, stdout=subprocess.PIPE, stderr=open(os.devnull, 'wb')).communicate()[0]
File "/usr/lib/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

I'm not sure what the cause of this error is. Any suggestions?

Cheers,
Nik

Annotation with MitFi fails (v1.4)

Hi Rémi, hi Frederic,

I thought it would be better to put this in a new issue. I downloaded version 1.4 and ran MitoFinder on the test_case files.

It looks like the annotation step fails, but I'm not sure why. I'm attaching the stdout file and the geneChecker.log.

The file geneChecker_error.log has the following message:

ln: failed to create symbolic link './cmsearch': File exists
Traceback (most recent call last):
File "/home/mcj43/software/MitoFinder/geneChecker_fasta.py", line 452, in
tRNAs = assemblyCheck.tRNAs
AttributeError: 'bool' object has no attribute 'tRNAs'

Let me know if you need anything else!

Best,
Mareike

geneChecker.log.txt
test_MitoFinder.out.txt

problem_in_poducing_the_[Seq_ID]_final_genes_NT.fasta

Hi Dear
Thanks a lot for this useful program. I have two questions:

  1. I am using mitoFinder for 40 samples. For 7 of them the mitoFinder doesn’t give the final result ” [Seq_ID]final_genes_NT.fasta” and instead of that provide the file “[Seq_ID] mtDNA_contig_genes_NT.fasta”.
    In addition at the end of log files of these 7 samples, just mention “… genes were found in mtDNA_contig” and don’t mention the name of contigs and so on..

Would you please let me know, what is the problem and how can I solve it?

  1. At the end of the log file of some of my samples, I receive one warning message:
    “genes were found more than once suggesting either fragmentation, NUMT annotations, or potential contamination of your sequencing data.
    Different contigs may be part of different organisms.”
    Some of my them have a high coverage and I am sure also they don’t have any contamination.

Now I want to know, can I overlook the warning message?

Please let me know, if you have any idea about my issues

Cheers
Niloo

Using scafSeq output for Phyluce

Hello!

I'm a newbie in bioinformatics and have been recently tackling my way through phyluce
https://phyluce.readthedocs.io/en/latest/tutorial-one.html#finding-uce-loci

Im trying to use match contig to probe function, in which requires the --contig input. I thought I could use scafseq files produced from mitofinder (and it suggests to here!)

however I've been getting the same error that it cannot create the database.
I'm wondering if you or anyone have tried to use .scafseq files to proceed with the UCE mining using phyluce, any ideas would help!

p.s. i already tried renaming the input files and still have been getting the same error.

script:

IN_DIR="/flash/BourguignonU/Nonno/Phyluce/scafseq"
PROBE="/flash/BourguignonU/Nonno/Phyluce/uce-loci/termite-master.fasta"
phyluce_assembly_match_contigs_to_probes \
        --contigs ${IN_DIR} \
        --probes $PROBE \
        --output uce-search-results \
        --log-path log

error:
2020-11-06 16:04:13,217 - phyluce_assembly_match_contigs_to_probes - CRITICAL - Database already exists
2020-11-06 16:04:13,217 - phyluce_assembly_match_contigs_to_probes - CRITICAL - Cannot create database

Problem running the program

Hello, I try to run again MItofinder, but I have the next problem:

Traceback (most recent call last): File "/opt/anaconda3/envs/mitofinder2/bin/MitoFinder/mitofinder", line 1209, in <module> rename = Popen(args1, stdout=open(os.devnull, 'wb')) File "/opt/anaconda3/envs/mitofinder2/lib/python2.7/subprocess.py", line 394, in __init__ errread, errwrite) File "/opt/anaconda3/envs/mitofinder2/lib/python2.7/subprocess.py", line 1047, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

The program seems to found contigs but doesn't finish

I hope you can help me with this, I ran Mitofinder last year without problems

Best

MitoFinder not installing in mac. Kindly help!

configure: Configuring Infernal for your system.
checking build system type... configure: error: /bin/sh ./config.sub -apple-darwin22.3.0 failed
configure: WARNING: cache variable ac_cv_build contains a newline
make: *** No targets specified and no makefile found. Stop.
make: *** No rule to make target `install'. Stop.

Evidences of circularization could not be found, but everyother step was successful

Hello,
I have been trying to assemble a full mitogenome with MitoFinder using megahit. I have been able to assemble a contig of ~16,500 bp, which should be quite close to the size of the genome. However, it seems that I am falling short in the last step, as it does not find "Evidences of circularization"... Could you give me a hint or recommendation of how I should proced to fully generate a whole mitogenome that finds circularization? Please bear in mind I do not have a available reference genome for the species I am dealing with. Some details regarding the input: They are 33 million paired end reads with an insert size ~350bp, which were trimmed and quality filtered with AdapterRemoval. I also leave here the log file.

PRN_megahit_MitoFinder.log

Annotating error

Error when I run: mitofinder -j Phidippus_adonis_DGF0948 -a Phidippus_adonis_DGF0948.contigs.fasta -r Habronattus_oregonensis_genome.gb -o 5 -p 4 -m 8

Some advice?

Traceback (most recent call last): File "/opt/anaconda3/envs/mitofinder/bin/mitofinder", line 1139, in <module> if check_if_string_in_file(pathtowork + "/geneChecker.log","MiTFi failed.") or (os.stat(pathOfFinalResults+"MiTFi.log").st_size != 0 and not check_if_string_in_file(pathOfFinalResults+"MiTFi.log","hits")) : OSError: [Errno 2] No such file or directory: '/Users/luiscarloshernandez/Desktop/UCEs-Phidippus/Analysis/Mitofinder analisis/Phidippus_adonis_DGF0948/Phidippus_adonis_DGF0948_MitoFinder_mitfi_Final_Results/MiTFi.log' (mitofinder) luiscarloshernandez@MacBook-Air-de-Luis Mitofinder analisis % mitofinder -j Phidippus_adonis_DGF0948_CO1 -a Phidippus_adonis_DGF0948.contigs.fasta -r COI_reference_Phidippus.gb -o 5 -p 4 -m 8

Thanks a lot

Not finding the mito in assembled CLR data

Hello,

I tried to find the mitochondrial sequences in my assembly.
I assembled CLR PacBio reads using CANU for a mollusc genome.
As reference I used 2 mitochondrial genomes of the same species and I used the genetic code 5 (Mito Invertebrates), but Mitofinder said that it could not find mitochondrial sequence in contigs less than 25 000 bp.

Does anyone know what is happening ? Is it better to use MitoFinder on the reads for that particular case ?

Thanks for your help

Aurélien

tRNAs and rRNAs not printed in final reports

Dear Remi,

We just tried Mitofinder but encountered some problem. Everything seems to perform correctly, and tRNAs annotation with Arwen ran well, as you can see below:

Starting Assembly step with MetaSPAdes 
Result files will be saved here: 
/work/BourguignonU/Menglin/Phylogeny/Neotropical_termite/NovoGen/Mitofinder/1197/1197_metaspades/
Formatting database for mitochondrial contigs identification...
Running mitochondrial contigs identification step...

MitoFinder found a single mitochondrial contig
Checking resulting contig for circularization...

Evidences of circularization could not be found, but everyother step was successful
Creating summary statistics for the mtDNA contig

Annotating

tRNA annotation with Arwen run well.

Annotation completed
Creating GFF and fasta files.

Note: 
15 genes were found in mtDNA_contig

However, the sequences are not printed out in the final report (see gb file attachment).

Any idea of what is going wrong?

Thanks in advance for looking into it,
Simon

1197_mtDNA_contig.gb.txt

cElementTree.ParseError

Hi Rémi,

I have been annotating a lot of samples recently with MitoFinder v1.4, but one consistently fails for no obvious reasons.

Here is what the log says:

Command line: /apps/unit/BourguignonU/MitoFinder/1.4/mitofinder --seqid Glossotermes_M2_ID283_S26_L002 --assembly M2_ID283_S26_L002_spades_scaffolds.fasta --tRNA-annotation mitfi --processors 20 --organism 5 --refseq sequence.gb

Start time : 2022-10-13 13:08:37
Job name = M2_ID283_S26_L002

Creating Output directory : M2_ID283_S26_L002
All results will be written here
Program folders:
MEGAHIT = /apps/unit/BourguignonU/MitoFinder/1.4/megahit/
Blast folder = /apps/unit/BourguignonU/MitoFinder/1.4/blast/bin/
IDBA-UD folder = /apps/unit/BourguignonU/MitoFinder/1.4/idba/bin/
MetaSPAdes folder = /apps/unit/BourguignonU/MitoFinder/1.4/metaspades/bin/
ARWEN folder = /apps/unit/BourguignonU/MitoFinder/1.4/arwen/
MiTFi folder = /apps/unit/BourguignonU/MitoFinder/1.4/mitfi/
tRNAscan-SE folder = /apps/unit/BourguignonU/MitoFinder/1.4/trnascanSE/tRNAscan-SE-2.0/

Formatting database for mitochondrial contigs identification...
Running mitochondrial contigs identification step...

MitoFinder found 176 contigs matching provided mitochondrial reference(s)
Did not check for circularization

.........
.........
.........

Creating summary statistics for mtDNA contig 65
Looking for best reference genes for mtDNA contig 65
Annotating mtDNA contig 65
tRNA annotation with MitFi run well.
Annotation completed

Creating summary statistics for mtDNA contig 66
Looking for best reference genes for mtDNA contig 66
Annotating mtDNA contig 66
tRNA annotation with MitFi run well.
ERROR: Gene annotation failed for mtDNA contig 66.
Please check  M2_ID283_S26_L002/geneChecker_error.log to see what happened
Aborting

And here is what is found in M2_ID283_S26_L002/geneChecker_error.log:

Traceback (most recent call last):
  File "/apps/unit/BourguignonU/MitoFinder/1.4/geneChecker_fasta.py", line 445, in <module>
    x = geneCheck(fastaReference, resultFile, percent_equality_prot, percent_equality_nucl, True, blastFolder, organismType, alignCutOff)
  File "/apps/unit/BourguignonU/MitoFinder/1.4/geneChecker_fasta.py", line 263, in geneCheck
    for qresult in blastparse: #in each query, let's look for a good hit
  File "/hpcshare/appsunit/BourguignonU/MitoFinder/1.4/Bio/SearchIO/__init__.py", line 314, in parse
    generator = iterator(source_file, **kwargs)
  File "/hpcshare/appsunit/BourguignonU/MitoFinder/1.4/Bio/SearchIO/BlastIO/blast_xml.py", line 190, in __init__
    self._meta, self._fallback = self._parse_preamble()
  File "/hpcshare/appsunit/BourguignonU/MitoFinder/1.4/Bio/SearchIO/BlastIO/blast_xml.py", line 204, in _parse_preamble
    for event, elem in self.xml_iter:
  File "<string>", line 107, in next
cElementTree.ParseError: no element found: line 1, column 0

I checked the contig #66 but could not find any issue with it.

Any idea what is happening here?

Thanks in advance for your reply.

Cheers,
Simon

MEGAHIT didn't run well - OSError: [Errno 2] No such file or directory: /tmp/k29/29.sdbg.24

Hello! I am running MitoFinder v1.4.1 installed with conda on a linux server and my job is failing during MEGAHIT. My command:

module load anaconda3/2021.05
source activate mitofinder 
cd /home/mdebiasse/borgstore/dlab/01-CCGP/03-SCARIOSUS/08-MITOFINDER
/home/mdebiasse/.conda/envs/mitofinder/bin/mitofinder -j semi_cari -1 /home/mdebiasse/borgstore/dlab/01-CCGP/01-DATA/18-SCARI_OMNIC/ftps.fulgentgenetics.com/SE8481/SA190613_14_trim_1.fastq -2 /home/mdebiasse/borgstore/dlab/01-CCGP/01-DATA/18-SCARI_OMNIC/ftps.fulgentgenetics.com/SE8481/SA190613_14_trim_2.fastq -r NC_039849.gb -o 5 -p 28 -m 240

The output file says MEGAHIT didnt run well. The megahit.log states:

2023-05-10 13:33:21 - MEGAHIT v1.2.9
2023-05-10 13:33:21 - Using megahit_core with POPCNT and BMI2 support
2023-05-10 13:33:22 - Convert reads to binary library
2023-05-10 13:44:53 - INFO  sequence/io/sequence_lib.cpp  :   75 - Lib 0 (/home/mdebiasse/borgstore/dlab/01-CCGP/01-DATA/18-SCARI_OMNIC/ftps.fulgentgenetics.com/SE8481/SA190613_14_trim_1.fastq,/home/mdebiasse/borgstore/dlab/01-CCGP/01-DATA/18-SCARI_OMNIC/ftps.fulgentgenetics.com/SE8481/SA190613_14_trim_2.fastq): pe, 740058186 reads, 151 max length
2023-05-10 13:44:53 - INFO  utils/utils.h                 :  152 - Real: 691.9458	user: 510.5266	sys: 61.5730	maxrss: 250200
2023-05-10 13:44:53 - k-max reset to: 141
2023-05-10 13:44:53 - Start assembly. Number of CPU threads 28
2023-05-10 13:44:53 - k list: 21,29,39,59,79,99,119,141
2023-05-10 13:44:53 - Memory used: 240000000000
2023-05-10 13:44:53 - Extract solid (k+1)-mers for k = 21
2023-05-10 14:03:56 - Build graph for k = 21
2023-05-10 14:07:56 - Assemble contigs from SdBG for k = 21
2023-05-10 15:25:26 - Local assembly for k = 21
2023-05-10 15:42:31 - Extract iterative edges from k = 21 to 29
2023-05-10 16:06:34 - Build graph for k = 29
2023-05-10 16:09:57 - Assemble contigs from SdBG for k = 29
Traceback (most recent call last):
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 1038, in <module>
    main()
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 1015, in main
    assemble(next_k)
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 264, in checked_or_call
    func(*args, **kwargs)
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 904, in assemble
    remove_temp_after_assemble(cur_k)
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 656, in remove_temp_after_assemble
    remove_if_exists(graph_prefix(kmer_k) + '.sdbg.' + str(i))
  File "/home/mdebiasse/.conda/envs/mitofinder/bin/megahit", line 127, in remove_if_exists
    os.remove(file_name)
OSError: [Errno 2] No such file or directory: '/mnt/borgstore/mdawson/dlab/01-CCGP/03-SCARIOSUS/08-MITOFINDER/semi_cari/semi_cari_megahit/tmp/k29/29.sdbg.24'

When I cd into /tmp/k29/ I see these files:

29.edges.info  29.sdbg.25  29.sdbg.26  29.sdbg.27  29.sdbg_info

I can provide the record of the conda install if that is useful. Any advice to correct this error is appreciated! Thank you!

"gene not recognized"

Hi Remi,

I am having more trouble with gene name errors in v 1.4 than 1.2. Clearly. the gene COX2 is in the ref list. Any ideas? I could use --new genes or --ignore but wondering if I am missing something.
Thanks!
Andrea

ERROR: Gene named "COX2" in the reference file(s) are not recognized by MitoFinder
This gene is not a standard mitochondrial gene (use --ignore or --new-genes options) or please change it to one of the following gene names:
COX1; COX2; COX3; CYTB; ND1; ND2; ND3; ND4; ND4L; ND5; ND6; ATP8; ATP6; rrnL; rrnS

Varying Lengths and Missing Genes in Outputs

I am trying to assemble whole mitogenomes in MitoFinder using MetaSPAdes as the assembler. I have clean pair ended UCE data of 4 octocoral species (90 specimens for each species) and am using a reference genome of one of the species that is also closely related to the other 3 species from GenBank that is 18,947 bp long, however when using mitofinder, my outputs vary in length. Roughly half of my assemblies do not contain all the 16 genes in my outputs and for the other half that do have every gene accounted for, they vary in length. Some are ~24,000 bp long, whereas some are ~ 15,000 bp and I am expecting them to be the similar lengths to the reference.

The main problems I’m unsure about are why do my assembly outputs vary so much in length? I am wanting to create whole genomes for each specimen. For the assemblies > 20,000 bp, how can I tell which of these data is additional (potential duplicate information), and thus remove it so I am able to create reliable whole mitogenomes? For the genomes that are smaller than the reference genome, what reasons could be behind why they are missing information in the final outputs?

I can upload any files if that would help but I would really appreciate feedback on what could be causing these issues in my outputs as I am confused and not sure why I’m seeing this.

Thank you very much!

subprocess.py

Hi Remi,

Have you seen this error before? I checked my /usr/lib/python2.7 directory and that subprocess.py file is present.

Cheers,
Andrea

Formatting database for mitochondrial contigs identification...
Running mitochondrial contigs identification step...

MitoFinder found 3 contigs matching provided mitochondrial reference(s)
Did not check for circularization

Creating summary statistics for mtDNA contig 1

Traceback (most recent call last):
File "/data/mcfadden/aquattrini/PROGRAMS/MitoFinder/mitofinder", line 1209, in
rename = Popen(args1, stdout=open(os.devnull, 'wb'))
File "/usr/lib/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Missing [Seq_ID]_final_genes_NT.fasta file in final results

Dear Rémi,

These are rather general question but I think this is still the best channel to ask them since you were very helpful with a previous inquiry of mine.
I used the following array script to run Mitofinder with Megahit. I had done it before with a very similar script and in the end I got a [Seq_ID]_final_genes_NT.fasta file containing all genes found. Nonetheless, this time I am only getting .gb and .fasta files for the contigs. Is there any particular reason why this might be happening?
Also, do you have any recommendations about how to do a "scaffolding" with the assembled contigs? For some samples there is only a single contig of around 15500bp (which is expected for my species), but some times there are up to 6 contigs.

The script is as follows:

#!/bin/bash

#PBS -N mitofinder_ARRAY_SPopGen_2023
#PBS -l select=1:ncpus=8:mem=50gb:scratch_local=300gb
#PBS -l walltime=47:59:00
#PBS -o /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/PBS/out/^array_index^_mitofinder_ARRAY_SPopGen.txt
#PBS -e /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/PBS/err/^array_index^_mitofinder_ARRAY_SPopGen.txt
#PBS -J 1-88

#clean scratch after the end
trap 'clean_scratch' TERM EXIT

# go to scratch directory
cd $SCRATCHDIR || exit 1

export TMPDIR=$SCRATCHDIR

source /storage/brno2/home/pedroribeiro/.bashrc

forward=`sed -n -e "$PBS_ARRAY_INDEX p" /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/forward.list`
reverse=`sed -n -e "$PBS_ARRAY_INDEX p" /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/reverse.list`
seqid=`sed -n -e "$PBS_ARRAY_INDEX p" /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/seqid.list`

cp /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/$forward.fastq $SCRATCHDIR
cp /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/$reverse.fastq $SCRATCHDIR

singularity run -B $SCRATCHDIR  /storage/brno2/home/pedroribeiro/software/mitofinder_latest.sif \
 --megahit -j $seqid.mitogenome \
 -1 /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/$forward.fastq \
 -2 /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/raw_reads/$reverse.fastq \
 -r /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/mito_reference/mito_reference.gb \
 -o 5 \
 -p 8 \
 -m 50

cp -r *.mitogenome* /storage/brno2/home/pedroribeiro/Projects/Spicauda_PopGen/out_mitogenes_for_iqtree/

Thank you very much!

All the best,

Pedro Ribeiro

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.