Giter Site home page Giter Site logo

Comments (4)

fasnicar avatar fasnicar commented on September 18, 2024

Hi!

  1. the scripts in the examples folders contain the .py because they assume you got PhyloPhlAn by cloning the repo and through the conda package (which will not provide the examples).

  2. Apologies. The script is not updated with the latest version of the phylophlan_metagenomic.py script. Updated now (136f2f1) in the repo. Basically, the -d is now a required parameter, this to ensure no different database version are used to process batches of samples of the same project.

  3. This shouldn't happen if you already have the phylophlan_metagenomic.txt file. Is the file in the correct folder?

from phylophlan.

Hocnonsense avatar Hocnonsense commented on September 18, 2024

As we all known (laugh cry), we cannot download from dropbox directly.

And my bash history is like this:

(phylophlan) [clsxx@cas556 ~/Work/2020-09-MgAffect/Analyze/phylophlan]$phylophlan \
>     -i ${bin_dir} \
>     -d phylophlan --diversity high -f supertree_aa.cfg \
>     --genome_extension .fa \
>     --nproc 4 \
>     --maas phylophlan_substitution_models/phylophlan.tsv \
>     --verbose\
>     2>&1 | tee tmp2.log
PhyloPhlAn version 3.0.59 (10 November 2020)

Command line: /lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/phylophlan -i /lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect//F-06-MAG/03_modify/7_final/ -d phylophlan --diversity high -f supertree_aa.cfg --genome_extension .fa --nproc 4 --maas phylophlan_substitution_models/phylophlan.tsv --verbose

Automatically setting "input=7_final" and "input_folder=/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify"
[e] "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/lib/python3.9/site-packages/PhyloPhlAn-3.0.1-py3.9.egg/phylophlan/phylophlan_configs/" folder does not exists
Creating folder "7_final_phylophlan"
Creating folder "7_final_phylophlan/tmp"
"high-accurate" preset
Setting "sort=True" because "database=phylophlan"
Setting "min_num_markers=100" since no value has been specified and the "database=phylophlan"     
Arguments: {'input': '7_final', 'clean': None, 'output': '7_final_phylophlan', 'database': 'phylophlan', 'db_type': None, 'config_file': 'supertree_aa.cfg', 'diversity': 'high', 'accurate': True, 
'fast': False, 'clean_all': False, 'database_list': False, 'submat': 'pfasum60', 'submat_list': False, 'submod_list': False, 'nproc': 4, 'min_num_proteins': 1, 'min_len_protein': 50, 'min_num_markers': 100, 'trim': 'greedy', 'gap_perc_threshold': 0.67, 'not_variant_threshold': 0.95, 'subsample': <function twentyfive at 0x2b6080061f70>, 'unknown_fraction': 0.3, 'scoring_function': <function trident at 0x2b60800621f0>, 'sort': True, 'remove_fragmentary_entries': False, 'fragmentary_threshold': 0.75, 'min_num_entries': 4, 'maas': 'phylophlan_substitution_models/phylophlan.tsv', 'remove_only_gaps_entries': False, 'mutation_rates': False, 'force_nucleotides': False, 'input_folder': 
'/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final', 'data_folder': '7_final_phylophlan/tmp', 'databases_folder': 'phylophlan_databases/', 'submat_folder': '/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/lib/python3.9/site-packages/PhyloPhlAn-3.0.1-py3.9.egg/phylophlan/phylophlan_substitution_matrices/', 'submod_folder': 'phylophlan_substitution_models/', 'configs_folder': None, 'output_folder': '', 'genome_extension': '.fa', 'proteome_extension': '.faa', 'update': False, 'verbose': True}
Loading configuration file "supertree_aa.cfg"
Checking configuration file
Checking "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/diamond"
Checking "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/mafft"
Checking "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/trimal"
Checking "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/FastTree"
Checking "/lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/raxmlHPC"
Checking "java"
File "phylophlan_databases/phylophlan_databases.txt" present
Downloading "https://zenodo.org/record/4005620/files/phylophlan.tar?download=1" to "phylophlan_databases/phylophlan.tar"
Downloading file of size: 64.05 MB
^C03 MB 4.73 %   0.37 MB/sec  2 min 47 sec
(phylophlan) [clsxx@cas556 ~/Work/2020-09-MgAffect/Analyze/phylophlan]$ls -l phylophlan_databases/
total 3112
-rw-rw-r-- 1 clsxx clsxx     323 Nov 13 16:47 phylophlan_databases.txt  
-rw-rw-r-- 1 clsxx clsxx    3027 Nov 13 16:47 phylophlan_metagenomic.txt
-rw-rw-r-- 1 clsxx clsxx 3178496 Nov 13 16:51 phylophlan.tar
(phylophlan) [clsxx@cas556 ~/Work/2020-09-MgAffect/Analyze/phylophlan]$phylophlan_metagenomic     
-i ${bin_dir}     -d SGB.Jul20     --database_folder ~/software/phylophlan/phylophlan_databases   
  --nproc 4     --verbose     2>&1 | tee tmp1.log
phylophlan_metagenomic.py version 3.0.34 (18 August 2020)

Command line: /lustre/home/acct-clsxx/clsxx/software/anaconda3/envs/phylophlan/bin/phylophlan_metagenomic -i /lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect//F-06-MAG/03_modify/7_final/ -d SGB.Jul20 --database_folder /lustre/home/acct-clsxx/clsxx/software/phylophlan/phylophlan_databases --nproc 4 --verbose

Setting --database_folder to "/lustre/home/acct-clsxx/clsxx/software/phylophlan/phylophlan_databases"
Setting input extension to ".fa"
Setting output prefix to "/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final"
Output prefix is a folder, setting it to "/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final"
Folder "/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final_sketches" already present
Folder "/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final_sketches/inputs" already present
Folder "/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final_dists" already present

Arguments: {'input': '/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect//F-06-MAG/03_modify/7_final/', 'output_prefix': '/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final', 'database': 'SGB.Jul20', 'database_list': False, 'database_update': False, 'input_extension': '.fa', 'how_many': 10, 'nproc': 4, 'database_folder': '/lustre/home/acct-clsxx/clsxx/software/phylophlan/phylophlan_databases', 'only_input': False, 'add_ggb': False, 'add_fgb': False, 'overwrite': False, 'verbose': True, 'mapping': 'SGB.Jul20.txt.bz2'}

Checking "mash"
Downloading "https://www.dropbox.com/s/xdqm836d2w22npb/phylophlan_metagenomic.txt?dl=1" to "phylophlan_metagenomic.txt"
[e] unable to download "https://www.dropbox.com/s/xdqm836d2w22npb/phylophlan_metagenomic.txt?dl=1"
(phylophlan) [clsxx@cas556 ~/Work/2020-09-MgAffect/Analyze/phylophlan]$

from phylophlan.

fasnicar avatar fasnicar commented on September 18, 2024

Hi and thanks for the log.

As you can see phylophlan_metagenomic.py set as database_folder the path /lustre/home/acct-clsxx/clsxx/software/phylophlan/phylophlan_databases.

Arguments: {'input': '/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect//F-06-MAG/03_modify/7_final/', 'output_prefix': '/lustre/home/acct-clsxx/clsxx/Work/2020-09-MgAffect/F-06-MAG/03_modify/7_final/7_final', 'database': 'SGB.Jul20', 'database_list': False, 'database_update': False, 'input_extension': '.fa', 'how_many': 10, 'nproc': 4, 'database_folder': '/lustre/home/acct-clsxx/clsxx/software/phylophlan/phylophlan_databases', 'only_input': False, 'add_ggb': False, 'add_fgb': False, 'overwrite': False, 'verbose': True, 'mapping': 'SGB.Jul20.txt.bz2'}

which is not the location where you downloaded the phylophlan_metagenomic.txt file (appears to be ~/Work/2020-09-MgAffect/Analyze/phylophlan/phylophlan_databases/).

Se, when running phylophlan_metagenomic.py you can specify as database folder the path to the folder containing the phylophlan_metagenomic.txt file you downloaded, using the --database_folder param, and this should solve the download issue.

Many thanks,
Francesco

from phylophlan.

Hocnonsense avatar Hocnonsense commented on September 18, 2024

Now I found the bug:
In phylophlan.py, you modified database_download:

database_download = os.path.join(args.databases_folder, os.path.basename(DATABASE_DOWNLOAD_URL).replace('?dl=1', ''))
download(DATABASE_DOWNLOAD_URL, database_download, overwrite=args.update, verbose=args.verbose)

However, in phylophlan_metagenomic.py, file will be downloaded to current path:
sgbs_url = os.path.basename(DOWNLOAD_URL).replace('?dl=1', '')
download(DOWNLOAD_URL, sgbs_url, verbose=args.verbose)

from phylophlan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.