I encounter an error trying to build a krakenuniq database. Is it an error that you have encountered before?
flextaxd-create -db databases/NCBI_GTDB_merge.db -o taxonomy_krakenuniq --genomes_path "/shared-nfs/MEN/silico_reads/gtdb_202/krakenuniq_gtdb_202_no_dust/library/" -p 30 --verbose --logs build_kraken_logs --create_db --db_name krakenuniqdb_test --dbprogram krakenuniq --test
2021-12-03 09:51:24,361 create_databases [INFO ] FlexTaxD-create logging initiated!
2021-12-03 09:51:24,366 create_databases [INFO ] Processing files; create kraken seq.map
2021-12-03 09:51:24,367 DatabaseConnection [INFO ] databases/NCBI_GTDB_merge.db opened successfully.
2021-12-03 09:51:24,693 ProcessDirectory [INFO ] Number of genomes annotated in database 265625
2021-12-03 09:51:24,693 ProcessDirectory [INFO ] Process genome path (/shared-nfs/MEN/silico_reads/gtdb_202/krakenuniq_gtdb_202_no_dust/library/)
2021-12-03 09:51:25,310 ProcessDirectory [INFO ] Processed 57311 genomes
2021-12-03 09:51:25,539 create_databases [INFO ] Genome annotations with no matching source: 220070
2021-12-03 09:51:25,798 create_databases [INFO ] Loading module: CreateKrakenDatabase
2021-12-03 09:51:25,812 create_databases [INFO ] Get genomes from input directory!
2021-12-03 09:51:25,812 DatabaseConnection [INFO ] databases/NCBI_GTDB_merge.db opened successfully.
2021-12-03 09:51:26,133 CreateKrakenDatabase [INFO ] krakenuniqdb_test
2021-12-03 09:51:26,192 create_databases [INFO ] --- process finished in 0 minutes 1.8346500396728516 seconds---
2021-12-03 09:51:26,192 CreateKrakenDatabase [INFO ] Test use only 10 genomes
2021-12-03 09:51:26,195 CreateKrakenDatabase [INFO ] Create library directory
2021-12-03 09:51:26,199 CreateKrakenDatabase [INFO ] Processing files; create kraken seq.map
2021-12-03 09:51:27,360 CreateKrakenDatabase [INFO ] Number of genomes succesfully added to the krakenuniq database: 10
2021-12-03 09:51:27,360 create_databases [INFO ] Genome folder preprocessing completed!
2021-12-03 09:51:27,360 create_databases [INFO ] --- process finished in 0 minutes 3.002981424331665 seconds---
2021-12-03 09:51:27,360 create_databases [INFO ] Create database
2021-12-03 09:51:27,360 CreateKrakenDatabase [INFO ] mkdir -p krakenuniqdb_test/taxonomy
2021-12-03 09:51:27,397 CreateKrakenDatabase [INFO ] cp taxonomy_krakenuniq/*.dmp krakenuniqdb_test/taxonomy
2021-12-03 09:51:27,796 CreateKrakenDatabase [INFO ] cp taxonomy_krakenuniq/*.map krakenuniqdb_test
2021-12-03 09:51:27,796 CreateKrakenDatabase [INFO ] krakenuniq-build --build --db krakenuniqdb_test --threads 30
Unknown option: skip-maps
Usage: krakenuniq-build [task option] [options]
Task options (exactly one can be selected -- default is build):
--download-taxonomy Download NCBI taxonomic information
--download-library TYPE Download partial library (TYPE = one of "refseq/bacteria", "refseq/archaea", "refseq/viral").
Use krakenuniq-download for more options.
--add-to-library FILE Add FILE to library
--build Create DB from library (requires taxonomy d/l'ed and at
least one file in library)
--rebuild Create DB from library like --build, but remove
existing non-library/taxonomy files before build
--clean Remove unneeded files from a built database
--shrink NEW_CT Shrink an existing DB to have only NEW_CT k-mers
--standard Download and create default database, which contains complete genomes
for archaea, bacteria and viruses from RefSeq, as well as viral strains
from NCBI. Specify --taxids-for-genomes and --taxids-for-sequences
separately, if desired.
--help Print this message
--version Print version information
Options:
--db DBDIR Kraken DB directory (mandatory except for --help/--version)
--threads # Number of threads (def: 1)
--new-db NAME New Kraken DB name (shrink task only; mandatory
for shrink task)
--kmer-len NUM K-mer length in bp (build/shrink tasks only;
def: 31)
--minimizer-len NUM Minimizer length in bp (build/shrink tasks only;
def: 15)
--jellyfish-hash-size STR Pass a specific hash size argument to jellyfish
when building database (build task only)
--jellyfish-bin STR Use STR as Jellyfish 1 binary.
--max-db-size SIZE Shrink the DB before full build, making sure
database and index together use <= SIZE gigabytes
(build task only)
--shrink-block-offset NUM When shrinking, select the k-mer that is NUM
positions from the end of a block of k-mers
(default: 1)
--work-on-disk Perform most operations on disk rather than in
RAM (will slow down build in most cases)
--taxids-for-genomes Add taxonomy IDs (starting with 1 billion) for genomes.
Only works with 3-column seqid2taxid map with third
column being the name
--taxids-for-sequences Add taxonomy IDs for sequences, starting with 1 billion.
Can be useful to resolve classifications with multiple genomes
for one taxonomy ID.
--min-contig-size NUM Minimum contig size for inclusion in database.
Use with draft genomes to reduce contamination, e.g. with values between 1000 and 10000.
--library-dir DIR Use DIR for reference sequences instead of DBDIR/library.
--taxonomy-dir DIR Use DIR for taxonomy instead of DBDIR/taxonomy.
Experimental:
--uid-database Build a UID database (default no)
--lca-database Build a LCA database (default yes)
--no-lca-database Do not build a LCA database
--lca-order DIR1 Impose a hierarchical order for setting LCAs.
--lca-order DIR2 The directories must be specified relative to the libary directory
... (DBDIR/library). When setting the LCAs, k-mers from sequences in
DIR1 will be set first, and only unset k-mers will be set from
DIR2, etc, and final from the whole library.
Use this option when including low-confidence draft genomes,
e.g use --lca-order Complete_Genome --lca-order Chromosome to
prioritize more complete assemblies.
Keep in mind that this option takes considerably longer.
Incomplete database, clean aborted.
2021-12-03 09:51:28,228 CreateKrakenDatabase [INFO ] krakenuniq database created
2021-12-03 09:51:28,228 create_databases [INFO ] --- Time summary 0 minutes 3.871030569076538 seconds---