Giter Site home page Giter Site logo

Comments (9)

cfz1998 avatar cfz1998 commented on June 18, 2024 1

Thank you!

from hgtector.

qiyunzhu avatar qiyunzhu commented on June 18, 2024

Hello @cfz1998 Thanks for reporting this issue. This problem happened because your command didn't specify how many genomes will be sampled per taxonomic group (e.g., using --sample 1).

I have tweaked the code so that it defaults to containing all genomes, such that your current command will work. To update the program, you can do:

pip install --force-reinstall --no-cache-dir git+https://github.com/qiyunlab/HGTector.git

from hgtector.

cfz1998 avatar cfz1998 commented on June 18, 2024

Hello @cfz1998 Thanks for reporting this issue. This problem happened because your command didn't specify how many genomes will be sampled per taxonomic group (e.g., using --sample 1).

I have tweaked the code so that it defaults to containing all genomes, such that your current command will work. To update the program, you can do:

pip install --force-reinstall --no-cache-dir git+https://github.com/qiyunlab/HGTector.git

Hi! qiyun
I find a new issue:

# The code:
hgtector database -o db_dir --cat plant,archaea,bacteria,fungi,protozoa --threads 32

Error message:

Database building started at 2022-09-14 10:51:27.131615.
Downloading NCBI taxonomy database... done.
Reading NCBI taxonomy database... done.
  Total number of TaxIDs: 2442723.
Downloading RefSeq assembly summary... done.
Reading RefSeq assembly summary... done.
  Total number of genomes: 272849.
Genome categories: plant, archaea, bacteria, fungi, protozoa
Downloading genome list per RefSeq category...
  plant: 154
  archaea: 1322
  bacteria: 1322
  fungi: 1322
  protozoa: 1322
Done.
  Total number of genomes in categories: 1476.
Filtering genomes...
Done.
Filtering genomes by taxonomy...
Done.
Total number of sampled genomes: 1476.
Downloading non-redundant genomic data from NCBI...
Traceback (most recent call last):
  File "/data/chaofan/miniconda/envs/hgtector/bin/hgtector", line 96, in <module>
    main()
  File "/data/chaofan/miniconda/envs/hgtector/bin/hgtector", line 35, in main
    module(args)
  File "/data/chaofan/miniconda/envs/hgtector/lib/python3.10/site-packages/hgtector/database.py", line152, in __call__
    self.download_genomes()
  File "/data/chaofan/miniconda/envs/hgtector/lib/python3.10/site-packages/hgtector/database.py", line622, in download_genomes
    makedirs(ldir, exist_ok=True)
  File "/data/chaofan/miniconda/envs/hgtector/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: 'db_dir/download/faa'

from hgtector.

cfz1998 avatar cfz1998 commented on June 18, 2024

Hi! @qiyunzhu
When i download data for a query list. [count 1336]
It's only download the default nr protein(archaea bacteria fungi protozoa). [count 1322]
genbank.txt

from hgtector.

qiyunzhu avatar qiyunzhu commented on June 18, 2024

Hi @cfz1998 I don't quite understand your last question. Can you elaborate?

from hgtector.

cfz1998 avatar cfz1998 commented on June 18, 2024

你好! 就是我通过-g 参数来构建数据库,提供的NCBI assembly accessions列表里包括了1336个物种。其中有1322个物种是属于(archaea bacteria fungi protozoa),这1322个物种也是使用--default参数自动下载的,其他14个物种是一些植物。 但是hgtector只构建了1322个物种的数据库,并没有包括我额外添加的这14个植物物种。

from hgtector.

qiyunzhu avatar qiyunzhu commented on June 18, 2024

@cfz1998 Most likely it was because the 14 additional species do not have certain metadata fields available, especially taxonomic classification, thus HGTector skipped those organisms. Currently, a workaround will be that you manually download the proteomes, and manually edit the genome.tsv file as well as the taxonomy files. This isn't easy though. In the future HGTector may support taxonomy-free analysis, but not now...

from hgtector.

cfz1998 avatar cfz1998 commented on June 18, 2024

Thank you for your help! I will try other ways.

from hgtector.

cfz1998 avatar cfz1998 commented on June 18, 2024
plant: 154

Hi!@qiyunzhu.
How to dispose this error?
image

from hgtector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.