Comments (14)
Hello ElsaFu,
I think this is a network connection problem. It is possible that you are
behind a firewall in your institute which prevents direct FTP connection to
the NCBI server. Maybe you can try it on your home computer with civilian
Internet service to see if it works.
Best,
Qiyun
On Thu, Jul 21, 2016 at 1:05 AM, ElsaFu [email protected] wrote:
hi, Qiyun,
I met a problem when creating a batadase, it show the error like that:
Traceback (most recent call last):
File
"/home/chenzhiduan/liruiqi/build/HGTector-master/scripts/databaser.py",
line 67, in
ftp = ftplib.FTP('ftp.ncbi.nlm.nih.gov', 'anonymous', '')
File "/build/Cellar/python/2.7.6/lib/python2.7/ftplib.py", line 120, in
init
self.connect(host)
File "/build/Cellar/python/2.7.6/lib/python2.7/ftplib.py", line 135, in
connect
self.sock = socket.create_connection((self.host, self.port), self.timeout)
File "/build/Cellar/python/2.7.6/lib/python2.7/socket.py", line 553, in
create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno -3] Temporary failure in name resolutionI don't know how to resolve the problem, could you help me? please!
thank you!!—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6, or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNyzM_m5E6TyiEZ1F05bVsl2kCfLYks5qXyhHgaJpZM4JRiuz
.
from hgtector.
Qiyun,thank you very much!
I have another question. I want to disover HGT of a plant genome from bacteria. Do you think the program will work? The plant I'm working at is early land plant.
Elsa
from hgtector.
Hello Elsa,
You are welcome! Technically our program can work on any type of genomes.
However, to achieve sufficient statistical power, the most important thing
is the abundant and evenly distributed relative genomes to your target
genome. For a plant species, there may not too many sister plant genomes
available at GenBank. Therefore, detecting HGT using our program can be
tricky. You may set the close group as the entire Eukaryotes. Then see if
you can get reasonable results.
Best,
Qiyun
On Thu, Jul 21, 2016 at 6:22 PM, ElsaFu [email protected] wrote:
Qiyun,thank you very much!
I have another question. I want to disover HGT of a plant genome from
bacteria. Do you think the program will work? The plant I'm working at is
early land plant.Elsa
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-234430629,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNzs39NZACaH0050iRvDez3xnH7Jvks5qYBtYgaJpZM4JRiuz
.
from hgtector.
Qiyun,
Thanks for your suggestion. I am wondering if I can use nr database. I have a local nr database download from NCBI. Do you think I should try?
Best,
Elsa
from hgtector.
Hello Elsa,
Yes you can try the nr database. Actually HGTector was originally designed
with the nr database. Remember to set ignoreSubspecies=1 in the
configuration file, otherwise you will see loads of junk information.
Best,
Qiyun
On Fri, Jul 22, 2016 at 2:02 AM, ElsaFu [email protected] wrote:
Qiyun,
Thanks for your suggestion. I am wondering if I can use nr database. I
have a local nr database download from NCBI. Do you think I should try?Best,
Elsa—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-234492302,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNwnknCB5pkz4wilzj6dQ6J2YXBFfks5qYIcVgaJpZM4JRiuz
.
from hgtector.
OK, thank you!
from hgtector.
Hello Qiyun, I'm disturbing you again.
My HGTector stop on the first step, the error file:
"Use of uninitialized value $s in scalar chomp at /home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line 541.
BLAST Database error: No alias or index file found for nucleotide database [/home/chenzhiduan/liruiqi/data/HGTector/BacPlant] in search path [/home/chenzhiduan/liruiqi/build/HGTector-master::]
Use of uninitialized value in split at /home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line 595.
Error: Cannot identify the taxonomy of Anthoceros.
Error: Execution of searcher.pl failed. HGTector exists.
vncserver: The HOME environment variable is not set."
and the outfile is
"Step 1: Searcher - batch protein sequence homology search.
-> Searcher: Batch sequence homology searching and filtering. <-
Reading input data...
Anthoceros: 122 proteins.
Done. 122 proteins from 1 set(s) to query.
Reading taxonomy database... done. 1496307 records read.
Reading protein-to-TaxID dictionary... done. 33985291 records read.
Enter the TaxID of Anthoceros, or press Enter if you don't know:Taxonomy of input protein sets:
Attempting to identify taxonomy of 1 protein set(s) :
Anthoceros"
My input file, Anthoceros.faa, is like the following.
Aau001790 Anthoceros angustus
MSAKILVVDDEPSIVKSIQYSLEKEGYQVVTASDGQQALEVARREKPNLVVLDVMLPSLDGYEVCRQLRQELPVPVIMLT
AKGEEIDKVVGLEIGADEYVTKPFSLRELLARVKALLRLVNRYSEAKQQQPDKIEIGDLIIDLTRHEVTLGGKVLSLTLK
EYELLKLLALNANKVLSREYLIEQVWGYDFTGEGRTVDVHIHWLREKIEKDPNHPMRIQTVRGVGYRFERRTRPVEV
It is a new genome that haven't published, so it not contained in RefSeq database. I use databaser.py to download a database of bacteria and plant. In your experience, what is the reason of errors?
Thank you!
Best!
Elsa
from hgtector.
Hello Elsa,
I read the error message, and think the following statement is the problem:
BLAST Database error: No alias or index file found for nucleotide database
[/home/chenzhiduan/liruiqi/data/HGTector/BacPlant] in search path
[/home/chenzhiduan/liruiqi/build/HGTector-master::]
Therefore, the database may have some problems. Can you check the path to
the database to make sure it is there?
Best,
Qiyun
On Mon, Jul 25, 2016 at 6:44 PM, ElsaFu [email protected] wrote:
Hello Qiyun, I'm disturbing you again.
My HGTector stop on the first step, the error file:
"Use of uninitialized value $s in scalar chomp at
/home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line
541.
BLAST Database error: No alias or index file found for nucleotide database
[/home/chenzhiduan/liruiqi/data/HGTector/BacPlant] in search path
[/home/chenzhiduan/liruiqi/build/HGTector-master::]
Use of uninitialized value in split at
/home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line
595.
Error: Cannot identify the taxonomy of Anthoceros.
Error: Execution of searcher.pl failed. HGTector exists.
vncserver: The HOME environment variable is not set."and the outfile is
"Step 1: Searcher - batch protein sequence homology search.-> Searcher: Batch sequence homology searching and filtering. <-
Reading input data...
Anthoceros: 122 proteins.
Done. 122 proteins from 1 set(s) to query.
Reading taxonomy database... done. 1496307 records read.
Reading protein-to-TaxID dictionary... done. 33985291 records read.
Enter the TaxID of Anthoceros, or press Enter if you don't know:Taxonomy
of input protein sets:
Attempting to identify taxonomy of 1 protein set(s) :
Anthoceros"My input file, Anthoceros.faa, is like the following.
Aau001790 Anthoceros angustus
MSAKILVVDDEPSIVKSIQYSLEKEGYQVVTASDGQQALEVARREKPNLVVLDVMLPSLDGYEVCRQLRQELPVPVIMLT
AKGEEIDKVVGLEIGADEYVTKPFSLRELLARVKALLRLVNRYSEAKQQQPDKIEIGDLIIDLTRHEVTLGGKVLSLTLK
EYELLKLLALNANKVLSREYLIEQVWGYDFTGEGRTVDVHIHWLREKIEKDPNHPMRIQTVRGVGYRFERRTRPVEV
It is a new genome that haven't published, so it not contained in RefSeq
database. I use databaser.py to download a database of bacteria and plant.
In your experience, what is the reason of errors?
Thank you!Best!
Elsa—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-235138899,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNxMdRnFbo42jeRWXFTkZJIFp-ELnks5qZWaKgaJpZM4JRiuz
.
from hgtector.
Hello Qiyun,
I check the database. It is in my working directory "/home/chenzhiduan/liruiqi/data/HGTector/",togetger with these files "BacPlant.faa gi2taxid.txt log sampled_genomes.txt taxonomy config.txt input representative_genomes.txt taxdump". The database is protein sequence. Why the error show "nucleotide database"?
Thank you!
Best,
Elsa
from hgtector.
Hello Elsa,
This file list looks quite normal to me. I am also confused why the program
thinks it is a nucleotide database. At this time maybe you can try to test
the database by running some blast commands manually, e.g., blastp -query
myseq.fa -db /home/chenzhiduan/liruiqi/data/HGTector/BacPlant. If that
works, we shall then move on to see if HGTector has problems.
Best,
Qiyun
On Tue, Jul 26, 2016 at 6:49 PM, ElsaFu [email protected] wrote:
Hello Qiyun,
I check the database. It is in my working directory
"/home/chenzhiduan/liruiqi/data/HGTector/",togetger with these files
"BacPlant.faa gi2taxid.txt log sampled_genomes.txt taxonomy config.txt
input representative_genomes.txt taxdump". The database is protein
sequence. Why the error show "nucleotide database"?
Thank you!
Best,
Elsa—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-235458389,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNySzV1fPZfvAknz7pag0Y5bpqx9eks5qZrkUgaJpZM4JRiuz
.
from hgtector.
Hello Qiyun,
I have try the blastp. But it didn't work. It show the same error "BLAST Database error: No alias or index file found for protein database [/home/chenzhiduan/liruiqi/data/HGTector/BacPlant] in search path [/home/chenzhiduan/liruiqi/data/HGTector::]". It means there is something wrong with the database. So, I try to use nr database. There is still error--"
Use of uninitialized value $s in scalar chomp at /home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line 541.
Error: Aau001790: OID not found
Error: Aau001790: OID not found
BLAST query/options error: Entry or entries not found in BLAST database
Please refer to the BLAST+ user manual.
Use of uninitialized value in split at /home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line 595.
Error: Cannot identify the taxonomy of Anthoceros.
Error: Execution of searcher.pl failed. HGTector exists."
and the output file --"Validating task...
Done.
Step 1: Searcher - batch protein sequence homology search.
-> Searcher: Batch sequence homology searching and filtering. <-
Reading input data...
Anthoceros: 122 proteins.
Done. 122 proteins from 1 set(s) to query.
Reading taxonomy database... done. 1496307 records read.
Reading protein-to-TaxID dictionary... done. 33985291 records read.
Reading taxonomy records from previous run(s)... done. 0 taxa and 0 ranks read.
Enter the TaxID of Anthoceros, or press Enter if you don't know:Taxonomy of input protein sets:
Attempting to identify taxonomy of 1 protein set(s) :
Anthoceros"
Comparation of twice errors, I am confused that if there is something wrong with my input file.
Sorry for taking too much your time!
Best,
Elsa
from hgtector.
Well, then your blast is not properly installed... I guess. Maybe you
should refer to some blast tutorials. - Qiyun
On Wed, Jul 27, 2016 at 2:40 AM, ElsaFu [email protected] wrote:
Hello Qiyun,
I have try the blastp. But it didn't work. It show the same error "BLAST
Database error: No alias or index file found for protein database
[/home/chenzhiduan/liruiqi/data/HGTector/BacPlant] in search path
[/home/chenzhiduan/liruiqi/data/HGTector::]". It means there is something
wrong with the database. So, I try to use nr database. There is still
error--"
Use of uninitialized value $s in scalar chomp at
/home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line
541.
Error: Aau001790: OID not found
Error: Aau001790: OID not found
BLAST query/options error: Entry or entries not found in BLAST database
Please refer to the BLAST+ user manual.
Use of uninitialized value in split at
/home/chenzhiduan/liruiqi/build/HGTector-master/scripts/searcher.pl line
595.
Error: Cannot identify the taxonomy of Anthoceros.
Error: Execution of searcher.pl failed. HGTector exists."and the output file --"Validating task...
Done.Step 1: Searcher - batch protein sequence homology search.
-> Searcher: Batch sequence homology searching and filtering. <-
Reading input data...
Anthoceros: 122 proteins.
Done. 122 proteins from 1 set(s) to query.
Reading taxonomy database... done. 1496307 records read.
Reading protein-to-TaxID dictionary... done. 33985291 records read.
Reading taxonomy records from previous run(s)... done. 0 taxa and 0 ranks
read.
Enter the TaxID of Anthoceros, or press Enter if you don't know:Taxonomy
of input protein sets:
Attempting to identify taxonomy of 1 protein set(s) :
Anthoceros"Comparation of twice errors, I am confused that if there is something
wrong with my input file.Sorry for taking too much your time!
Best,
Elsa—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-235537410,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVN-oUFfKSmqJ_DMO6xlrE9kV_1TAOks5qZyd0gaJpZM4JRiuz
.
from hgtector.
Hi, Qiyun
Sorry for disturbing you again! I used your sample input file to run, it worked successfully. So the problem must be my input file. I don't know how to set my input file and the config file.
My input file is like this
Aau001790 Anthoceros angustus
MSAKILVVDDEPSIVKSIQYSLEKEGYQVVTASDGQQALEVARREKPNLVVLDVMLPSLDGYEVCRQLRQELPVPVIMLT
AKGEEIDKVVGLEIGADEYVTKPFSLRELLARVKALLRLVNRYSEAKQQQPDKIEIGDLIIDLTRHEVTLGGKVLSLTLK
EYELLKLLALNANKVLSREYLIEQVWGYDFTGEGRTVDVHIHWLREKIEKDPNHPMRIQTVRGVGYRFERRTRPVEV
Aau002007 Anthoceros angustus
MAENDVFEKVKKIVVDRLGVTDDQVTMEASFTEDLGADSLDTVELVMAFEEEFNIEIPDEDAEKIATVKDAVTYITKAQ
Aau001399 Anthoceros angustus
MSTTSAPTRKKIAQDVTELIGNTPLVRLNRVTKGLEATILAKLDYFNPACSVKDRIGSAMILDAERQGLITPGETTLIEP
TSGNTGIALAMVAAARGYRLILTMPETMSLERRKVLRIYGAELVLTPGPMGMKGAIAKAEELLAATPNSYMLQQFKNPAN
VAVHRATTAEEIWHDTDGAVDVFVGGVGTGGTVTAVGEVLKVRKPELKVFAVEPTESPVIAGGQPGPHKIQGIGAGFIPE
NLHTEVLDGTVSVSSDEAFTMSRRLAREEGLMVGISSGAACHAAIELAKRPENKGKTIVVMFPSFGERYLSTALFAGLFE
EDAQPT
but your sample inputfile is like this:
EME25735
EME25820
EME25933
EME26043
EME26273
EME26368
My sequence is not submited to NCBI, could you tell me how to set my input file and config file?
Thank you!
Best,
Elsa
from hgtector.
Hello Elsa,
I see. The input file format should be multi-Fasta. For example:
Aau001790
MSAKILVVDDEPSIVKSIQYSLEKEGYQVVTASDGQQALEVARREKPNLVVLDVMLPSLDGYEVCRQLRQELPVPVIMLT
AKGEEIDKVVGLEIGADEYVTKPFSLRELLARVKALLRLVNRYSEAKQQQPDKIEIGDLIIDLTRHEVTLGGKVLSLTLK
EYELLKLLALNANKVLSREYLIEQVWGYDFTGEGRTVDVHIHWLREKIEKDPNHPMRIQTVRGVGYRFERRTRPVEV
Aau002007
MAENDVFEKVKKIVVDRLGVTDDQVTMEASFTEDLGADSLDTVELVMAFEEEFNIEIPDEDAEKIATVKDAVTYITKAQ
Aau001399
MSTTSAPTRKKIAQDVTELIGNTPLVRLNRVTKGLEATILAKLDYFNPACSVKDRIGSAMILDAERQGLITPGETTLIEP
TSGNTGIALAMVAAARGYRLILTMPETMSLERRKVLRIYGAELVLTPGPMGMKGAIAKAEELLAATPNSYMLQQFKNPAN
VAVHRATTAEEIWHDTDGAVDVFVGGVGTGGTVTAVGEVLKVRKPELKVFAVEPTESPVIAGGQPGPHKIQGIGAGFIPE
NLHTEVLDGTVSVSSDEAFTMSRRLAREEGLMVGISSGAACHAAIELAKRPENKGKTIVVMFPSFGERYLSTALFAGLFE
EDAQPT
This is the safest way. So you don't have to worry about NCBI numbers.
Forget about the sample input.
Best,
Qiyun
On Fri, Jul 29, 2016 at 3:05 AM, ElsaFu [email protected] wrote:
Hi, Qiyun
Sorry for disturbing you again! I used your sample input file to run, it
worked successfully. So the problem must be my input file. I don't know how
to set my input file and the config file.
My input file is like thisAau001790 Anthoceros angustus
MSAKILVVDDEPSIVKSIQYSLEKEGYQVVTASDGQQALEVARREKPNLVVLDVMLPSLDGYEVCRQLRQELPVPVIMLT
AKGEEIDKVVGLEIGADEYVTKPFSLRELLARVKALLRLVNRYSEAKQQQPDKIEIGDLIIDLTRHEVTLGGKVLSLTLK
EYELLKLLALNANKVLSREYLIEQVWGYDFTGEGRTVDVHIHWLREKIEKDPNHPMRIQTVRGVGYRFERRTRPVEV
Aau002007 Anthoceros angustusMAENDVFEKVKKIVVDRLGVTDDQVTMEASFTEDLGADSLDTVELVMAFEEEFNIEIPDEDAEKIATVKDAVTYITKAQ
Aau001399 Anthoceros angustusMSTTSAPTRKKIAQDVTELIGNTPLVRLNRVTKGLEATILAKLDYFNPACSVKDRIGSAMILDAERQGLITPGETTLIEP
TSGNTGIALAMVAAARGYRLILTMPETMSLERRKVLRIYGAELVLTPGPMGMKGAIAKAEELLAATPNSYMLQQFKNPAN
VAVHRATTAEEIWHDTDGAVDVFVGGVGTGGTVTAVGEVLKVRKPELKVFAVEPTESPVIAGGQPGPHKIQGIGAGFIPE
NLHTEVLDGTVSVSSDEAFTMSRRLAREEGLMVGISSGAACHAAIELAKRPENKGKTIVVMFPSFGERYLSTALFAGLFE
EDAQPTbut your sample inputfile is like this:
EME25735
EME25820
EME25933
EME26043
EME26273
EME26368
My sequence is not submited to NCBI, could you tell me how to set my input
file and config file?
Thank you!Best,
Elsa—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/DittmarLab/HGTector/issues/6#issuecomment-236143404,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEMVNy3UNok5UDUIfTi4b5Yn5Zopryn_ks5qadBTgaJpZM4JRiuz
.
from hgtector.
Related Issues (20)
- ValueError: Invalid bandwidth:0.3.
- HGT events in plants HOT 3
- Combining databases to predict is much less than separately
- run too long HOT 2
- blast database HOT 4
- ValueError: diamond failed with error code 1. HOT 22
- Meet error in making database HOT 1
- making plant database with error HOT 1
- about datase build HOT 8
- Remote search timeout error HOT 2
- about database build! HOT 3
- Problem in downloading database HOT 12
- Fails at diamond blast, type error: expected string HOT 4
- encountered an error while running the search option on blast
- hgtdb_20170630.tar.xz HOT 1
- The meaning and impact of genes with 'TaxID' of 0 in the HGTector2 prediction result HOT 3
- Can the blast step be completed locally and then analyzed using HGTector?
- Not any hits in HGTector to detect HGT in plants
- An error occurred downloading the database HOT 1
- Can HGTector2 detect HGTs from the 'Close' group to the 'self' group? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hgtector.