Giter Site home page Giter Site logo

meningotype's Introduction

meningotype

In silico typing of Neisseria meningitidis contigs

  • Serotyping
  • MLST
  • Finetyping (porA, fetA, porB)
  • Bexsero antigen sequence typing (BAST) (fHbp, NHBA, NadA, PorA)

Quick start

# install
$ pip install git+https://github.com/MDU-PHL/meningotype.git

# just serotype
$ meningotype NMA.fasta

SAMPLE_ID  SEROGROUP  ctrA    MLST    PorA    FetA    PorB    fHbp    NHBA    NadA    BAST
NMA.fasta  A          ctrA    -       -       -       -       -       -       -       -

# include all genotypes
$ meningotype --all NMA.fasta

SAMPLE_ID  SEROGROUP  ctrA    MLST    PorA       FetA    PorB            fHbp    NHBA    NadA    BAST
NMA.fasta  A          ctrA    4       7,13-1     F1-5    NEIS2020_28     5       29      0       639

# type lots of files at once
$ meningotype --all *.fna

SAMPLE_ID  SEROGROUP  ctrA    MLST    PorA       FetA    PorB            fHbp    NHBA    NadA    BAST
A.fna      A          ctrA    4       7,13-1     F1-5    NEIS2020_28     5       29      0       639
B.fna      B          ctrA    8       5-2,10-1   F3-6    NEIS2020_12     16      20      8       150
C.fna      C          ctrA    177     21,26-2    F1-5    NEIS2020_3      17      101     9       118
W.fna      W          ctrA    11      5,2        F1-1    NEIS2020_244    623     29      6       141
X.fna      X          ctrA    181     5-1,10-1   F4-23   NEIS2020_509    391     358     0       -
Y.fna      Y          ctrA    23      5-2,10-1   F4-1    NEIS2020_67     25      7       0       228

Installation

Dependencies

The simplest way to install dependencies is to use the Brew (MacOS) or Linuxbrew (Linux) packaging system.

$ brew tap brewsci/bio
$ brew install ispcr blast mlst

Installing

The easiest way of installing meningotype is using pip:

$ pip install --user git+https://github.com/MDU-PHL/meningotype.git

The --user option will install the package locally, rather than in the global python directory.

Thus, by default, this will install the package in $HOME/.local/, and the executable in $HOME/.local/bin/. To install the executable in a custom location (e.g., $HOME/bin), use the following:

$ pip install --install-option="--install-scripts=$HOME/bin" --user git+https://github.com/MDU-PHL/meningotype.git

To upgrade to a newer version:

$ pip install --upgrade --install-option="--install-scripts=$HOME/bin" --user git+https://github.com/MDU-PHL/meningotype.git

Testing

Once installed, you can run the following to ensure meningotype is successfully working:

$ meningotype.py --test

If everything works, you will see the following:

$ meningotype.py --test
Running meningotype.py on test examples ... 
$ meningotype.py A.fna B.fna C.fna W.fna X.fna Y.fna
SAMPLE_ID	SEROGROUP	ctrA	MLST	PorA	FetA	PorB	fHbp	NHBA	NadA	BAST
meningotype/test/A.fna	A	ctrA	-	-	-	-	-	-	-	-
meningotype/test/B.fna	B	ctrA	-	-	-	-	-	-	-	-
meningotype/test/C.fna	C	ctrA	-	-	-	-	-	-	-	-
meningotype/test/W.fna	W	ctrA	-	-	-	-	-	-	-	-
meningotype/test/X.fna	X	ctrA	-	-	-	-	-	-	-	-
meningotype/test/Y.fna	Y	ctrA	-	-	-	-	-	-	-	-

or to check finetyping:

$ meningotype.py --test --finetype
Running meningotype.py on test examples ... 
$ meningotype.py A.fna B.fna C.fna W.fna X.fna Y.fna
SAMPLE_ID	SEROGROUP	ctrA	MLST	PorA	        FetA	PorB	fHbp	NHBA	NadA	BAST
meningotype/test/A.fna	A	ctrA	-	7,13-1		F1-5	-	-	-	-	-
meningotype/test/B.fna	B	ctrA	-	5-2,10-1	F3-6	-	-	-	-	-
meningotype/test/C.fna	C	ctrA	-	21,26-2		F1-5	-	-	-	-	-
meningotype/test/W.fna	W	ctrA	-	5,2		F1-1	-	-	-	-	-
meningotype/test/X.fna	X	ctrA	-	5-1,10-1	F4-23	-	-	-	-	-
meningotype/test/Y.fna	Y	ctrA	-	5-2,10-1	F4-1	-	-	-	-	-

or to check finetyping and Bexsero antigen sequence typing:

$ meningotype.py --test --all
Running meningotype.py on test examples ... 
$ meningotype.py A.fna B.fna C.fna W.fna X.fna Y.fna
SAMPLE_ID	SEROGROUP	ctrA	MLST	PorA		FetA	PorB		fHbp	NHBA	NadA	BAST
meningotype/test/A.fna	A	ctrA	4	7,13-1		F1-5	NEIS2020_28		5	29	0	639
meningotype/test/B.fna	B	ctrA	8	5-2,10-1	F3-6	NEIS2020_12		16	20	8	150
meningotype/test/C.fna	C	ctrA	177	21,26-2		F1-5	NEIS2020_3		17	101	9	118
meningotype/test/W.fna	W	ctrA	11	5,2		F1-1	NEIS2020_244	623	29	6	141
meningotype/test/X.fna	X	ctrA	181	5-1,10-1	F4-23	NEIS2020_509	391	358	0	-
meningotype/test/Y.fna	Y	ctrA	23	5-2,10-1	F4-1	NEIS2020_67		25	7	0	228

Usage

$ meningotype.py -h
usage: 
  meningotype.py [OPTIONS] <fasta1> <fasta2> <fasta3> ... <fastaN>

In silico typing for Neisseria meningitidis
Default: Serotyping, MLST and ctrA PCR

PCR Serotyping Ref: Mothershed et al, J Clin Microbiol 2004; 42(1): 320-328
PorA and FetA typing Ref: Jolley et al, FEMS Microbiol Rev 2007; 31: 89-96
Bexsero antigen sequence typing (BAST) Ref: Brehony et al, Vaccine 2016; 34(39): 4690-4697
See also http://www.neisseria.org/nm/typing/

positional arguments:
  FASTA       input FASTA files eg. fasta1, fasta2, fasta3 ... fastaN

optional arguments:
  -h, --help  show this help message and exit
  --finetype  perform porA and fetA fine typing (default=off)
  --porB      perform porB sequence typing (NEIS2020) (default=off)
  --bast      perform Bexsero antigen sequence typing (BAST) (default=off)
  --mlst      perform MLST (default=off)
  --all       perform MLST, porA, fetA, porB, BAST typing (default=off)
  --db DB     specify custom directory containing allele databases for porA/fetA typing
              directory must contain database files: "FetA_VR.fas", "PorA_VR1.fas", "PorA_VR2.fas"
              for Bexsero typing: "fHbp_peptide.fas", "NHBA_peptide.fas", "NadA_peptide.fas", "BASTalleles.txt"
  --printseq  save porA/fetA or BAST allele sequences to file (default=off)
  --updatedb  update allele database from <pubmlst.org>
  --test      run test example
  --version   show program's version number and exit

Examples

To perform in silico serotyping on FASTA files:

$ meningotype <fasta1> <fasta2> <fasta3> ... <fastaN>`

The serotypes are printed in tab-separated format to stdout. To save results to a tab-separated text file, redirect stdout:

$ meningotype <fasta1> <fasta2> <fasta3> ... <fastaN>  > results.txt

To perform in silico serotyping AND finetyping of the porA and fetA genes:

$ meningotype --finetype <fasta1> <fasta2> <fasta3> ... <fastaN>

To save finetyping sequences of the alleles to a file (eg. for uploading "new" sequences to http://pubmlst.org/neisseria/):

$ meningotype --finetype --printseq <fasta1> <fasta2> <fasta3> ... <fastaN>

These are placed into a folder called printseq in the current directory.

Updating the allele databases

To update the allele databases from http://pubmlst.org/neisseria/

$ meningotype.py --updatedb

A copy of the original database is saved to *.old just in case, but is overwritten with each subsequent --updatedb. Ensure you back up your old databases if you wish to keep them.

Citation

Kwong JC, Gonçalves da Silva A, Stinear TP, Howden BP, Seemann T.
meningotype: in silico typing for Neisseria meningitidis.
GitHub https://github.com/MDU-PHL/meningotype

Bugs

Software Licence

GPL3

Authors

  • Jason Kwong (@kwongjc)
  • Anders Gonçalves da Silva (@drandersg)
  • Torsten Seemann (@torstenseemann)

References

meningotype's People

Contributors

andersgs avatar embatty avatar kristyhoran avatar kwongj avatar schultzm avatar tseemann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meningotype's Issues

Provide an option to not print the header on the output

Sometimes analysis for different FASTA files might be run in parallel, resulting in multiple analysis outputs. These then might be concatenated with cat. While it is possible (using e.g. tail) to skip the header from the output of meningotype the tool should make this possible itself.

porB

Add support for porB typing.

Meningotype does not fail if any of the subprocesses it calls (isPcr, mlst) fail

Subprocesses are called via subprocess.Popen() and if they error out for whatever reason (e.g. lack of dependencies needed to call the subprocess) the error will just be passed on as output of the subprocess instead of actually passing on the error message and error code and exiting meningotype gracefully.

def nm_mlst(f):
	proc = subprocess.Popen(['mlst', '--scheme=neisseria', '--quiet', f], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
	PCRout = proc.communicate()[0].decode('UTF-8')
	return PCRout.split('\t')[2]

For example, when this method is called but mlst throws an error, instead of passing this error and exiting, it will pass the error message into PCRout and then return PCRout.split('\t')[2] will throw an IndexError because the list returned by str.split() is of length 1 (as the error message is unlikely to contain a tab).

Raising this issue as a reminder to include a fix for this in next release.

incorrectly calls truncated alleles

Due to BLAST scoring, some truncated alleles are called incorrectly.
For example:

>PorA_VR2_2
HFVQQTPKSQPTLVP
>PorA_VR2_2-24
HFVQQTPTHFVQQTPKSQPTLVP
>PorA_VR2_2-48
HFVQQIPTSQPTLVPAQNSKSAYTPAHFVQQIPTSQPTLVP
>PorA_VR2_2-59
HFVQQTP

A nucleotide sequence matching 2-59 followed by a stop codon may have a higher BLAST identity score for other alleles, despite being truncated. Due to the length of the match, 2-48 > 2-24 > 2 > 2-59. Need to ensure 100% ID match.

Add --checkdeps and check within application too?

The dependencies mlst, isPcr, blastx etc should be checked for existence at start of script.

There should also be an option --checkdeps which checks them all, standalone.

Need a function like def exe_exists(exe) or something. Ask @schultzm as he recently found a python function which makes this very easy.

Barf nicer on non-existent filename

It seems a missing file is still BLASTed?
Need to check it exists first, if not, print a warning to stderr, and continue on.

meningotype/meningotype.py NOT_HERE.fasta

SAMPLE_ID       SEROGROUP       ctrA    MLST    PorA    FetA    PorB    fHbp    NHBA    NadA    BAST
Traceback (most recent call last):
  File "meningotype/meningotype.py", line 515, in <module>
    main()
  File "meningotype/meningotype.py", line 460, in main
    seroCOUNT = '/'.join(seroTYPE(f, seroPRIMERS, allelesDB))
  File "meningotype/meningotype.py", line 114, in seroTYPE
    stdout, stderr = seroBLAST()
  File "/home/linuxbrew/.linuxbrew/Cellar/python/2.7.13/lib/python2.7/site-packages/Bio/Application/__init__.py", line 516, in __call__
    stdout_str, stderr_str)
Bio.Application.ApplicationError: Non-zero return code 1 from 'blastn -outfmt "6 sseqid pident length" -query NOT_HERE.fasta -db meningotype/db/blast/seroALLELES -evalue 1e-20 -culling_limit 1 -task blastn -perc_identity 90', message 'Command line argument error: Argument "query". File is not accessible:  `NOT_HERE.fasta\''

Meningotype seems to have trouble identifying FetA F5-1

I have recently incorporated meningotype in the pipeline we use at our lab, and was comparing output to external and internal reference datasets.

While checking the dataset compiled by Bogaerts et al., I ran into six strains for which FetA identification is a bit inconsistent. Isolates were WGSed in triplicate by Bogaerts et al. For some replicates FetA could not be identified while in other replicates of the same strain, FetA was typed correctly. Five out of six strains for which this happened were F5-1. SRA accessions, correct FetA (from PubMLST) and meningotype results are listed in samples_fetA.txt. If it helps I can share the shovill assemblies of the run accessions listed in the sheet.

Using a custom ABRicate database of FetA, I can find the whole fetA gene and after translation to protein sequence, the exact F5-1 VR protein motif as well(GEFEISGKKKDPKDPKKEIDKTDEEKAKDKKDMDLVHSYKLS, from here). The whole fetA genes seem to be assembled and usually >100 bp away from contig ends.

Switching on some msg calls in meningotype seems to indicate the isPcr call in

proc = subprocess.Popen(['isPcr', f, finetypeprimers, 'stdout', '-maxSize=800', '-tileSize=10', '-minPerfect=8', '-stepSize=3'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
is not finding FetA and therefore finetypeBLAST is not called for fetA. Setting stepSize to 2 instead of 3 on line 229 causes isPcr to "amplify" the FetA VR.

The weird thing is, stepSize=3 works well for some samples such as SRR6953990:

(base) boas$ isPcr genomes/SRR6953990.fasta primers.tsv stdout -maxSize=800 -tileSize=10
 -minPerfect=8 -stepSize=3
>contig00011:875-1333 fetA_seq 459bp TTCAACTTCGACAGCCGCCTT TTGCAGCGCGTCATACAGGCG
TTCAACTTtGACAGCCGCCTTgccgaacaaaccctgttgaaatacggtat
caactaccgccatcaggaaatcaaaccgcaagcgtttttgaacggcgaat
ttgagatctccggtaagaagaaagatccgaaagatcccaaaaaagaaata
gataagaccgatgaagaaaaagcgaaagacaagaaagatatggatcttgt
ccattcctacaaactgtccaacccgaccaaaaccgataccggcgcgtata
tcgaagccattcacgagcttgacggctttaccctgaccggcgggctgcgt
tacgaccgcttcaaggtgaaaacccacgacggcaaaaccgtttcaagcag
caaccttaacccgagtttcggcgtgatttggcagccgcacgaacactgga
gcttcagctcaagccacaactacgccagccgcagcccgCGCCTGTATGAC
GCGCTGCAA

But not for others such as SRR6953892:

(base) boas$ isPcr genomes/SRR6953892.fasta primers.tsv stdout -maxSize=800 -tileSize=10 -minPerfect=8 -stepSize=3

However setting stepSize=2 for SRR6953892 gives an identical match to the stepSize=3 match for sample SRR6953990. I don't have a clue why.

(base) boas$ isPcr genomes/SRR6953892.fasta primers.tsv stdout -maxSize=800 -tileSize=10 -minPerfect=8 -stepSize=2
>contig00071:1814+2272 fetA_seq 459bp TTCAACTTCGACAGCCGCCTT TTGCAGCGCGTCATACAGGCG
TTCAACTTtGACAGCCGCCTTgccgaacaaaccctgttgaaatacggtat
caactaccgccatcaggaaatcaaaccgcaagcgtttttgaacggcgaat
ttgagatctccggtaagaagaaagatccgaaagatcccaaaaaagaaata
gataagaccgatgaagaaaaagcgaaagacaagaaagatatggatcttgt
ccattcctacaaactgtccaacccgaccaaaaccgataccggcgcgtata
tcgaagccattcacgagcttgacggctttaccctgaccggcgggctgcgt
tacgaccgcttcaaggtgaaaacccacgacggcaaaaccgtttcaagcag
caaccttaacccgagtttcggcgtgatttggcagccgcacgaacactgga
gcttcagctcaagccacaactacgccagccgcagcccgCGCCTGTATGAC
GCGCTGCAA

Would it make sense to set stepSize to 2 instead of 3? For 18 samples, this increased analysis time from 48 to 59 seconds, but it managed to identify all FetA types missed with stepSize=3.

Thanks in advance!

meningotype can not work

Hi
I installed meningotype v0.8.4 with pip and tried to run it, and show the error:

Traceback (most recent call last):
  File "/home/chen1i6c04/miniconda3/envs/meningotype/bin/meningotype", line 8, in <module>
    sys.exit(main())
  File "/home/chen1i6c04/miniconda3/envs/meningotype/lib/python3.6/site-packages/meningotype/meningotype.py", line 438, in main
    check_db_files(porA1alleles, porA1URL)
  File "/home/chen1i6c04/miniconda3/envs/meningotype/lib/python3.6/site-packages/meningotype/meningotype.py", line 97, in check_db_files
    update_db(f, db_url)
  File "/home/chen1i6c04/miniconda3/envs/meningotype/lib/python3.6/site-packages/meningotype/meningotype.py", line 88, in update_db
    urllib.urlretrieve(db_url, db_file)
AttributeError: module 'urllib' has no attribute 'urlretrieve'

Did I take any mistake?

Thanks

Installing dependencies: homebrew/science is deprecated

homebrew/science has been deprecated. blast is now in homebrew core, ispcr and mlst are in brewsci/bio.
So I think

$ brew tap homebrew/science
$ brew tap tseemann/homebrew-bioinformatics-linux

can be changed to

$ brew tap brewsci/bio
and the installation will work.
#31

Make --version more standards compliant

Currently

meningotype/meningotype.py --version
=====================================
meningotype.py v0.8-beta
Updated 25-Mar-2017 by Jason Kwong
Dependencies: isPcr, mlst, BLAST+, BioPython
=====================================

First line should look like this and print to stdout:

meningotype 0.7b

You can still add the gumpf if you want, but after.

DB should have "version number" and this version number should be reported in output

The DB used by this tool is sourced from a number of locations, including various PubMLST pages. Updating the DB can result in the tool giving different output as the contents of the DB changes. The DB should thus have a version number of some kind (and ideally a "schema version" in case the schema of the DB changes) and this should be reported as part of the tool output to ensure reproducibility.
This "version number" should be updated with the DB is updated.

Does --db need to be BLAST formatted?

  --db DB     specify custom directory containing allele databases for porA/fetA typing
              directory must contain database files: "FetA_VR.fas", "PorA_VR1.fas", "PorA_VR2.fas"
              for Bexsero typing: "fHbp_peptide.fas", "NHBA_peptide.fas", "NadA_peptide.fas", "BASTalleles.txt"

Does it need to be formatted?

Missing exact hits to PorA

From Una (NZ):

This should be an exact hit to PorA_VR2_30 but fails.

>porA
ACCGCCCTCGTATTGTCCGCACTGCCGCTTGCGGCCGTTGCCGATGTCAGCCTGTACGGC
GAAATCAAAGCCGGCGTGGAAGGCAGGAACTACCAGCTGCAATTGACTGAAGCACAAGCC
GCTAACGGTGGAGCGAGCGGTCAGGTAAAAGTTACTAAAGTTACTAAGGCCAAAAGCCGC
ATCAGGACGAAAATCAGTGATTTCGGCTCGTTTATCGGCTTTAAGGGGAGTGAGGATTTG
GGCGAAGGGCTGAAGGCTGTTTGGCAGCTTGAGCAAGACGTATCCGTTGCCGGCGGCGGC
GCGACCCAGTGGGGCAACAGGGAATCCTTTATCGGCTTGGCAGGCGAATTCGGTACGCTG
CGCGCCGGTCGCGTTGCGAATCAGTTTGACGATGCCAGCCAAGCCATTGATCCTTGGGAC
AGCAACAATGATGTGGCTTCGCAATTGGGTATTTTCAAACGCCACGACGATATGCCGGTT
TCCGTACGCTACGACTCTCCGGAATTTTCCGGTTTTAGCGGCAGCGTCCAATTCGTTCCG
GCTCAAAACAGCAAGTCCGCCTATACGCCGGCTCATTATACTACTGTGTATAATGCTACT
ACTACTACTACTACTTTCGTTCCGGCTGTTGTCGGCAAGCCCGGATCGGATGTGTATTAT
GCCGGTCTGAATTAC

User can't run --update if installed globally

Check permissions first for --db folder ? (or catch exception)
Allow local DB in $HOME/.config/ ?

% meningotype  --updatedb

Updating "/home/linuxbrew/.linuxbrew/Cellar/python/2.7.13/lib/python2.7/site-packages/meningotype/db/PorA_VR1.fas" ...
Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/bin/meningotype", line 11, in <module>
    load_entry_point('meningotype==0.4b0', 'console_scripts', 'meningotype')()
  File "/home/linuxbrew/.linuxbrew/Cellar/python/2.7.13/lib/python2.7/site-packages/meningotype/meningotype.py", line 327, in main
    update_db(porA1alleles, porA1URL)
  File "/home/linuxbrew/.linuxbrew/Cellar/python/2.7.13/lib/python2.7/site-packages/meningotype/meningotype.py", line 77, in update_db
    os.rename(db_file, db_file+'.old')
OSError: [Errno 13] Permission denied

error in menwy.py with non-P/G/S amino acid in EX7E motif

If the amino acid at position 310 is not P(W), G(Y) or S(W/Y), it generates a key error:

Traceback (most recent call last):
  File "/home/jasonk1/scripts/bin/meningotype", line 11, in <module>
    load_entry_point('meningotype==0.7b0', 'console_scripts', 'meningotype')()
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 459, in main
    seroCOUNT = '/'.join(seroTYPE(f, seroPRIMERS, allelesDB))
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 138, in seroTYPE
    sero = seroWY(f, sero)
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 144, in seroWY
    wyTYPE = menwy.menwy(f, False)
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/menwy.py", line 61, in menwy
    serogroup = seroDICT[EX7E[3]]
KeyError: 'F'

Adding other serotype E, cnl etc

Hi MDU-PHL,
I see that the users could individually update our allele database from pubmlst. But It looks like the serotyping alleles couldn't be updated. From the db directory, there is a seroALLELES.fa, which has the main serotyping but not other serotypes like MenE, MenH, CNL etc. I wonder if there is a chance to include these newly defined serogroup?
Cheers,
Lex

seroWY doesn't handle gaps

If there are gaps in the assembly, unable to translate sequence:

$ meningotype --all fasta/NM_NM115.fna 
SAMPLE_ID	SEROGROUP	ctrA	MLST	PorA	FetA	PorB	fHbp	NHBA	NadA	BAST
Traceback (most recent call last):
  File "/home/jasonk1/scripts/bin/meningotype", line 11, in <module>
    load_entry_point('meningotype==0.8b0', 'console_scripts', 'meningotype')()
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 504, in main
    seroCOUNT = '/'.join(seroTYPE(f, seroPRIMERS, allelesDB))
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 154, in seroTYPE
    sero = seroWY(f, sero)
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/meningotype.py", line 160, in seroWY
    wyTYPE = menwy.menwy(f, False)
  File "/home/jasonk1/.local/lib/python2.7/site-packages/meningotype/menwy.py", line 60, in menwy
    EX7E = str(EX7E_SEQ.translate())
  File "/home/jasonk1/.local/lib/python2.7/site-packages/Bio/Seq.py", line 1025, in translate
    cds, gap=gap)
  File "/home/jasonk1/.local/lib/python2.7/site-packages/Bio/Seq.py", line 2098, in _translate_str
    "Codon '{0}' is invalid".format(codon))
Bio.Data.CodonTable.TranslationError: Codon 'AT-' is invalid

Refactor code

Need to replace all this

porASEQS = []
fetASEQS = []
porBSEQS = []
fHbpSEQS = []
NHBASEQS = []
NadASEQS = []
sero = None
porA = None
fet = None
porB = None
fHbp = None
NHBA = None
NadA = None

with either a dictionary eg. gene['porA']['SEQS'] etc

or a python class eg. gene['porA'].seqs etc

Updating DB outside of tool-installation directory should be possible

Currently the tool can update DB by downloading updated files from PubMLST, and, with the specification of a --db location, this update can run even if the tool's install location is read-only (as in the case of e.g. a system-wide install on a multi-user system or in the case of the tool encapsulated in a Docker container). The resulting updated DB will, however, be incomplete.

A work-around is possible where the install-time DB is combined with the updated files downloaded from PubMLST, but preferably the tool should allow the DB to be updated to an arbitrary DB location without resulting in an incomplete DB.

Additionally, the tool should only download files if a newer file is available, i.e. it should use the 'If-Modified-Since' header for HTTP/HTTPS requests. While this is not super-easy with urlretrieve it is possible with (https://stackoverflow.com/a/59602931).

Handle new Y+W serotype

“the molecular basis of this dual antigenic specificity has been determined to be due to a single amino acid change at position 310 in the EX7E motif of the capsule polymerase enzyme synG (also referred to as siaDY) (from
glycine to serine) or synF (siaDW-135) (from proline to serine)..

http://jcm.asm.org/content/49/1/472.full.pdf+html

Not compatible with latest version of MLST

At line 512 of the meningotype.py there is the following:

mlst = nm_mlst(f).split('\t')[11]

The new version of mlst has only 10 fields.

In addition, the parsing of the output of MLST should be moved to the nm_mlst function. And, the identification of the ST position should not rely on the position being fixed. It should parse the header line from MLST and identify the index of ST.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.