Giter Site home page Giter Site logo

sanger-pathogens / seroba Goto Github PK

View Code? Open in Web Editor NEW
15.0 7.0 16.0 19.89 MB

k-mer based Pipeline to identify the Serotype from Illumina NGS reads

Home Page: https://sanger-pathogens.github.io/seroba/

License: Other

Shell 2.71% Python 96.62% Dockerfile 0.67%
genomics sequencing next-generation-sequencing research bioinformatics bioinformatics-pipeline global-health infectious-diseases pathogen

seroba's Introduction

SeroBA

SeroBA is a k-mer based Pipeline to identify the Serotype from Illumina NGS reads for given references. You can use SeroBA to download references from (https://github.com/phe-bioinformatics/PneumoCaT) to do identify the capsular type of Streptococcus pneumoniae.

Build Status
License: GPL v3
status
install with bioconda
Container ready
Docker Build Status
Docker Pulls
codecov

Contents

Introduction

SeroBA can predict serotypes, by identifying the cps locus, directly from raw whole genome sequencing read data with 98% concordance using a k-mer based method, can process 10,000 samples in just over 1 day using a standard server and can call serotypes at a coverage as low as 10x. SeroBA is implemented in Python3 and is freely available under an open source GPLv3

Installation

SeroBA has the following dependencies:

Required dependencies

  • Python3 version >= 3.3.2
  • KMC version >= 3.0
  • MUMmer version >= 3.1
  • Ariba

There are a number of ways to install SeroBA and details are provided below. If you encounter an issue when installing SeroBA please contact your local system administrator. If you encounter a bug please log it here.

conda

Set up bioconda channel:

conda config --add channels bioconda

Install SeroBA:

conda install -c bioconda seroba

CentOS 7

Ensure you have a development environment setup (you may have done this already):

yum -y update
yum -y groupinstall 'Development Tools'
yum -y install https://centos7.iuscommunity.org/ius-release.rpm

Install SeroBA and its dependancies:

yum -y install python36u python36u-pip python36u-devel zlib-devel wget which python36u-tkinter
ln -s $(which pip3.6) /usr/bin/pip3
bash <(curl -fsSL https://raw.githubusercontent.com/sanger-pathogens/seroba/master/install_dependencies.sh)

Make sure to add the PATHs outputted by this script to your .bashrc file (or equivalent). Finally install SeroBA:

pip3 install seroba

Debian Testing/ Ubuntu 17.10

Install the dependancies:

sudo apt-get update
sudo apt-get install ariba python3-pip wget

Manually install KMC version 3 (version 2 is the latest in Debian but is incompatible). Add the binaries to your PATH (e.g. in your bash profile).

mkdir kmc && cd kmc
wget https://github.com/refresh-bio/KMC/releases/download/v3.0.0/KMC3.linux.tar.gz
tar xvfz KMC3.linux.tar.gz
export PATH=$PWD:$PATH

Finally install SeroBA:

pip3 install seroba

Ubuntu 16.04 (Xenial)

Install the dependancies:

apt-get update
apt-get install --no-install-recommends -y build-essential cd-hit curl git libbz2-dev liblzma-dev mummer python python3-dev python3-setuptools python3-pip python3-tk python3-matplotlib unzip wget zlib1g-dev
wget -q https://raw.githubusercontent.com/sanger-pathogens/seroba/master/install_dependencies.sh && bash ./install_dependencies.sh

Once the dependencies are installed, install SeroBA using pip:

pip3 install seroba

Docker

Install Docker. We have a docker container which gets automatically built from the latest version of SeroBA. To install it:

docker pull sangerpathogens/seroba

To use it you would use a command such as this (substituting in your directories), where your files are assumed to be stored in /home/ubuntu/data:

docker run --rm -it -v /home/ubuntu/data:/data sangerpathogens/seroba seroba runSerotyping seroba/database /data/read_1.fastq.gz /data/read_2.fastq.gz  /data/output_folder

Running the tests

The test can be run from the top level directory:

python setup.py test

Usage

Setting up the database

You can use the CTV of PneumoCaT by using seroba getPneumocat. It is also possible to add new serotypes by adding the references sequence to the "references.fasta" file in the database folder. Out of the information provided by this database a TSV file is created while using seroba createDBs. You can easily put in additional genetic information for any of these serotypes in the given format.

Since SeroBA v0.1.3 an updated variant of the CTV from PneumoCaT is provided in the SeroBA package. This includes the serotypes 6E, 6F, 11E, 10X, 39X and two NT references. It is not necessary to use SeroBA getPneumocat. For SeroBA version 0.1.3 and greater, download the database provided within this git repository:

For git users

Clone the git repository:
git clone https://github.com/sanger-pathogens/seroba.git

Copy the database to a directory:

cp -r seroba/database my_directory

Delete the git repository to clean up your system:

rm -r seroba

For svn users
Install svn. Checkout the database directory:

svn checkout "https://github.com/sanger-pathogens/seroba/trunk/database"

Continue with Step 2.

For SeroBA version 0.1.2 and smaller:

usage: seroba  getPneumocat <database dir>

Downloads PneumoCat and build an tsv formatted meta data file out of it

positional arguments:
  database dir      directory to store the PneumoCats capsular type variant (CTV) database

Usage

usage: seroba createDBs  <database dir> <kmer size>

Creates a Database for kmc and ariba

positional arguments:
    database dir     output directory for kmc and ariba Database
    kmer size   kmer_size you want to use for kmc , recommended = 71

    usage: seroba runSerotyping [options]  <databases directory> <read1> <read2> <prefix>

    Example : seroba createDBs my_database/ 71

Identify serotype of your input data

    positional arguments:
      database dir         path to database directory
      read1              forward read file
      read2              reverse read file
      prefix             unique prefix

    optional arguments:
      -h, --help         show this help message and exit

    Other options:
      --noclean NOCLEAN  Do not clean up intermediate files (assemblies, ariba
                         report)
      --coverage COVERAGE  threshold for k-mer coverage of the reference sequence (default = 20)                         



Summaries the output in one tsv file

usage: seroba summary  <output folder>

positional arguments:
  output folder   directory where the output directories from seroba runSerotyping are stored

Output

In the folder 'prefix' you will find a pred.tsv including your predicted serotype as well as a file called detailed_serogroup_info.txt including information about SNP, genes, and alleles that are found in your reads. After the use of "seroba summary" a tsv file called summary.tsv is created that consists of three columns (sample Id , serotype, comments). Serotypes that do not match any reference are marked as "untypable"(v0.1.3).

detailed_serogroup_info example:

Predicted Serotype:       23F
Serotype predicted by ariba:    23F
assembly from ariba has an identity of:   99.77    with this serotype

Serotype       Genetic Variant
23F            allele  wchA

In the detailed information you can see the finally predicted serotype as well as the serotypes that had the closest reference in that specific serogroup according to ARIBA. Furthermore you can see the sequence identity between the sequence assembly and the reference sequence.

Troubleshooting

  • Case 1:

    • SeroBA predicts 'untypable'. An 'untypable' prediction can either be a real 'untypable' strain or can be caused by different problems. Possible problems are: bad quality of your input data, submission of a wrong species or to low coverage of your sequenced reads. Please check your data again and run a quality control.
  • Case 2:

    • Low alignment identity in the 'detailed_serogroup_info' file. This can be a hint for a mosaic serotpye.
    • Possible solution: perform a blast search on the whole genome assembly
  • Case 3:

    • The third column in the summary.tsv indicates "contamination". This means that at least one heterozygous SNP was detected in the read data with at least 10% of the mapped reads at the specific position supporting the SNP.
    • Possible solution: please check the quality of your data and have a look for contamination within your reads

License

SeroBA is free software, licensed under GPLv3

Feedback/Issues

Please report any issues to the issues page.

Citation

SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data
Epping L, van Tonder, AJ, Gladstone RA, GPS Consortium, Bentley SD, Page AJ, Keane JA, Microbial Genomics 2018, doi: 10.1099/mgen.0.000186

Further Information

Tutorial

A tutorial for SeroBA can be found here:

https://github.com/sanger-pathogens/pathogen-informatics-training/

seroba's People

Contributors

andrewjpage avatar antunderwood avatar eppinglen avatar gindar avatar harryhung avatar kathryn1995 avatar lfulcrum avatar ssjunnebo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

seroba's Issues

No detailed output file

Hello.

Nice tool! Thank you.

I seem to have run in to an unexpected behaviour. When I run version 1.0.1 on a my test case, I only get a pred.tsv file with three columns. I see no detailed_serogroup_info.txt. Admittedly, I have only run it on a single sample, and the third column suggests it might be contaminated. I am wondering if I missed out on the detailed_serogroup_info because the sample appears contaminated.

Thank you.

Anders.

Can't build DB? False | ariba prepare_ref | no such file

False
no such file
ariba prepareref -f /home/linuxbrew/db/temp_aribaX5nkguz0r/temp_fasta_ref.fasta -m /home/linuxbrew/db/temp_aribaX5nkguz0r/temp_meta_ref.tsv --cdhit_clusters /home/linuxbrew/db/temp_aribaX5nkguz0r/cdhit_cluster_ref seroba/ariba_db/01/ref
False
no such file
ariba prepareref -f /home/linuxbrew/db/temp_aribaXxkf8ld23/temp_fasta_ref.fasta -m /home/linuxbrew/db/temp_aribaXxkf8ld23/temp_meta_ref.tsv --cdhit_clusters /home/linuxbrew/db/temp_aribaXxkf8ld23/cdhit_cluster_ref seroba/ariba_db/02/ref
False
no such file
ariba prepareref -f /home/linuxbrew/db/temp_aribaXhf22fy64/temp_fasta_ref.fasta -m /home/linuxbrew/db/temp_aribaXhf22fy64/temp_meta_ref.tsv --cdhit_clusters /home/linuxbrew/db/temp_aribaXhf22fy64/cdhit_cluster_ref seroba/ariba_db/03/ref
False
no such file
ariba prepareref -f /home/linuxbrew/db/temp_aribaXkkhn_2b7/temp_fasta_ref.fasta -m /home/linuxbrew/db/temp_aribaXkkhn_2b7/temp_meta_ref.tsv --cdhit_clusters /home/linuxbrew/db/temp_aribaXkkhn_2b7/cdhit_cluster_ref seroba/ariba_db/04/ref
False

Default for runSerotyping coverage not being set correctly

Dear seroba team,

I have run into an issue where serotyping won't run unless I explicitly set coverage.

root@d2d570a8775d:/# seroba runSerotyping seroba/database/ data/testsample_1.fastq.gz data/testsample_2.fastq.gz data/TESTcontainer
Traceback (most recent call last):
  File "/usr/local/bin/seroba", line 4, in <module>
    __import__('pkg_resources').run_script('seroba==1.0.2', 'seroba')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 650, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1453, in run_script
    exec(script_code, namespace, namespace)
  File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/EGG-INFO/scripts/seroba", line 86, in <module>
  File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/tasks/sero_run.py", line 13, in run
  File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/serotyping.py", line 34, in __init__
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'

It seems like a default value for cov is not being set? I can get it to run by explicitly specifying coverage:

root@d2d570a8775d:/# seroba runSerotyping --coverage 20 /seroba/database/ /data/ERR1438805_1.fastq.gz /data/ERR1438805_2.fastq.gz /data/TESTcontainer

I am using the latest docker container from sangerpathogens/seroba (b4f4e60ee092)

"Bash Run_nucmer.sh" Error

Dear Developer,

In what case will seroba run run_nucmer.sh? And where is this assembly file which rummer requires from "/data/xxxxxxx/result/assemblies.fa"

Do I need to provide assembly for seroba?

This probably means that very few reads were mapped at all. No local assemblies will be run
WARNING: not enough proper read pairs (found 0) to determine insert size.
This probably means that very few reads were mapped at all. No local assemblies will be run
The following command failed with exit code 255
bash run_nucmer.sh

The output was:

bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8)
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = "en_US.utf8",
	LC_COLLATE = "C",
	LANG = "en_AU.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file "p.ntref" of length 8097 
construct suffix tree for sequence of length 8097
(maximum reference length is 536870908)
(maximum query length is 4294967295)
CONSTRUCTIONTIME /usr/bin/mummer p.ntref 0.00
/usr/bin/mummer: cannot open file "/data/xxxxxxxx/result/assemblies.fa" or file "/data/xxxxxxx/result/assemblies.fa" is empty
ERROR: mummer and/or mgaps returned non-zero
ERROR: Could not parse delta file, p.delta
error no: 400
ERROR: Could not parse delta file, p.delta.filter
error no: 402
ERROR: Could not parse delta file, p.delta.filter
error no: 402

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pymummer/syscall.py", line 20, in run
    output = subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT)
  File "/usr/lib/python3.7/subprocess.py", line 395, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 487, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'bash run_nucmer.sh' returned non-zero exit status 255.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/seroba", line 4, in <module>
    __import__('pkg_resources').run_script('seroba==1.0.1', 'seroba')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 666, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1453, in run_script
    exec(script_code, namespace, namespace)
  File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.1-py3.7.egg/EGG-INFO/scripts/seroba", line 86, in <module>
  File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.1-py3.7.egg/seroba/tasks/sero_run.py", line 19, in run
  File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.1-py3.7.egg/seroba/serotyping.py", line 481, in run
  File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.1-py3.7.egg/seroba/serotyping.py", line 453, in _prediction
  File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.1-py3.7.egg/seroba/serotyping.py", line 269, in _find_serotype
  File "/usr/lib/python3/dist-packages/pymummer/nucmer.py", line 144, in run
    syscall.run('bash ' + script, verbose=self.verbose)
  File "/usr/lib/python3/dist-packages/pymummer/syscall.py", line 26, in run
    raise Error('Error running command:', cmd)
pymummer.syscall.Error: ('Error running command:', 'bash run_nucmer.sh')```

Josh

seroba stopping signal recieved 28

cluster detected 1 threads available to it
cluster reported completion
The following command failed with exit code 1
rm -rf /media/crlkims/Data_Vol_1/ROSE/Wghole_genome_sequencing_pneumoniae/RawData_385/ERR1638455/fastq_files/seroba_out/ref/ariba.tmp.1_w_k7u0/cluster

The output was:

rm: cannot remove '/media/crlkims/Data_Vol_1/ROSE/Wghole_genome_sequencing_pneumoniae/RawData_385/ERR1638455/fastq_files/seroba_out/ref/ariba.tmp.1_w_k7u0/cluster': Directory not empty

Stopping! Signal received: 28
Traceback (most recent call last):
File "/home/crlkims/anaconda2/bin/seroba", line 4, in
import('pkg_resources').run_script('seroba==1.0.1', 'seroba')
File "/home/crlkims/.local/lib/python3.6/site-packages/pkg_resources/init.py", line 666, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/crlkims/.local/lib/python3.6/site-packages/pkg_resources/init.py", line 1460, in run_script
exec(script_code, namespace, namespace)
File "/home/crlkims/anaconda2/lib/python3.6/site-packages/seroba-1.0.1-py3.6.egg/EGG-INFO/scripts/seroba", line 86, in
File "/home/crlkims/anaconda2/lib/python3.6/site-packages/seroba-1.0.1-py3.6.egg/seroba/tasks/sero_run.py", line 19, in run
File "/home/crlkims/anaconda2/lib/python3.6/site-packages/seroba-1.0.1-py3.6.egg/seroba/serotyping.py", line 480, in run
File "/home/crlkims/anaconda2/lib/python3.6/site-packages/seroba-1.0.1-py3.6.egg/seroba/serotyping.py", line 94, in _run_ariba_on_cluster
File "/home/crlkims/anaconda2/lib/python3.6/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'seroba_out/genes/assembled_genes.fa.gz'

I am gettig report.tsv but its not complete

Job runs for 17+ hours without completion

Hi there,

I have run a few hundred samples through SeroBA v1.0.2 using the sangerpathogens/seroba docker image. Of these, 6 of them have failed to complete the job, producing the following the log files and continuing to run for 17+ hours before I aborted the job. QC of reads and assemblies from these samples have looked fine. Do you know what might be causing this issue and how we might avoid it?

Thanks in advance, Emma

Stage 1: 0% Stage 1: 26% Stage 1: 52% Stage 1: 78% Stage 1: 100%
Stage 2: 100%
1st stage: 3.11065s
2nd stage: 0.071989s
Total : 3.18264s
Tmp size : 0MB

Stats:
No. of k-mers below min. threshold : 0
No. of k-mers above max. threshold : 0
No. of unique k-mers : 0
No. of unique counted k-mers : 0
Total no. of k-mers : 0
Total no. of reads : 1476720
Total no. of super-k-mers : 0
in1: 0% in1: 0% in2: 0%

Empty summary.tsv file

Hi,

I downloaded seroba via conda and ran it as recommended. However, although I get pred.tsv output for all genomes, I don't get a summary.tsv output. I used a for loop to run the commands for all my genome files since it was easier that way (see below):

#define serotype
for f in ./*_1P.fastq
do
 base=$(basename $f "_1P.fastq")
 basebam=$(basename $f "_L1_out_1P.fastq")
 seroba runSerotyping $out/Pneumocat-dir/ ${base}_1P.fastq ${base}_2P.fastq $out/summary_out/seroBA_${basebam} &&
 seroba summary $out/summary_out/seroBA_${basebam}
done

When I realized that I didn't get the summary output file, I decided to run the seroba summary command for just one output folder (instead of the for loop), but that still didn't work. As you can see in the snapshot below, the size of the file remains 0 KB.

image

Is there something I did wrong? Thanks for your help in advance.

Swiss_NT

Hello
I am trying to understand the difference between "Swiss_NT" and "untypable" serotypes in Seroba. I have a set of non-encapsulated S. pneumoniae strains from the published literature which I serotyped again using Seroba. Some are called "Swiss_NT" while some are predicted as "untypable" by Seroba. I can not find any information on "Swiss_NT" in the Seroba manual. Is there any paper that explains this serotype?

Thanks
Tauqeer

gzip: No such file or directory

Hey everybody,

because I couldn't fix the problem executing seroba installed from source, I tried to run it using the provided docker image.
Unfortunatly I am still not able to get it to work. I get the following error message. Can maybe someone help me with that?

sudo docker run --rm -it -v /home/user/Test:/data sangerpathogens/seroba seroba runSerotyping seroba/database '/home/user/RKI4410_S1_L001_R1.fastq.gz' '/home/user/RKI4410_S1_L001_R2.fastq.gz' '/home/user/Test/output_test' gzip: /home/user/RKI4410_S1_L001_R1.fastq.gz: No such file or directory Traceback (most recent call last): File "/usr/local/bin/seroba", line 4, in <module> __import__('pkg_resources').run_script('seroba==1.0.2', 'seroba') File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 666, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1469, in run_script exec(script_code, namespace, namespace) File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/EGG-INFO/scripts/seroba", line 86, in <module> File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/seroba/tasks/sero_run.py", line 19, in run File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/seroba/serotyping.py", line 468, in run File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/seroba/serotyping.py", line 60, in _run_kmc File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/seroba/kmc.py", line 10, in run_kmc File "/usr/local/lib/python3.7/dist-packages/seroba-1.0.2-py3.7.egg/seroba/common.py", line 42, in detect_sequence_format File "/usr/lib/python3/dist-packages/pyfastaq/utils.py", line 15, in open_file_read raise Error("Error opening for reading gzipped file '" + filename + "'") pyfastaq.utils.Error: Error opening for reading gzipped file '/home/user/RKI4410_S1_L001_R1.fastq.gz'

Thank you very much in advance!

Karsten

Can't build the ariba database

Hi,
After installation with conda install -c bioconda seroba the ariba database doesn't load with seroba createDBs my_database/ 71 because installed tbb=2021.2.0 is incompatible with bowtie2 (for ariba). After downgrading to tbb=2020.2 there is no bowtie2 issue anymore, but I'm getting another error while trying to load the database (see below and attached), namely the prepare_ref issue related to issue #37 despite PR #38, this also result in segmentation fault (core dumped) issue. Any help would be greatly appreciated.
See also attached log.
False no such file ariba prepareref -f /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/temp_aribaXjuhh_i6n/temp_fasta_ref.fasta -m /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/temp_aribaXjuhh_i6n/temp_meta_ref.tsv --max_noncoding_length 50000 --cdhit_clusters /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/temp_aribaXjuhh_i6n/cdhit_cluster_ref /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/ariba_db/01/ref False

-ci1 -m1 -t1 -fm /home/wmiellet/anaconda3/envs/env-seroba/bin/kmc -k71 -ci1 -m1 -t1 -fm /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/kmer_db/01/01.fasta /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/kmer_db/01/01 /mnt/c/Users/wille/Documents/Pneumococcus/ngsserotyping/db/kmer_db/01 Segmentation fault (core dumped)
seroba_log.txt

The "cd_cluster.tsv" created recently is different from the previous one

cd_cluster_old.txt
cd_cluster_new.txt

The cd_cluster_old.txt is the previous one. And the cd_cluster_new.txt is the new cd_cluster.tsv the program created when I tried to build another copy in my another device.
Is this due to new version of KMC or Python3 I used in my new device?

The new cd_cluster.tsv looks like to have some problems and will make the seroba serotyping end with errors for some serotypes.
Please let me know if you could recreate the problem and any solution?

Josh

TypeError: argument of type 'NoneType' is not iterable

On a sample, SeroBA encounters a fatal Python error

cluster detected 1 threads available to it
cluster reported completion
cluster_3 detected 1 threads available to it
cluster_3 reported completion
cluster_4 detected 1 threads available to it
cluster_4 reported completion
cluster_6 detected 1 threads available to it
cluster_6 reported completion

0.013121071707115657
/seroba-1.0.2/build/kmc_tools simple /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/NP-0087-IDRL-AKU_S92_trimmed seroba/database/kmer_db/11F/11F intersect /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/inter
0.03264454465908326
/seroba-1.0.2/build/kmc_tools simple /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/NP-0087-IDRL-AKU_S92_trimmed seroba/database/kmer_db/06C/06C intersect /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/inter
0.034005116366132154
/seroba-1.0.2/build/kmc_tools simple /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/NP-0087-IDRL-AKU_S92_trimmed seroba/database/kmer_db/10A/10A intersect /home/ubuntu/local-repo/gps-unified-pipeline/work/64/bd269cc67f479cb03ad3532be5311d/temp.kmcl69cg5cc/inter
0.018366189193022266
15C
{'15A': 0, '15B': 0, '15C': 0, '15F': 16}
15A
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
15B
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
15C
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
15C
15F
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
{'15A': -1, '15B': 0, '15C': -2.5, '15F': 15}
{'15A': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}, '15B': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}, '15C': {'genes': [], 'pseudo': ['wciZ'], 'allele': [], 'snps': []}, '15F': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}}
['15A', '15C']
15A
15B/15C
15C
None
Traceback (most recent call last):
  File "/usr/local/bin/seroba", line 4, in <module>
    __import__('pkg_resources').run_script('seroba==1.0.2', 'seroba')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 658, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1445, in run_script
    exec(script_code, namespace, namespace)
  File "/usr/local/lib/python3.6/dist-packages/seroba-1.0.2-py3.6.egg/EGG-INFO/scripts/seroba", line 86, in <module>
  File "/usr/local/lib/python3.6/dist-packages/seroba-1.0.2-py3.6.egg/seroba/tasks/sero_run.py", line 19, in run
  File "/usr/local/lib/python3.6/dist-packages/seroba-1.0.2-py3.6.egg/seroba/serotyping.py", line 481, in run
  File "/usr/local/lib/python3.6/dist-packages/seroba-1.0.2-py3.6.egg/seroba/serotyping.py", line 453, in _prediction
  File "/usr/local/lib/python3.6/dist-packages/seroba-1.0.2-py3.6.egg/seroba/serotyping.py", line 397, in _find_serotype
TypeError: argument of type 'NoneType' is not iterable

The relevant part in the code is

seroba/seroba/serotyping.py

Lines 392 to 397 in 8138dc8

if mixed_serotype != None:
for key in min_keys:
print(key)
print(mixed_serotype)
if key not in mixed_serotype:
mixed_serotype = None

I think this piece of code has a logic flaw.

While iterating through min_keys in line 393:

  • when if key not in mixed_serotype at line 396 is true, mixed_serotype is therefore set to None.
  • In the next loop, if key not in mixed_serotype at line 396 is effectively turns into if key not in None and leads to the Python error TypeError: argument of type 'NoneType' is not iterable

Fixing this seems to be trivial, but I am not sure which one is the right approach:

  • looping through all min_keys, only when all keys are not in mixed_serotype, then mixed_serotype should be set to None
  • looping through all min_keys, when any key is not in mixed_serotype, mixed_serotype should be set to None and exit the loop

pkg_resources.ResolutionError: No script named 'seroba'

I reinstalled from git (git clone seroba && cd seroba && python3 setup.py install):

% seroba

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/bin/seroba", line 4, in <module>
    __import__('pkg_resources').run_script('seroba==0.1.5', 'seroba')
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 748, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1509, in run_script
    raise ResolutionError("No script named %r" % script_name)
pkg_resources.ResolutionError: No script named 'seroba'

errors during installation via conda/mamba

Dear Lennard,

is the tool seroba still maintained?
We had issues with the installation via conda and mamba.

#Standard installation from readme.md
conda install -c bioconda seroba
--> error with incompatible glibc versions.

#Standard installation from bioconda
mamba install seroba
--> seroba does not exist

Could not solve for environment specs
The following package could not be installed
└─ seroba does not exist (perhaps a typo or a missing channel).

#Installation with custom yml file with information from this post
#conda conda env create -f seroba.yml
--> worked, but after setting up databases, we got this error:

ERROR: I tried to get the version of nucmer with: "/mnt/localdata/homes/user/miniconda3/envs/seroba/bin/nucmer --version" and the output didn't match this regular expression: "^NUCmer (NUCleotide MUMmer) version ([0-9.]+)"
Something wrong with at least one dependency. Please see the above error message(s)
Traceback (most recent call last):
File "/mnt/localdata/homes/user/miniconda3/envs/seroba/bin/seroba", line 3, in
import seroba
File "/mnt/localdata/homes/user/miniconda3/envs/seroba/lib/python3.6/site-packages/seroba/init.py", line 16, in
from seroba import *
File "/mnt/localdata/homes/user/miniconda3/envs/seroba/lib/python3.6/site-packages/seroba/kmc.py", line 6, in
ext_progs = external_progs.ExternalProgs()
File "/mnt/localdata/homes/user/miniconda3/envs/seroba/lib/python3.6/site-packages/seroba/external_progs.py", line 90, in init
raise Error('Dependency error(s). Cannot continue')

We then removed line 15 and 27 in the script external_progs.py.

Now we can call seroba but not sure if it really works now.

Thanks for your time and input on this topic!

All the best,
Markus

Readme mentions SVN on a git database

For SeroBA version 0.1.3 and greater, download the database provided within this git repository:

Install svn
svn checkout "https://github.com/sanger-pathogens/seroba/trunk/database"

serotyping from assemblies as input

Dear Friends,

I don't have reads, but assemblies (one or mere contigs) of S. pneumoniae.
How can I assign the serotype?
seroBa doesn't look like to support fasta input right?

Bests,
Alex

AttributeError: 'Namespace' object has no attribute 'database_dir'

% seroba getPneumocat db

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/bin/seroba", line 4, in <module>
    __import__('pkg_resources').run_script('seroba==0.1.4', 'seroba')
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 742, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1510, in run_script
    exec(script_code, namespace, namespace)
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/seroba-0.1.4-py3.6.egg/EGG-INFO/scripts/seroba", line 86, in <module>
  File "/home/linuxbrew/.linuxbrew/opt/python3/lib/python3.6/site-packages/seroba-0.1.4-py3.6.egg/seroba/tasks/getPneumocat.py", line 6, in run
AttributeError: 'Namespace' object has no attribute 'database_dir'

Error:

Hi,
I have been using seroba v1.0.1 recently, but after changing to seroba v.1.0.2, i started having the same error of min memory must be at least 2GB. The same thing happened to me even after reverting back to v.1.0.1...

Error: Wrong parameret: min memory must be at least 2GB
/phe/tools/miniconda3/envs/phetype/bin/kmc_tools simple /scratch/iidlleo/Spneumoniae/temp.kmctmwj04if/2212515304 /phe/tools/seroba/database/kmer_db/35B/35B intersect /scratch/iidlleo/Spneumoniae/temp.kmctmwj04if/inter
Error: Cannot open file /scratch/iidlleo/Spneumoniae/temp.kmctmwj04if/2212515304.kmc_pre
Error: Cannot open file /scratch/iidlleo/Spneumoniae/temp.kmctmwj04if/inter.kmc_pre
Traceback (most recent call last):
  File "/phe/tools/miniconda3/envs/phetype/bin/seroba", line 86, in <module>
    args.func(args)
  File "/phe/tools/miniconda3/envs/phetype/lib/python3.6/site-packages/seroba/tasks/sero_run.py", line 19, in run
    sero.run()
  File "/phe/tools/miniconda3/envs/phetype/lib/python3.6/site-packages/seroba/serotyping.py", line 468, in run
    self._run_kmc()
  File "/phe/tools/miniconda3/envs/phetype/lib/python3.6/site-packages/seroba/serotyping.py", line 68, in _run_kmc
    with open( temp_hist, 'r') as fobj:
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/iidlleo/Spneumoniae/temp.kmctmwj04if/hist'

Is there something missing in my dependencies?

YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated,

seroba  getPneumocat seroba
--2019-04-03 09:34:36--  https://github.com/phe-bioinformatics/PneumoCaT/archive/v1.1.tar.gz
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/phe-bioinformatics/PneumoCaT/tar.gz/v1.1 [following]
--2019-04-03 09:34:37--  https://codeload.github.com/phe-bioinformatics/PneumoCaT/tar.gz/v1.1
Resolving codeload.github.com (codeload.github.com)... 192.30.255.120, 192.30.255.121
Connecting to codeload.github.com (codeload.github.com)|192.30.255.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: 'v1.1.tar.gz'

v1.1.tar.gz                  [              <=>               ] 320.07M  3.77MB/s    in 78s     

2019-04-03 09:35:56 (4.09 MB/s) - 'v1.1.tar.gz' saved [335618305]

/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/seroba/get_pneumocat_data.py:47: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  allele_snp=yaml.load( open( os.path.join(serogroup_dir,subdir,'mutationdb.yml'), "rb" ) )

Dependencies issues when using seroba through conda

Hello! I have encountered few issues while trying to run seroba through a conda environment. One is the same reported in issue #59 with biopython and the other one is with bowtie2. In the later one, seroba runs properly at the beginning but then it stops with an error about not being able to get the bowtie2 --version. The specific error that bowtie2 throws is this:

error while loading shared libraries: libtbb.so.2: cannot open shared object file: No such file or directory

I tracked down both errors and it turns out that in order to use seroba, you need to list as dependencies biopython=1.74 (see this forum) and tbb=2020.3 (see this other forum). I made few tests and it seems to work fine for me (with seroba 1.0.0 and 1.0.2).

It would be really helpful to have these dependencies properly documented (I assume the error is not unique from the conda version but I did not test it) or simply have them included in the installation.

Conda install 'createDBs' and 'getPneumocat' errors

Hello!

I was thinking about including the databases in the bioconda recipe, but I'm running into some errors.

I installed Seroba using mamba

mamba create -n test-seroba -c conda-forge -c bioconda seroba

I'm getting the following errors:

createDBs

seroba createDBs database 71
Traceback (most recent call last):
  File "/home/robert_petit/miniconda3/envs/test-seroba/bin/seroba", line 86, in <module>
    args.func(args)
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/tasks/createDBs.py", line 10, in run
    ref_db.run()
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/ref_db_creator.py", line 236, in run
    self.meta_dict = self._read_meta_data_tsv(self.meta_data_tsv)
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/ref_db_creator.py", line 183, in _read_meta_data_tsv
    with open(meta_data_tsv,'r') as fobj:
FileNotFoundError: [Errno 2] No such file or directory: 'database/meta.tsv'

getPneumocat

seroba getPneumocat database
--2022-01-26 02:36:35--  https://github.com/phe-bioinformatics/PneumoCaT/archive/v1.1.tar.gz
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/phe-bioinformatics/PneumoCaT/tar.gz/v1.1 [following]
--2022-01-26 02:36:35--  https://codeload.github.com/phe-bioinformatics/PneumoCaT/tar.gz/v1.1
Resolving codeload.github.com (codeload.github.com)... 140.82.114.9
Connecting to codeload.github.com (codeload.github.com)|140.82.114.9|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘v1.1.tar.gz’

v1.1.tar.gz                                                                  [                                                                 <=>                                                                                                              ] 320.07M  24.1MB/s    in 13s

2022-01-26 02:36:48 (24.0 MB/s) - ‘v1.1.tar.gz’ saved [335618305]

Traceback (most recent call last):
  File "/home/robert_petit/miniconda3/envs/test-seroba/bin/seroba", line 86, in <module>
    args.func(args)
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/tasks/getPneumocat.py", line 7, in run
    pneumo.run()
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/get_pneumocat_data.py", line 105, in run
    self._pneumocat_db_2_tsv(self.serogroup_dir,self.out_file)
  File "/home/robert_petit/miniconda3/envs/test-seroba/lib/python3.8/site-packages/seroba/get_pneumocat_data.py", line 47, in _pneumocat_db_2_tsv
    allele_snp=yaml.load( open( os.path.join(serogroup_dir,subdir,'mutationdb.yml'), "rb" ) )
TypeError: load() missing 1 required positional argument: 'Loader'

Here's my conda env if you think it might be helpful

conda env export
name: test-seroba
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=1_gnu
  - _sysroot_linux-64_curr_repodata_hack=3=h5bd9786_13
  - ariba=2.14.6=py38hc37a69a_2
  - bcftools=1.14=hde04aa1_1
  - beautifulsoup4=4.10.0=pyha770c72_0
  - biopython=1.77=py38h1e0a361_1
  - bowtie2=2.2.5=py38h8c62d01_8
  - brotli=1.0.9=h7f98852_6
  - brotli-bin=1.0.9=h7f98852_6
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.18.1=h7f98852_0
  - ca-certificates=2021.10.26=h06a4308_2
  - cd-hit=4.8.1=h2e03b76_5
  - certifi=2021.10.8=py38h578d9bd_1
  - cycler=0.11.0=pyhd8ed1ab_0
  - dendropy=4.5.2=pyh3252c3a_0
  - fonttools=4.29.0=py38h497a2fe_0
  - freetype=2.11.0=h70c0345_0
  - gettext=0.21.0=hf68c758_0
  - giflib=5.2.1=h516909a_2
  - gsl=2.7=he838d99_0
  - htslib=1.14=h5138463_1
  - icu=69.1=h9c3ff4c_0
  - jpeg=9d=h516909a_0
  - kernel-headers_linux-64=3.10.0=h4a8ded7_13
  - kiwisolver=1.3.2=py38h1fd1430_1
  - kmc=3.2.1=h95f258a_1
  - krb5=1.19.2=hcc1bbae_3
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.36.1=hea4e1c9_2
  - libblas=3.9.0=13_linux64_openblas
  - libbrotlicommon=1.0.9=h7f98852_6
  - libbrotlidec=1.0.9=h7f98852_6
  - libbrotlienc=1.0.9=h7f98852_6
  - libcblas=3.9.0=13_linux64_openblas
  - libcurl=7.81.0=h2574ce0_0
  - libdeflate=1.9=h7f98852_0
  - libedit=3.1.20210714=h7f8727e_0
  - libev=4.33=h516909a_1
  - libffi=3.4.2=h7f98852_5
  - libgcc-ng=11.2.0=h1d223b6_12
  - libgfortran-ng=11.2.0=h69a702a_12
  - libgfortran5=11.2.0=h5c6108e_12
  - libgomp=11.2.0=h1d223b6_12
  - libiconv=1.16=h516909a_0
  - libidn2=2.3.2=h7f98852_0
  - liblapack=3.9.0=13_linux64_openblas
  - libnghttp2=1.46.0=hce63b2e_0
  - libnsl=2.0.0=h7f98852_0
  - libopenblas=0.3.18=pthreads_h8fe5266_0
  - libpng=1.6.37=hed695b0_2
  - libssh2=1.10.0=ha56f1ee_2
  - libstdcxx-ng=11.2.0=he4da1e4_12
  - libtiff=4.2.0=hf544144_3
  - libunistring=0.9.10=h14c3975_0
  - libwebp=1.2.0=h3452ae3_0
  - libwebp-base=1.2.0=h7f98852_2
  - libxml2=2.9.12=h885dcf4_1
  - libzlib=1.2.11=h36c2ea0_1013
  - llvm-openmp=8.0.1=hc9558a2_0
  - lz4-c=1.9.3=h9c3ff4c_1
  - matplotlib-base=3.5.1=py38hf4fb855_0
  - mummer=3.23=pl5321h1b792b2_13
  - munkres=1.1.4=pyh9f0ad1d_0
  - ncurses=6.2=h58526e2_4
  - numpy=1.22.1=py38h6ae9a64_0
  - olefile=0.46=pyh9f0ad1d_1
  - openmp=8.0.1=0
  - openssl=1.1.1m=h7f8727e_0
  - packaging=21.3=pyhd8ed1ab_0
  - perl=5.32.1=1_h7f98852_perl5
  - pillow=8.4.0=py38h5aabda8_0
  - pip=21.3.1=pyhd8ed1ab_0
  - pyfastaq=3.17.0=py_2
  - pymummer=0.10.3=py_2
  - pyparsing=3.0.7=pyhd8ed1ab_0
  - pysam=0.17.0=py38h104f7d5_1
  - python=3.8.12=hb7a2778_2_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python_abi=3.8=2_cp38
  - pyyaml=6.0=py38h497a2fe_3
  - readline=8.1=h46c0cb4_0
  - samtools=1.14=hb421002_0
  - seroba=1.0.2=py_0
  - setuptools=60.5.0=py38h578d9bd_0
  - six=1.16.0=pyh6c4a22f_0
  - soupsieve=2.3.1=pyhd8ed1ab_0
  - spades=3.15.3=h95f258a_1
  - sqlite=3.37.0=h9cd32fc_0
  - sysroot_linux-64=2.17=h4a8ded7_13
  - tk=8.6.11=h27826a3_1
  - unicodedata2=14.0.0=py38h497a2fe_0
  - wget=1.20.3=ha56f1ee_1
  - wheel=0.37.1=pyhd8ed1ab_0
  - xz=5.2.5=h516909a_1
  - yaml=0.2.5=h7f98852_2
  - zlib=1.2.11=h36c2ea0_1013
  - zstd=1.5.2=ha95c52a_0
prefix: /home/robert_petit/miniconda3/envs/test-seroba

biopython issue!

i install seroba by conda and encounter this message.

$ seroba
Traceback (most recent call last):
File "/home/ctsui/.conda/envs/seroBA/bin/seroba", line 3, in
import seroba
File "/home/ctsui/.conda/envs/seroBA/lib/python3.6/site-packages/seroba/init.py", line 16, in
from seroba import *
File "/home/ctsui/.conda/envs/seroBA/lib/python3.6/site-packages/seroba/tasks/init.py", line 10, in
from seroba.tasks import *
File "/home/ctsui/.conda/envs/seroBA/lib/python3.6/site-packages/seroba/tasks/getPneumocat.py", line 2, in
from seroba import get_pneumocat_data
File "/home/ctsui/.conda/envs/seroBA/lib/python3.6/site-packages/seroba/get_pneumocat_data.py", line 6, in
from Bio.Alphabet import generic_dna
File "/home/ctsui/.conda/envs/seroBA/lib/python3.6/site-packages/Bio/Alphabet/init.py", line 21, in
"Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the molecule_type as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information."
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the molecule_type as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.

KeyError:'24B'

Hey everyone,

I am getting a KeyError when I am running seroba on my test data.

Traceback (most recent call last):
  File "/home/user/.local/bin/seroba", line 86, in <module>
    args.func(args)
  File "/home/user/.local/lib/python3.8/site-packages/seroba/tasks/sero_run.py", line 19, in run
    sero.run()
  File "/home/user/.local/lib/python3.8/site-packages/seroba/serotyping.py", line 479, in run
    cluster = self.serotype_cluster_dict[self.best_serotype]
KeyError: '24B'

I am not sure what it means or how to fix it. Maybe someone had the same or a similar problem ?

Thank you in advance!

nucmer dependency error

Hi,

I have installed KMC and Mummer. They are in the path and work. However, when I try to run seroba, it complains:

ERROR: I tried to get the version of nucmer with: "/apps/mummer/4.0.0.beta2/bin/nucmer --version" and the output didn't match this regular expression: "^NUCmer \(NUCleotide MUMmer\) version ([0-9\.]+)"

When I run /apps/mummer/4.0.0.beta2/bin/nucmer --version I get 4.0.0beta2, which indeed does not match what sroba expects.

11A/11C Misidentification

Hello!

I work at the Minnesota Department of Health. Internally, have been using Seroba as a replacement for our conventional/molecular serotyping of Strep pneumo for a few months now.

We recently sequenced a handful of 11As (previously serotyped by quellung) that Seroba calls 11C. We’ve tried running Seroba in a number of different environments/containers and it is consistently predicting 11C. Manually mapping the reads the each cps loci seems to give better results for 11A than 11C as well.

Any idea what might be happening here? I'd be happy to share the sequence data privately.

Thanks!

Run Seroba with PneumoCat v1.2.1??

Hi,
Noticed an issue while setting up the database for Seroba. The program is downloading an outdated version of the Pneumocat database (v1.1). The subsequent versions of Pneumocat have included important revisions, particularly on serotype 15A and serogroup 19.

Is there a way to run Seroba using the Pneumocat database v1.2.1??
Thanks.

latest build on docker hub is non functional

Greetings,

The latest build on dockerhub (tag latest) is non-functional. The program is exiting with the following error:

(base) cimendes@lobo-1:~/pneumo/in_silico_serotype/seroba_fabio_2021$ srun --nodes=1 --ntasks=1 --cpus-per-task=4 shifter --image=sangerpathogens/seroba:latest seroba runSerotyping /seroba/database/ /home/cimendes/pneumo/in_silico_serotype/seroba_fabio_2021/concatenated_reads/2017PP664_1.fastq.gz /home/cimendes/pneumo/in_silico_serotype/seroba_fabio_2021/concatenated_reads/2017PP664_2.fastq.gz /mnt/nfs/lobo/ONEIDA-NFS/cimendes/pneumo/in_silico_serotype/seroba_fabio_2021/outdir/2017PP664/seroba;
Traceback (most recent call last):
File "/usr/local/bin/seroba", line 4, in
import('pkg_resources').run_script('seroba==1.0.2', 'seroba')
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 650, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 1453, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/EGG-INFO/scripts/seroba", line 86, in
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/tasks/sero_run.py", line 13, in run
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/serotyping.py", line 34, in init
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'
srun: error: compute-1: task 0: Exited with exit code 1
srun: launch/slurm: _step_signal: Terminating StepId=2113593.0

Running the same command with build sangerpathogens/seroba:remove_sanger_pathogen_email works as expected.

error

Hello,
I install seroba and download the database from PneumoCaT. But I am not able to create the database.
(py36)[ctsui@grl-salk Strep_sero]$ seroba createDBs pneumoDB 71
Traceback (most recent call last):
File "/home/ctsui/.conda/envs/py36/bin/seroba", line 86, in
args.func(args)
File "/home/ctsui/.conda/envs/py36/lib/python3.6/site-packages/seroba/tasks/createDBs.py", line 10, in run
ref_db.run()
File "/home/ctsui/.conda/envs/py36/lib/python3.6/site-packages/seroba/ref_db_creator.py", line 237, in run
os.makedirs(os.path.join(self.out_dir,'ariba_db'))
File "/home/ctsui/.conda/envs/py36/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)
FileExistsError: [Errno 17] File exists: 'pneumoDB/ariba_db'
(

Will appreciate for advice! Thanks,

Clement

Got an error massage through docker with Ubuntu20.04 LTS

When I run seroBA using docker 20.10.1 in Ubuntu20.04LTS, I got an error massage as follows,

Traceback (most recent call last):
File "/usr/local/bin/seroba", line 4, in
import('pkg_resources').run_script('seroba==1.0.2', 'seroba')
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 650, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 1453, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/EGG-INFO/scripts/seroba", line 86, in
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/tasks/sero_run.py", line 13, in run
File "/usr/local/lib/python3.8/dist-packages/seroba-1.0.2-py3.8.egg/seroba/serotyping.py", line 34, in init
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'

How can I solve this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.