qhery

Identification of mutations in SARS-CoV-2 associated with resistance to treatment.

qhery was developed by the Q-PHIRE Genomics team at Forensic and Scientific Services, Queensland Health.

Installation
- git
requirements
- Optional requirements
Example usage
Output
- Example output
- Columns
TODO
Arguments
- subcommands
  - list_rx
  - run
  - mutations
Method

Installation

git

While in development qhery can only be installed by downloading from git

git clone https://github.com/mjsull/qhery.git

requirements:

Python >= 3.9.12
bcftools >= 1.10.2
curl >= 7.83.1
wget >= 1.20.3

Optional requirements:

ncbi-blast+ >= 2.9.0+ - This will generate a BLASTx alignment of the genome for visualization
lofreq >= 2.1.5 - if provided with a BAM file qhery will look for minor alleles in the alignment with lofreq
samtools >= 1.7 - samtools is used to determine the depth of sequence along the genome, and which resistance mutations cannot be reported on due to lack of coverage.

Example usage:

qhery run --database_dir database_dir --vcf sample.vcf --pipeline_dir output_dir --lineage Omicron/BA.1 --sample_name mysample --rx_list Sotrovimab

Determines the amino acid changes caused by the mutations listed in sample.vcf and then compares them to a list of mutations that cause a reduction in Sotrovimab binding.

qhery run --database_dir database_dir --vcf sample.vcf --pipeline_dir output_dir --lineage Omicron/BA.1 --sample_name mysample --rx_list Sotrovimab Remdesivir --fasta sample.consensus.fasta --bam sample.primertrimmed.rg.sorted.bam

Determines the amino acid changes caused by the mutations listed in sample.vcf. Additionally will use lofreq to find minor alleles in the BAM file. Finally they are compared to a list of mutations that cause a reduction in Sotrovimab binding or reduction in Remdesivir efficiency.

qhery list_rx --database_dir database_dir

List treatments for which resistance information exists.

Output

qhery produces two tables.

<sample_name>.full.tsv

Contains all mutations detected in the sample and all mutations associated with the treatments listed by the user.

<sample_name>.final.tsv

Both tables have the same format (described below in example output).

Contains all mutations detected in the sample that are both in genes assosciated with reistance to the treatments listed and are not lineage defining mutations. It also contains resistance mutation that do not have enough read depth to be called as present or absent (default 20x read depth).

Finally qhery will also produce a BLASTx alignment of the query to mature proteins and a bammix plot of the epitopes of the treatments the user listed (if available).

Example output:

Mutation	alt_names	in_sample	in_variant	covered	resistance_mutation	Remdesivir_average_fold_reduction	Remdesivir_fold_reductions	Remdesivir_in_epitope	Sotrovimab_average_fold_reduction	Sotrovimab_fold_reductions	Sotrovimab_in_epitope
E:T9I	-	True	True	True	False	0	-	False	0	-	False
M:D3G	-	True	True	True	False	0	-	False	0	-	False
N:ERS31-33∆	-	True	True	False	False	0	-	False	0	-	False
ORF3a:L52F	-	True	False	True	False	0	-	False	0	-	False
RdRP:802D	-	False	False	True	True	2.54	=2.54	False	0	-	False
S:R214ins	S:R214R_EPE	True	True	True	True	0	-	False	3.00	=3.0	False
S:P337T	-	True	False	True	True	0	-	False	8.00	=5.4,=10.6	True

Columns

column	header	description
1	mutation	The mutation name. Gene name comes before the colon, then reference amino acid, position and sample amino acid
2	alt_names	Discrepency between database mutation name and csq mutation name
3	in_sample	Is mutation in the query
4	in_variant	Is mutation a lineage defining mutation
5	covered	Is the mutation covered by 20 or more reads
6	resistance_mutation	Is there evidence the mutation may confer some resistance to one of the treatments listed
7	rx1_average_fold_reduction	Average fold reduction of listed fold reductions
8	rx1_fold_reductions	Fold reductions listed in the database
9	rx1_in_epitope	Is the mutation in the epitope of this treatment (MABs only)
10	rx2_average_fold_reduction	The previous 3 columns repeate for each treatment provided by the user
11	rx2_fold_reductions	...
12	rx2_in_epitope	...

TODO

Need a phasing step between lofreq and bcftools csq (or to switch to a vcf caller that does phasing)
Add allele frequency information to output table

Arguments:

-h, --help

show this help message and exit

subcommands

list_rx

List all drugs for which resistance information is available.

Takes no arguments

run

Determines mutations in samples and then checks against resistance data.

arguments

-n, --sample_name <sample_name>

Sample name, output files will be prefixed with this.

-v, --vcf <sample.vcf>

vcf file, variants called against the Wuhan-Hu-1 reference (MN908947.3)

-b, --bam <sample.sorted.bam>

Sorted bam file. File of read alignments for the sample mapped against the Wuhan-Hu-1 reference (MN908947.3)

-d, --database_dir <path/to/database_dir>

Directory with the latest version of the Stanford resistance database. If the latest version is not in this folder it will be downloaded to this location.

-p, --pipeline_dir <path/to/pipeline_dir>

All script output and intermediated files will be put here. Script will create a directory if none exists.

-l, --lineage <BA.1>

Lineage of the query (BA.1/BA.2/BA.3/Delta etc.)

-rx, --rx_list <Sotrovimab Remdesevir>

List of treatments to interrogate.

--fasta, --fasta <sample.fasta>

Fasta file of the consensus sequence of the sample, only used to generate a BLASTx alignment for double checking mutations.

mutations

arguments

Only list mutations and not resistance information.

-n, --sample_name <sample_name>

Sample name, output files will be prefixed with this.

-v, --vcf <sample.vcf>

vcf file, variants called against the Wuhan-Hu-1 reference (MN908947.3)

-b, --bam <sample.sorted.bam>

Sorted bam file. File of read alignments for the sample mapped against the Wuhan-Hu-1 reference (MN908947.3)

-d, --database_dir <path/to/database_dir>

Directory with the latest version of the Stanford resistance database. If the latest version is not in this folder it will be downloaded to this location.

-p, --pipeline_dir <path/to/pipeline_dir>

All script output and intermediated files will be put here. Script will create a directory if none exists.

-l, --lineage <BA.1>

Lineage of the query (BA.1/BA.2/BA.3/Delta etc.)

-k, --keep_lineage

report lineage defining mutations as well

Method

A flowchart of how qhery run works

mjsull / qhery Goto Github PK

qhery's Introduction

qhery

Identification of mutations in SARS-CoV-2 associated with resistance to treatment.

Table of contents

Installation

git

requirements:

Optional requirements:

Example usage:

Output

Example output:

Columns

TODO

Arguments:

subcommands

list_rx

run

arguments

mutations

arguments

Method

qhery's People

Contributors

Stargazers

Watchers

Forkers

qhery's Issues

Recommend Projects

Recommend Topics

Recommend Org