Giter Site home page Giter Site logo

mitts's Introduction

mitts

MITTS - Mutations In Twenty Three S

Simple tool that estimates the number of 23S copies that contain a given point mutation, given a reference and a known position

UNDER DEVELOPMENT

Installation

conda create -n mitts pysam pysamstats bowtie2 samtools

conda activate mitts

git clone https://github.com/stroehleina/mitts

Quick start

python mitts.py --help

usage: mitts.py [-h] [--version] --species SPECIES --fastq1 FASTQ1 --fastq2
                FASTQ2 [--threads THREADS] [--output OUTPUT] [--mincov MINCOV]

MITTS - Mutations In Twenty Three S

optional arguments:
  -h, --help            show this help message and exit
  --version, -v         show program's version number and exit
  --species SPECIES, -s SPECIES
                        Species that you want to investigate 23S mutations
                        for, choose from ['ENTEROCOCCUS', 'NEISSERIA']
                        (default: None)
  --fastq1 FASTQ1, -1 FASTQ1
                        R1 reads in FASTQ format (can be gzipped) (default:
                        None)
  --fastq2 FASTQ2, -2 FASTQ2
                        R2 reads in FASTQ format (can be gzipped) (default:
                        None)
  --threads THREADS, -t THREADS
                        Number of threads to use (default: 1)
  --output OUTPUT, -o OUTPUT
                        Define prefix for output bam file (default: None)
  --mincov MINCOV, -m MINCOV
                        Minimum coverage at mutation position to make a call
                        (default: 60)

Defining references

References are defined in the classs Species in species.py as follows:

    'MYSPECIESNAME' : {
      'reference' : '/path/to/fasta/reference.fa', # Should only contain one sequence
      'chrom' : 'my_fasta_header',  # The header of the fasta sequence (without ">")
      'bowtie_ref' : '/path/to/bowtie/database/bt_myspecies',  # Path to bowtie-2 database files created with bowtie2-build -f reference.fa bt_myspecies
      'positions' : [2589, 2518], # List of positions (1-based) in the reference that you want to investigate for mutations
      'reference_nucls' : ['G', 'G'], # List of nucleotides in the reference at the respective positions
      'variant_nucls' : ['T', 'A'], # List of mutated nucleotides that you want to report
      'mutations' : ['G2576T', 'G2505A'], # Name of mutation (e.g, if using E. coli numbering)
      'copy_number' : 6 # Number of 23S copies present in the genome
    }

If you have a reference sequence and know the positions of the point mutations that you are interested in you can simply add them to this file and then call mitts with the --species myspecies flag (mitts will automatically convert to uppercase so Myspecies, myspecies, MySpecies and MYSPECIES will work).

Choice of reference

When choosing a reference, make sure that it contains a reasonably-sized (depending on your read length and insert size) flanking region up- and down-stream of all positions of interest, otherwise you will grossly underestimate the number of reads mapped to the position and thus, your call will become less accurate, or may be skipped if the coverage drops below --mincov (default: 60).

mitts's People

Contributors

stroehleina avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.