Giter Site home page Giter Site logo

guido's Introduction

Guido

Documentation Status PyPI version DOI

Guido is a Python package developed to search for gRNA targets in any reference genome or DNA sequence. It integrates MMEJ prediction and scoring, off-target search, and allows users to define their own data layers that can be used in the gRNA evaluation and ranking.

Installation

Install guido and azimuth (dependency) via PyPi using pip:

$ pip install guido
$ pip install git+https://github.com/Biomatters/Azimuth

Please note that guido requires bowtie v1.3.1 to be installed on your system. Please follow the instructions on the bowtie website to install it.

Additionaly, guido requires tabix to be installed in order to search for off-targets. Please follow the instructions on the htslib website to install it.

Alternatively, you can install both bowtie and tabix by using conda:

$ conda install -c bioconda bowtie
$ conda install -c bioconda tabix

Usage

You can use guido to search for gRNAs in a reference genome or DNA sequence. The following example shows how to search for gRNAs in the malaria mosquito (Anopheles gambiae) reference genome (AgamP4):

Create a Genome instance

First we need to create a Genome instance. This instance will be used to search for gRNAs in the genome. The Genome class will link the genome sequence to the annotation file and will create a bowtie index for the genome. The Genome class takes the following arguments: genome_name (name of the genome), genome_file_abspath (FASTA sequence), and annotation_file_abspath (GTF annotation file).

>>> import guido

>>> genome = guido.Genome(genome_name='AgamP4',
                      genome_file_abspath='data/AgamP4.fa',
                      annotation_file_abspath='data/AgamP4.12.gtf')

Build the FASTA index and bowtie index files and create a AgamP4.guido file in genome_file_abspath that contains the genome information and can be used to create a Genome instance next time without needing to build the genome indices again.

>>> genome.build(n_threads=2)

The Genome class also has a bowtie_index_abspath argument that can be used to specify the directory and bowtie index name. If this argument is not specified, the bowtie index will be created when running genome.build().

To load a Genome instance from a AgamP4.guido file that was created using genome.build(), use the Genome.load() method:

>>> import guido
>>> genome = guido.load_genome_from_file(guido_file='data/AgamP4.guido')

Create a Locus instance and search for gRNAs

Locus instances are used to search for gRNAs in a specific genomic region. The Locus class takes the following arguments: genome (a Genome instance), chromosome (chromosome name), start (start position), and end (end position). The start and end positions are 1-based and inclusive.

>>> import guido

>>> genome = guido.load_genome_from_file(guido_file='data/AgamP4.guido')
>>> loc = guido.locus_from_coordinates(genome, 'AgamP4_2R', 48714541, 48714666)

>>> loc.find_guides()
    ['gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|)',
     'gRNA-2(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714583|-|)',
     ...
     'gRNA-7(GTTTAACACAGGTCAAGCGGTGG|AgamP4_2R:48714637-48714659|-|)',
     'gRNA-8(TATGTTTAACACAGGTCAAGCGG|AgamP4_2R:48714640-48714662|-|)']

Alternatively, you can use the locus_from_gene() function to create a Locus instance and limit the search for gRNAs to a specific feature that is defined in the genome annotation file:

>>> import guido
>>> loc = guido.locus_from_gene(genome, 'AGAP005958')

>>> loc.find_guides(selected_features='exon')

After running loc.find_guides(), the Locus instance will contain a list of gRNA instances that can be accessed using the loc.guides attribute.

>>> loc.guides

    ['gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|)',
     'gRNA-2(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714583|-|)',
     ...
     'gRNA-7(GTTTAACACAGGTCAAGCGGTGG|AgamP4_2R:48714637-48714659|-|'),
     'gRNA-8(TATGTTTAACACAGGTCAAGCGG|AgamP4_2R:48714640-48714662|-|)']

You can access a gRNA by its index or a name:

>>> loc.guides[0]
    'gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|)'

>>> loc.guides['gRNA-1']
    'gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|)'

Docs

More extensive documentation can be found on Read the Docs.

Examples

You can find a Jupyter notebook with examples in the notebooks/ directory.

Cite as

Nace Kranjc, & Courty Thomas. (2023). nkran/guido: v0.1.3 (v0.1.3). Zenodo. https://doi.org/10.5281/zenodo.8056051

Developer setup

Install poetry:

$ pip install poetry

Create development environment:

$ cd guido
$ poetry install

Activate development environment:

$ poetry shell

Install pre-commit hooks:

$ pre-commit install

Run pre-commit checks (isort, black, blackdoc, flake8, ...) manually:

$ pre-commit run --all-files

Bump version, build and publish to PyPI:

$ poetry version prerelease
$ poetry build
$ poetry publish

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.