Giter Site home page Giter Site logo

pythseq / acastellanii_legionella_infection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cmdoret/acastellanii_legionella_infection

0.0 0.0 0.0 72.56 MB

Analysis of genomic changes induced in Acanthamoeba castellanii C3 following infection by Legionella pneumophila

License: GNU General Public License v3.0

Python 83.88% R 16.12%

acastellanii_legionella_infection's Introduction

Genomic changes in A. castellanii during infection of amoeba by L. pneumophila

DOI

Background

This repository contains the analysis of Acanthamoeba castellanii infection by Legionella pneumophila. We investigate how the host genome is remodelled during infection by an intracellular bacterium. To investigate these changes, we use Hi-C and RNAseq to measure both 3D changes in chromatin and gene expression changes. We use two biological replicates of uninfected A. castellanii (strain C3) and two infected replicates at 5h post infection.

A frozen copy of this repository and its output data are available for download at the corresponding Zenodo record.

Dependencies

The pipeline is written using snakemake and has the following dependencies:

  • python >= 3.7
  • conda >= 4.8
  • snakemake >= 5.5

Each rule is encapsulated in a conda environment where its dependencies are managed automatically. Fastq files containing the Hi-C and RNA-seq reads are also downloaded automatically from SRA. Input files (genomes, annotations, ...) are automatically downloaded from the corresponding Zenodo record.

Installation

You need to have a working conda installation on your machine and install snakemake (>=5.5) via pip or conda.

Usage

You can then run the pipeline with:

snakemake -j6 --use-conda

And the pipeline should fetch required packages and data as it runs.

Configuration

Some metadata files are provided with the pipeline to help understand the design and modify parameters. The following files may be of interest:

  • samples.tsv: Samples used in analyses and associated informations
  • units.tsv: sequencing libraries used in the pipeline, file paths for the reads and metadata
  • config.yaml: path to key files and general parameters to control results of the pipeline.
  • cluster_slurm.json: Cluster resource requirements in the event that the pipeline is run on a HPC with the SLURM scheduler. In that case, the following command should be used to run the pipeline instead:
    • snakemake --rerun-incomplete --use-conda --cluster-config cluster_slurm.json --cluster "sbatch -n {cluster.ntasks} -c {cluster.ncpus} --mem {cluster.mem} --qos {cluster.queue}" --jobs 30

Pipeline

The pipeline is subdivided into submodules relating to the processing and downstream analysis of Hi-C and RNAseq data. It starts from fastq files to generate Hi-C matrices and differential expression results. It also computes statistics and does pattern detection on Hi-C contact map to generate figures and tables which will be used by tailored analyses in jupyter notebooks.

Here is a visual summary of pipeline steps and their dependencies:

For a more detailed visual summary showing input/output files, see the filegraph

Analyses

Analyses are described in jupyter notebooks located in the docs/notebooks folder. These notebooks are numbered to reflect the logical order in which analyses should be done. They should be executed in that order as some will generate files for the next notebook.

  • Notebook: Statistical exploration of chromatin loop changes
  • Notebook: Visual exploration of global contact changes during infection
  • Notebook: Analysis of interchromosomal contacts changes
  • Notebook: Detection and overview of chromatin insulation domains
  • Notebook: Analysis of the relationship between expression and contacts changes during infection
  • Notebook: Analysis of gene coexpression versus contact changes using lifted-over expression data from Li et al. 2020

acastellanii_legionella_infection's People

Contributors

cmdoret avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.