Repository with workflow for competitive read mapping and allele-specific expression counts

Description of the repository

In the main directory you will find bash scripts named from 1 to 4. They should be run in that order (see Requirements).

Script 1

It will download the raw reads directly from SRA with fasterq-dump using the accession numbers stored in metadata-7938743-processed-ok.tsv and rename them. Next step is to filter and trim reads with trimmomatic.

Script 2

Generates genome indices for star based on files stored in references.

Script 3

Maps reads using star and generates BAM files as well as their indices.

Script 4

Counts reads for parent species alignments using featureCounts and for hybrid data and allele-specific expression it uses CompMap. It will also generate tables with raw counts for each sample.

Simulations

This directory includes bash and r scripts running RNA-seq data simulations for allele-specific expression using the package polyester.

Counts

All the raw count tables are included here.

Analyses

This contains two subdirectories scripts and tables.

scripts has further subdirectories with r scripts to generate summary tables from differential expression (and other) analyses. tables has all the raw summary tables used for this work.

Running some of the r scripts will overwrite the files already in that directory. So do it carefully!

Extra files

briggsae.nigoni.1-to-1.orto.dnds_final.csv --> Includes dN, dS estimates for each ortholog
c_briggsae_genes_conv_to_wormbase.csv --> CSV tables with equivalent gene names for the current C. briggsae WormBase release (PRJNA10731)
orthologs.txt --> A list with C. briggsae and C. nigoni orthologs
samples.txt --> List of sample names used for renaming files

Requirements

Unix command-line programs

samtools
CompMap
GNU-parallel
star
BWA
trimmomatic
sratools
featureCounts

R packages

polyester
glimma
edgeR
DESeq2
MASS
RColorBrewer
cowplot
dplyr
edgeR
ggVennDiagram
ggforce
ggplot2
ggplotify
ggpointdensity
ggrepel
ggridges
ggstance
ggtext
grid
gridExtra
gtable
lemon
parallel
scales
tibble
tidyr
venneuler
viridis

santiagosnchez / competitive_mapping_workflow Goto Github PK

competitive_mapping_workflow's Introduction

Repository with workflow for competitive read mapping and allele-specific expression counts

Description of the repository

Script 1

Script 2

Script 3

Script 4

Simulations

Counts

Analyses

Extra files

Requirements

Unix command-line programs

R packages

competitive_mapping_workflow's People

Contributors

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org