Giter Site home page Giter Site logo

luffy563 / bovine_circqtl Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 180 KB

BOVINE-circQTL is a code repository for the pipeline of circQTL identification and downstream analysis in bovine muscle tissue.

License: GNU General Public License v3.0

Shell 9.52% R 78.35% Python 12.13%
circrna mirna rbp biogenesis bovine circqtl

bovine_circqtl's Introduction

BOVINE-circQTL (Breed-specific circQTL regulatory networks in bovine muscle tissue: Insights into circRNA biogenesis and multifunctional mechanisms)


GitHub Clones

Copyright (C) 2020-2023 Northwest A&F University, Mingzhi Liao, Xianyong Lan

Authors: Hongfei Liu

Contact: [email protected]

BOVINE-circQTL is a code repository for the pipeline of circQTL identification and downstream analysis in bovine muscle tissue.


Table of contents



Overview of pipeline

  • 1.genotype_dataset_preprocess

    Raw genotype datasets contain vcf file from BGVD (Bovine Genome Variant Database) and two SNP-chip datasets (GSE95358 and GSE100038). This preprocess step includes format, imputation, and remap (from btau 6 to btau 9 genome assembly).

  • 2.RNA_seq_datasets_preprocess

    RNA-seq datasets were retreived from from NCBI SRA database. This preprocess includes quality control, circRNA and mRNA identification.

  • 3.summary_statistics_genotype

    This step includes summary statistics report of genotype datasets and combination of different genotype datasets from various sources.

  • 4.summary_statistics_rnaseq

    CircRNA identification was implemented by different software, including CircMarker, CIRI2, and circRNAfinder, and then the detection results were merged in this step.

  • 5.matrixeqtl

    CircQTL and eQTL were both identified by MatrixEQTL in R basic environment. Furthermore, the functional genomic region and phenotype enrichment analysis was also carried out.

  • 6.trans-circQTL

    Substantial trans-circQTLs were found in this study, so the preliminary investigation of characteristic of tran-circQTL was involved in this step.

  • 7.ABS_events

    First, the ABS (alternativ back-splicing) events profile was constructed to investigate the overall patterns and divergence among breeds. Then, to explore the association of ABS events (intron length and number/pairing ability of SINE (Short interspersed nuclear elements) /Alu-like elements) with flanking circQTL mediated by cis-elements (SINE), we extracted the flanking SINE elements closer to ABS-circRNAs.

  • 8.circRNA_functions

    CircRNA function mainly include miRNA sponge and RBP interaction at cytoplasm, so the potential miRNA and RBP interact with circRNAs were predicted, by which we constructed the circRNA-miRNA/RBP interaction networks. To evaluate the distribution and effect of circQTL, we also investigated the distribution of circQTLs within binding sites and then analyzed the change of binding ability, enrichment degree, and secondary structure through altering the genotype of circQTLs witinin circRNAs.

Config

  • btau9.yml: a YAML-formated config file for CIRIquant to find software and reference needed
  • CM_config.ini: config file of CircMarker, which contains path of reference genome fasta file, annotation file, reads1/2, and other required or optional parameters. Of which, Reference, GTF, Reads1/2, and options in Parameter section is important and required.
  • software_list.txt: a list contains three circRNA detection software
  • SRR_list.txt: SRA accession ID list for RNA-seq dataset
  • SRR_breeds.csv: SRA accession ID and corresponding breed name

Required scripts

Python scripts

  • extract_exons.py: This file is part of HISAT 2 for extracting exons from gtf annotation file.
  • extract_flanking_introns.py: Extract flanking introns closer to ABS-circRNAs
  • extract_splice_sites.py: This file is part of HISAT 2 for extracting splice sites from gtf annotation file.
  • extract_random_seq_bed.py: Generate the custom number of random sequences from the bovine genome (ARS-UCD1.2 genome assembly) according
  • filt.py: Extract the specific sequence set from a fasta file based on a sequence ID list.
  • getTPM.py: Extract the TPM quantity matrix at genes and transcripts levels from the output generated by stringtie -e.
  • get_seed_region.py: Get the seed region of miRNAs
  • summary_rna_seq.py: To merge circRNA detection results and then report the summary statistics to the specific normal distribution of sequence length.

R scripts

  • annotation.r: merge annotated circRNA list (circularRNA_known.txt) by the known circRNA
  • combine.r: Combination of BGVD genotype and two SNP-chip sets
  • conversion.r: convert raw genome coordinate (0-based or 1-based) to 0-based
  • prep_quant.r: convert raw genome coordinate (0-based or 1-based) to 0-based for CIRIquant calibration

bovine_circqtl's People

Contributors

luffy563 avatar luffylouis avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.