Giter Site home page Giter Site logo

toney823 / allhic_components Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sc-zhang/allhic_components

0.0 0.0 0.0 11.13 MB

Some components that speed up and reduce resource cost for original ALLHiC

Shell 5.91% C++ 12.86% Python 80.36% Makefile 0.87%

allhic_components's Introduction

Introduction

Some components that speed up and reduce resource cost for original ALLHiC.

Dependencies

  • pysam
  • numpy
  • matplotlib
  • jcvi
  • h5py

Installation

git clone https://github.com/sc-zhang/ALLHiC_components.git
cd ALLHiC_components
chmod +x bin/*.*

# install ALLHiC_prune
cd src/
make && make install

Usage

ALLHiC_prune is used for prunning singals between allelic chromosomes, which was rewritten for speedup and mem reduce.

************************************************************************
    Usage: ./ALLHiC_prune -i Allele.ctg.table -b sorted.bam
      -h : help and usage.
      -i : Allele.ctg.table
      -b : sorted.bam
************************************************************************

partition_gmap.py is used for spliting bam and contig level fasta by chromosomes with allele table.

usage: partition_gmap.py [-h] -r REF -g ALLELETABLE [-b BAM] [-d WORKDIR]
                         [-t THREAD]

optional arguments:
  -h, --help            show this help message and exit
  -r REF, --ref REF     reference contig level assembly
  -g ALLELETABLE, --alleletable ALLELETABLE
                        Allele.gene.table
  -b BAM, --bam BAM     bam file, default: prunning.bam
  -d WORKDIR, --workdir WORKDIR
                        work directory, default: wrk_dir
  -t THREAD, --thread THREAD
                        threads, default: 10

ALLHiC_partition.py is an experimental script for clustering contigs into haplotypes.

usage: ALLHiC_partition.py [-h] -r REF -b BAM -d BED -a ANCHORS -p POLY
                           [-e EXCLUDE] [-o OUT]

optional arguments:
  -h, --help            show this help message and exit
  -r REF, --ref REF     Contig level assembly fasta
  -b BAM, --bam BAM     Prunned bam file
  -d BED, --bed BED     dup.bed
  -a ANCHORS, --anchors ANCHORS
                        anchors file with dup.mono.anchors
  -p POLY, --poly POLY  Ploid count of polyploid
  -e EXCLUDE, --exclude EXCLUDE
                        A list file contains exclude contigs for partition,
                        default=""
  -o OUT, --out OUT     Output directory, default=workdir

ALLHiC_rescue.py is a new version of rescue use jcvi to prevent the collinear contigs be rescued to same group.

usage: ALLHiC_rescue.py [-h] -r REF -b BAM -c CLUSTER -n COUNTS -g GFF3 -j
                        JCVI [-e EXCLUDE] [-w WORKDIR]

optional arguments:
  -h, --help            show this help message and exit
  -r REF, --ref REF     Contig level assembly fasta
  -b BAM, --bam BAM     Unprunned bam
  -c CLUSTER, --cluster CLUSTER
                        Cluster file of contigs
  -n COUNTS, --counts COUNTS
                        count REs file
  -g GFF3, --gff3 GFF3  Gff3 file generated by gmap cds to contigs
  -j JCVI, --jcvi JCVI  CDS file for jcvi, bed file with same prefix must
                        exist in the same position
  -e EXCLUDE, --exclude EXCLUDE
                        cluster which need no rescue, default="", split by
                        comma
  -w WORKDIR, --workdir WORKDIR
                        Work directory, default=wrkdir

ALLHiC_plot.py is used to plot heatmap of Hi-C singal, and compare with original version, it can reduce the usage of memory, and easier plot heatmap with other resolution.

# Notice: bam file must be indexed
usage: ALLHiC_plot.py [-h] -b BAM -l LIST [-a AGP] [-5 H5] [-m MIN_SIZE] [-s SIZE] [-c CMAP] [-o OUTDIR] [--line | --block] [--linecolor LINECOLOR] [-t THREAD]

options:
  -h, --help            show this help message and exit
  -b BAM, --bam BAM     Input bam file
  -l LIST, --list LIST  Chromosome list, contain: ID Length
  -a AGP, --agp AGP     Input AGP file, if bam file is a contig-level mapping, agp file is required
  -5 H5, --h5 H5        h5 file of hic signal, optional, if not exist, it will be generate after reading hic signals, or it will be loaded for drawing other resolution of heatmap
  -m MIN_SIZE, --min_size MIN_SIZE
                        Minium bin size of heatmap, default=50k
  -s SIZE, --size SIZE  Bin size of heatmap, can be a list separated by comma, default=500k, notice: it must be n times of min_size (n is integer) or we will adjust it to nearest one
  -c CMAP, --cmap CMAP  CMAP for drawing heatmap, default="YlOrRd"
  -o OUTDIR, --outdir OUTDIR
                        Output directory, default=workdir
  --line                Draw dash line for each chromosome
  --block               Draw dash block for each chromosome
  --linecolor LINECOLOR
                        Color of dash line or dash block, default="grey"
  -t THREAD, --thread THREAD
                        Threads for reading bam, default=1

Other scripts are under development, and not recommend to use.

allhic_components's People

Contributors

sc-zhang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.