Giter Site home page Giter Site logo

macie's Introduction

MACIE (Multi-dimensional Annotation Class Integrative Estimation)

Description

Thank you for your interest in MACIE. MACIE (Multi-dimensional Annotation Class Integrative Estimation) is an unsupervised multivariate mixed model framework to assess multi-dimensional functional impacts for both coding and non-coding variants in the human genome. MACIE integrates a variety of functional annotations, including protein function scores, evolutionary conservation scores, and epigenetic annotations from ENCODE and Roadmap Epigenomics, and estimates the joint posterior probabilities of each genetic variant being functional.

Data Availability and Code Reproducibility

The MACIE scores (and other integrative scores) used in all benchmarking examples are available for download here. Precomputed MACIE scores for every possible variant in the human genome are available for download via Zenodo: Part 1 (Chr1 - Chr3), Part 2 (Chr4 - Chr7), Part 3 (Chr8 - Chr13), Part 4 (Chr14 - Chr22). These are compressed with the bgzip utility, and indexed with tabix, both of which are part of the Samtools software suite. In addition, tabix provides a means of efficiently extracting subsets of the data defined by genomic regions. For example, the command line

tabix MACIE_hg19_noncoding_chr1.tab.bgz 1:20000-30000 > Subset.txt

extracts all variants on chromosome 1 from position 20,000 through 30,000 and writes them to the file Subset.txt. In this example, the tabix index file, MACIE_hg19_noncoding_chr1.tab.bgz.tbi, needs to be in the same directory as the main data file, MACIE_hg19_noncoding_chr1.tab.bgz. Samtools, including bgzip and tabix, is available here.

The code used for training MACIE models are available here.

All genomic coordinates are given in NCBI Build 37/UCSC hg19.

Reference

Xihao Li*, Godwin Yung*, Hufeng Zhou, Ryan Sun, Zilin Li, Kangcheng Hou, Martin Jinye Zhang, Yaowu Liu, Theodore Arapoglou, Chen Wang, Iuliana Ionita-Laza, and Xihong Lin (2022). A multi-dimensional integrative scoring framework for predicting functional variants in the human genome. The American Journal of Human Genetics, 109(3), 446-456. PMID: 35216679. PMCID: PMC8948160. DOI: 10.1016/j.ajhg.2022.01.017.

macie's People

Contributors

xihaoli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.