Giter Site home page Giter Site logo

n2v-hc's Introduction

N2V-HC

N2V-HC: A novel method for disease module identification based on deep representation learning of multi-layer biological networks.

Introduction

N2V-HC implement a disease module identification method based on deep representation learning of multi-layer biological networks. This method first generates an integrated network based on human interactome and summary data of Genome-wide Association Studies (GWAS), expression Quantitative Trait Loci (eQTL) studies. The features of nodes in the network are then extracted by deep representation learning. Hierarchical clustering with dynamic tree cut methods are applied to discover the modules containing disease related genes which are regulated by GWAS variants, and the module containing eGene is extracted as a sub-ntwork to use Hierarchical clustering, iteratively, until the number of nodes in the network no longer changes.

Input

'diseaseSNP'     The relationship between disease and SNP, from left to right is proxy SNP, independent SNP, and disease.
'eqtl.edgelist'  eQTL network, one eQTL information per line, gene and SNP from left to right.
'gene.edgelist'  Human intercome network, two vertices per line, representing one edge.

Output

'eGene.nodeID2name'          eGene and its nodeID mapping file.
'eqtl.edgelist.indeSNP'      Each line from left to right is gene and SNP, which represents an eQTL information after converted proxy SNP into its corresponding independent SNP.
'eqtl.nodeID2name'           Gene and SNP nodes and their nodeID in the 'eqtl.edgelist.indeSNP' file.
'network.edgelist.nodeName'  The integrated network, with two nodes(node name) per row, represents one edge.
'network.nodeID2name'        The mapping of node name to node ID in the 'network.edgelist.nodeName'.
'network.emb'                Node2vec result file, the file name can be specified by the user.
'network.edgelist.nodeID'    The integrated network, with two nodes(node ID in 'network.nodeID2name' )per row, represents one edge.
'network.pred_label'         Hierarchical clustering result file, one node(node ID) per line and its module.
'result'                     Converged result files, the file name can be specified by the user. From left to right, are module label, module nodes size, module edges size, disease1_SNP__count, disease1_eGene_count, disease1_SNP, disease1_eGene, ..., other gene.

Environment

python2
R 3.5.3

Demo

## Hierarchical Clustering in Dolphins social network
Rscript src/HierarchicalClustering.R -e dolphins/embedding -g dolphins/edgelist -o dolphins/pred_label -s 10 -r 2

## execution script
bash run.sh

image Figure. The clustering effect of N2V-HC on Dolphins social network (Lusseau et al., 2003). (A) The topology of original network, with colors represents the ground truth communities. (B) The hierarchical clustering dendrogram constructed by N2V-HC, where each leaf node represents a member in original network. two prediced modules are colored in red and blue. Node โ€™40โ€™, which is misclassified, is labeled in yellow.

Contact

If you need help, please contact [email protected] or [email protected].

Full paper

Full paper has been submitted to Frontiers in Genetics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.