Giter Site home page Giter Site logo

borgwardtlab / sinimin Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 1.0 3.01 MB

Significant Network Interval Mining

License: GNU General Public License v3.0

C++ 93.25% Makefile 0.22% C 2.42% Python 3.37% CMake 0.75%
networks statistical-testing interactions

sinimin's Introduction

SiNIMin

This repository contains the code for the Significant Network Interval Mining approach, short SiNIMin, and its permutation-testing based counterpart SiNIMin-WY. The methods are described in Network-guided detection of candidate intervals that exhibit genetic heterogeneity (under review).

Data formatting

Assuming we are given a data set of n samples with d binary features. An example of all files can be found in the folder examples.

The method requires the following input:

  1. data file with d rows corresponding to features (important: the features are assumed to follow a natural ordering, such as genetic variants by their position on the DNA ) and n columns, corresponding to n samples. The values are supposed to be binary.
  2. label file with n rows, that contains the binary phenotype of the n samples. Samples are assumed to be in same ordering as in data file.
  3. feature file d rows, that contains the name of the d features. Samples are assumed to be in same ordering as in data file.
  4. edge file, where each row contains the names of the nodes adjacent to the edge in tab-separated format.
  5. mapping file, linking the features to the nodes in the network. Each row contains the name of the node followed by a white-space separated list of feature names.
  6. target FWER, the target family wise error rate, default: 0.05.
  7. covariate file (optional) with n rows. Each row contains the index of the class of the corresponding sample. Samples are assumed to be in same ordering as in data file.

Usage information

Compilation (manual)

Note that the package relies on the Eigen-library. This library has to be linked upon re-compilation of the method. OpenMP is used for parallelization of permutation testing.

We provide a Makefile that may have to be adjusted for the compilation to work. You can compile the program using the following steps:

$ cd SiNIMin/C
$ make

If the compile step does not work, please try adjusting the compiler settings in the Makefile or use another compilation method.

Compilation (CMake)

Another way to compile the package involves compiling it using cmake. For Mac OS X, we recommend installing the following packages using Homebrew:

$ brew install cmake gcc eigen

After cloning this repository, the following steps are required to compile the package:

$ cd SiNIMin/C
$ mkdir build
$ cd build
$ cmake -DCMAKE_CXX_COMPILER=g++-9 ../
$ make

Optionally, the compiler version can also be changed if a more recent compiler is present. Compiling the package with the Apple version of the clang compiler (which is sometimes confusingly also present as g++ in the system) currently does not work.

Having compiled the package, it can optionally be installed by issuing

$ make install

from the build directory created above.

Installation using Homebrew (Mac OS X )

For Mac OS X, we recommend installing the package using the Homebrew package manager:

$ brew install --cc=gcc BorgwardtLab/mlcb/sinimin

Afterwards, the package can be automatically used on the command-line.

Example usage

Examples on how to execute the methods SiNIMin and SiNIMin-WY can be found in examples/runs with corresponding data in examples/data. The executable for both methods is called sinimin and can be found in SiNIMin/compiled.

./sinimin \
  -i "${data_file}" \
  -l "${labels_file}" \
  -c "${covariate_file}" \
  -m "${mapping_file}" \
  -e "${edge_file}" \
  -s "${feature_file}" \
  -f 0.05 \
  -o "${output_prefix}" \

There exist additional flags that can be set, namely:

  -d ${maxlen} \
  -n ${number_threads} \
  -p ${number_permutations} 

The -d flag toggles the maximum length of intervals to be tested. For example, if d is set to 1, only interactions between single features are tested. The -p flag toggles the number of permutations. If this flag is set, SiNIMin-WY is executed, i.e. Westfall-Young permutations are used to estimate family-wise error rates. The -n flag sets the number of processes. This parameter only results in a speed-up for permutation testing. sinimin uses OMP to parallelize.

Help

If you have questions concerning SiNIMin or you encounter problems when trying to build the tool under your own system, please open an issue in the issue tracker. Try to describe the issue in sufficient detail in order to make it possible for us to help you.

Contact

[email protected]

sinimin's People

Contributors

pseudomanifold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

vishalbelsare

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.