Giter Site home page Giter Site logo

rodriados / museqa Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 9.81 MB

Multiple Sequence Aligner using hybrid parallel computing

License: GNU General Public License v3.0

Makefile 0.94% C++ 56.20% Cuda 31.32% Shell 3.43% Python 6.81% C 0.77% Perl 0.54%
cuda cpp hybrid-parallelism

museqa's Introduction

Hello, I am Rodrigo!

Welcome to my GitHub profile!
I am a software developer and languages enthusiast from Brazil.
I am always open to collaborate with new projects and innovative ideas.
Here you will find the projects I've been working on lately.

museqa's People

Contributors

rodriados avatar

Stargazers

 avatar

Watchers

 avatar

museqa's Issues

Improve installation process

Currently, the project cannot be easily installed on a new machine and is totally dependent on relative paths to find the required scripts to run on our entrypoint.

A basic Docker installation was provided as of #25, but unfortunately such container did not serve its purpose - did not provide ease of use and did not allow the project to run within it - and thus has been removed on 3adae10.

A make install and make uninstall target should be available for installing and uninstalling the project from the user's system.

Introduce middlewares on pipeline modules

Museqa's pipeline modularity is one of the fundamental pieces of its framework. As such, a module must be extensible enough to allow its functionality to be customized depending on each particular use case, without the need of creating a whole new module for it.

This issue suggests the creation of pipeline::middleware, which could be attached to a pipeline::module. Each middleware implements a run method that is responsible for calling a next method, that bubbles the execution down the line to the next middleware or to the wrapped module.

msarun should take care of error handling

As issues were found when trying to use threads to take care of errors, msarun should become the piece responsible for handling errors.

This can be achieved by calling a shell script in each cluster node (via mpirun inside msarun) and passing messages between these scripts. Each script shall end it's process as needed. Thus, #1 will also be resolved.

Install Docker and CI integrations

A Docker image with the required environment for the project should be created. This would ease the use while making it easier to be shipped and installed by anyone.

Also, once a Docker container is ready, Continuous Integrations should be installed in order to allow the project to be tested automatically after each change.

Implement gap extension penalties in pairwise module

The pairwise alignment heuristic should implement a different treatment to gaps which are extended. Biologically, it's much more possible the occurrence of less gaps but longer, than much more smaller gaps.

Thus, gap penalties should be smaller if an already existing gap is being extended, and higher when a new gap is being introduced into the alignment.

Add new input sequence parsers

There are many file formats out there for storing biological sequences, such as DNA and RNA. Some formats that may have parsers implemented for are:

  • NBRF/PIR
  • EMBL/SwissProt
  • Clustal
  • GDE
  • GCG/MSF
  • RSF

Avoid creation of zombie processes

When running the project in a cluster, the creation of zombie processes have been observed. This must be avoided.

Probable causes for this problem:

  • Watchdog does not kill all processes when finalizing execution on error
  • MPI does not finalize all processes when execution is over.

Avoid GPU memory allocation by master node in pairwise module

When executing the hybrid Needleman algorithm for the pairwise module, the cluster's master node tries to allocate memory and transfer data into a GPU. The cluster's master node, though, may not have access to a GPU and thus should not try interacting with one.

Introduce assert guards

The are plenty of cases where one must assert whether a given condition is met or not. For these cases, an assert statement might be useful to make sure everything is okay. This is currently being achieved by ugly and unreadable code blocks like the following:

#if defined(msa_compile_cython) && !defined(msa_compile_cuda)
    if(static_cast<unsigned>(offset) >= getSize())
        throw Exception("buffer offset out of range");
#endif

A much more elegant solution would be the introduction of exception guards, which will assert whether the condition is true or else will throw the exception.

msa::guard(static_cast<unsigned>(offset) >= getSize(), "buffer offset out of range");

Improvements to the pair generation algorithm

Currently, the pair generation algorithm, performed as an initialization step of the pairwise module is naive. So that every compute node and also the master node must generate all sequence pairs before aligning any of them, even if most of the generated pairs is not relevant to the node generating it.

This can be optimized by taking advantage of the OEIS sequence A002024. Thus, each node will be able to generate only the pairs it needs for its pairwise module's execution.

Implement unit tests for the phylogeny module

Unit tests must be implemented for the phylogeny module. At the moment, unit tests are only available for one heuristics step, namely the pairwise module.

It is important to implement unit tests for phylogeny as well, in order to be sure the module is still behaving as expected through future code base changes.

Standardization of Reflection using Loophole and Reflector

The Reflection module does not produce the same results when an array is present in the reflected object depending on whether the Loophole or Reflector back-end is used.

struct Object
{
    int v[3];
};

To access the second element in v, using Loophole is ReflectionTuple<Object>::get<1>() whereas using Reflector is ReflectionTuple<Object>::get<0>()[1].

The first interface is preferred.

Allow multiple alignments per GPU block in hybrid-needleman algorithm

Currently, when executing the pairwise module with the hybrid-needleman algorithm, a GPU block can only process a single alignment per kernel call. This limitation is not efficient, and it takes an increasingly higher toll when the number of sequences grows.

For each kernel call, a huge load of memory is allocated on device and subsequently, a high load of data is transferred between device and host memories. As a measure against this inefficiency, blocks should be allowed to process more than one alignment pair per kernel call if there is enough device memory available for such.

Improve time measurements

Instead of timing heuristic stages as a whole, it would also be relevant to show execution times stratified in different ways such as: time spent in communication IO, GPU processing and also the whole execution time.

Adjust guide-tree rerooting when using phylogeny's njoining algorithm

The phylogeny module's algorithm njoining is not currently performing the correct logic at its final step. The current behavior chooses to execute the exact same node-joining logic until only one node is left at the star tree. This final node is, consequently, the guide-tree's root node.

The correct behavior, though, is that this aforementioned logic halts when 3 nodes are left in the star tree and a tree rerooting algorithm is run to find the best guide-tree.

Hybrid needleman wrong results

A bug has been found in hybrid needleman algorithm: whenever the algorithm tries to reuse the last calculated column, the first thread (and only the first) will get an invalid value from global memory.

The cause for this bug is yet to be determined.

Sequential neighbor-joining algorithm crashes when running in parallel

When running the phylogeny module with a distributed sequential version of the neighbor-joining algorithm, a crash happens when the number of sequences to align is too low.

The thrown exception's message is related to a buffer offset out of range, which is a clue that a slave node is trying to do work when there's none assigned to it.

A possible fix is to limit the number of workers depending on the amount of nodes on the algorithm's star-tree.

Improve const-qualifier consistency and enforcement

There are many const-qualifiers inconsistencies throughout the project's classes and files. For instance, the following code should not be valid:

const Buffer<int> buffer {10};
for(size_t i = 0; i < buffer.getSize(); ++i)
    buffer[i] = i;

If buffer is const-qualified, than one should not be able to change buffer's contents after its initialization. In this case, buffer is only usable when copy- or move-constructed.

Hybrid pairwise using shared-memory Scoring Table

On the hybrid version of the Needleman-Wunsch algorithm, in src/needleman/hybrid.cu, used in the pairwise module, a significant performance increase may arise by using the scoring table contents in the shared memory.

Compress FastaSequence outside Pairwise module

The Fasta sequences would better be compressed before being sent to slave nodes and processing modules such as pairwise.

This will decrease the pairwise module's execution time and total memory consumption in all nodes.

Implement subprograms and routines

In addition to #26, the project's interface should be upgraded to offer subprograms and utilities via the command line, ultimately allowing the user to manage everything they can without directly editing files or modifying code.

Subprogram examples:

  • museqa host find: runs the hostfinder and update the user's .hostfile automatically.
  • museqa host add <host>: adds a host to the user's .hostfile.
  • museqa [options] run <file>: runs with the given mpirun options.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.