Giter Site home page Giter Site logo

chi-0828 / uppipe Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 0.0 7 MB

UpPipe is an RNA abundance quantification design on a real processing-near-memory system (UPMEM DPU); the paper of this project is published in Design Automation Conference (DAC) 2023

Home Page: https://doi.org/10.1109/DAC56929.2023.10247915

Makefile 1.65% C 91.25% M4 0.66% Roff 1.04% Perl 1.69% Shell 0.40% Scilab 0.15% C++ 3.17%
processing-in-memory rna-seq-quantification upmem dpu genome-sequencing processing-near-memory

uppipe's Introduction

UpPipe

GitHub repository GitHub top language GitHub commit activity (branch) GitHub last commit (by committer) C++ version g++ version
UpPipe is an RNA abundance quantification design on a real processing-near-memory system (UPMEM DPU); the paper of this project is published in Design Automation Conference (DAC) 2023

Citation

Liang-Chi Chen, Chien-Chung Ho, and Yuan-Hao Chang, “UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification," ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, July 9-13, 2023.

@inproceedings{chen2023uppipe,
  title={UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification},
  author={Chen, Liang-Chi and Ho, Chien-Chung and Chang, Yuan-Hao},
  booktitle={2023 60th ACM/IEEE Design Automation Conference (DAC)},
  pages={1--6},
  year={2023},
  organization={IEEE}
}

Materials

Hardware/System Prerequisites

The project has to be run on a system equipped with UPMEM DRAM Processing Units (DPUs), and the kernel system requires installing the UPMEM SDK

Start

git clone https://github.com/chi-0828/UpPipe.git
cd UpPipe
chmod +x build.sh
./build.sh
make -j4

Usage

Allocate transcriptome to DPU(s)

  • KMER SIZE should be 3, 5, ..., 31
  • NUMBER OF DPU(s) in a PIPELINE WORKER should be less than 64 in our suggestion
./UpPipe build \
            -k KMER SIZE  \
            -i OUTPUT INDEX FILE PATH \
            -d NUMBER OF DPU(s) in a PIPELINE WORKER \
            -f TRANSCRIPTOME FILE PATH

Run alignment step for quantification

  • The size of k-mer is already set in INPUT INDEX FILE, this setting cannot be changed in this step
./UpPipe alignment \
            -i INPUT INDEX FILE PATH \
            -r NUMBER OF PIPELINE WORKER(s) \
            -f INPUT RNA READ FILE PATH

Parameters setting (dpu_app/dpu_def.h)

  • KMER SIZE less than 7 may lead to inaccurate mapping result
  • NUMBER OF DPU(s) in a PIPELINE WORKER should be less than 64 for optimal performance
  • The number of transcript / NUMBER OF DPU(s) in a PIPELINE WORKER must be less than 200 (COUNT_LEN in dpu_app/dpu_def.h)
  • Setting READ_LEN to the sequence length of RNA reads
  • Setting WRAM_READ_LEN to the a number which is larger than READ_LEN and divisible by 8
  • WRAM_PREFETCH_SIZE is the size for WRAM pre-feteching, 16 is the optimal size in most situations

Test

  • To build the index file by 11-mer and allocate to 60 DPUs
./UpPipe build \
            -k 11  \
            -i test/test.idx \
            -d 60 \
            -f test/tran.fa
  • To run alignment with 40 pipeline workers
./UpPipe alignment \
            -i test/test.idx \
            -r 40 \
            -f test/read.fa
  • Performance: UpPiep uses 40 pipeline workers
real    0m2.747s
  • Performance: UpPiep uses 20 pipeline workers
real    0m3.584s
real    0m4.003s
  • To note that UpPipe shows its efficiency more in the large size dataset due to the porcessing-in-memory features

uppipe's People

Contributors

chi-0828 avatar gary0828gary avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.