Giter Site home page Giter Site logo

kisarur / slorado Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bonsonw/slorado

0.0 0.0 0.0 48.05 MB

A simplified version of Dorado built on top of S/BLOW5 format.

License: Other

Shell 11.27% C++ 70.13% C 6.74% Makefile 6.82% CMake 5.03%

slorado's Introduction

Slorado

Slorado is a simplified version of Dorado built on top of S/BLOW5 format and reduced dependecies so that it can be (relatively) easily compiled. A minimum g++ version of 5.4 is required. Currently slorado only supports Linux on x86_64 architecture or aarm64 Jetson-based devices.

Slorado is mainly for research and educational purposes and performance is currently not the key goal. Slorado will only support a minimal set of features and may not be up to date with Dorado. A feature rich, fast and up to date version of Dorado that supports S/BLOW5 (called slow5-dorado) can be found here.

Compilation and Running

1. Dependencies

sudo apt-get install zlib1g-dev   #install zlib development libraries
git clone --recursive https://github.com/BonsonW/slorado
cd slorado

The commands to install zlib development libraries on some popular distributions:

On Debian/Ubuntu : sudo apt-get install zlib1g-dev
On Fedora/CentOS : sudo dnf/yum install zlib-devel
On OS X : brew install zlib

2. Downloading Models

Download fast, high accuracy, and super accuracy simplex basecalling models ([email protected], [email protected] and [email protected]). We have tested slorado only on these models.

scripts/download-models.sh

3. Make

Building for x86_64 architceture

Option 1: CUDA GPU version that uses ONT's closed-source koi library binaries (CUDA >=11.3 needed). This is the fastest:
scripts/install-torch12.sh
make cuda=1 koi=1 -j

If you do not have CUDA 11.3 or higher installed system wide, you can install CUDA 11.3 using following commands:

wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda_11.3.0_465.19.01_linux.run
chmod +x cuda_11.3.0_465.19.01_linux.run
./cuda_11.3.0_465.19.01_linux.run --toolkit --toolkitpath=/local/path/cuda/

Then compile slorado by specifying the custom CUDA location to CUDA_ROOT variable as:

make cuda=1 koi=1 -j CUDA_ROOT=/local/path/cuda/
Option 2: CUDA GPU version without close-source koi library (CUDA >=10.2 is adequate). Uses CPU decoder, thus considerably slow:
scripts/install-torch12.sh
make cuda=1 -j
./slorado basecaller models/[email protected] test/oneread_r10.blow5
Option 3: CPU-only version (horribly slow):
scripts/install-torch12.sh
make -j
./slorado basecaller -x cpu models/[email protected] test/oneread_r10.blow5

Building for ARM64 Jetson-based devices

Click to expand
  1. Install and activate python venv.

    sudo apt install python3.8-venv
    python3 -m venv pytorch_venv
    source pytorch_venv/bin/activate
    
  2. Update pip and install pytorch for your specific Nvidia Jetpack version. You can find this by running sudo apt-cache show nvidia-jetpack | grep "Version", or browse https://developer.download.nvidia.com/compute/redist/jp/ to find a suitable version of pytorch. We tested on a Jetson Xavier board with Jetpack 5.0 installed and the commands used were:

    pip3 install --upgrade pip
    pip3 install --no-cache  https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+8a1a93a9.nv22.5-cp38-cp38-linux_aarch64.whl
    
  3. Clone and build.

    git clone --recursive https://github.com/BonsonW/slorado.git
    cd slorado
    make -j cuda=1 jetson=1 cxx11_abi=1 LIBTORCH_DIR=/path/to/pytorch_venv/lib64/python3.8/site-packages/torch/
    

Advanced Options

  • Custom libtorch path:

    make cuda=1 LIBTORCH_DIR=/path/to/libtorch
    
  • C++11 ABI:

    make cxx11_abi=1
    
  • You can optionally enable zstd support for builtin slow5lib when building slorado by invoking make zstd=1. This requires zstd 1.3 development libraries installed on your system (libzstd1-dev package for apt, libzstd-devel for yum/dnf and zstd for homebrew).

4. Running, options and testing

./slorado basecaller models/[email protected] test/oneread_r10.blow5

Using a large batch size may take up a significant amount of RAM during run-time. Similarly, your GPU batch size will determine how much GPU memory is used. Currently, slorado does not implement automatic batch size selection based on available memory. Thus, if you see an out of RAM error, reduce the batch size using -K or -B. If you see an out of GPU memory error, reduce the GPU batch size using -C option. All options supported by slorado are detailed below:

Option: Decription: Default Value:
-t INT number of processing threads. 8
-K INT batch size (max number of reads loaded at once). 2000
-C INT gpu batch size (max number of chunks loaded at once) 800
-B FLOAT[K/M/G] max number of bytes loaded at once 20.0M
-o FILE output to file stdout
-c INT chunk size 8000
-p INT overlap 150
-x DEVICE specify device (e.g., cpu, cuda:0, cuda:1,2: cuda:all) cuda:0
-h shows help message and exits -
--verbose INT verbosity level 4
--version print version

A script to calculate Basecalling Accuracy is provided:

set environment variable MINIMAP2 if minimap2 is not in PATH.
scripts/calculate_basecalling_accuarcy.sh /genome/hg38noAlt.idx reads.fastq

Acknowledgement

slorado's People

Contributors

bonsonw avatar hiruna72 avatar hasindu2008 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.