Giter Site home page Giter Site logo

bio-boris / kbparallel Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kbaseapps/kbparallel

0.0 1.0 0.0 218 KB

prototype module for parallel job execution for KBase

License: MIT License

Ruby 0.81% Makefile 0.61% Python 98.30% Shell 0.21% Dockerfile 0.07%

kbparallel's Introduction

KBParallel

Execute batch jobs in KBase.

Installation

kb-sdk install KBParallel

Example Usage

Below is an example of running KBParallel.run_batch -- read the comments for details:

# invoke from an App like any other KBase SDK function call
parallel_runner = KBParallel(self.callback_url)

# build a list of tasks
# ---------------------
# The parameters ('parameters': { ... }) are the parameters that are sent to 
# align_reads_to_assembly_app.  For instance, if you are trying to align 
# multiple fastq files in parallel, then the parameters will include 

tasks = [
  {
    'module_name': 'kb_Bowtie2',
    'function_name': 'align_reads_to_assembly_app',
    'version': 'dev',
    'parameters': { ... }  # your app parameters
  },
  ...
]

# NOTE: modules called by kbparallel (i.e. kb_Bowtie2) have to be registered
# in appdev as 'release', 'beta', or 'dev'.


# configure how tasks are run
# ---------------------------
# you can set how many concurrent jobs you want running on the local 
# machine, and how many nodes you want running in parallel.
# For example, in this case, if you have 5 tasks, 2 will be submitted to 2 njsw nodes and the
# remaining 3 will get run in serial on the local machine. 

batch_run_params = {'tasks': tasks,
                    'runner': 'parallel', # parallel | local_parallel | local_serial
                    'concurrent_local_tasks': 1,
                    'concurrent_njsw_tasks': 2,
                    'max_retries': 2 # how many attempts at running a task before admitting defeat
                    }

# submit the tasks
results = parallel_runner.run_batch(batch_run_params)

KBParallel will give you back nested python dictionaries of results for every task that was run. Below is a description of this data structure.

# results data structure
# ----------------------
# The results of the function being called by kbparallel 
# (align_reads_to_assembly_app), must be returned or it will not be accessible; 
# for instance, if align_reads_to_assembly_app creates a new alignment 
# file, the path to this file must be returned in the output dictionary. 

{
  'results': [
    {
      'is_error': 0,
      'result_package': {
        'error': None,
        'function': {
          'function_name': 'align_reads_to_assembly_app',
          'module_name': 'kb_Bowtie2',
          'version': 'dev'
        },
        'result': [ ... Method call return data ... ]
        'run_context': {
          'job_id': '81e3f2c4-5386-45f3-9bbf-d5a7bd23731a',
          'location': 'local'
        }
      }
    },
    ...
  ]
}

Some Examples

For a simple hello world example that runs 3 tasks in parallel; each job creates a .txt file. The jobs are run on 1 local & 2 njsw nodes. To try it, search for kbparallel example in 'dev'. (also see https://gitlab.com/jfroula/kbparallel_example.git) .

For an example that actually does something: search for bowtie2 or Align Reads using Bowtie2 v2.3.2. (also see https://github.com/kbaseapps/kb_Bowtie2) .

This example is tricky because it calls the same function ("align" in Bowtie2Aligner.py) twice, once to set up the parallel tasks (runs this section first if input_info['run_mode'] == 'sample_set') and then again to run each task (runs this section second if input_info['run_mode'] == 'single_library':). This section actually does the work by calling single_reads_lib_run.

Development

Project anatomy

  • lib/KBParallel/utils/task_manager.py - The TaskManager creates all the tasks, starts jobs, manages the local and remote job queues, and polls for the statuses of running jobs.
  • lib/KBParallel/utils/task.py - A Task represents a KBase module and method with parameters to be run either locally or remotely. A task can have multiple jobs (if some fail).
  • lib/KBParallel/utils/job.py - A Job represents an attempt to run a Task either on NJS or locally.
  • lib/KBParallel/utils/validate_params.py - Utility to validate the parameters passed into KBParallel.run_batch and set defaults for the data

Testing

Edit test_local/test.cfg with the following settings:

test_token=<your_appdev_developer_token>
kbase_endpoint=https://appdev.kbase.us/services
auth_service_url=https://appdev.kbase.us/services/auth/api/legacy/KBase/Sessions/Login
auth_service_url_allow_insecure=false

Then run:

kb-sdk test 

kbparallel's People

Contributors

sjyoo avatar jayrbolton avatar jfroula avatar briehl avatar sean-mccorkle avatar jamesjeffryes avatar sychan avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.