Bionode submitted with BioJS, projects for GSoC2015. Unfortunately, BioJS wasn't accepted as an organisation this year (Google didn't accept many previously accepted organisations, and accepted new ones, which is fair).
However, those projects can still be carried by anyone interested in them, or by students looking for a projects.
The following is a copy of the Bionode projects listed on the BioJS page. If you are interested, please reply here or at gitter.im/bionode/bionode. You can also send me an email at [email protected]
Bionode Pipeline Building GUI
Rationale & Approach
Making a easy to use graphical user interface to build interactive pipelines would lower the barrier of entry to usage of Bionode to non-bioinformaticians/programmers. This could be achieved through integration with projects like Galaxy, however a more interactive/advanced interface such as Node-RED is what we are aiming for. Another good source for interface inspiration would be the NoFlo project. Node-RED or any other open source project can and should be used/adapted as much as possible instead of writing a new interface from scratch.
The resulting interface should produce as output a descriptive text file representation of the pipeline, that should be able to run on the command line without requiring the GUI. For example, Gasket, datscript, hackfile or Makefile.
Challenges
- Integration between available interfaces and bionode pipeline
- Producing a simple text format representation of those pipelines for easy versioning, distribution and collaboration.
Involved Tools / Libraries
- Node-RED
- NoFlo (for ideas)
- Galaxy (for ideas)
- Gasket, Datscript, Hackfiles, Makefiles (for text representation of pipeline).
Needed Skills
- Backend JavaScript/Node.js
- Frontend JavaScript
- Bash
- CoffeeScript (for NoFlo)
Mentors
Bionode team (contact: Bruno Vieira)
- Boris Adryan: Scientist: @Flyjedi, genome gazer. Geek: Founder of @thingslearn, #IoT tinkerer
- Bruno Vieira: Bioinformatics PhD student at Queen Mary University of London and Node.JS Web Developer. Working on population genomics, bionode.io and dat-data.com
- Dave C-J: Node-RED developer
- Karissa McKelvey: Programmer and idea jockey based in Oakland, CA. Former academic experienced in building interactive data visualization and collaboration tools
- Mathias Buus: Programmer based in Copenhagen, Denmark. Co-creator of node-modules.com and co-founder of ge.tt. open mouth, open source
- Max Ogden: Programmer based in Portland, OR. Max works on or has worked on things like CSVConf, Code for America, JavaScript for Cats, and Voxel.js
- Nicholas O'Leary: IBM Emerging Technologies geek. All things MQTT and IoT. Creator of @nodered and one of the @BeardyDads
- Steve Moos: Passionate Computational and Data Scientist specialising in Bioinformatics, DevOps and SysAdmin
- Yannick Wurm: Population Genomics, Bioinformatics, Evolution of Social Insects. Senior Lecturer at Queen Mary University London
Bionode integration
Rationale & Approach
Bionode focus is on modular pipelines for data manipulation and analysis, while BioJS focus is on visualisation. It would be interesting to combine both tools to solve a biologically relevant problem while testing and solving issues with the integration between both projects.
For example, one interesting use case is to use Bionode to get transcriptomic data from the Sequence Read Archive (SRA) for any species/experiment and visualise the expression levels of genes with BioJS. During your project you should be able to work on at least three different use cases.
As the data might become larger for specific files (e.g. SAM/BAM) one should be able to use streams to communicate with Bionode modules
Challenges
- Getting several modules from both projects to work together
- Might require some architectural changes to those modules.
Involved Tools / Libraries
Needed Skills
- Frontend JavaScript
- Backend JavaScript/Node.js
Mentors
Bruno Vieira (Bionode) and Miguel Pignatelli (BioJS)
Bionode distribution on HPC Grid
Rationale & Approach
Bionode pipelines can currently only run on one machine, but we would like them to be able to scale and be distributed across nodes of a high performance computing cluster (HPC). There are several ways to distribute Node apps across several CPUs/Machines using native Node.js or libraries but for a scenario were the user does not have administrative access to the cluster and must rely on established queuing tools (i.e., Sun Grid Engine) integrating/wrapping Bionode around those tools might be the best approach.
Challenges
- Development will require access to a cluster of several machines or a simulated environment. We already have a Docker container that provides Sun Grid Engine.
- If the student is interested in using Node.js queuing/distribution libraries, it will require a review of the existing options and adapting to bionode pipelines.
- If the student has more interest or experience with other queuing systems, it will require wrapping those systems with bionode/node.js code.
- We only expect the student to do one approach, but a very skilled student could do both.
Involved Tools / Libraries
- Node queuing systems
- Other queuing systems (i.e. SGE)
Needed Skills
- Node.js/JavaScript
- HPC experience
- Docker (could be useful for development)
Mentors
Steve Moss and Max Ogden
Bionode modules
Rationale & Approach
There are several modules that would be useful for bionode that can be grouped in:
- Data access (from web APIs)
- Data parsing/wrangling
- Tools wrappers
The student could work on improving an existing module or writing from scratch a module that has been requested. If the student is interested in several small modules, improving their architecture and integration among themselves and other UNIX tools could have a huge impact on the usability of the project.
Challenges
The challenges will depend on the module(s) the student is interested in, but there are enough options to adapt to a very diverse range and level of skills.
Involved Tools / Libraries
- Depends on the module, but everything from web APIs (e.g., NCBI) to command line tools (e.g., SAMTOOLS).
- Node.js/JavaScript
Needed Skills
JavaScript/Node.js
Mentors
Bionode team (contact: Bruno Vieira)