Giter Site home page Giter Site logo

umms-biocore / dolphinnext Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 60.63 MB

A graphical user interface for distributed data processing of high throughput genomics

Home Page: https://dolphinnext.umassmed.edu

PHP 63.58% CSS 3.24% JavaScript 25.44% Hack 2.29% Python 0.14% Nextflow 3.31% Shell 0.13% Perl 0.04% Twig 1.61% SCSS 0.24%
rna-seq chip-seq atac-seq dolphinnext pipelines workflows share-pipelines pipeline nextflow amazon-cloud

dolphinnext's People

Contributors

dependabot[bot] avatar nephantes avatar onuryukselen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dolphinnext's Issues

Published dirs

If published directory is not empty, resumed job cannot overwrite and gives error.

Unexpected scheduler agent error on Amazon Ignite

Nextflow fails with following error when autoscale is used. Especially when big files are used in the run.

ERROR ~ === Unexpected scheduler agent error
May-01 05:08:26.675 [scheduler-agent]
ERROR nextflow.scheduler.SchedulerAgent - === Unexpected scheduler agent error
java.lang.IllegalStateException: class org.apache.ignite.internal.processors.cache.CacheStoppedException: Failed to perform cache operation (cache is stopped): nextflow.cache.pendingtasks at org.apache.ignite.internal.processors.cache.GridCacheGateway.enter(GridCacheGateway.java:164) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1684) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.query(GatewayProtectedCacheProxy.java:365) at nextflow.scheduler.SchedulerAgent$AgentProcessor.processPendingTasks0(SchedulerAgent.groovy:290) at nextflow.scheduler.SchedulerAgent$AgentProcessor.processPendingTasks(SchedulerAgent.groovy:277) at nextflow.scheduler.SchedulerAgent$AgentProcessor.run(SchedulerAgent.groovy:144)
Caused by: org.apache.ignite.internal.processors.cache.CacheStoppedException: Failed to perform cache operation (cache is stopped): nextflow.cache.pendingtasks
... 6 common frames omitted

Input Reads for Run not seen by other group members

When I go to a previous run by someone else in my group, and click on the
Input "reads" item name, I do not see what files were specified for the input reads. It is blank.

I would have expected to see a list of all the input files.

I am using Fedora Core 25 and the Chrome browser.

Run schedular

Setting a time to start a run in the advanced section?

Option to remove bam files

yes/no option to remove alignment files in STAR, HISAT2 and Tophat2 after successfull completion of a pipeline.

Log Timeline display unreadable and/or wrong

When looking at the timeline in the Log file from a pipeline with lots of processes, it seems like
all the process timeline bars are squished to the far right. Also the x-axis labels (ie, the time)
start from 0 and go to 20. then start again at 0 and go to 3. So I am guessing that the process
bars are all using just the last 3 values along the x-axis.
This can be seen in ChIP Run1 on the GHPCC cluster.

I would expect to see something like
https://www.nextflow.io/docs/latest/tracing.html#timeline-report

Maybe if there is a log time the parent nextflow process waits in the queue
before it runs, that eats up
a lot of time and then all results are squished to the right? But why would the x-axis time
labels start over? If this is the reason, would there be some way to exclude the initial
waiting time from the output timeline?

see screen shots:

XaxisLabels
TheFirstProcesses

I have tried this with both Chrome and Firefox on Fedora Core 25. I have also clicked on
the "web link" button to see it larger. That does not fix the problem.

create new revision button

While developing pipeline, autosave option trigger and asks to confirm revision. Instead just warn and ask for "save and continue" (replace Save on existing button)
Revision button will be separate.

error reporting

Remove the keyword "error" from run status checking criteria. Use more specific terms.

User Groups

Users should be added to the groups by searching their names

input file name collision

If files added as paired end format, since the defined name is going to be used while renaming this error happens. Autofill for mate should be added according to the selected samples.

Run Status should show runs not yet submitted

If I create run (e.g., by copying a previous one), and assign it to a project and save it - but
do not submit it to run - I cannot see that run when I click on "Run Status" at the top
of the page. I can see it if I go to my projects. I find that confusing.

I think it would
be nice to be able to see it both places (although I suppose if I had lots of runs which I
hadn't yet submitted it might get crowded - but maybe the search box on the Run Status
page could also search on the Status field which might say: Not Submitted).

Since the search box doesn't seem to look at the Status column at all, maybe there should be
a dropdown menu or check box that lets you filter the runs you see based on their status.
Or allow the search box to search the status field too.

show all outputs

Add checkbox to process circle in order to show all tooltips of the outputs at the same time. Also You can use different color to emphasize unconnected nodes.

Cluster termination

Add an option to advance tab to terminate the amazon cluster when a pipeline is completed either successfully or failed.

Bug in sequential mapping

It skips custom sequence set. Custom sequence set is not working.
When I don't add any other common RNA's for loop is like below;

 for rna_set in 

When I add rRNA,GFP,miRNA for loop is below;

 for rna_set in rRNA  miRNA

But this part is right.

prev="reads"
IFS=',' read -r -a paramsListAr <<< "-N 1,-N 1,-N 1" #create comma separated array 
IFS=',' read -r -a filtersListAr <<< "Yes,Yes,Yes"
IFS=',' read -r -a indexesListAr <<< "/share/data/umw_biocore/genome_data/human/hg19/commondb/rRNA/rRNA,/project/umw_robert_kotin/data/gfp/gfp,/share/data/umw_biocore/genome_data/human/hg19/commondb/miRNA/miRNA"

So, it doesn't add name_of_the_index_file to the list;

Missing Project Files upload button

Describe the bug
Project Files section is missing an upload files button as described in the documentation (see screenshot below).

To Reproduce
Steps to reproduce the behavior:

  1. Build and run docker version from latest commit.
  2. Install php7.2-ldap in the container
  3. Register new user, activate and login as new user
  4. Create new project
  5. Scroll in the project to "Project Files" section

Expected behavior
Upload files button present to upload new files.

Screenshots
image

Run Status shows a shared run, clicking on it shows nothing

I click on Run Status. At the top of the list I see ID 1578 ChIP Run1-copy completed
by rw54w at 15:47. If I click on the run name, or go to Options>view run
I get sent to a blank Biocore Run Generation page (ie, no info is filled in the fields).
I would have expected to see the Run name, run settings, etc.

NoRunInfo

I am not sure why this is happening and how to reproduce the error.
I have been able to see results from other runs by rw54w.
I did NOT see this run listed in Run Status before it completed - I'm not sure
if that is related to this problem or not.

Using FC25 and the Chrome browser.

Running DolphinNext locally

Hi, is there any documentation on running your own instance of DolphinNext? I would like to try running it locally.

To that end, are you considering providing a Docker image for DolphinNext? It should be straightforward, you would just need to define all dependencies and the server execution command. Then, anyone could simply run something like docker run UMMS-Biocore/dolphinnext to execute their own server.

Github integration

Add feature to publish pipeline into github and initiate run from that repository. Add "run in command line" option which will print all the command and parameters to execute in your local environment. Pipeline will download missing files from genioHub before run starts.

Amazon ec2 is not accesible

warn the user when we cannot access to the cluster like "we cannot connect to Amazon. Would you like to shut it down." after 5 trials for example in every 2 mins.

Improve drag and drop of processes into pipeline

when I am creating a pipeline I first select a process in the left pane and then drag it into
the pipeline development region in the upper right. But if I scroll down to pick a process that is far down the left pane the development region is no longer on my screen; and, at least in Chrome on FC25, I can no
longer drag and drop the process into the development region. I have tried many different keystroke
combinations to try to get the browser to auto-scroll up towards the development region (which
is off the top of the window), but nothing seems to enable that.

Perhaps in addition to drag and drop there could some sort of copy and paste which would let
me "copy" the process from the left pane, then let me manually scroll my window until the
pipeline development region was in view, and then "paste" the process into the development
region.

Right now I am just shrinking text,etc, in my browser way down so that I can fit both the
left pane process and the right pane pipeline development window onto the same display, and
then expanding things back up to work on them.

License is contradictory with README

Either the software is released under the GNU GPL license, as stated on the LICENSE file and therefore, as it is stated in your LICENSE file, it is free to use for any purpose and can be redistributed for a fee; or it is only available for nonprofit academic use as stated in the README and therefore it is not free software (may be "free" as in "free beer", but not as in "freedom").

Note that if DolphinNext is only available for nonprofit academic use then it will most likely be excluded from being packaged in all free software linux distributions (Debian, Fedora, Ubuntu) because yes, free software allows us to sell Debian CDs if we want to.

I'm a non-profit academic user, so it does not directly affect me, but you may be interested in learning about free software vs "academic only" licenses.

Upgrade Add File Feature

Project and run environment dropdown should be added and they should be selected in first opening.

Tutorial Requests

Prepare tutorial for

  1. Passing fastq names to single/pair-end channels
  2. Building Singularity and Docker and integration with Dolphinnext tutorial

Enable unique docker container for each process

Is your feature request related to a problem? Please describe.
Currently, a docker container can only be specified for the whole workflow.

Describe the solution you'd like
Nextflow easily enables unique containers for each process with the container 'image_name_1' setting at the top of a process. Currently I have not figured out a way to specify this in DolphinNext as the header, body and footer of a process are contained within the quotes, although I am quite possibly missing something.

Describe alternatives you've considered
Can make my own docker container containing dependencies for all of the processes, publish and run the full pipeline in that container, although this is less convenient than using premade biocontainers or the like.

Ability to add comments after a run

After a run I look at the output and make some observations and conclusions.
It would be nice to be able to add them to the associated run info.
Maybe just the ability to edit the Run Settings>Run Description after the run?
Or maybe another entry box specifically designed for that purpose?

Newly created pipeline loses all of its contents!

I am using the pipeline editor to string together two simple processes (proc1 and proc2) into
a pipeline called proc1 to proc2 (another attempt, which is still around, is called p1 to p2).
I create the pipeline and make sure that I click the save
icon near the top. If I then click the Run button to create a run, then choose a project to associate it with, then choose a run name, I get into a vanilla Run Generation page which has almost NOTHING
filled in (ie, the Run name is missing, the pipeline name IS there, but I can't choose a Run environment and no inputs or outputs are visible. The Workflow tab gives me just the text description, NO GUI workflow is shown. So I deleted the run and got out of the pipeline
altogether. Then I chose that pipeline (which IS listed now) from the left pane, in order to try
a run that way. Only the text description of the pipeline populates the pipeline page; no
workflow GUI, and nothing listed in the Processes, Inputs, or Outputs tabs.

This has happened to me 3 times in a row now. I've tried it with a completely new (and I deleted the old "proc1 to proc2") pipeline
with a new name ("p1 to p2"). The same problem comes up. I am completely stuck in the water.
Very annoying.

Maybe some previous attempt is somehow preventing a new pipeline?
I am really at a loss ...

E-mail the recent updates

Add checkbox to receive a newsletter in the profile.
Two main categories:
A) User related news
B) Developer related news (mostly summary of recent upgrades)

Add ability to see whole module name - enhancement

Both on the left pane (where the list of modules and processes is) and in the
pipeline GUI (ie, the circles with the process or module name in them), the names
of the associated process or pipeline get cut off after about 10 characters. There
doesn't seem to be any way to click on them or scroll over them to see their entire
name. This is very frustrating as typically it seems like all the elements associated
with a pipeline (ie, its processes, and its inputs and output icons) get named with
the same beginning (eg, chipseq_output_bam, chipseq_ouput_txt, etc). So the
distinguishing part of the name (eg, bam, txt) gets cut off and it is hard to tell from the name
how they differ.

Maybe when the cursor is positioned over the name a tooltip could pop up with the
entire name (for names in both the left pane and in the GUI pipeline builder window).

I am on FC25 with chrome browser.

Improve GEO import feature

fasterq-dump gives error when network connection is not stable (ncbi/sra-tools#139)

eg error:
fasterq-dump.2.9.4 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -76 ( NET - Reading information from the socket failed )
fasterq-dump.2.9.4 sys: connection failed while opening file within cryptographic module - ktls_handshake failed while accessing '165.112.9.232' from '172.31.52.78'
fasterq-dump.2.9.4 sys: connection failed while opening file within cryptographic module - Failed to create TLS stream for 'sra-download.ncbi.nlm.nih.gov' (165.112.9.232) from '172.31.52.78'
fasterq-dump.2.9.4 err: connection failed while opening file within cryptographic module - error with https open 'https://sra-download.ncbi.nlm.nih.gov/traces/sra77/SRR/008399/SRR8601574'

Replace this tool with alternatives (such as prefetch) to improve the performance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.