Giter Site home page Giter Site logo

hmenager / workflow-is-cwl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ebi-metagenomics/workflow-is-cwl

0.0 1.0 0.0 26.27 MB

This repository contains CWL descriptions of the various tools which will allow you to build workflows for the annotation of transcripts

Home Page: https://www.elixir-europe.org/

Common Workflow Language 63.40% Shell 5.03% Pep8 13.58% Roff 16.96% Dockerfile 1.03%

workflow-is-cwl's Introduction

Build Status

ELIXIR Workflow Implementation Study - CWL descriptions

This repository contains CWL descriptions of the various tools (TransDecoder, Diamond, InterProScan and PHMMER), which will allow you to build workflows for the annotation of transcriptomes.

workflow-is-cwl's People

Contributors

mscheremetjew avatar hmenager avatar stain avatar

Watchers

 avatar

workflow-is-cwl's Issues

galaxy+cwltool - TransDecoder

$namespaces: 
  gx: "http://galaxyproject.org/cwl#"
hints:
  - class: DockerRequirement
    dockerPull: greatfireball/ime_transdecoder:5.0.2
  - class: gx:interface
    gx:inputs:
      - gx:name: geneToTranscriptMap
        gx:type: data
        gx:format: 'txt'
        gx:optional: True
      - gx:name: geneticCode
        gx:type: data
        gx:format: 'txt'
        gx:optional: True
      - gx:name: minimumProteinLength
        gx:type: integer
        gx:optional: True
      - gx:name: strandSpecific
        gx:type: boolean
        gx:optional: True
      - gx:name: transcriptsFile
        gx:type: data
        gx:format: 'txt'

e5cb9a7

galaxy+cwltool - HMMER

Run successfully in Galaxy after adding hints below in phmmer-v3.2.cwl

$namespaces:
  gx: "http://galaxyproject.org/cwl#"
hints:
  gx:interface:
    gx:inputs:
      - gx:name: bitscoreThreshold
        gx:type: integer
        gx:optional: True
      - gx:name: cpu
        gx:type: integer
        gx:optional: True
      - gx:name: seqFile
        gx:type: data
        gx:format: 'txt'
      - gx:name: seqdb
        gx:type: data
        gx:format: 'txt'

cwltool - Infernal/cmsearch

$ pip show cwltool
Name: cwltool
Version: 1.0.20180508202931

$ cd workflow-is-cwl/tools/Infernal/cmsearch
$ cwl-runner infernal-cmsearch-v1.1.2.cwl infernal-cmsearch.test.job.yaml
Final process status is success

Job outputs moved to "workflow-is-cwl/tools/Infernal/cmsearch/expected_output" folder.

cwltool - BUSCO

$ pip show cwltool
Name: cwltool
Version: 1.0.20180508202931

$ cd workflow-is-cwl/tools/BUSCO
$ cwl-runner BUSCO-v3.cwl BUSCO-v3.test.job.yaml
Final process status is success

Job outputs moved to "workflow-is-cwl/tools/BUSCO/expected_output" folder.

sync201811 - galaxy+cwltool - cmsearch-multimodel-wf

$ cwltool --pack cmsearch-multimodel-wf.cwl > cmsearch-multimodel-wf-packed.cwl

When importing the packed workflow in Galaxy, error below occurs

Traceback (most recent call last):
  File "lib/galaxy/web/framework/decorators.py", line 283, in decorator
    rval = func(self, trans, *args, **kwargs)
  File "lib/galaxy/webapps/galaxy/api/workflows.py", line 337, in create
    return self.__api_import_from_archive(trans, archive_data, source="uploaded file", from_path=os.path.abspath(uploaded_file_name))
  File "lib/galaxy/webapps/galaxy/api/workflows.py", line 596, in __api_import_from_archive
    raw_workflow_description = self.__normalize_workflow(trans, data)
  File "lib/galaxy/webapps/galaxy/api/workflows.py", line 679, in __normalize_workflow
    return self.workflow_contents_manager.normalize_workflow_format(trans, as_dict)
  File "lib/galaxy/managers/workflows.py", line 315, in normalize_workflow_format
    workflow_path += "#" + object_id
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'str'

cwltool - cmsearch-deoverlap

$ cd workflow-is-cwl/tools/cmsearch-deoverlap
$ cwl-runner cmsearch-deoverlap-v0.02.cwl cmsearch-deoverlap-v0.02.test.job.yaml 
Final process status is success

Jobs outputs moved to "workflow-is-cwl/tools/cmsearch-deoverlap/expected_output" folder.

sync201811 - galaxy+cwltool - Diamond

Add Tool link in tools_conf.xml

<tool file="../../workflow-is-cwl/tools/Diamond/Diamon.makedb-v0.9.21.cwl" />

When running the tool, error below occurs

/home/jra001k/snapshot/galaxy/database/jobs_directory/000/12/tool_script.sh: line 10: diamond: command not found
Traceback (most recent call last):
  File "/home/jra001k/snapshot/galaxy/database/jobs_directory/000/12/relocate_dynamic_outputs.py", line 1, in <module>
    from galaxy_ext.cwl.handle_outputs import relocate_dynamic_outputs; relocate_dynamic_outputs()
  File "/home/jra001k/snapshot/galaxy/lib/galaxy_ext/cwl/handle_outputs.py", line 21, in relocate_dynamic_outputs
    handle_outputs()
  File "/home/jra001k/snapshot/galaxy/lib/galaxy/tools/cwl/runtime_actions.py", line 117, in handle_outputs
    outputs = job_proxy.collect_outputs(tool_working_directory)
  File "/home/jra001k/snapshot/galaxy/lib/galaxy/tools/cwl/parser.py", line 578, in collect_outputs
    return self.cwl_job().collect_outputs(tool_working_directory)
  File "/home/jra001k/snapshot/galaxy/.venv/local/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 539, in collect_output_ports
    compute_checksum=compute_checksum)
  File "/home/jra001k/snapshot/galaxy/.venv/local/lib/python2.7/site-packages/schema_salad/sourceline.py", line 168, in __exit__
    raise self.makeError(six.text_type(exc_value))
cwltool.errors.WorkflowException: Error collecting output for parameter 'diamondDatabaseFile':
:1:1: Did not find output file with glob pattern: '['uniref90_subset.dmnd']'
]

galaxy+cwltool - Infernal/cmsearch

hints:
  - class: DockerRequirement
    dockerPull: 'biocontainers/infernal:v1.1.2-1-deb_cv1'
  - class: gx:interface
    gx:inputs:
      - gx:name: covariance_model_database
        gx:type: data
        gx:format: 'txt'
      - gx:name: cpu
        gx:type: integer
        gx:optional: True
      - gx:name: cut_ga
        gx:type: boolean
        gx:optional: True
      - gx:name: omit_alignment_section
        gx:type: boolean
        gx:optional: True
      - gx:name: only_hmm
        gx:type: boolean
        gx:optional: True
      - gx:name: query_sequences
        gx:type: data
        gx:format: 'txt'
      - gx:name: search_space_size
        gx:type: integer
        gx:default_value: 1000
        gx:optional: True

abc3d8f

cwltool - HMMER

$ pip show cwltool
Name: cwltool
Version: 1.0.20180508202931

$ cd workflow-is-cwl/tools/HMMER
$ cwl-runner phmmer-v3.2.cwl phmmer-v3.2.test.job.yaml
Final process status is success.

Job outputs moved to "workflow-is-cwl/tools/HMMER/expected_output" folder.

Tools label / description too large

screenshot_2018-07-17_06-26-42

"label" and "doc" attributes are too large to be displayed in the list (on the left).

Notes:

  • With galaxy tools XML descriptor, the text displayed is made of "name" and "description" attributes.
  • With tools CWL descriptor, the text displayed is made of "label" and "doc" attributes.
  • The mapping between those attributes take place in lib/galaxy/tools/cwl/parser.py (line 284)

cwltool - TransDecoder

$ pip show cwltool
Name: cwltool
Version: 1.0.20180508202931

$ cd workflow-is-cwl/tools/TransDecoder
$ cwl-runner TransDecoder.LongOrfs-v5.cwl TransDecoder.LongOrfs-v5.test.job.yaml 
Final process status is success.
$ cwl-runner TransDecoder.Predict-v5.cwl TransDecoder.Predict-v5.test.job.yaml 
Final process status is success

Jobs outputs moved to "workflow-is-cwl/tools/TransDecoder/expected_output" folder.

sync201811 - galaxy+cwltool - BUSCO

Add Tool link in tools_conf.xml

<tool file="../../workflow-is-cwl/tools/BUSCO/BUSCO-v3.cwl" />

When running the tool, error below occurs

Traceback (most recent call last):
  File "lib/galaxy/jobs/runners/__init__.py", line 214, in prepare_job
    job_wrapper.prepare()
  File "lib/galaxy/jobs/__init__.py", line 871, in prepare
    tool_evaluator.set_compute_environment(compute_environment, get_special=get_special)
  File "lib/galaxy/tools/evaluation.py", line 123, in set_compute_environment
    self.tool.exec_before_job(self.app, inp_data, out_data, param_dict)
  File "lib/galaxy/tools/__init__.py", line 2457, in exec_before_job
    cwl_command_line = cwl_job_proxy.command_line
  File "lib/galaxy/tools/cwl/parser.py", line 516, in command_line
    if self.is_command_line_job:
  File "lib/galaxy/tools/cwl/parser.py", line 428, in is_command_line_job
    self._ensure_cwl_job_initialized()
  File "lib/galaxy/tools/cwl/parser.py", line 452, in _ensure_cwl_job_initialized
    *args, **kwargs
  File "/home/jra001k/snapshot/galaxy/.venv/local/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 342, in job
    builder = self._init_job(job_order, **kwargs)
  File "/home/jra001k/snapshot/galaxy/.venv/local/lib/python2.7/site-packages/cwltool/process.py", line 594, in _init_job
    raise WorkflowException("Invalid job input record:\n" + Text(e))
WorkflowException: Invalid job input record:
the `lineage` field is not valid because
  Expected class 'Directory' but this is 'File'

cwltool - InterProScan

$ cd workflow-is-cwl/tools/InterProScan
$ cwl-runner InterProScan-v5.cwl --proteinFile test_single_protein.fasta
'interproscan.sh' not found
[job InterProScan-v5.cwl] completed permanentFail
{}
Final process status is permanentFail

galaxy+cwltool - cmsearch-deoverlap

hints:
  - class: DockerRequirement
    dockerPull: biocrusoe/cmsearch-deoverlap
  - class: gx:interface
    gx:inputs:
      - gx:name: clan_information
        gx:type: data
        gx:format: 'txt'
        gx:optional: True
      - gx:name: cmsearch_matches
        gx:type: data
        gx:format: 'txt'

galaxy+cwltool - InterProScan

Docker disabled for now. Install InterProScan manually and set PATH accordingly before starting Galaxy server.

hints:
#  - class: DockerRequirement
#    dockerPull: 'olat/interproscan:latest'
  - class: gx:interface
    gx:inputs:
      - gx:name: applications
        gx:type: text
        gx:optional: True
      - gx:name: proteinFile
        gx:type: data
        gx:format: 'txt'

d9d1610

galaxy+cwltool - Diamond

Add hints in Diamon.makedb-v0.9.21.cwl

$namespaces:
  gx: "http://galaxyproject.org/cwl#"

hints:
  - class: DockerRequirement
    dockerPull: 'buchfink/diamond:version0.9.21'
  - class: gx:interface
    gx:inputs:
      - gx:name: inputRefDBFile
        gx:type: data
        gx:format: 'txt'
      - gx:name: taxonMapFile
        gx:type: data
        gx:format: 'txt'
        gx:optional: True
      - gx:name: taxonNodesFiles
        gx:type: data
        gx:format: 'txt'
        gx:optional: True
      - gx:name: threads
        gx:type: integer
        gx:optional: True

92ee2b9

sync201811 - cwltool - cmsearch-multimodel-wf

$ cwltool --pack cmsearch-multimodel-wf.cwl > cmsearch-multimodel-wf-packed.cwl
$ cwltool --beta_relaxed_fmt_check cmsearch-multimodel-wf-packed.cwl cmsearch-multimodel-wf.test.job.yaml

[workflow main] completed success
{   
    "deoverlapped_matches": {
        "checksum": "sha1$4d88865ee67b33fbf3db268a211fce9dbff715dd",
        "basename": "result.deoverlapped",
        "location": "file:///home/jra001k/snapshot/workflow-is-cwl/workflows/result.deoverlapped",
        "path": "/home/jra001k/snapshot/workflow-is-cwl/workflows/result.deoverlapped",
        "class": "File",
        "size": 7490
    }
}
Final process status is success

cwltool - Diamond

$ cwl-runner Diamon.blastx-v0.9.21.cwl Diamon.blastx-v0.9.21.test.job.yaml

[job Diamon.blastx-v0.9.21.cwl] completed success
{
    "matches": {
        "format": "http://edamontology.org/format_2333",
        "checksum": "sha1$021f319cc66ca5c34211992209d88ed84b247ce6",
        "basename": "test_transcripts.fasta.diamond_matches",
        "location": "file:///home/jra001k/snapshot/workflow-is-cwl_h/tools/Diamond/test_transcripts.fasta.diamond_matches", 
        "path": "/home/jra001k/snapshot/workflow-is-cwl_h/tools/Diamond/test_transcripts.fasta.diamond_matches", 
        "class": "File",
        "size": 12519
    }
}
Final process status is success
$ cwl-runner --debug Diamon.makedb-v0.9.21.cwl Diamon.makedb-v0.9.21.test.job.yaml

Traceback (most recent call last):
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/main.py", line 602, in main
    **vars(args))
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/executors.py", line 31, in __call__
    return self.execute(*args, **kwargs)
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/executors.py", line 72, in execute
    self.run_jobs(t, job_order_object, logger, **kwargs)
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/executors.py", line 101, in run_jobs
    for r in jobiter:
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 341, in job
    builder = self._init_job(job_order, **kwargs)
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/process.py", line 626, in _init_job
    builder.bindings.extend(builder.bind_input(self.inputs_record_schema, builder.job, discover_secondaryFiles=kwargs.get("toplevel")))
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/builder.py", line 182, in bind_input
    bindings.extend(self.bind_input(f, datum[f["name"]], lead_pos=lead_pos, tail_pos=f["name"], discover_secondaryFiles=discover_secondaryFiles))
  File "/home/jra001k/.local/lib/python2.7/site-packages/cwltool/builder.py", line 242, in bind_input
    raise WorkflowException("Expected value of '%s' to have format %s but\n  %s" % (schema["name"], schema["format"], ve))
WorkflowException: Expected value of 'inputRefDBFile' to have format http://edamontology.org/format_1929 but
  File has no 'format' defined: {
    "class": "File",
    "location": "file:///home/jra001k/snapshot/workflow-is-cwl_h/tools/Diamond/test-input/uniref90_subset.fasta", 
    "size": 4752,
    "basename": "uniref90_subset.fasta",
    "nameroot": "uniref90_subset",
    "nameext": ".fasta"
}

uniref90_subset.fasta.zip

cwltool - cmsearch-multimodel-wf

cd workflow-is-cwl/workflows
sed -i "s/foobar/$USER/" cmsearch-multimodel-wf.cwl # temporary hack
cwl-runner cmsearch-multimodel-wf.cwl cmsearch-multimodel-wf.test.job.yaml
Final process status is success

Output files produced

workflow-is-cwl/tools/Infernal/cmsearch/mrum-genome.fa.cmsearch.out
workflow-is-cwl/tools/Infernal/cmsearch/mrum-genome.fa.cmsearch_matches.tbl
workflow-is-cwl/tools/cmsearch-deoverlap/1.cmsearch.tblout.deoverlapped
workflow-is-cwl/workflows/result.deoverlapped

galaxy+cwltool - BUSCO

hints:
  - class: gx:interface
    gx:inputs:
      - gx:name: blastSingleCore
        gx:type: boolean
        gx:optional: True
      - gx:name: cpu
        gx:type: integer
        gx:optional: True
      - gx:name: evalue
        gx:type: float
        gx:optional: True
      - gx:name: force
        gx:type: boolean
        gx:optional: True
      - gx:name: help
        gx:type: boolean
        gx:optional: True
      - gx:name: lineage
        gx:type: directory
      - gx:name: long
        gx:type: boolean
        gx:optional: True
      - gx:name: mode
        gx:type: text
      - gx:name: outputName
        gx:type: text
      - gx:name: quiet
        gx:type: boolean
        gx:optional: True
      - gx:name: regionLimit
        gx:type: integer
        gx:optional: True
      - gx:name: restart
        gx:type: boolean
        gx:optional: True
      - gx:name: sequenceFile
        gx:format: txt
        gx:type: data
      - gx:name: species
        gx:type: text
        gx:optional: True
      - gx:name: tarzip
        gx:type: boolean
        gx:optional: True
      - gx:name: tempPath
        gx:type: directory
        gx:optional: True
      - gx:name: version
        gx:type: boolean
        gx:optional: True

galaxy+cwltool - cmsearch-multimodel-wf

Importing 'cmsearch-multimodel-wf.cwl' workflow in galaxy triggers exception below

raise exceptions.MessageException("The data content does not appear to be a valid workflow.")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.