Giter Site home page Giter Site logo

Module documentation format about modules HOT 8 CLOSED

nf-core avatar nf-core commented on May 19, 2024
Module documentation format

from modules.

Comments (8)

sven1103 avatar sven1103 commented on May 19, 2024 2

With some x-talk with @ewels, let's try simple YAML. It is no effort at all to parse in most languages, and with a regex like \/\*(\*(?!\/)|[^*])*\*\/ everything within a comment block can be fetched:

/*
My process description.
*/

Everything that does not look like a YAML can be easily ignored (probably a usual code comment).

What information do we want to display? I will start with a list:

  • Description
  • Keywords
  • Tools
  • Input
  • Output
  • Authors

Description

Just a general description about the purpose of the process / function.

Keywords

One or more keywords to be able to group processes by keyword.

Tools

A list of tool objects used in a process. A tool object can contain fields like

  • description
  • url
  • doi

Input

Input is a list of Nextflow input definitions, and follow the format

 <input qualifier> <input name> [from <source channel>] [attributes]

Maybe two fields here: the definition and a description?

Output

Same as input.

Authors

A list of GitHub users contributed to the process.

Example

How would this look like:

/*
description: Simply FASTQC
keywords:
    - Quality Control
    - QC
tools:
    - fastqc:
        description: <description here>
        homepage: https://superhomepage.edu
        doi: <doi here>
input:
    - reads:
        type: file
        description: <description here>
    - sample_id
        type: string
        description: <description here>
output:
    - report:
        type: file
        schema: *_fastqc.{zip,html}
authors:
    - @sven1103
    - @drharshil
*/
process fastqc {
    tag "$sample_id"
    publishDir "${params.outdir}/fastqc", mode: 'copy',
        saveAs: {filename -> filename.indexOf(".zip") > 0 ? "zips/$filename" : "$filename"}

    input:
    set val(sample_id), file(reads)

    output:
    file "*_fastqc.{zip,html}"

    script:
    """
    fastqc -q $reads
    fastqc --version &> fastqc.version.txt
    """
}

This is just an example, we can work out the details. But seeing the code makes it easier to communicate what we are talking about :D

from modules.

ewels avatar ewels commented on May 19, 2024 1

Everything that does not look like a YAML can be easily ignored (probably a usual code comment).

I think we should try to parse everything inside the comment block as YAML. Guessing which bits are YAML and which bits are comment is a bit of a faff (there can always be yaml comments!).

Otherwise, I think this all looks great! Only thing I notice is that the inputs should be a list of a list, as there can be multiple input channels, each of which can have multiple definitions. So more like:

input:
  - - reads:
      type: file
      description: <description here>
    - sample_id:
      type: string
      description: <description here>

Then you can have, for example:

input:
  - 
    - reads:
      type: file
      description: <description here>
    - sample_id:
      type: string
      description: <description here>
  -
    - index:
      type: file
      description: Second input channel for a reference or whatever

This YAML syntax is a bit confusing to look at, so will definitely need some linting with nice helpful error messages ๐Ÿ˜‰

from modules.

ewels avatar ewels commented on May 19, 2024 1

Discussing at the hackathon - suggestion is that we should have this meta information as a separate file so that it is easier to parse by other tools (including nextflow itself). If it's in a comment then it will be very difficult to get in to nextflow.

We could copy bioconda and have a meta.yml for each module.

Note that we need things to be organised in directories for this. But we should probably have that anyway.

from modules.

ewels avatar ewels commented on May 19, 2024

Suggestions: donโ€™t prefix each line with * (no need for comment and makes it harder to write & parse); use valid YAML ๐Ÿ˜‰ - keywords should be prefixed with - to make it an array, description should start with : > to make it multi-line; donโ€™t use capitalisation in keys maybe?

from modules.

sven1103 avatar sven1103 commented on May 19, 2024

ok, I agree. All-or-nothing parsing :) But people could still have usual comment blocks, and we should not restrict them from doing so.

So I suggest to let the linting throw warnings, if a comment block cannot be parsed as YAML.

from modules.

ewels avatar ewels commented on May 19, 2024

Addressed in #9

from modules.

grst avatar grst commented on May 19, 2024

In the context of the discussion in #8, I was wondering if the meta.yml could become a valid conda build recipe.

Name, description etc. are standard fields in a recipe already, and the rest could go into the extra section. (https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#extra-section)

from modules.

ewels avatar ewels commented on May 19, 2024

Discussion at another hackathon - general consensus was that the current system of using separate YAML files is probably best. I think that we can close this issue now.

from modules.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.