Giter Site home page Giter Site logo

sfle's Introduction

SflE


  • Download the SflE software ( sfle.tar.gz ) then follow the steps to install and run.

  • Please direct questions to the SflE google group (subscribe to receive SflE news).

  • If you use this environment, the included scripts, or any related code in your work, please let us know.


SflE (pronounced "souffle") is a Scientific workFLow Environment built around the SCons build system. SflE's goal is to rapidly, reliably, and reproducibly turn input data into analysis products. It allows researchers to combine Python scripts or other data processing programs with Sphinx reStructuredText documentation, much like Sweave combines R scripts with LaTeX documentation. SflE can include R scripts, command line programs, or any combination of steps needed to produce calculations, text, or publication-ready figures from input data. It can automatically download files from remote servers, parallelize and document analysis steps, minimize recalculations on an as-needed basis, and ensure you get the correct results quickly, every time. SflE workflows are packaged as "projects," and the quickest way to get started is to look at some of the example analyses provided with SflE itself.

Getting started with SflE

Prerequisites

SflE has several required prerequisites that must be installed and executable in order for it to run. We also suggest several optional prerequisites that will make your analyses more pleasant.

Required

  1. Python (version >= 2.7)
  2. SCons (version >= 2.1)
  3. Sphinx (version >= 1.1.2)
  4. Operating system (Linux or Mac)

Recommended

  1. R
  2. Matplotlib (version >= 1.1.0)

Installation

  1. Download and unpack the SflE software

    • Download the software: sfle.tar.gz
    • $ unzip master.zip
  2. Add the SflE software to your PYTHONPATH

    • $ export PYTHONPATH=$PYTHONPATH:`pwd`/src
  3. (Optional) Run SflE on all of its default workflows to test the install

    • $ scons

How to run

For a quick start, from the SflE directory, just type: $ scons

That's it! SflE automatically performs the following steps:

  1. Set up a default environment including convenience scripts and Sphinx utilities.
  2. Look for each project subdirectory in input, containing a file tree:
input                   # Main sfle input directory
  project1              # First project
    SConscript          # Workflow rules for project1
    input               # Input files specific to project1
      data.pcl
      metadata.pcl
      ...
    src                 # Scripts specific to project1
      script.py
      script.R
      ...
  project2              # Second project
    ...
Run each project's SConscript rules, generating intermediate and final output files:

output                  # Main sfle output directory
  tmp                   # Main sfle intermediate file directory
    project1            # Intermediate files specific to project1
      intermediate1.txt
      intermediate2.txt
      ...
    project2            # Intermediate files specific to project2
      ...
  project1              # Output files specific to project1
    output1.txt
    output2.txt
    ...
  project2              # Output files specific to project2
    ...

Dependency-based workflows

SflE manages projects, each of which represents a workflow for transforming inputs (typically data files) into outputs (typically calculation results, figures, documents, or reports). Each workflow is a list of one or more modules, and each module performs exactly one processing step.

For example, a workflow might consist of the following steps:

  1. Normalize an input data file
  2. Combine the normalized file with a metadata file
  3. Generate a report on the combined file

Each step represents one or a few modules, and each module runs exactly one command to generate a new output. Normalizing an input file creates one new, normalized output file. Combining two files might require creating two intermediate files and then joining them into another new output file.

SflE defines commands within modules, and assembles modules into workflows, using the SCons build system. SCons, like make, is a dependency-based "language" in which rules are descriptive, not imperative.

That is, in order to perform a series of steps:

  1. Call program1 to convert infile1 to tmpfile1
  2. Call program2 to combine infile2 and tmpfile1 into tmpfile2
  3. Call program3 to convert tmpfile2 into outfile1

You should instead describe what inputs and outputs each step requires:

  • tmpfile1 requires calling program1 to convert infile1
  • tmpfile2 requires calling program2 to combine infile2 and tmpfile1
  • tmpfile3 requires calling program3 to convert tmpfile2

This distinction is subtle, but notice that in the second form, you've only defined rules; you've not executed any of them yet! Nothing will happen until you execute the key step, create tmpfile3.

If you're not familiar with software build systems like SCons or make, we recommend the following background reading:

  1. Sections 1-4 of the GNU make manual
  2. Sections 2-3, 6-7, and 18 of the SCons user guide

sfle's People

Contributors

chuttenh avatar ljmciver avatar lwaldron avatar sagun98 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.