Giter Site home page Giter Site logo

union-tool's Introduction

union-tool

This script takes two tab delimited files and merge them based on shared column(s).

Usage

Usage: union.pl --input1=FILENAME --input2=FILENAME --col1=INT(,INT,INT...) --col2=INT(,INT,INT...) --header=INT --output=FILENAME

   --input1|-1 (FILENAME): name of first input file. Must be a tab delimited file. MANDATORY.
   --input2|-2 (FILENAME): name of second input file. Must be a tab delimited file. MANDATORY.
   --output|-o (FILENAME): name of output file (default STDOUT).
   --col1|-c (List of INT): list of column numbers used to merge the file. (default: 1).
   --col2|-d (List of INT): list of column numbers used to merge the file. (default: 1).
   --header|H (INT): number of header lines. (default: none).

Goal

This tool makes union of two datasets based on one or many common field. Columns are referenced with a number. For example, 3 refers to the 3rd column of a tab-delimited file. Specify a list of comma separated numbers to join datasets based on several columns at the same time. Identifiers i.e common field(s) must by exactly identical (case sensitive). Unmatched lines will be output. Empty fields are left blank.

Example

Dataset1:

chr1 10 20 geneA
chr1 50 80 geneB
chr5 10 40 geneL

Dataset2:

geneA tumor-supressor
geneB Foxp2
geneC Gnas1
geneE INK4a

Joining the 4th column of Dataset1 with the 1st column of Dataset2 will yield:

geneA chr1    10      20      tumor-supressor
geneB chr1    50      80      Foxp2
geneL chr5    10      40
geneC                         Gnas1
geneE                         INK4a

union-tool's People

Contributors

slegras avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.