nbisweden / gaas Goto Github PK

View Code? Open in Web Editor NEW

196.0 55.0 45.0 30.02 MB

Genome Assembly and Annotation Service code

License: GNU General Public License v3.0

Perl 75.61% Shell 6.20% Ruby 12.15% Python 0.93% R 2.02% HTML 2.04% Awk 0.21% Groovy 0.25% HCL 0.58%

annotation assembly knowledge scripts cheatsheet

gaas's Introduction

GAAS

Genome Assembly Annotation Service (GAAS)

Suite of tools related to Genome Assembly Annotation Service tasks.

What can GAAS do for you?
Installation
- Using bioconda
  - Install
  - Update
  - Uninstall
- Old school
Usage
Repository structure

What can GAAS do for you?

The repository contains mainly tools and knowledge related to bioinformatics and annotation the most often. To access and install the tools please follow the installation procedures below. For the knowledge you are invited to visit the knowledge part of the repo or if you are looking specifically for genome assembly knowledge The Genome Assembly Workshop Knowledge Base.

Installation

Using conda

Install

conda install -c bioconda gaas

Update

conda update gaas

Uninstall

conda uninstall gaas

Old school

Prerequisites

Perl Perl >= 5.8, and a list of perl modules that can be installed using cpan, cpanm or conda:

Install perl modules with cpanm

cpanm install bioperl
cpanm install Clone
cpanm install Graph::Directed
cpanm install LWP::UserAgent
cpanm install Statistics::R
cpanm install Sort::Naturally
cpanm install File::Share
cpanm install Moose
cpanm install File::ShareDir::Install
cpanm install Bio::DB::EUtilities

Install perl modules with conda

conda env create -f conda_environment_GAAS.yml
conda activate gaas

Install

git clone https://github.com/NBISweden/GAAS.git # Clone GAAS
cd GAAS                                         # move into GAAS folder
perl Makefile.PL                                # Check all the dependencies*
make                                            # Compile
make test                                       # Test
make install                                    # Install

^*If dependencies are missing you can install them using cpan/cpanm or use conda and load the environment conda_environment_GAAS.yml

Remark: On MS Windows, instead of make you'd probably have to use dmake or nmake depending the toolchain you have.

Update

From the folder where the repository is located.

git pull                                        # Update to last GAAS
perl Makefile.PL                                # Check all the dependencies<sup>1</sup>
make                                            # Compile
make test                                       # Test
make install                                    # Install

Change to a specific version

From the folder where the repository is located.

git pull                                        # Update the code
git checkout v0.1.1                             # use version v0.1 (See releases tab for a list of available versions)
perl Makefile.PL                                # Check all the dependencies<sup>1</sup>
make                                            # Compile
make test                                       # Test
make install                                    # Install

Uninstall

perl uninstall_GAAS

Usage

script_name.pl -h

Repository structure

annotation

Annotation directory contains everything related to annotation side of the service.

Shorcuts:

knowledge
Genome annotation workshop
Tools
=> All gff related work have been transplanted into AGAT (11/2019)
Pipelines

assembly

Assembly directory contains development related to assembly side of the service.

Shorcuts:

gaas's People

Contributors

Stargazers

Watchers

Forkers

remiolsen libingnan11 kaydaramola pythseq wangpanqiao sunnycqcn dahlo jamiecfreeman altingia cnyuanh kant pallavimore02 wangdi2014 breme86 tw7649116 1010stone nimarafati hj1994412 galicae sadikmu nicholas-nvs nylander mxrcon leornardzhou mingjuhao ahmedib1 venuraherath yuzhenpeng lengfei5 onlinearts emmadebayos crawlingsponge aijigekoukou-shen sunnyev wook2014 wenmm bijendrabio lee520421 akshaya-v jiangchb quvance exgdt jimaz shivanshss rajaldebnath

gaas's Issues

gff3_sp_statistics.pl

Find a way to not count overlaping part when isoform present to the covering % of the genome

run agat_convert_sp_gff2gtf.pl after conda install

I have use this command below :
1 - conda activate agat
2 - I'm open a terminal on my gff pathway
3 - perl agat_convert_sp_gff2gtf.pl --gff file.gff -o file.gtf

ERROR : Can't locate Sort/Naturally.pm in @inc (you may need to install the Sort::Naturally module) (@inc contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /home/kali/AGAT/bin/agat_convert_sp_gff2gtf.pl line 7.
BEGIN failed--compilation aborted at /home/kali/AGAT/bin/agat_convert_sp_gff2gtf.pl line 7.

If anyone has an idea what it is

gff3_sp_keep_longest_isoform.pl error

While running "gff3_sp_keep_longest_isoform.pl" I am getting the following error
Can't use an undefined value as an ARRAY reference at /anno/sanyalab/SOFTWARE/GAAS/annotation/BILS/Handler/GXFhandler.pm line 1293

I have attached a small chunk of the file giving the problem. Please advice.
ptn_genes_modi.txt

Originally posted by @sanyalab in #7 (comment)

gff3_add_locus_tag.pl has to work by locus/record

Currently it behave like a sub-part of gff3_manageAttributes.pl.
Should create a locus tag from level1 and then transmist it to sub-features

maker_merge_outputs_from_datastore

Do not modify output name when provided

missing perl module: 'NBIS::GFF3::Omniscient'

Dear team,

I am trying to use gff3_sp_fix_feature_duplication.pl, and I encountered the following error.
Can't locate NBIS/GFF3/Omniscient.pm in @inc (you may need to install the NBIS::GFF3::Omniscient module)
However, I can never find and install this module either by CPAN website or cpanm.
Could you please help me to solve this problem?
Thank you very much and happy Christmas!

Kind regards,

Wei

gxf_to_gff3.pl pb PATH in creation of file

gxf_to_gff3.pl create the file where you run it and not where you give it a PATH

gff3_sp_fix_features_locations_duplicated.pl

after using this script few problem in _check_all_level1_positions. So find a way to fix that. See analis data for fixing it.

sub fil_cds_frame

The function assumes CDS start in frame 0, which is not necesseraly the case if gene is incomplete. So we should add the genome to extract the first codon and check it is a start codon according to the codon table to use.

Bug in gaas_filter_by_size.pl

Hi,

in: gaas_filter_by_size.pl

you guys are not actually using the "-o" option to designate an output file. And in addition, the line used to name the output file has an error in the regexp:

/.fa/ -> should be /\.fa/

However, I was actually annotating some dog data, with the file name "Canis_familiaris.proteins.fa" - and that actually produced:

Canisamiliaris.proteins.30.fa

Obviously, this is a problem for nextflow or other tools using name tracking. So maybe just use the "-o" option so users can choose the name.

Better description of statistics module

Hello,

I was wondering if you could help me understand some of the stats generated by GAAS module gff3_sp_statistics.pl.

What do you mean by "Number of intron in cds" and "Number of intron in exon". I am confused. How can introns exist in exons.
While calculating "Number of cdss", "Number of exons", "Number of exon in cds", "mean exons per mrna" "Total mrna length" "Total cds length" "Total exon length" do you consider all transcripts for a gene or the longest transcript for a gene.
"Total intron length per cds", "Total intron length per exon", "Total intron length per five_prime_utr", "Total intron length per three_prime_utr" is unnaturally high. What is meant by these?
What is the difference between "mean cds length" and "mean cds piece length"
What is the difference between "mean five_prime_utr length" and "mean five_prime_utr piece length"

Please give a better description of the stats. Attached is a stat file I generated
Thank you
Abhijit
test_stat_gaas.txt

Add expose feature_level option

Add expose feature_level option to copy in local folder the 3 files of features level.

add a a method that automatically check for presence of feature_level son file in the local folder to be used in priority over the default one.

gff3_sp_extract_attributes.pl

Please add a way to put all extracted gene information in the same output file instead of one file by attribute.

gff3_sp_clip_UTRs.pl to finish fix and new implementation

Develop terraform provisioning code for deploying webapollo instances

Develop a number of terraform code for deploying webapollo instances on SNIC cloud for enhanced reproducibility.

gff3_sp_keep_longest_isoform.pl PATH problem

Problem when creating the output, create at the same place we launch it but not where we want it

Gff3_sp_extract_sequence improve help

Cdna option is not really cDna because not reverse. Precise it to avoid confusion.

gff3_sp_manage_introns.pl error when no output folder provided

agat_sp_extract_sequences.pl

warning when phase is ".":
Argument "." isn't numeric in numeric ne (!=) at /sw/anaconda/2019.10/envs/agat/bin/agat_sp_extract_sequences.pl line 392.
Argument "." isn't numeric in numeric ne (!=) at /sw/anaconda/2019.10/envs/agat/bin/agat_sp_extract_sequences.pl line 395.

agat_sp_manage_functional_annotation.pl formatting issue

Hello,

I installed AGAT using bioconda. I am trying to run the following: agat_sp_manage_functional_annotation.pl -gff braker1+2_combined.gtf -i genome_AA.faa.tsv -o braker.interproscan

The input GTF file is the output of the TSEBRA combiner tool that combines the predictions from Braker1 and Braker2. I am getting the error:

gff3 reader error level1: No ID attribute found @ for the feature: scf7180000002648     AUGUSTUS        gene    57422     208439  .       +       .
gff3 reader error level2: No ID attribute found @ for the feature: scf7180000002648     AUGUSTUS        transcript57422   208439  .       +       .
WARNING level2: No Parent attribute found @ for the feature: scf7180000002648   AUGUSTUS        transcript      57422     208439  .       +       .       ID "transcript-1"
WARNING gff3 reader: Hmmm, be aware that your feature doesn't contain any Parent and locus tag. No worries, we will handle it by considering it as strictly sequential. If you disagree, please provide an ID or a comon tag by locus. @ the feature is:

I saw other people had issues with TSEBRA output formatting, so I ran the rename_gtf.py script to clean up the gtf file but the error persists.

Here are the top lines of the 'clean' gtf file:

scf7180000002648        AUGUSTUS        transcript      57422   208439  .       +       .       g1.t1
scf7180000002648        AUGUSTUS        start_codon     57422   57424   .       +       0       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        CDS     57422   57455   0.38    +       0       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        exon    57422   57455   .       +       .       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        intron  57456   74207   0.54    +       .       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        CDS     74208   74305   0.54    +       2       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        exon    74208   74305   .       +       .       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        intron  74306   75613   0.53    +       .       transcript_id "g1.t1"; gene_id "g1";
scf7180000002648        AUGUSTUS        CDS     75614   75730   0.56    +       0       transcript_id "g1.t1"; gene_id "g1";

Do you have any suggestions on how to fix this?

Thanks in advance!
Julia

File copy warning with gaas_maker_merge_outputs_from_datastore.pl

The following warning messages where observed:

$ ~/git/NBIS/GAAS/annotation/tools/maker/gaas_maker_merge_outputs_from_datastore.pl \
   -i genome.maker.output_mixabinitio_abinitio_pacbio/ \
   -o genome.maker.output_mixabinitio_abinitio_pacbio_output_processed

[...]
Now save a copy of the Maker option files ...
Copy failed: No such file or directory genome.maker.output_mixabinitio_abinitio_pacbio_output_processed/maker_opts.ctl
Copy failed: No such file or directory genome.maker.output_mixabinitio_abinitio_pacbio_output_processed/maker_exe.ctl
Copy failed: No such file or directory genome.maker.output_mixabinitio_abinitio_pacbio_output_processed/maker_evm.ctl
Copy failed: No such file or directory genome.maker.output_mixabinitio_abinitio_pacbio_output_processed/maker_bopts.ctl

Now protecting the maker_annotation.gff annotation by making it readable only...

Now performing the statistics of the annotation file genome.maker.output_mixabinitio_abinitio_pacbio_output_processed/maker_ann
otation.gff...
WARNING get_longest_cds_level2: NO exon or cds to select the longest l2 for evm-000115f-processed-gene-1.0 l1 ! We will take on
e randomly ! @

There are possibly two kinds of errors observed here. First is the failure of copying control files.
This is addressed in the pull request (#47 ).

The second is the warning from get_longest_cds_level2. This have not yet been addressed.

One issue related to the error with paths and folders is that the script
searches for output folders from Maker ending in maker.output (line #59), but the case I
was given have folders ending in something else.

Trouble installing gaas with conda

I've been having trouble installing gaas via Conda.

Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package ncurses conflicts for:
python=3.9 -> readline[version='>=8.0,<9.0a0'] -> ncurses[version='>=6.1,<7.0a0']
python=3.9 -> ncurses[version='>=6.2,<7.0a0|>=6.3,<7.0a0']

Package xz conflicts for:
python=3.9 -> xz[version='>=5.2.5,<6.0a0|>=5.2.6,<6.0a0|>=5.2.8,<6.0a0']
gaas -> r-base[version='>=3.5,<3.6.0a0'] -> xz[version='>=5.2.4,<6.0a0']

Package libcxx conflicts for:
python=3.9 -> libffi[version='>=3.3,<3.4.0a0'] -> libcxx[version='>=4.0.1']
python=3.9 -> libcxx[version='>=10.0.0|>=12.0.0|>=14.0.6']

I've tried to force Conda to use a specific version of Python, like 3.7, 3.8, 3.10 and 3.11, they all fail. I don't get it, is this happening to other people as well? I've tried in our cluster, and locally.

Add 2 extra scripts

based on filter_sort.pl from andreas:
1 create a script to filter CDS without start or/and stop.
2 create a script to filter by inter-loci distance

=> Then we can add those steps in the course for a complete workflow to filter dataset for training ab-initio.

gff3_sp_manage_functional_annotation.pl

Dear GAAS Developers,

I installed GAAS through Conda. I was looking for the script "gff3_sp_manage_functional_annotation.pl" to merge/integrate InterProScan output with Maker's gff3 file, but I couldn't find (invoke) it. I was wondering if this script is available separately elsewhere.

Thank in advance for the help!

Update gaas_create_annotation_project.pl to current work-flow demands

There is a need to update gaas_create_annotation_project.pl to current work-flow demands. A pull-request is soon to be submitted. This note is for adding it to the list of tasks to do.

gff3_sp_functional_statistics.pl add functions in txt files

as for gff3_sp_manage_functional_annotation.pl

how to use gff3_sp_statistics.pl to statistics output from cufflinks(transcripts.gff3)

Hi,Juke
I want to use the gff3_sp_statistics.pl to statistics the ouput from cufflinks.that is transcripts.gff3.How can I achieve it？

Thanks！

installation via conda: package incompatibility

Hi I am trying to install gaas via conda install -c bioconda gaas and some conflicts between version of packages have been reported.

If I am understanding this correctly, gaas requires r-base >=3,5,<3.6.0a0, which requires krb5 < 1.16.4, which requires openssl < 1.1.2a. While mamba requires openssl to be >=3.1.4, thus the conflict. Or I could be totally wrong.

Any help would be appreciated.

groovy

i) When I run this on rackham I get this error:

`$source GAAS/profiles/activate_nbis_env
Admin privileges not granted
Checking for tool dependencies
Ruby is installed ... yes : ruby 2.0.0p648 (2015-12-16) [x86_64-linux]
Groovy is installed ... no
Perl is installed ... yes :
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
(with 33 registered patches, see perl -V for more detail)

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Rscript is installed ... yes : Fatal error: cannot open file '-v': No such file or directory
make: *** [check] Error 2`

ii) When I run it on my own server (Ubuntu server 18.04 LTS), which has groovy installed I get:

Fatal error: cannot open file '-v': No such file or directory ~/Makefile:2: recipe for target 'check' failed make: *** [check] Error 2

Any thoughts?

Versioning is not correct.

The versioning is not correct in gaas_fasta_removeSeqFromIDlist.pl

How to reproduce:

docker run -it --rm quay.io/biocontainers/gaas:1.2.0--pl526r35_0 bash
gaas_fasta_removeSeqFromIDlist.pl --help

Output:


 ------------------------------------------------------------------------------
|   Genome Assembly Annotation Service (AGAT) - Version: v1.1.0              |
|   https://github.com/NBISweden/AGAT                                          |
|   National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se         |
 ------------------------------------------------------------------------------
  
< truncated >

General problem:
I need to be able to extract version information reliably when using these scripts in the nf-core modules way.

Replace Travis by github actions

Travis is too limited. We have only few build per month and in many case the CI tests are not run,
Would be nice to move on to github actions. See AGAT or pipelines-nextflow for an example

gaas_maker_merge_outputs_from_datastore.pl does not create the maker_mix.gff file anymore

Hi,
I ran gaas_maker_merge_outputs_from_datastore.pl on the genome.maker.output folder of maker the 30th of September and no maker_mix.gff was created.
Last time I ran the script and created a maker_mix.gff file was on 29th on july (on an other species).
I ran gaas_maker_merge_outputs_from_datastore.pl on the same folder as the 29th of july but no maker_mix.gff was created.

I checked the git log but I don't see anything that was done that would change gaas_maker_merge_outputs_from_datastore.pl

any ideas?

Thank you,

Lucile

GXF Handler - When parsing using locus tag - duplicate are not removed.

see RAST gff annotation

Increase the usage of combined assignment operators

👀 Some source code analysis tools can help to find opportunities for improving software components.
💭 I propose to increase the usage of combined operators accordingly.

diff --git a/annotation/tools/abinitio/augustus/gaas_junctions2hints.pl b/annotation/tools/abinitio/augustus/gaas_junctions2hints.pl
index a74a6ef..d3ee10f 100755
--- a/annotation/tools/abinitio/augustus/gaas_junctions2hints.pl
+++ b/annotation/tools/abinitio/augustus/gaas_junctions2hints.pl
@@ -47,8 +47,8 @@ while (<INFILE>) {
   unless ($_ =~ /^track/){
     my @bed_line=split(/\t/, $_);
     my ($startblock,$endblock)=split(/\,/, $bed_line[10]);
-    $bed_line[1]=$bed_line[1]+$startblock+1;
-    $bed_line[2]=$bed_line[2]-$endblock;
+    $bed_line[1] += $startblock + 1;
+    $bed_line[2] -= $endblock;
     my $key = join (':',$bed_line[0],$bed_line[1],$bed_line[2]);
     $junctions{$key} +=$bed_line[4];
   }
diff --git a/annotation/tools/comparative_genomic/gaas_orthomcl_analyzeOG.pl b/annotation/tools/comparative_genomic/gaas_orthomcl_analyzeOG.pl
index d16dacc..d01aa93 100755
--- a/annotation/tools/comparative_genomic/gaas_orthomcl_analyzeOG.pl
+++ b/annotation/tools/comparative_genomic/gaas_orthomcl_analyzeOG.pl
@@ -409,7 +409,7 @@ foreach my $keyID (keys %LossIdByAppearance){
             print $message; if ( $opt_output ){ print $outReport $message; }
 
             if(exists ($lossMergedByTaxid{$keyID2})){
-                $lossMergedByTaxid{$keyID2}=$lossMergedByTaxid{$keyID2}+$value;
+                $lossMergedByTaxid{$keyID2} += $value;
             }
             else{
                 $lossMergedByTaxid{$keyID2}=$value;
diff --git a/annotation/tools/comparative_genomic/gaas_prepare_matrice_by_window.pl b/annotation/tools/comparative_genomic/gaas_prepare_matrice_by_window.pl
index e4c6fa3..53c7823 100755
--- a/annotation/tools/comparative_genomic/gaas_prepare_matrice_by_window.pl
+++ b/annotation/tools/comparative_genomic/gaas_prepare_matrice_by_window.pl
@@ -263,7 +263,7 @@ foreach my $chr (keys %hash_chr2name){
     if( (1000000*$currentLimit) > $highvalue){$stop++};
 
     my $nbPrintedVal=0;
-    $currentLimit = $currentLimit + $opt_windowsSize;
+    $currentLimit += $opt_windowsSize;
     $currentCenter=$currentLimit-($opt_windowsSize/2);
     $currentCenter=sprintf "%.1f",$currentCenter;
     $path=$dir."/".$currentCenter."Mb.aln";
diff --git a/annotation/tools/fasta/gaas_fasta_domain_extractor.pl b/annotation/tools/fasta/gaas_fasta_domain_extractor.pl
index abb042c..200df2f 100755
--- a/annotation/tools/fasta/gaas_fasta_domain_extractor.pl
+++ b/annotation/tools/fasta/gaas_fasta_domain_extractor.pl
@@ -114,7 +114,7 @@ if(length($seq) < $end){print "End position for extraction is over the sequence
 #start is 1-based coordinate system
 # The extraction compute in 0-based coordinate system
 # Lets change the 1-based coordinate system in 0-based coordinate system for the start
-$start=$start-1;
+$start -= 1;
 my $lengtExtraction=$end-$start; #Length in 0-based coordinate (in 1-based coordinate we must add +1)
 print "Length sequence extracted: $lengtExtraction\n";
 my $extractedPart=substr($seq, $start, $lengtExtraction);
diff --git a/annotation/tools/fasta/gaas_fasta_extract_sequence_from_id.pl b/annotation/tools/fasta/gaas_fasta_extract_sequence_from_id.pl
index d962388..3408f05 100755
--- a/annotation/tools/fasta/gaas_fasta_extract_sequence_from_id.pl
+++ b/annotation/tools/fasta/gaas_fasta_extract_sequence_from_id.pl
@@ -81,7 +81,7 @@ if (-f $opt_name){
   if (! defined $col){
     $col=0;
   }
-  else{$col=$col -1 ;}
+  else{$col -= 1 ;}
 
   #Manage line to avoid
   if (! defined $lineToAvoid){
diff --git a/annotation/tools/fasta/gaas_fasta_splitter.pl b/annotation/tools/fasta/gaas_fasta_splitter.pl
index 44290ed..8767440 100755
--- a/annotation/tools/fasta/gaas_fasta_splitter.pl
+++ b/annotation/tools/fasta/gaas_fasta_splitter.pl
@@ -342,7 +342,7 @@ sub split_fasta{
 
 								if ($next_size < ceil($split_size/2) ){
 
-									$size = $size+$next_size;
+									$size += $next_size;
 									$continue = undef;
 									print "attach: attach because next_size $next_size < $split_size / 2(ceil) \n";
 
diff --git a/annotation/tools/fastq/gaas_fastq_check_sync_pair1_pair2.pl b/annotation/tools/fastq/gaas_fastq_check_sync_pair1_pair2.pl
index 42fbfc1..098d37e 100755
--- a/annotation/tools/fastq/gaas_fastq_check_sync_pair1_pair2.pl
+++ b/annotation/tools/fastq/gaas_fastq_check_sync_pair1_pair2.pl
@@ -114,7 +114,7 @@ while (!eof($in1) and !eof($in2)) {
 }
 
 my $percent = ($read_fail / $read_cpt) * 100 ;
-$percent = $percent * 100; # make it percent
+$percent *= 100; # make it percent
 $percent = sprintf("%.2f", $percent);
 
 my $end_run = time();
@@ -141,7 +141,7 @@ sub concat_list_from_left{
 
   my $result="";
   foreach my $element (@{$list}){
-    $result = $result.$element;
+    $result .= $element;
   }
 
   return $result;
diff --git a/annotation/tools/fastq/gaas_fastq_deinterleave_bash.pl b/annotation/tools/fastq/gaas_fastq_deinterleave_bash.pl
index 9176713..1771ae0 100755
--- a/annotation/tools/fastq/gaas_fastq_deinterleave_bash.pl
+++ b/annotation/tools/fastq/gaas_fastq_deinterleave_bash.pl
@@ -112,7 +112,7 @@ sub concat_list_from_left{
 
   my $result="";
   foreach my $element (@{$list}){
-    $result = $result.$element;
+    $result .= $element;
   }
 
   return $result;
diff --git a/annotation/tools/ncbi/gaas_ncbi_get_sequence_from_list.pl b/annotation/tools/ncbi/gaas_ncbi_get_sequence_from_list.pl
index 524846d..3c620a4 100755
--- a/annotation/tools/ncbi/gaas_ncbi_get_sequence_from_list.pl
+++ b/annotation/tools/ncbi/gaas_ncbi_get_sequence_from_list.pl
@@ -82,7 +82,7 @@ else{
 if (! defined $col){
 	$col=0;
 }
-else{$col=$col -1 ;}
+else{$col -= 1 ;}
 
 #Manage line to avoid
 if (! defined $lineToAvoid){
diff --git a/bin/gaas_fasta_domain_extractor.pl b/bin/gaas_fasta_domain_extractor.pl
index abb042c..200df2f 100755
--- a/bin/gaas_fasta_domain_extractor.pl
+++ b/bin/gaas_fasta_domain_extractor.pl
@@ -114,7 +114,7 @@ if(length($seq) < $end){print "End position for extraction is over the sequence
 #start is 1-based coordinate system
 # The extraction compute in 0-based coordinate system
 # Lets change the 1-based coordinate system in 0-based coordinate system for the start
-$start=$start-1;
+$start -= 1;
 my $lengtExtraction=$end-$start; #Length in 0-based coordinate (in 1-based coordinate we must add +1)
 print "Length sequence extracted: $lengtExtraction\n";
 my $extractedPart=substr($seq, $start, $lengtExtraction);
diff --git a/bin/gaas_fasta_extract_sequence_from_id.pl b/bin/gaas_fasta_extract_sequence_from_id.pl
index d962388..3408f05 100755
--- a/bin/gaas_fasta_extract_sequence_from_id.pl
+++ b/bin/gaas_fasta_extract_sequence_from_id.pl
@@ -81,7 +81,7 @@ if (-f $opt_name){
   if (! defined $col){
     $col=0;
   }
-  else{$col=$col -1 ;}
+  else{$col -= 1 ;}
 
   #Manage line to avoid
   if (! defined $lineToAvoid){
diff --git a/bin/gaas_fasta_splitter.pl b/bin/gaas_fasta_splitter.pl
index 44290ed..8767440 100755
--- a/bin/gaas_fasta_splitter.pl
+++ b/bin/gaas_fasta_splitter.pl
@@ -342,7 +342,7 @@ sub split_fasta{
 
 								if ($next_size < ceil($split_size/2) ){
 
-									$size = $size+$next_size;
+									$size += $next_size;
 									$continue = undef;
 									print "attach: attach because next_size $next_size < $split_size / 2(ceil) \n";
 
diff --git a/bin/gaas_fastq_check_sync_pair1_pair2.pl b/bin/gaas_fastq_check_sync_pair1_pair2.pl
index 42fbfc1..098d37e 100755
--- a/bin/gaas_fastq_check_sync_pair1_pair2.pl
+++ b/bin/gaas_fastq_check_sync_pair1_pair2.pl
@@ -114,7 +114,7 @@ while (!eof($in1) and !eof($in2)) {
 }
 
 my $percent = ($read_fail / $read_cpt) * 100 ;
-$percent = $percent * 100; # make it percent
+$percent *= 100; # make it percent
 $percent = sprintf("%.2f", $percent);
 
 my $end_run = time();
@@ -141,7 +141,7 @@ sub concat_list_from_left{
 
   my $result="";
   foreach my $element (@{$list}){
-    $result = $result.$element;
+    $result .= $element;
   }
 
   return $result;
diff --git a/bin/gaas_fastq_deinterleave_bash.pl b/bin/gaas_fastq_deinterleave_bash.pl
index 9176713..1771ae0 100755
--- a/bin/gaas_fastq_deinterleave_bash.pl
+++ b/bin/gaas_fastq_deinterleave_bash.pl
@@ -112,7 +112,7 @@ sub concat_list_from_left{
 
   my $result="";
   foreach my $element (@{$list}){
-    $result = $result.$element;
+    $result .= $element;
   }
 
   return $result;
diff --git a/bin/gaas_junctions2hints.pl b/bin/gaas_junctions2hints.pl
index a74a6ef..d3ee10f 100755
--- a/bin/gaas_junctions2hints.pl
+++ b/bin/gaas_junctions2hints.pl
@@ -47,8 +47,8 @@ while (<INFILE>) {
   unless ($_ =~ /^track/){
     my @bed_line=split(/\t/, $_);
     my ($startblock,$endblock)=split(/\,/, $bed_line[10]);
-    $bed_line[1]=$bed_line[1]+$startblock+1;
-    $bed_line[2]=$bed_line[2]-$endblock;
+    $bed_line[1] += $startblock + 1;
+    $bed_line[2] -= $endblock;
     my $key = join (':',$bed_line[0],$bed_line[1],$bed_line[2]);
     $junctions{$key} +=$bed_line[4];
   }
diff --git a/bin/gaas_ncbi_get_sequence_from_list.pl b/bin/gaas_ncbi_get_sequence_from_list.pl
index 524846d..3c620a4 100755
--- a/bin/gaas_ncbi_get_sequence_from_list.pl
+++ b/bin/gaas_ncbi_get_sequence_from_list.pl
@@ -82,7 +82,7 @@ else{
 if (! defined $col){
 	$col=0;
 }
-else{$col=$col -1 ;}
+else{$col -= 1 ;}
 
 #Manage line to avoid
 if (! defined $lineToAvoid){
diff --git a/bin/gaas_orthomcl_analyzeOG.pl b/bin/gaas_orthomcl_analyzeOG.pl
index d16dacc..d01aa93 100755
--- a/bin/gaas_orthomcl_analyzeOG.pl
+++ b/bin/gaas_orthomcl_analyzeOG.pl
@@ -409,7 +409,7 @@ foreach my $keyID (keys %LossIdByAppearance){
             print $message; if ( $opt_output ){ print $outReport $message; }
 
             if(exists ($lossMergedByTaxid{$keyID2})){
-                $lossMergedByTaxid{$keyID2}=$lossMergedByTaxid{$keyID2}+$value;
+                $lossMergedByTaxid{$keyID2} += $value;
             }
             else{
                 $lossMergedByTaxid{$keyID2}=$value;
diff --git a/bin/gaas_prepare_matrice_by_window.pl b/bin/gaas_prepare_matrice_by_window.pl
index e4c6fa3..53c7823 100755
--- a/bin/gaas_prepare_matrice_by_window.pl
+++ b/bin/gaas_prepare_matrice_by_window.pl
@@ -263,7 +263,7 @@ foreach my $chr (keys %hash_chr2name){
     if( (1000000*$currentLimit) > $highvalue){$stop++};
 
     my $nbPrintedVal=0;
-    $currentLimit = $currentLimit + $opt_windowsSize;
+    $currentLimit += $opt_windowsSize;
     $currentCenter=$currentLimit-($opt_windowsSize/2);
     $currentCenter=sprintf "%.1f",$currentCenter;
     $path=$dir."/".$currentCenter."Mb.aln";

gff3_sq_remove_redundant_entries.pl error

I'm training my own species model by using Augustus. I want to get the gff3 file without redundancy proteins. And I don't know how to do this, so I hope the perl script, gff3_sq_remove_redundant_entries.pl can help me to do this.
But, I fount a error in this script, when I try to using it, it put the error to me

Global symbol "$help" requires explicit package name (did you forget to declare "my $help"?) at ./gff3_sq_remove_redundant_entries.pl line 38.
Execution of ./gff3_sq_remove_redundant_entries.pl aborted due to compilation errors.

And I also want to know, whether this scripts can remove the redundancy proteins from gff3?

gaas_fasta_statistics.pl output a markdown table not tsv or csv

This is a feature request. Can you make ./gaas_fasta_statistics.pl output a csv file or tsv file instead of a markdown format output. It's good for visual inspection of assembly stats but I need to use GAAS to make a multi assembly summary file.

nbisweden / gaas Goto Github PK

gaas's Introduction

GAAS

Genome Assembly Annotation Service (GAAS)

Table of Contents

What can GAAS do for you?

Installation

Using conda

Install

Update

Uninstall

Old school

Prerequisites

Install

Update

Change to a specific version

Uninstall

Usage

Repository structure

Shorcuts:

Shorcuts:

gaas's People

Contributors

Stargazers

Watchers

Forkers

gaas's Issues

Recommend Projects

Recommend Topics

Recommend Org