Giter Site home page Giter Site logo

molbio-dresden / flexidot Goto Github PK

View Code? Open in Web Editor NEW
89.0 5.0 16.0 21.32 MB

Highly customizable, ambiguity-aware dotplots for visual sequence analyses

License: GNU Lesser General Public License v2.1

Python 100.00%
sequence-analysis visualization dotplot repetitive-dna longest-common-subsequence pairwise-comparisons biology bioinformatics

flexidot's People

Contributors

molbio-dresden avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

flexidot's Issues

Conda dist

Hello Flexidot team,

I can't use the software, it kept giving an error. Here is it:

$ python2 ../flexidot/code/flexidot_v1.05.py -h
Installing Python module: matplotlib
	python -m pip install --upgrade matplotlib

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Requirement already up-to-date: matplotlib in /home/zarul/.local/lib/python2.7/site-packages (2.2.4)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.1 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (2.8.1)
Requirement already satisfied, skipping upgrade: subprocess32 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (3.5.4)
Requirement already satisfied, skipping upgrade: cycler>=0.10 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied, skipping upgrade: six>=1.10 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (1.14.0)
Requirement already satisfied, skipping upgrade: backports.functools-lru-cache in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (1.6.1)
Requirement already satisfied, skipping upgrade: pytz in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (2019.3)
Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (2.4.6)
Requirement already satisfied, skipping upgrade: numpy>=1.7.1 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (1.16.6)
Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in /home/zarul/.local/lib/python2.7/site-packages (from matplotlib) (1.1.0)
Requirement already satisfied, skipping upgrade: setuptools in /home/zarul/.local/lib/python2.7/site-packages (from kiwisolver>=1.0.1->matplotlib) (44.0.0)


Please install module matplotlib manually
Traceback (most recent call last):
  File "../flexidot/code/flexidot_v1.05.py", line 3315, in <module>
    load_modules()
  File "../flexidot/code/flexidot_v1.05.py", line 61, in load_modules
    import matplotlib.colors as mcolors
  File "/home/zarul/.local/lib/python2.7/site-packages/matplotlib/__init__.py", line 126, in <module>
    from . import cbook
  File "/home/zarul/.local/lib/python2.7/site-packages/matplotlib/cbook/__init__.py", line 34, in <module>
    import numpy as np
  File "/home/zarul/.local/lib/python2.7/site-packages/numpy/__init__.py", line 140, in <module>
    from . import _distributor_init
ImportError: cannot import name _distributor_init

Can you provide it in conda distribution through the bioconda channel please?

Self plot shading

Dear author,

I go through SupplementaryData.pdf and found the example of using .gff3 to annotate the plot. But how can I add the figure legend, I did not see the legend on my data either your test data.

Thanks!

Flexidot computational time

Hi
Thanks a lot for developing this toolkit looks really amazing.
However, may I know the computational time and threads needed to obtain the graphs?
Because I am trying it but it seems to be taking long time to even perform a first calculation.
Does it need to be run on a server?
Best Regards
PL

Comparing two genomes

Say, I have two genomes. One is complete (pseudomolecule level assembly) and the other one is not complete yet (few Mb sized scaffolds). My question is, it possible to plot these two genomes to study genome rearrangements? I'm confused with -p options.

sub sequence

Hi,
great tool, it has made my life so much easier!
One thing that I think might be missing (unless I am mistaken) is the possibility of reading into a very long fasta file, say an entire genome, and then passing a list of regions (in bed format or other) for the program to work on.
That way the plots would keep the proper genomic coordinates rather than arbitrary numbers.

Thank you a lot!

It is a really brilliant tool !!! I was looking for it for a long time. I working with different genome assemblies and this tool help me while comparisons them.

Use blast output as input?

Hey, great tool! Looks very promising to combine it with RepeatMasker gff3 annotation, and I love the way of installation - so easy! Every python program should do this!

So I tried it on a 44K-sequence region with heavily nested TE insertions. It runs quite slow on my laptop with 4 threads. Immediately I am thinking: can the program take BLAST outputs (i.e., -outfmt=6) as inputs, which may be faster for the alignment procedure?

Other thought: I tried it on our CentOs 7 server, the program seems to require a visualization terminal.

self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

Can it be run in a server and only generate png/pdf files for laptop visualization?

Thanks again!
Shujun

License?

Could you please add a license to the repo? I would like to add the tool to Bioconda.

Error reading fasta sequences - please check input files!

Hi thanks for this great software. While using this I am unable to start with the continuous error of "Error reading fasta sequences - please check input files!". This error also occurring with test-seqs.fas.

Full log output pasted below:

python2.7 ~/Software/flexidot/code/flexidot_v1.00.py -i ~/Software/flexidot/test-data/test-seqs.fas -g ~/Software/flexidot/test-data/example2.gff3 -G ~/Software/flexidot/test-data/gff_color.config -o Cluster_94644.txt.bed.pdf -p 2
Installing Python module: colour
	python -m pip install colour

Requirement already satisfied: colour in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (0.1.5)


Please install module colour manually
Installing Python module: colormap
	python -m pip install colormap

Installing Python module: easydev
	python -m pip install easydev

Requirement already satisfied: colormap in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (1.0.4)
Requirement already satisfied: easydev in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (0.12.0)
Requirement already satisfied: colorama in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (from easydev) (0.4.4)
Requirement already satisfied: pexpect in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (from easydev) (4.8.0)
Requirement already satisfied: colorlog in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (from easydev) (6.6.0)
Requirement already satisfied: ptyprocess>=0.5 in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (from pexpect->easydev) (0.7.0)


Please install module colormap manually
Installing Python module: biopython
	python -m pip install biopython

Requirement already satisfied: biopython in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (1.79)
Requirement already satisfied: numpy in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (from biopython) (1.21.5)


Please install module biopython manually
Installing Python module: regex
	python -m pip install regex

Requirement already satisfied: regex in /Users/nawazk/opt/anaconda3/lib/python3.9/site-packages (2022.3.15)


Please install module regex manually

...reading input arguments...
fasta file #1: /Users/nawazk/Software/flexidot/test-data/test-seqs.fas
GFF file #1: /Users/nawazk/Software/flexidot/test-data/example2.gff3

----------------------------------------------------------------------

INPUT/OUTPUT OPTIONS...

Input fasta file:                                  /Users/nawazk/Software/flexidot/test-data/test-seqs.fas
Automatic fasta collection from current directory: False
Collage output:                                    True
Number of columns per page:                        4
Number of rows per page:                           5
File format:                                       png
Residue type is nucleotide:                        True


CALCULATION PARAMETERS...

Wordsize:                                          7
Plotting mode:                                     2
                                                   all-against-all 
Ambiguity handling:                                False
Reverse complement scanning:                       True
Alphabetic sorting:                                False
Prefix for output files:                           Cluster_94644.txt.bed.pdf


LCS SHADING OPTIONS (plotting_mode 'all-against-all' only)...

LCS shading:                                       False
LCS shading interval number:                       5
LCS shading reference:                             maximal LCS length
LCS shading orientation:                           forward


GRAPHIC FORMATTING...

Plot size:                                         10
Line width:                                        1
Line color:                                        black
Reverse line color:                                #009243
X label position:                                  True
Label size:                                        10
Spacing:                                           0.04
Title length (limit number of characters):         inf
Length scaling:                                    False
----------------------------------------------------------------------


==================================================

Reading GFF color configuration file
----------------------------

=> /Users/nawazk/Software/flexidot/test-data/gff_color.config

Updating GFF color configuration with custom specifications

1 feature type specifications overwritten:
	repeat_region

GFF color specification updated acc. to /Users/nawazk/Software/flexidot/test-data/gff_color.config
	misc_feat, misc_feature, spacer3, spacer2, spacer1, cds, misc, ltr_retrotransposon, ppt, tandem_repeat, target_site_duplication, intron, spacerzoom, repeat, ltr-retro, long_terminal_repeat, exon, others, orf, transposable_element, ltr, pbs, repeat_region, utr, repeat_region_rev, orf_rev, gene


==================================================

Running plotting modes 2
Error reading fasta sequences - please check input files!

Failed to load sequences

	 0.000445127487183 seconds

No image files were created!

==================================================

##################################################
##################################################

Thank you for using FlexiDot!

```


How best to decrease computation time?

Thanks for developing such a versatile tool; especially the ability to integrate annotations is very nice. However, I find that running comparisons of relatively long sequences is prohibitive.
Specifically, I started an all-vs-all comparison of three plant genome sequences of 0.9, 2.7, and 5.6 Mb and it is still running after almost one day. Below are the settings used:
flexidot_v1.06.py -i JAATIP010000026.1.fas,JAATIP010000103.1.fas,UZAU01000736.fas -k 70 -p 2 -f 2 -w 0 -L 1 -g genes.gff -G gff_color.config
Are such runtimes expected for such sequence lengths, and if so, do you have any recommendations for speeding things up?

Thanks

annotation - pairwise comparisons

Dear Flexidot Developers,

thank you for flexidot - this is just what we needed.
we would like to compare two different genomes in a pairwise dot-plot with annotation but I cannot figure out how to do so.

  1. It seems that the annotation option only works for "self" and "all-comparisons" (-p 0/2) but not pairwise dot plots.
    Even for "all comparisons" the annotations only show on the diagonal tiles that are self-dotplots.

Is there a way to implement annotations for pair-wise comparison dotplots?

  1. also, is there anyway to automatically assign color shading for the annotations without manually creating a color-config file?

Im currently using the python 3 version ("flexidot_v2.01_alpha01_py3").

thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.