Giter Site home page Giter Site logo

smdabdoub / phylotoast Goto Github PK

View Code? Open in Web Editor NEW
12.0 6.0 4.0 81.61 MB

Tools for phylogenetic data analysis including visualization and cluster-computing support.

Home Page: http://phylotoast.org

License: MIT License

Python 100.00%
python phylogenetics qiime visualization 16s microbiome microbiome-analysis microbial-genomics bioinformatics computational-biology

phylotoast's People

Contributors

akshayparopkari avatar smdabdoub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

phylotoast's Issues

Switch to Python3

  • Planned changes:
    • Changing Python2 code syntax to align with Python3 syntax.
      • Tackling backward compatibility issues.
      • Continued usage of PEP8 and PEP257 guidelines.
    • Checking and updating package dependencies.

AttributeError: 'module' object has no attribute 'rstyle'

When attempting to run the PCoA_bubble.py script, I run into the below error:

Traceback (most recent call last): File "/usr/local/bin/PCoA_bubble.py", line 254, in <module> main() File "/usr/local/bin/PCoA_bubble.py", line 251, in main category_colors, xr, yr, args.output_dir) File "/usr/local/bin/PCoA_bubble.py", line 109, in plot_PCoA graph_util.rstyle(ax) AttributeError: 'module' object has no attribute 'rstyle'

From my googling of the issue I'm assuming this has to do with matplotlib, but beyond that I'm helpless. As an aside- there are pretty major differences in the input of this script in the documentation online and the script-as-installed... but that could be the difference between the pip installed version and the github installed version correct?

My Environment

  • Version used: pip installed (So I assume up-to-date)
  • Operating System and version: Ubuntu 16.04.1 LTS

Update PCoA_bubble

The PCoA_bubble script was written very early on, and never updated along with the rest of the visualization scripts. The code needs to be cleaned up in general, but a few specific things need to be addressed:

  • Remove dependence on the custom file format for specifying group names (mapping file column), colors (colors column in the mapping file), and OTU names (which can now be automatically generated by the otu_calc module)
  • Use the relative abundance method now available in the biom_calc module
  • Replace link_samples_to_categories() with the gather_categories method in the util module
  • Make applying the ggplot2 style optional

Integrating network analysis into PhyloToAST

  • Incremental contributions
    • ability to create network graph objects and generate .gexf files for usage in Gephi, Cytoscape, etc.
    • possibly generate network plots in Python, in addition to using external visualization tools mentioned above.

phylotoast is not in PyPI

The docs say that phylotoast in in pypi but the command pip install phylotoast fails. A search of the pypi.org website also fails to locate the package.

Modify parse_map_file()

Either use a csv.DictReader or have the function return the header line along with existing list of entries.

List of parse_map_file occurrences:
bin/PCoA_bubble.py: imap = util.parse_map_file(args.mapping)
bin/barcode_filter.py: barcodes = util.parse_map_file(mapFN, 1).keys()
bin/iTol.py: imap = util.parse_map_file(args.mapping)
bin/transpose_biom.py: mapping = util.parse_map_file(args.mapping)

LDA/PCoA update

  • LDA
    • Remove --input_data_type parameter.
      • utilize argparse checks for relevant input files for bubble plots.
    • Consolidate figure sizes for LDA and LDA bubble plots.
    • Add optional 3D plotting functionality.
  • PCoA
    • allow users to input distance matrices for PCoA plotting purposes.
    • transfer PCoA computation to PhyloToAST backend.
      • allow users to calculate a variety of beta diversity metrics for downstream analysis.
      • add a parameter which lists out all the available beta diversity metrics which users can use.

Update otu_to_tax_name.py

Currently, this script uses OTUIDs to get OTU genus-species identifiers. Adding the ability to get OTUIDs from genus-species identifiers (reverse option).

Editing raw_abundance() in biom_calc.py

Changes to biom_calc.py file

  • split raw_abundance() function into two functions
  • one to calculate total abundance and another to get the output of total abundance to transform it using a function
    • the transformation function by default would be base 10 logarithmic function.

Replace brewer2mpl

The brewer2mpl library has been replaced by Palettable. The code in PCoA.py and diversity.py should be updated to reflect this.

iTol.py: No warning given if mapping file sample IDs are not in the BIOM table

When calculating raw abundance (for example), if there is no overlap in sample IDs between the mapping file and the BIOM table, the output will be zero for each OTU.

While the mapping file is not strictly necessary for calculating raw abundance, it is for the other calculations. As such it is a required argument to the program. Additionally, for greater code simplicity the abundance calculation methods in the biom_calc module take a list of sample IDs. In the case of iTol.py calculating total OTU abundance, it just passes all the sample IDs in the mapping file to the abundance calculation methods. If none of those sample IDs are found in the BIOM table, nothing will be calculated and all zeroes will be output.

For these reasons, changing the code to just for the raw abundance use case would add unnecessary complexity. Instead, a warning should be output if no overlap in sample IDs is found between the mapping file and the BIOM table.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.